Deep Learning Applications & Advanced Analysis in Hydrocarbon Refineries

Deep learning is an umbrella term that covers many things. In a strict sense, it is using computer programs to do what intelligent humans could do, and often doing it even better. Deep learning (cognitive science), which is the most popular computer science course now in USA and European universities.

Five Attributes of Deep Learning

The cognitive tasks of AI can be divided into five categories:

(1) Perception.

(2) Learning.

(3) Forecasting.

(4) Reasoning.

(5) Coordinating.

With perception, deep learning can understand the environment with sensing, and detect and recognize occurrences; is that smell a fuel leak? From this it can learn by synthesizing that information into knowledge; this could be learning the relationship between temperature set points and distillate yield.

Extract value from the generated data by being able to forecast with high precision. We can simulate outcomes such as reservoir performance (IPR) at various topside operating conditions.

When it comes to solving logical problems, or reasoning, deep learning can make decisions or suggest the best solutions; given what I know, what is the optimal distribution of my products at different terminal sites?

Despite the expanding range of problems deep learning can solve, there is one thing which no deep learning program has been able to replace humans in: defining the problem itself.

Given the obvious benefits that can be derived from adopting ML in refineries, what are the challenges that downstream oil and gas companies face when they embark on a program?

One of the biggest mistakes that companies make is that they embark on deep learning without first defining the problem. They collect lots of data, but do not know what to do with it, since they do not know what problem they are trying to solve by collecting all this data.

Machine Learning roles in Dynamic Nodal Analysis

One of the most popular subcategories today is machine learning. In fact, machine learning has become so popular that many people equate machine learning to deep learning.

Machine learning is popular because it overcomes scientific unknowns through large quantities of historical data, and hence has made fortunes for companies that in the past found their data too complex to interpret.

Machine Algorithm Definition:

Machine learning is based on pattern recognition, and machine learning methods consider all data as either inputs (features) or outputs (prediction). Multiple inputs are fed into an algorithm that produces an output. If the output does not match the actual data, the algorithm is tweaked to do better next time. This is called training in machine learning.

Because machine learning relies on large quantities of data about the same subject, it is better at very focused problems and parameters, such as what is the relationship between vibration and engine failure?

Machine learning behaves poorly when the problem is a system problem with more complexity, such as a refining process or a logistics supply chain for oil that has many moving parts, which prevents repeating patterns.

It can also struggle when most of the information is domain specific, such as the pressure setting on the steam boiler that has a certain relationship with the steam energy generated and subsequently the processes in the distillation column. Such domain-specific information from the data cannot be utilized unless an engineer or data scientist has spent time to structure and correlate the data to correctly represent the relationship between them; this is something that machine learning cannot replace. The cost of this manual work is often ignored when companies want to train their data. They end up not having meaningful conclusions.

Another problem occurs when time and sequence are important. Most machine learning programs do not incorporate time-based patterns. For example, the best way to predict the loading queue at the terminal in the next hour is to count the current queue length. Fuel demand estimates at a retail fuel station require information such as which month of the year and which day of the week it is in order to predict more accurately. This is where time series come in. The central point that differentiates time series problems from most other statistical problems is that in a time series observation are not mutually independent. Rather a single chance event may affect all later data points.

Yet, existing time series technology alone does not solve all the new problems either. Enterprises are trying to aggregate and store all data in time series format, which understands time, but misses all domain correlations. This correlation across the domain of operations is critical for gaining contextual intelligence. Even though historian has been a familiar technology to first use, it is not sufficient.

Companies should consider the nature of the problems before they invest. You need the right AI tool based on the problem you have defined. Define the problem first, so that you can select the right tool. Do not make the auto industry’s mistake.

Categories of problem in downstream oil and gas

1-Scheduling/ allocation/ coordination problems:

2-Process optimization:

3-Monitoring, detection, faster responses:

4-Supply chain logistic:

Terminal Efficiency

In downstream terminals, maximizing loading efficiency can have a significant impact on the performance of terminal operations. Scheduling is a complex process (truck arrival time, terminal queue, loading bay queue, loading time) and the multiple combinations of trucks that require different products, against the required volumes and flow rates from pumps into different loading bays. The number of calculations becomes exponential as you consider all the variables in this process and becomes a nearly impossible task for humans.

Today’s manual process is typically experience based with some amount of guesswork, which does not optimize terminal operations. With predictive analysis from A-Stack, these different variables can calculate optimized scheduling, to determine for each truck which particular loading bay it should use. It minimizes overall queuing for the terminal and maximizes loading efficiency. It improves supply chain logistics.


Five categories of intelligence can be concluded based on various modes of operations and different areas of applications in oil and gas industry.

1 Future intelligence: the ability to forecast future events with good confidence & high accuracy based on the learning from current events.

2 Historical intelligence: the ability to understand what happened.

3 Contextual intelligence: the ability to correlate multiple factors in a context and make sense of what is happening.

4 Domain intelligence: the ability to deepen domain knowledge/science

5 Logical intelligence: compute numerous logical conditions simultaneously and find the solution.

Until next time,

Emad Gebesy, Founder of Optimize Global Solutions, Subject Matter Expert

Machine Learning and Deep Learning for Fire Detection

By definition, a fire is a process in which substances combine chemically with oxygen from the air and typically give out bright light, heat, and smoke; combustion or burning. In the oil and gas industry, fire hazards are a big issue which is faced on a regular basis. In order for companies to understand how to reduce these fires as well as damage and injuries, it is important to know the difference between the different types of fires that can occur. This article is about understanding the types of fires, more specifically comparing the differences between a jet fire, pool fire, and flash fire. As well, we will look into how these fires can be prevented with the use of machine learning. In order to understand the differences, we first need to see what causes these fires, and what variables are the same throughout all three, and which create a change between them.

Jet fires are caused by high pressure releases of hydrocarbon, causing flames to shoot out in one direction, similarly as a flamethrower. In order for this fire to occur, there is also a need for oxygen which allows the fire to breathe. However, in order for the fire to start, there needs to be a source of ignition. There are many potential sources of ignition, such as a spark, heat from hot surfaces, or even the reflection from mobile or tablet devices. Another source could also be cigarettes, which is why they are forbidden even in gas stations. The image above represents a case of a jet fire, which we can identify for certain by the way the fire is shooting in one direction.

Pool fires on the other hand are the result of liquid hydrocarbon which also interacts with oxygen and a source of ignition. Since in this case the hydrocarbon is liquid, the fire is spread out along the liquid, rather than shooting out from a highly pressurized point, which can be seen in the image above, as they are not high in pressure and a lot more spread out. This type of fire can be caused when liquid hydrocarbon interacts with air, as well as any one source of ignition as mentioned previously. This in turn starts a flame that then spreads out throughout the liquid, as shown in the image.

Finally, a flash fire is caused when gas hydrocarbon slowly seeps out, until it catches on fire from a source of ignition, weather it is from a spark or high temperatures, it goes through the air until it consumes all the gas in the air. The image above represents a scenario like this, where the fire is fully spread out in the air, consuming the gasses all around.

Having an understanding of how the different fires happen allows them to optimize work, to fix the roots of the problems before they are able to happen. For jet fires, it is important that highly pressurized gases aren’t able to interact with oxygen as well as having a source of ignition. To prevent pool fires pipe leaks of liquid hydrocarbon must be sealed off in order for them not to catch on fire. As for flash fires, hydrocarbon gas shouldn’t be able to seep out from pipes as it can in an instant catch on fire, causing safety risks for workers. These steps are simple in theory, but in practice a lot harder considering how big the working facilities are, and how many pipes need to be maintained.

Luckily, as technology progresses rapidly, we have more equipment at our disposal, such as sensors and IoT devices. However, it is still hard to process the data which is given from them in order to prevent potential fires, which is why deep learning is used in order to create a predictive model that helps companies predict disasters ahead of time, allowing them to be proactive rather than reactive. This is done by converting information based on past results as well as real-time results, creating simple insights that allow users to detect defects which would have lead to a malfunction in the system.


The challenge with this lies in the fact that the oil and gas industry is structured towards avoiding failure, meaning that the needed examples of failure patters can be hard to find, but not impossible. Still, what deep learning brings is an improved approach if done correctly that can prevent many catastrophic events both for the environment and for workers, and over time these methods will further improve, removing a lot more risks from the industry that is heavily connected to our everyday lives.

Power of Machine Learning in Dynamic QRA using PyRISK™

Power of Machine Learning in Dynamic QRA using PyRISK™

We are celebrating our ongoing successful business case with PetroGulf Misr of performing Dynamic QRA for their offshore topside facilities to understand the baseline risk and additional risk with intention to make recommendations if required to reduce the risk to its tolerable limits.

One of the requirements under this project is to perform a Quantitative Risk Assessment to identify the risk. It has been highlighted to PetroGulf Misr team that the limitations of static QRA usually provide an overview of the risk at fixed reference point in time. Such studies involve hundreds, sometimes thousands of scenarios covering operating conditions, manning level and maintenance activities. The cost can be substantial, and a study may need repeating every few years.

Alternatively Optimize Global Solutions and Kageera have offered the utilization of the Dynamic QRA for this challenging project to empower the the decision making for the HSE managers and executives. Optimize Global Solutions’ product in cooperation with Kageera for PyRISK™ unlocks the opportunities of machine learning in the risk management. It supports customers already using PyRISK™ to perform Dynamic QRA to get even more value from the data at no additional cost.

Data Preparation

Given the critical importance of having the right & clean data, Optimize Global Solutions and Kageera prudently apply a holistic approach in executing their oil and gas project differently and remarkably from other competitors where the inception and culmination of the processes are within their expertise and base knowledge. The following processes are applied chronologically to ensure the compliance of the data for further machine learning activities.

1-Problem Definition and Framing out.

2-Functional Block Assessment.

3-Frequency Calculations.

4-Consequence Modeling for governing Cases.

5-Sensitivity Analysis using VB-based PyRISK™

Problem Frame-out

Image for post

Geisum North Field platform comprises four (4) decks, namely

1-Upper Deck.

2-Main Deck.

3-Machinery Deck.

4-Lower Deck

Image for post

Red blocks denotes the hazardous operation where the facilities handle hydrocarbon

Functional Block Assessment

The objective of the function blocks is to identify the global scenarios where the static and dynamic inventories for each key unit operation. Each functional block should include the operating temperature and pressure plus the fluid compositions for further consequence modeling using Phast DNV GL

Image for post

From P&IDs, pipe size, approximate length, number of flanges, manual valves, instrument connections, etc. are counted to eventually calculate the frequency of failure.

Frequency Calculations

The frequency of leak event is quite important to deeply understand the likelihood of each scenario and the frequency of occurrence of each hazard. Following the latest codes and standards in oil and gas industry, and utilizing of experience in the programming, the VBA-based application of CCE™is empowered to calculate an accurate results of frequencies for various leak sizes scenarios e.g. small leak, medium leak and large leak.

Image for post

Consequence Modeling

Quantifying the hazard intensity threshold / Severity of each potential hazard is essential for our machine learning solutions.

Image for post

Data Generation & Dynamic QRA

This step is a combination of science and art where the produced data should ensure proper distribution with intention to avoid over-fitting and under-fittings.

VBA-based application of PyRISK™ (in-house tool programmed by Optimize and kageera) empowered with thremo-dynamic data from Phast and has the ability of running various sensitivity cases . This cutting-edge tool was quite imperative to massively decrease the analysis time of nearly 65% compared to the conventional ways of working.

Image for post

Front-end Machine Learning Application

Starting from Data Exploratory through out algorithms testing and culminating with the most accurate predictive model selection is one of the outstanding strength of Kageera which enables our smart solutions to take place in the industry with full recognition from our clients.

Image for post

(1) Quantitative risk assessment (QRA) benefits an operator more than simply complying with varying regulations.

(2) Dynamically updating QRA offers cost, production, and safety benefits

(3) Dynamic QRA supports decision making and planning to improve risk management

(4) Digitization assisted by online tools enables dynamic QRA

Optimize Oil & Gas Production with Digital Twins (LENAᵀᴹ)

We believe data, algorithms, and software should power the industry and humans, to use their creativity to shape a profitable, safe, and sustainable present & future. Today, heavy-asset industries like oil and gas, renewable industry, and energy have reached a digitalization tipping point. Increasing access to data has made data handling a key changer, even in industries that have historically been considered far from high-tech.

No alt text provided for this image

We believe machine learning is not a magic wand though it is an entirely new technology that is just at the beginning of the usage. Only 4% of companies in the whole world are in the phase – early practice. But, humans will still be the ones to drive the change.

Companies need to consider investing in the technology of digital twins (LENAᵀᴹ) that will amplify the experience and skills of their own people and assets. Digital twins technology is there to inform users about their operations and suggest measures to avoid any downtime. Once again, humans bring their own expertise to the table, supplemented by a data-driven decision and the ability to examine the data more deeply before taking any action. A creative mind is something that can not be automated and humans are essential in the artificial intelligence reality that is coming.

Online Digital Twins

As the operational life continues, the digital copy is updated automatically, in real-time, with current data, work records, and engineering information to optimize maintenance and operational activities. Using this information, engineers, managers, and operators can easily search the asset tags to access critical up-to-date engineering and work information and find the health of a particular asset. Previously, such tasks would take considerable time and effort, and would often lead to issues being missed, leading to failures or production outages. With Online LENAᵀᴹ, operational and asset issues are flagged and addressed early on, and the workflow becomes preventative, instead of reactive.

No alt text provided for this image

The reliable, real-time process data from the digital twin can be fed into simulation and analytics to optimize overall production, process conditions, and even predict failures ahead of time. A digital twin, when combined with powerful analytics and machine learning, enables predictive maintenance and optimized processes. Analytics leverage advanced pattern recognition, statistical models, mathematical models and machine learning algorithms to model an asset’s operating profile and processes and predict future performance. Appropriate, timely actions are then recommended to reduce unplanned downtime and to optimize operating conditions. With the digital twin, process simulation can also be performed to optimize the operating models based on their physical properties and thermodynamic laws.

The following three steps approach enabled by Digital Twin to optimize oil and gas production -from gathering systems to gas processing plants – is fundamental to improving performance and boosting profitability:

1. Steady State

Engineering and Design (FEED) stage, steady-state simulation models of gas processing and others, can be created to optimize the design. During operations, engineers and operators can perform engineering studies via Offline to identify design changes that will significantly increase throughput and the reliability and safety of plant operation.

No alt text provided for this image

Data analytics can be used to model fluid flow behaviors in pipeline multiphase or single-phase flow to predict pipeline holdup and potential slugging in the network. Understanding flow performance is key to optimizing the gathering network design, reducing CAPEX, and optimizing pipeline. With a unified simulation platform, the evolution from a steady-state to dynamic simulation can be achieved effortlessly.

2. Dynamic Modeling

Dynamic modeling based on ordinary differential equations and partial differential equations can be performed on these models to validate process design such as relief and flare systems, changes in feedstock, production capacity adjustment, and controls, enabling engineers to optimize the design and reduce CAPEX and OPEX. In addition, the dynamic simulation allows effective troubleshooting, control system checkouts, and comprehensive evaluations of standard and emergency operating procedures to shorten time requirements for safe plant start-up and shutdown.

No alt text provided for this image

3. Predictive Analytics to Monitor Equipment Health

Minimizing plant downtime is therefore key to improve production. This is where predictive analytics comes in. Predictive analytics enables modeling of rotating equipment performance – such as pumps, compressors, and turbines – using advanced pattern recognition and machine learning algorithms to identify and diagnose any potential operating issues, days, or weeks before failures occur. Minimizing plant downtime is therefore key to improving production. Operating models including past loading, ambient, and operational conditions are used to create a unique asset signature for each type of equipment. Real-time operating data is then compared against these models to detect any subtle deviations from expected equipment behavior, allowing reliable and effective monitoring of different types of equipment with no programming required during setup. The early-warning notification allows reliability and maintenance teams to assess, identify and resolve problems, preventing major breakdowns that can cost companies millions of dollars in production slowdowns or stoppages.

No alt text provided for this image

Brief Summary overview of Digital Twins

Digital Twins works as a back-end rigorous model and running to report key data. Such as production throughput, product purity, but it also includes extra features and tools for optimization Oil & Gas production. Digital Twins can work either offline (case studies and What-If) or online (data acquisition and optimization) based on the requirements of business values.

The digital twins enables operational excellence by helping oil and gas engineers, managers operators and owners take a model-focused approach that quickly turns massive amounts of data into business value.

These powerful data insights mean

  1. Asset failure can be predicted.
  2. Hidden revenue opportunities can be uncovered and realized.
  3. Businesses can continuously improve in the ever-changing, competitive marketplace.

In a nutshell, Digital twins is the foundation of a digital transformation that optimize production, detects equipment problems before failure occurs, uncovers new opportunities for process improvement, all while reducing unplanned downtime. Depending on the facility itself you can combine data analytics, machine learning, Big Data, and software applications in order to utilize the power of the data and turn it into business value for every Oil and Gas company.

Until next time,

Manja Bogicevic



6 APPLICATIONS OF MACHINE LEARNING IN OIL AND GAS? The issue of environmental sustainability is a major concern for governments and players from the oil and gas industry worldwide. The negative impacts as a result of activities carried out by oil and gas companies have been a major tool not only for the livelihood and health of people but more so to the environment, such as pollution.

The risk of environmental pollution, hazards and severity can be and will be reduced with the use of machine learning and deep learning in the years to come. There has been an amazing application in chemical engineering, process safety, process control tuning, advanced dynamic and process optimization. We can see that the Oil and Gas industry when we talk about software especially machine learning solutions didn’t have any innovation in the last 15 years.

If you are a manager in the Oil and gas, you are afraid that Hazzard will happen and you can’t reduce the risk. What to do about it?

1. Predictive analytics for MPPL (Multiple Pipeline Product)

Machine learning 1

Model-based, predictive analytics lends itself to crunching multiple data sources to pinpoint risks for pipeline integrity management, while analytics for process monitoring and measurement evolve to better discover crucial variances. Kageera and Optimize Global Solutions vendors are starting to use rules and heuristic techniques to spot deviations impacting the accurate understanding of flow, production, and gas quality. Through the ability to set thresholds on deviations, measurement software can help spot problems such as missing data, suspect data, or uncollected data.

Predictive analytics for pipeline integrity is emerging, but engineers must advise analytics experts on underlying data sources and configuring KPIs.


is crucial to the midstream, as are ways of converting measurement and flow data into production reports. With solutions, look for analytics and reports that can quickly update trends and generate new reports as new data from metering or gas analysis become available.

The foundation for analytics goes all the way back to plant and asset design. Install enough instrumentation to drive better predictions.

To work well, an analytics program might identify the need for some newer instrumentation, such as replacing gas charts with EFMs, investing in new acoustic leak detection technology, or online corrosion measurement transmitters.

2. Boosting the Business Values via PyRISK™ Dynamic QRA The Frontier of Dynamic QRA and Static QRA

Machine learning 2

The oil and gas industry has been using static quantitative risk assessment (QRA) and related studies for more than 50 years to evaluate risks of major accident hazards. It is applied to demonstrate a risk to the public and employees as part of complying with regulatory requirements, which vary worldwide.

Static QRA studies typically play an important role early in the design stage of a project’s capital expenditure (CAPEX) phase; for example, when evaluating concepts, optimizing design, and establishing cost-effective risk management.

Currently, Static QRA for oil and gas assets usually provides an overview of risk at a fixed reference point in time. Such studies involve hundreds, sometimes thousands, of scenarios covering operating conditions, manning levels, and maintenance activities. The cost can be substantial, and a study may need repeating every few years. Despite this, some operators have recognized a need to utilize Dynamics QRAs as a fundamental decision-support tool for their projects and operations beyond the explicit regulatory requirements. By doing so, they gain insights into how to save costs and maintain or increase production efficiency while operating safely.

Some operators are now moving beyond static QRA models to make better use of the huge volume of data that are collected in a process involving considerable time, effort, and cost. They are starting, or planning to, update data more frequently using a dynamic approach to risk-based safety management. New cloud-based tools, combining secure data storage and analytics, are assisting them on this journey in pursuit of potential cost, operational, and safety benefits.


On the Quantitative Risk Assessment project which is expected to save 60% of the work effort through the full utilization of Machine Learning.

The outcome is a faster and better decision support and simpler and easier risk communication in all phases of an asset’s life cycle. We estimate that PyRISK™ can reduce the overall cost of safety studies by up to 40 to 60% over the lifetime of an asset. Dynamic Risk Estimation & Decision Making

Kageera’s product in cooperation with Optimize Global Solutions for PyRISK™ unlocks opportunities for machine learning in risk management. It supports customers already using PyRISK™ to perform Dynamic QRA to get even more value from the data at no additional cost.

The PyRISK™ approach offers significant cost advantages. Our estimate is that an operator may end up spending 30–40% less on safety studies over the 25- to the 30-year lifecycle of a medium- to the large-sized asset, while getting much more value out of the data generated in their QRA studies.


Machine learning 3

PySEP™: An integrated and holistic approach of delivering bespoke models which includes an immense diversity of process modeling, automation & optimization, Linear Programming, data analytics and machine learning solution for wide-ranging applications for offshore processing, gas processing, LPG, refineries, and chemical plants. PySEP™ Solutions provide the following functions, but not limited to:

  • Field life cycle analysis and project evaluation.
  • Reservoir management and dynamic nodal analysis.
  • Flow assurance and hydro-dynamic slug management.
  • Refinery linear programming.
  • Operator training simulation.
  • The data-driven decision platform empowers senior management to take the best-suited decisions in a timely manner.
  • Avoid unplanned downtime.

4. PyFCC

Machine learning 4

PyFCC™: Machine learning smart solution for Fluid Catalytic Cracking delivered with out-of-the-box integrated architectures that seamlessly integrate the flow of information between operating facilities and final predictive model to sort out through the field complexities. The out-of-the-box integration is giving an opportunity to:

  • Proactively predicts the FCC yield in response to any potential change in the refinery feedstock.
  • Prediction of kinetics parameters and overall conversion efficiency weighing the actual operating conditions.
  • The data-driven decision platform empowers senior management to take the best-suited decisions in a timely manner.
  • Avoid unplanned downtime.

5. Remote operation and performance shutdown using IOT (PyMACHINA)

Machine learning, Deep learning and the Internet of Things (IoT) that could potentially revolutionize the oil and gas industries. Having already made quite a storm in various other industries including consumer electronics, this couldn’t come at a better time for the oil industry as it currently faces dramatic drops in the price of oil.

  • Remote operation with use of IoT (there is no need for people checking anything at platforms
  • Remote notifications alarms and performance shutdown

Every 20 years we have a massive explosion of platforms we can reduce this risk with deep learning algorithms and IoT.

  • Remote operations with the use of Deep Q learning and robotics (PyROBOT)


No accidents, no harm to the people, no damage to the environment that is the goal of our solutions in the oil and gas industry. (MACHINE LEARNING IN OIL AND GAS)

6. Deep Learning risk detecting and predictive diagnostics (PyVision™)

In an industry where oil prices are constantly rising, it has become imperative to act on operating costs, productivity and lead times in order to have better returns on investment.

There are different categories of Computer Vision such as image processing (including image recognition), facial recognition, optical character recognition. This variety means that CV can be useful for many different types of industries, and various practical use cases. Here are some concrete examples of CV applications that are currently in production:

  • Smart Video surveillance system to help diagnose pool fire for example.
  • Quality control automation of preventing fire in a refinery


Everything mention above entails smart and quick solutions, a big reduction in operating expenses of companies and consequently fewer overheads. This recipe of success would help the entities to survive throughout the wild competitions and reduce oil prices.

On the other hand, process and chemical engineers should start pulling up their socks and refresh their minds by recovering all basics and fundamentals of mathematics to be utterly equipped with the proper knowledge when they use ML and its application in the O&G industry. In order to have successful machine learning or deep learning solution you have to have:

  • Initial screening with Experienced Process Flow Assurance Engineer and Machine Learning Engineer with a proven record in the field
  • An agile action plan with applicable and rising risks
  • Digest and analyze numerous sources of data

Until next time,

Happy machine learning in Oil and Gas.

If you are interested in our Oil and Gas solutions send me an email at manja.bogicevic[et]