Oil & Natural Gas AnalyticsOptimizing Oil & Natural Gas Trading Strategies using Data Analytics.

Problem Statement
The client is a Malaysia based Fortune 500 global conglomerate that deals in Petroleum, Gas & Energy sector with presence in over 100 countries. Their business majors in Exploration & Production of crude oil and natural gas in Malaysia and internationally including on-shore and off-shore operations.
The company is a significant player in LNG trading, where it buys and sells LNG via cargoes internationally. It is also involved in term trading to generate stable revenues & spot trades to capitalize on market conditions. Due to multiple ERP systems used by various teams & vendors there were mismatches in the trade postings. No single system showed the exact trade information. The finance team wanted a consolidated dashboards to understand the exact position of trade finances & P&L statements.
Challenges
When we had conversations with the business partners and IT teams and did an audit we found the following:
- There was lack of single source of truth in data. Since they used SAP and TRMS (ERP) for booking trades and later synced data between them, there were some trades which were missing in SAP but existed in TRMS and vice-versa. This led to lot of issues in source data.
- The data had to be reconciled manually with the finance excel sheets to capture all trades either from invoices (SAP) or trade booking system (TRMS).
- Since the project was under the finance department of the client the accuracy and precision that was expected was very high even in case of bogus data.
- Every department was using their own analytics tools and frameworks which made it difficult for finance department to derive data and analytics for their own use. No single standard was followed leading to lot of inefficiencies.
- As they had multiple analytics systems for analytics there was great amount of data redundancy and also mismatches of data between departments leading to interdepartmental confusions.
Solution Design & Implementation

Our data architects & analysts first collaborated with their business focals & data source heads to understand velocity, volume, data skewness, data variety, consumers & producers of data and the decisive key metrics. We implement the following design:
- Data Lakehouse Architecture : It was important to load all the source data in a single location in it’s native form for analyzing on a single platform.
- Data Reconciliation & Data Quality Framework (DQF) : The need for a single source of truth which reflects business trades with accuracy in its native form/schema was addressed in the silver zone where the clean and reconciled data was loaded. The reconciliation was automated using PySpark notebooks.
- Databricks powered Transformations : The bronze to silver layer data movement consisting of reconciliation algorithms & DQF was implemented using PySpark notebooks on Databricks. Azure Databricks was also used to write the transformation processes to move data from silver to gold layer.
0
Reduction in
Reconciliations
0
Accurate
Dashboard
0
Reduced
Infra Costs
0
Savings on
Human Capital
Future Plans
The DQF Project
Data Quality Algorithms are taken out to create a separate Data Quality Framework (DQF) to use it in other products & applications.
Source Data Audits
Source Data Audits to be implemented to make sure source data correctness is maintained for quality reporting.
Data Literacy
DataLakehouse adoption is being encouraged to stop manual reporting and use automated reports on the DataLakehouse.
Data Governance
Data Governance policies for security, standardization & compliance purposes are to be implemented across organization.