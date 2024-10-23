Job summary

Are you looking for a career in tech that truly helps make the world a better place? Bp is moving through the biggest transition in its 100+ year history with the goal of becoming one of the world’s largest renewable energy providers and achieving net zero carbon emissions by 2050!

To make this transition, we are looking for a Lead Data Scientist to deploy our integrated capability and standards in service of our net zero and safety ambitions. We will be collaborating to deliver competitive customer-focused energy solutions. We will be originating, scaling and commercialising innovative ideas and crafting ground-breaking new businesses.

Join us in creating, growing, and delivering innovation at pace, enabling us to thrive while transitioning to a net zero world!

What You Will Do

Define and implement sophisticated machine learning pipelines including data processing and aggregation, machine learning training and inference (e.g., time-series forecasting, classification, regression and clustering)

Work closely with data scientists and individuals doing R&D to improve algorithms and machine learning models and adapt them to production environments (AWS based)

Develop scalable machine learning models, improving efficiency, time to run and minimising costs.

Ensure quality of the code and best software engineering practice.

Ability to work on end-to- end machine learning engagements understanding the data models, process and decisions that the machine learning models automate.

Ensure pipelines are automated up to industry standard with automated monitoring, failure detection, and ensuring high availability of the solution.

What You will Need To Be Successful

More than 8 years of advanced python experience in machine learning production context

Advanced hands-on python and SQL end-to-end experience of crafting sophisticated data and machine learning pipelines: real-time inference, batch inference, machine learning training, data persistence, use of feature stores, etc.

Extensive experience with main python packages, in particular: pandas, numpy, pyspark, tensorflow and torch.

Ability to optimize python code to improve performance, memory usage and ensure scalability in production environment (horizontal + vertical scalability)

API deployment experience. In depth understanding of API technologies such as Fastapi (synchronous and asynchronous)

Experience working in AWS environment using AWS tooling



Negligible travel should be expected with this role



Relocation may be negotiable for this role



This position is a hybrid of office/remote working



Commercial Acumen, Communication, Data Analysis, Data cleansing and transformation, Data domain knowledge, Data Integration, Data Management, Data Manipulation, Data Sourcing, Data strategy and governance, Data Structures and Algorithms (Inactive), Data visualization and interpretation, Digital Security, Extract, transform and load, Group Problem Solving

