Strata + Hadoop Big Data
Kaushik Deka, Director of Engineering, Novantas
Phil Jarymiszyn, Director, Novantas
Enterprise adoption of big data in financial services is proving to be a key challenge due to the lack of a cohesive framework for building high value business applications across different lines of businesses. One ingredient of success is establishing a feature store – a library of documented metrics for various analyses, including predictive analytics, based on a shared data model. This central repository of standardized, reusable, transparent and well-governed library of features empowers data scientists and business analysts to deliver real results across a range of business problems via the enterprise.
Inspired by a real world use case at a top 10 US Bank, we will discuss the benefits of a Spark-based feature store and explore the three challenges we faced:
- Semantic Data Integration – Project the data from a data lake into a common conceptual data model and then leverage the data model to develop features
- High Performance Feature Engineering – Enable high performance feature engineering at a customer level on top of the conceptual data model
- Metadata Governance – Enforce business metadata governance on the feature store
…and most importantly, how we overcame these challenges to develop an ensemble of behavioral models to optimize the value of promotional pricing, as well as analyze cash flow trends by customer segments.