Data & Governance
Retail Store Chain Data Analytics
Integrating Intempt with a BI Data Warehouse platform enables you to perform data analysis at the most granular level.
By centralizing data from all sources, you can track customers from the first touch to the latest payment or interaction. From that data, you can derive an accurate customer lifetime value and incorporate it into revenue projections.
With the Intempt-powered BI Data Warehouse platform, you can also:
- Deploy end to end data analytics solution as per the business need.
- Provide estimates and ensure adherence to timelines for business requests.
- Identify problem areas and use supporting data to solve them.
- Automate operational processes.
Retail Store Data Integration Flow
1. POS zip file generation
All transactions generated at several POS machines at the end of a business day are coupled in a zip file (e.g., XYZ_POS_EU_1129_20200907_20200907030423.zip). Each zip file contains various four subfiles:
2. Zip files delivery to EC2
The zip files from the POS machines are sent to the landing zones of the EC2 instances.
3. File Watcher data filtering
Files are passed to a File Watcher application that filters and sends them to the Hadoop Distributed File System (HDFS) for ETL processing. Rejected files are kept in a separate folder inside the EC2 instance.
4. ETL processing
HDFS Map Reduce job takes these *.zip files as input to process them. Different data tables are combined into one master *.psv output (Daily_Sales).
5. Output file generation
Generated output files are sent back to HDFS, and a copy of them is stored in Amazon S3.
6. Data enrichement, cleaning and validation
Intempt Platform receives and processes Daily_sales.psv files. Data is enriched, validated, and cleaned.
- Data validation: dw_file_id, column count, and column values are aggregated and compared against the initial XML file columns (e.g., Transaction_Detail_1129_20200907_20200907030423.xml). The validation process ensures that no data is lost after combining different subfiles.
- Data is enriched by adding new columns based on the lookup tables (e.g. store_number -> store name, zip_code -> state_name, ctry_code -> cntry_name)
- Input property name fields are transformed to fit the schema in Amazon Redshift cluster.
7. Data delivery to the warehouse
Transformed data is sent from Intempt Platform to the Amazon Redshift cluster.
8. Report generation
Ad hoc daily, weekly, and monthly sales reports are being generated by Looker.