Hi Eugene, Thanks for your response. I guess I did not make myself clear. I am able to create and run jobs; the issue is not there. I have past 5 months data in hive table(partitioned by day). Now, Griffin by defaults aggregates the entire data and runs the job on aggregated dataset. I want to process aggregated weekly data starting from week 1, then week 2, week 3...and so on for 5 months. I am unable to find a way to do. Does Griffin has this capability at present? Or do I need to create custom rules using Spark-SQL mentioning the date range and submit them directly using API/Livy? Any help is much appreciated.
Regards, Vikram From: Eugene Liu <[email protected]> Sent: Friday, January 18, 2019 7:10 AM To: Vikram Jain <[email protected]>; [email protected] Subject: Re: Analyzing past batch data periodically Hi Vikram if you hope to create a profiling job, there are user/api guides which can help you https://github.com/apache/griffin/blob/master/griffin-doc/service/api-guide.md [Image removed by sender.]<https://github.com/apache/griffin/blob/master/griffin-doc/service/api-guide.md> apache/griffin<https://github.com/apache/griffin/blob/master/griffin-doc/service/api-guide.md> Mirror of Apache griffin . Contribute to apache/griffin development by creating an account on GitHub. github.com https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md [Image removed by sender.]<https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md> griffin/user-guide.md at master * apache/griffin * GitHub<https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md> Apache Griffin is an open source Data Quality solution for distributed data systems at any scale in both streaming or batch data context. Users will primarily access this application from a PC. if you want to measure the match rate between source and target, choose accuracy. if you want to check the ... github.com thx Eugene ________________________________ From: Vikram Jain <[email protected]> Sent: Friday, January 18, 2019 3:41 AM To: [email protected] Subject: Analyzing past batch data periodically Hi, I have 5 months data in my hive table which is partitioned day wise. I want to run a profiling job on this data and analyze the result and trend on weekly data i.e. I want to create the job that on each execution, processes 1 week of data incrementally and stores the metrics weekwise. I could not find a way to do this in Griffin. Can someone please help me with the solution if it exists. Regards, Vikram
