Dear all,

We have in our Data Warehouse System, about 600  ETL( Extract Transform Load) 
jobs to create interim data model. SOme jobs are dependent on completion of 
others.

Assume that I create a group id intdependent jobs. Say a group G1 contains 100 
jobs , G2 contains another 200 jobs which are dependent on completion of Group 
G1 and so on.

Can we leverage on Haddop so that Hadoop executed G1 first, on failure it wont 
execute G2 otherwise will continue for G2 and so  on.. ?

Or do I need to configure "N" ( where N =  total number of groups) Haddop jobs 
independently and handle by ourselves?

Please share your thoughts, thanks

Warmest regards,
Ravion

Reply via email to