All,


Does anyone know if it is possible to do asynchronous forking in Oozie?
Currently we are running a set of ETL extractions that are pairs of actions
(sqoop action then a hive transformation) but we would like to have the
Sqoop actions be serial and the Hive actions be called asynchronously when
the paired Sqoop job finishes. The reason the Sqoop actions are serial is
we would like to limit the number of concurrent mappers hitting the data
source and we could do this through the fair scheduler but that would
require a pool per data source. Attached is a picture of suggested ETL flow.



If anyone has any suggestions on best practices around this I would love to
hear them.



Thanks,

Matt

Reply via email to