Matt, Its always better to have a join for the corresponding fork. I think it would be better if you clarify in the question more about your workflow design and the requirement for asynchronous spikes.
Thanks, Virag On 7/17/12 2:30 PM, "Matt Goeke" <[email protected]> wrote: > Virag, > > Thanks for the response. I have read the workflow spec and while I realize > there is the ability to fork within a workflow my issue is that all forks > must be paired with joins. What I was looking for was some way to fork but > not require all of the forked nodes to rejoin the primary workflow (hence > some of the nodes becoming asynchronous spikes). I feel like this > capability might already exist and this might just be an issue of > workflow/subworkflow composition. > > -- > Matt Goeke > > On Tue, Jul 17, 2012 at 2:00 PM, Virag Kothari <[email protected]> wrote: > >> Hi Matt, >> I think you can fork the hive actions using the fork/join control nodes in >> Oozie. >> >> http://incubator.apache.org/oozie/docs/3.2.0-incubating/docs/WorkflowFunctio >> nalSpec.html#a3.1.5_Fork_and_Join_Control_Nodes. >> >> I have no idea why the attachment doesn't work. >> >> Thanks, >> Virag >> >> >> On 7/17/12 12:13 PM, "Matt Goeke" <[email protected]> wrote: >> >>> Apparently when I put an imagur link in the reply the spam score gets >> high >>> enough that the delivery is denied... is there anyway to link an image? >>> Also, if not then is there anything I can clarify in the question that >>> would make it more straightforward? >>> >>> -- >>> Matt Goeke >>> >>> On Tue, Jul 17, 2012 at 11:22 AM, Mona Chitnis <[email protected] >>> wrote: >>> >>>> The attachment hasn't come through. This had happened with an earlier >>>> email with the Oozie Meetup slides attachments too. Any solutions? >>>> >>>> -- >>>> Mona Chitnis >>>> >>>> From: Matt Goeke <[email protected]<mailto: >> [email protected]>> >>>> Reply-To: "[email protected]<mailto: >>>> [email protected]>" <[email protected] >>>> <mailto:[email protected]>> >>>> To: "[email protected]<mailto: >>>> [email protected]>" <[email protected] >>>> <mailto:[email protected]>> >>>> Subject: Oozie: asynchronous forking >>>> >>>> All, >>>> >>>> Does anyone know if it is possible to do asynchronous forking in Oozie? >>>> Currently we are running a set of ETL extractions that are pairs of >> actions >>>> (sqoop action then a hive transformation) but we would like to have the >>>> Sqoop actions be serial and the Hive actions be called asynchronously >> when >>>> the paired Sqoop job finishes. The reason the Sqoop actions are serial >> is >>>> we would like to limit the number of concurrent mappers hitting the data >>>> source and we could do this through the fair scheduler but that would >>>> require a pool per data source. Attached is a picture of suggested ETL >> flow. >>>> >>>> If anyone has any suggestions on best practices around this I would love >>>> to hear them. >>>> >>>> Thanks, >>>> Matt >>>> >> >>
