Vincent Sorry for the late reply, but here it is
Based on what you have described it appears you have a mix of two problems: Work Flow Orchestration and Data Flow. The main issue is that at the surface it’s not always easy to tell the difference, but I’ll try. Work Flow Orchestration allows one to orchestrate a single process by breaking it down in a set of individual components (primarily for simplicity and modularization) and then composing such components into one cohesive process. Data Flow manages individual processes, their lifecycle, execution, input and output from a central command/control facility. So with the above in mind i say you have a mix problem where you have Data Flow consisting of simple and complex processors. And its those complex processors that need to invoke MR job and then act on its result is what falls into the category of Work Flow Orchestration where individual components within such process must work with awareness of the overall process they represent. For example: GetFile (NiFi) PutHDFS (NiFi) Process (NiFi Custom Processor) - where you execute the MR Job and react to its completion (success or failure) and possibly put something on the output queen in NiFi so the next element of Data Flow can kick in. . . . So, the 3 elements of Data Flow above are the individual NiFi Processors, yet the 3rd one internally represents a complex and orchestrated process. Now, the orchestration is just a term and without relying on outside frameworks that specifically address the orchestration it would be just a lot of custom code. Thankfully NiFi provides support for Spring Application Context container that allows you to implement your NiFi processor using work flow orchestration frameworks such as Spring Integration and/or Apache Camel. I’d be more then willing to help you further with that if you’re interested, but wanted to see how you feel with the above architecture. I am also working on the blog to describe exactly that and how Data Flow and Work Flow can complement one another. Let me know Cheers Oleg On Mar 29, 2016, at 9:26 AM, Oleg Zhurakousky <ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote: Vincent I do have a suggestion for you but need a bit more time to craft my response. Give me till tonight EST. Cheers Oleg On Mar 29, 2016, at 8:55 AM, Vincent Russell <vincent.russ...@gmail.com<mailto:vincent.russ...@gmail.com>> wrote: Thanks Oleg and Joe, I am not currently convinced that nifi is the solution as well, but it is a nice way for us to manage actions based on the result of a mapreduce job. Our use cases is to have follow on processors that perform actions based on the results of the map reduce jobs. One processor kicks off the M/R process and then the results are sent down the flow. The problem with our current scenario is that we have two separate flows that utilize the same location as the output for the M/R locations. One simple way might be to use mongo itself has a locking mechanism. On Mon, Mar 28, 2016 at 7:07 PM, Oleg Zhurakousky <ozhurakou...@hortonworks.com<mailto:ozhurakou...@hortonworks.com>> wrote: Vincent This sounds more like an architectural question and even outside of NiFi in order to achieve that especially in the distributed environment one would need some kind of coordination component. And while we can think of variety of way to accomplish that I am not entirely convinced that this is the right direction. Would you mind sharing a bit more about your use case and perhaps we can jointly come up with a better and hopefully simpler solution? Cheers Oleg On Mar 28, 2016, at 6:45 PM, Vincent Russell <vincent.russ...@gmail.com<mailto:vincent.russ...@gmail.com>> wrote: I have two processors (that aren't part of the same flow) that write to the same resource (a mongo collection) via a map reduce job. I don't want both to run at the same time. On Mar 28, 2016 6:28 PM, "Joe Witt" <joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote: Vincent, Not really and that would largely be by design. Can you describe the use case more so we can suggest alternatives or perhaps understand the motivation better? Thanks Joe On Mon, Mar 28, 2016 at 4:00 PM, Vincent Russell <vincent.russ...@gmail.com<mailto:vincent.russ...@gmail.com>> wrote: > > Is it possible to have one processor block while another specified processor > is running (within the onTrigger method). > > I can do this on a non-clustered nifi with a synchronized block I guess, but > i wanted to know if there was a more idiomatic way of doing this. > > Thanks, > Vincent