I want to run a flow like this: Notice file in directory Call script passing path to file The script calls a mapreduce job Take the output of the mapreduce job (files) and move those to a new HDFS folder
I see there is an ExecuteStreamProcess which passes a FlowFile to stdin and then uses stdout as a flow file. But in my case, the script reads and writes from files based on a path. I wanted to be able to create these steps and connect them in the UI, but I'm thinking that what I need to do instead is have: 1. A file watcher on the original directory that then calls the scripts 2. A file watcher on the script's output directory Does that make sense? -Dave Dave Tauzell | Senior Software Engineer | Surescripts O: 651.855.3042 | www.surescripts.com<http://www.surescripts.com/> | [email protected]<mailto:[email protected]> Connect with us: Twitter<https://twitter.com/Surescripts> I LinkedIn<https://www.linkedin.com/company/surescripts-llc> I Facebook<https://www.facebook.com/Surescripts> I YouTube<http://www.youtube.com/SurescriptsTV> This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.
