Hello, I am looking into the possibility of using NiFi as a Data Pipeline Orchestration Tool. I’m evaluating NiFi along with some other tools such as Airflow and AWS Step Functions/Lambdas.
Has anyone used NiFi as an orchestration/scheduling tool for tasks such as submitting spark jobs to an EMR cluster? These are some of the requirements we are considering while evaluating such a tool: 1. SSH capabilities to execute remote commands 2. Rich scheduling (CRON) 3. Ability to write custom routines and import custom libraries 4. Event-based triggering of a pipeline Any insight would be helpful. We have used NiFi for about a year now for data movement and are familiar with its capabilities. My biggest worry is the ability to coordinate with other machines using SSH. Thanks, Jon