We are considering using Kylo on top of NiFi. It is my understanding that while Kylo manages both NiFi and Spark, its designers decided to utilize Scoop from Spark in order to ingest the data from relational databases. I am also aware that it is possible to drive Scoop from NiFi using one of the processors which can run scripts. Why would Kylo designers rely on Scoop rather than on NiFi? It's possible to set up a stand-alone NiFi instance and a NiFi cluster to do parallel database access. Scoop will achieve polarization for extraction from databases relying on the power of MR. We are a HortonWorks on Azure shop, so we already have infrastructure for both approaches. Does anyone have any feedback why would one approach be preferable to another?
STATEMENT OF CONFIDENTIALITY The information contained in this email message and any attachments may be confidential and legally privileged and is intended for the use of the addressee(s) only. If you are not an intended recipient, please: (1) notify me immediately by replying to this message; (2) do not use, disseminate, distribute or reproduce any part of the message or any attachment; and (3) destroy all copies of this message and any attachments.
