Hi Clay, A common use case for NiFi and Kafka in conjunction is when you want the capabilities of a message broker system like Kafka with very low latency and multiple publishers/consumers, but you also have the need for some of the features NiFi provides like backpressure, as you mentioned. This is frequently found in industrial control systems or hardware/IoT integration (sometimes interfacing with MQTT).
In the scenario you call out in 1), yes, NiFi can be a complete solution for record transformation and writing to HDFS. I am not a Storm expert, but you correctly identify NiFi as a good “deliverer” of data to stream processing applications. I also won’t address microbatching, but I’m confident some other community members will have good input on the topic. I’ve included a couple resources you may find helpful, and I would suggest you might also get good results sending this email to the [email protected] <mailto:[email protected]> mailing list, as this list tends to focus more on the internals of NiFi, extensibility, and feature development. The users list has many contributors who deploy NiFi in real world scenarios and integrate with other systems that may not monitor this list. Good luck. https://bryanbende.com/development/2016/09/15/apache-nifi-and-apache-kafka <https://bryanbende.com/development/2016/09/15/apache-nifi-and-apache-kafka> https://hortonworks.com/webinar/apache-kafka-apache-nifi-better-together/ <https://hortonworks.com/webinar/apache-kafka-apache-nifi-better-together/> https://hortonworks.com/tutorial/realtime-event-processing-in-hadoop-with-nifi-kafka-and-storm/ <https://hortonworks.com/tutorial/realtime-event-processing-in-hadoop-with-nifi-kafka-and-storm/> Andy LoPresto [email protected] [email protected] PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > On May 2, 2018, at 7:45 PM, Clay Teahouse <[email protected]> wrote: > > Hello All, > > 1) Why would one need both nifi and Kafka in an environment, considering > that NiFi can handle back pressure, set up and deal with queues? I have an > environment where I would be collecting data via NiFi and I would need to > write the data to hdfs after some post processing. Can't I just process the > records in NiFi, change the format and write the data to HDFS via a NiFi > HDFS processor, PutHDFS? > > 2) Similarly, if I need to do some stream processing, can't I just pull the > data from NiFi processor via NiFiSpout, do the processing via some bolts, > and write the data to HDFS either via HDFS bolt or A NiFi HDFS processor? > > Do I even need storm in the picture? > > 3) Is NiFi suited for microbatch processing? Would it be better to pull the > data from NiFi via spark streaming and do microbatching there? Which > approach is most performant and reliable? > > thanks > > Clay
signature.asc
Description: Message signed with OpenPGP using GPGMail
