Hi All,

I currently have a small hadoop cluster running with HDFS and Hive. My
ultimate goal is to leverage NiFi's ingestion and flow capabilities to
store real-time external JSON formatted event data.

What I am unclear about is what the best strategy/design is for storing
FlowFile data (i.e. JSON events in my case) within HDFS that can then be
accessed and analysed in Hive tables.

Is much of the design in terms of storage handled in the NiFi flow or do I
need to set something up external of NiFi to ensure I can query each JSON
formatted event as a record in a Hive log table for example?

Any examples or suggestions much appreciated,

Thanks,
M

Reply via email to