Hi All, I currently have a small hadoop cluster running with HDFS and Hive. My ultimate goal is to leverage NiFi's ingestion and flow capabilities to store real-time external JSON formatted event data.
What I am unclear about is what the best strategy/design is for storing FlowFile data (i.e. JSON events in my case) within HDFS that can then be accessed and analysed in Hive tables. Is much of the design in terms of storage handled in the NiFi flow or do I need to set something up external of NiFi to ensure I can query each JSON formatted event as a record in a Hive log table for example? Any examples or suggestions much appreciated, Thanks, M