Hello.

We would like, in a flow to:

1)      Read data from kafka

2)      Merge records

3)      PutParquet on HDFS

4)      Load data in Hive using LOAD DATA hiveql.

The first 3 steps are no problem, it's working fine.

But we wonder what is the best way to launch the hiveql command. We planned to 
use the PutHiveQL processor, but it needs the command in the flowfile content.

Using generateflowfile would be nice, but we can't generate the event to 
trigger the generateflowfile, preventing too to use the wait/notify.

What is the best way? I suppose replacetext or so would be too "heavy" has it 
requires to load the message in memory?

Thanks for any pointer/idea.


Aurélien DEHAY
Big Data Architect
+33 616 815 441
aurelien.de...@faurecia.com<mailto:aurelien.de...@faurecia.com>

23/27 avenue des Champs Pierreux
92735 Nanterre Cedex - France
[Faurecia_inspiring_mobility_logo-RVB_150]


This electronic transmission (and any attachments thereto) is intended solely 
for the use of the addressee(s). It may contain confidential or legally 
privileged information. If you are not the intended recipient of this message, 
you must delete it immediately and notify the sender. Any unauthorized use or 
disclosure of this message is strictly prohibited.  Faurecia does not guarantee 
the integrity of this transmission and shall therefore never be liable if the 
message is altered or falsified nor for any virus, interception or damage to 
your system.

Reply via email to