Hi William,

>From another nifi newbie, i also think that both MergeContent and MergeRecord 
>processors are really hard to use. Sometimes you just need to merge multiple 
>flowfiles toguether, based on line count or size, and it is really hard to 
>accomplish this.
In docs there are some tips that may impact you (im using 1.11.4 at the moment 
here): 
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.4/org.apache.nifi.processors.standard.MergeContent/index.html

Merge Strategy: Bin-Packing Algorithm - generates a FlowFile populated by 
arbitrarily chosen FlowFiles.
Attribute Strategy: Keep Only Common Attributes - only the attributes that 
exist on all FlowFiles in the bundle, with the same value, will be preserved.
Correlation Attribute Name: No Value - If not specified, FlowFiles are bundled 
by the order in which they are pulled from the queue.
But...
Metadata Strategy: Do Not Merge Uncommon Metadata For any input format that 
supports metadata (Avro, e.g.), any FlowFile whose metadata values do not match 
those of the first FlowFile in the bin will not be merged.

So, there is a lot of things that you need to check:
1) Attributes of the flowfiles
2) Values of the attributes
3) Metadata of the "first" flowfile that reachs the processor
4) Metadata of the others flowfiles

I hope that this can help you clarify a bit more your flow...
Henrique

________________________________
De: william.jn.zhang <william.jn.zh...@gmail.com>
Enviado: quinta-feira, 5 de novembro de 2020 23:16
Para: users@nifi.apache.org <users@nifi.apache.org>
Assunto: Question about [MergeContent] processor


Hi all,

My have a job consist of following steps: first consuming data from kafka, and 
then packing data every 5 minutes into one file, finally put the packed file 
into hdfs.

I use the [MergeContent] processor to accomplish the “packing” step. The 
properties of MergeContent I configured is list below:



----------------------

Merge Strategy: Bin-Packing Algorithm

Merge Format: Binary Concatenation

Attribute Strategy: Keep Only Common Attributes

Correlation Attribute NameNo value setMetadata Strategy: Do Not Merge Uncommon 
Metadata

Minimum Number of Entries: 1

Maximum Number of Entries: 999999999

Minimum Group Size: 255 MB

Maximum Group SizeNo value setMax Bin Age: 5 minutes

Maximum number of Bins: 1

----------------------



I found the behavior of the MergeContent processor is very uncontrollable. 
There are serveral workflows running on the nifi with the same configuration of 
MergeContent processor, some workflows can packing the data every 5 minutes 
into one file correctly, but some others can’t. It even happened that some 
MergeContent processor generate one flowfile per record.



I am wondering if I misunderstanding the machanism of MergeContent processor.



A newbie of nifi, please help me.



Thanks!

Reply via email to