Hi William, >From another nifi newbie, i also think that both MergeContent and MergeRecord >processors are really hard to use. Sometimes you just need to merge multiple >flowfiles toguether, based on line count or size, and it is really hard to >accomplish this. In docs there are some tips that may impact you (im using 1.11.4 at the moment here): https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.4/org.apache.nifi.processors.standard.MergeContent/index.html
Merge Strategy: Bin-Packing Algorithm - generates a FlowFile populated by arbitrarily chosen FlowFiles. Attribute Strategy: Keep Only Common Attributes - only the attributes that exist on all FlowFiles in the bundle, with the same value, will be preserved. Correlation Attribute Name: No Value - If not specified, FlowFiles are bundled by the order in which they are pulled from the queue. But... Metadata Strategy: Do Not Merge Uncommon Metadata For any input format that supports metadata (Avro, e.g.), any FlowFile whose metadata values do not match those of the first FlowFile in the bin will not be merged. So, there is a lot of things that you need to check: 1) Attributes of the flowfiles 2) Values of the attributes 3) Metadata of the "first" flowfile that reachs the processor 4) Metadata of the others flowfiles I hope that this can help you clarify a bit more your flow... Henrique ________________________________ De: william.jn.zhang <william.jn.zh...@gmail.com> Enviado: quinta-feira, 5 de novembro de 2020 23:16 Para: users@nifi.apache.org <users@nifi.apache.org> Assunto: Question about [MergeContent] processor Hi all, My have a job consist of following steps: first consuming data from kafka, and then packing data every 5 minutes into one file, finally put the packed file into hdfs. I use the [MergeContent] processor to accomplish the “packing” step. The properties of MergeContent I configured is list below: ---------------------- Merge Strategy: Bin-Packing Algorithm Merge Format: Binary Concatenation Attribute Strategy: Keep Only Common Attributes Correlation Attribute NameNo value setMetadata Strategy: Do Not Merge Uncommon Metadata Minimum Number of Entries: 1 Maximum Number of Entries: 999999999 Minimum Group Size: 255 MB Maximum Group SizeNo value setMax Bin Age: 5 minutes Maximum number of Bins: 1 ---------------------- I found the behavior of the MergeContent processor is very uncontrollable. There are serveral workflows running on the nifi with the same configuration of MergeContent processor, some workflows can packing the data every 5 minutes into one file correctly, but some others can’t. It even happened that some MergeContent processor generate one flowfile per record. I am wondering if I misunderstanding the machanism of MergeContent processor. A newbie of nifi, please help me. Thanks!