[
https://issues.apache.org/jira/browse/NIFI-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pierre Villard resolved NIFI-10553.
-----------------------------------
Resolution: Feedback Received
Apache NiFi 1.x is no longer maintained and no new release is planned on the
1.x release line. Marking as resolved as part of a cleanup operation. Please
open a new one with an updated description if this is still relevant for NiFi
2.x.
> MergeContent Prematurely Evicts Bins
> ------------------------------------
>
> Key: NIFI-10553
> URL: https://issues.apache.org/jira/browse/NIFI-10553
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.14.0, 1.16.3
> Reporter: Eric Secules
> Priority: Major
>
> When NiFi's merge processors are configured to defragment, the user wants
> flowfiles merged in a specific way according to the `fragment.` attributes.
> Hoever, when MergeDocuments is handling many unique values for
> `fragment.identifier` it opens up one bin per value until it reaches the
> `MAX_BIN_COUNT` parameter configured on this processor. This parameter is
> there to limit memory used by merging too many things all at once. It is not
> certain that the user will be able to set this to an appropriate value for
> every flow, and the consequence is that evicting a partially filled bin will
> result in possible downstream issues and flowfiles stuck in the input
> connection of MergeDocuments.
>
> Instead of this behaviour, the merge processor should penalize and requeue
> flowfiles that don't fit in any of the existing bins if we have reached the
> max number of bins already. Penalizing non-matching flowfiles will give time
> for the ones needed to complete the existing bins to arrive.
> I wrote a unit test on my fork of NiFi which covers this bug:
> https://github.com/esecules/nifi/blob/2e5074eabfc0be100491fa007329ce9492382af7/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestMergeContent.java#L1091
--
This message was sent by Atlassian Jira
(v8.20.10#820010)