[ https://issues.apache.org/jira/browse/NIFI-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
endzeit updated NIFI-12386: --------------------------- Status: In Progress (was: Patch Available) > Add a FilterAttributes processor > -------------------------------- > > Key: NIFI-12386 > URL: https://issues.apache.org/jira/browse/NIFI-12386 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework > Reporter: endzeit > Assignee: endzeit > Priority: Major > Time Spent: 6h 20m > Remaining Estimate: 0h > > Flows in Apache NiFi can get quite sophisticated, consisting of a long chains > of both {{ProcessGroup}} and {{Processor}} components. > Oftentimes {{Processor}} components, including those in the NiFi standard > bundle, enrich an incoming {{FlowFile}} with additional FlowFile attributes. > This can lead to a fair amount of different FlowFile attributes accumulating > over the FlowFile's lifecycle. > In order to prevent subsequent {{ProcessGroup}} / {{Processor}} components to > accidentally rely on implementation details of preceding components, a good > practice is to: > # define which FlowFile attributes should exist at selected points in the > {{Flow}} > # reduce the attributes of the FlowFile at the selected point to those > defined > This can be achieved by using the > [UpdateAttribute|https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.23.2/org.apache.nifi.processors.attributes.UpdateAttribute/index.html] > processor of the standard processor bundle. > However, the {{UpdateAttribute}} processor allows only for a regular > expression to define a set of attributes to remove. Instead, the outlined > practice above desires to explicitly state a set of attributes to keep. One > can do so with a regular expression as well, but writing the reverse lookup > to achieve this is not the easiest endeavor to put it mildly. > This issue proposes a new processor {{FilterAttributes}} to be added to the > library of {{{}nifi-standard-processors{}}}, which can be configured with a > set of attributes and removes all attributes of an incoming FlowFile other > than the ones configured. > The processor should > * have a required, non-blank property "Attributes to keep", which takes a > list of attribute names separated by delimiter, e.g. comma (,). > ** trailing whitespace around attribute names should be ignored > ** leading or trailing delimiters should be ignored > * have a required, non-blank property "Delimiter", which is used to delimit > the individual attribute names, with a default value of "," (comma). > * have a single relationship "success" to which all FlowFiles are routed, > similar to {{UpdateAttribute}} > * have an > [InputRequirement|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/InputRequirement.html] > of > [INPUT_REQUIRED|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/InputRequirement.Requirement.html] > * > [@SupportsBatching|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/SupportsBatching.html] > * be > [@SideEffectFree|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/SideEffectFree.html] > Some possible extension might be: > * have a required property "Core attributes", with allowable values of "Keep > UUID only", "Keep all", with a default of "Keep UUID only" > ** an additional allowable value e.g. "Specify behaviour" may be added, > which allows for more customization > * have a required property "Mode", with allowable values of "Retain" and > "Remove", with a default of "Retain" -- This message was sent by Atlassian Jira (v8.20.10#820010)