[ https://issues.apache.org/jira/browse/NIFI-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Stieglitz updated NIFI-13288: ------------------------------------ Description: Per [~markap14] in the following [post|https://lists.apache.org/thread/7zo2px31r3377c7vhby4h6nrngdf3llf] one should avoid calling session.putAttribute many times since in order to maintain object immutability it has to create a new FlowFile object (and a new HashMap of all attributes!) for every call to putAttribute which leads to potentially a huge amount of garbage getting created. Per this advice some of the split processors SplitJson, SplitXml, and SplitAvro all have loops to create a new flow file for each split and it calls putAttribute more than once (in order to populate the split attributes FRAGMENT_ID, FRAGMENT_INDEX etc) for each flow file created. These should be fixed to to populate the attributes in a Map and then make one call to session.putAttributes. was: Per [~markap14] in the following [post|https://lists.apache.org/thread/7zo2px31r3377c7vhby4h6nrngdf3llf] one should avoid calling session.putAttribute many times since in order to maintain object immutability it has to create a new FlowFile object (and a new HashMap of all attributes!) for every call to putAttribute which leads to potentially a huge amount of garbage getting created. Per this advice some of the split processors SplitJson, SplitXml, and SplitAvro all have loops to create a new flow file for each split and it calls putAttribute more than once (in order to populate the split attributes FRAGMENT_ID, FRAGMENT_INDEX etc) for each flow file created. These should be fixed to to populate the attributes in a Map and then make one call to putAttributes. > Fix SplitJson, SplitXml, and SplitAvro processor not to call > session.putAttribute multiple times > ------------------------------------------------------------------------------------------------ > > Key: NIFI-13288 > URL: https://issues.apache.org/jira/browse/NIFI-13288 > Project: Apache NiFi > Issue Type: Improvement > Reporter: Daniel Stieglitz > Assignee: Daniel Stieglitz > Priority: Major > > Per [~markap14] in the following > [post|https://lists.apache.org/thread/7zo2px31r3377c7vhby4h6nrngdf3llf] one > should avoid calling session.putAttribute many times since in order to > maintain object immutability it has to create a new FlowFile object (and a > new HashMap of all attributes!) > for every call to putAttribute which leads to potentially a huge amount of > garbage getting created. Per this advice some of the split processors > SplitJson, SplitXml, and SplitAvro all have loops to create a new flow file > for each split and it calls putAttribute more than once (in order to populate > the split attributes FRAGMENT_ID, FRAGMENT_INDEX etc) for each flow file > created. These should be fixed to to populate the attributes in a Map and > then make one call to session.putAttributes. -- This message was sent by Atlassian Jira (v8.20.10#820010)