[ https://issues.apache.org/jira/browse/NIFI-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joseph Witt updated NIFI-1008: ------------------------------ Fix Version/s: 0.5.0 > NiFi should swap out FlowFiles to disk even before the session is committed > --------------------------------------------------------------------------- > > Key: NIFI-1008 > URL: https://issues.apache.org/jira/browse/NIFI-1008 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework > Reporter: Mark Payne > Fix For: 0.5.0 > > > Currently, NiFi will swap out FlowFiles if there are a large number in a > FlowFile Queue. This is done to avoid running out of JVM heap space. However, > if we have a simple flow like GetFile -> SplitText and GetFile pulls in a > large file, SplitText can quickly cause OutOfMemoryError. This is not because > it buffers the content of the FlowFile in memory but rather because it holds > the millions of FlowFile objects in memory. We can do better. > When we call session.transfer for the FlowFiles, once we hit a magical > threshold (say 10,000), we should swap those FlowFiles to disk and the > session should transfer them to the queue "swapped out" flowfiles, rather > than having to buffer all of these in memory and then swapping them out once > they land in the queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)