[ 
https://issues.apache.org/jira/browse/NIFI-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Witt updated NIFI-1008:
------------------------------
    Fix Version/s: 0.5.0

> NiFi should swap out FlowFiles to disk even before the session is committed
> ---------------------------------------------------------------------------
>
>                 Key: NIFI-1008
>                 URL: https://issues.apache.org/jira/browse/NIFI-1008
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: Mark Payne
>             Fix For: 0.5.0
>
>
> Currently, NiFi will swap out FlowFiles if there are a large number in a 
> FlowFile Queue. This is done to avoid running out of JVM heap space. However, 
> if we have a simple flow like GetFile -> SplitText and GetFile pulls in a 
> large file, SplitText can quickly cause OutOfMemoryError. This is not because 
> it buffers the content of the FlowFile in memory but rather because it holds 
> the millions of FlowFile objects in memory. We can do better.
> When we call session.transfer for the FlowFiles, once we hit a magical 
> threshold (say 10,000), we should swap those FlowFiles to disk and the 
> session should transfer them to the queue "swapped out" flowfiles, rather 
> than having to buffer all of these in memory and then swapping them out once 
> they land in the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to