[ 
https://issues.apache.org/jira/browse/NIFI-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172612#comment-15172612
 ] 

Mark Payne commented on NIFI-1577:
----------------------------------

The easiest way that I've found to test this is to run a Processor like 
ListenSyslog that supports batching and calls session.append(). If the Run 
Duration is set to 25 ms and a fairly large amount of data is pushed to it, the 
logs will start being filled with errors about Too Many Open Files. Once this 
patch is applied, that goes away.

Unfortunately, the patch does not lend itself well to unit tests, as it would 
require inspecting a lot of internal private state about the 
StandardProcessSession, which would result in very brittle unit tests. However, 
since checkpoint() clears the 'records' map, those streams that would be 
accessible will no longer be accessible anyway because the Mapping is from 
ContentClaim (which belongs to exactly 1 RepositoryRecord in the 'records' Map) 
to an OutputStream. Since the 'records' map is cleared, we cannot access the 
OutputStream, so they were being held open without any benefit.

> NiFi holds open too many files when using a Run Duration > 0 ms and calling 
> session.append
> ------------------------------------------------------------------------------------------
>
>                 Key: NIFI-1577
>                 URL: https://issues.apache.org/jira/browse/NIFI-1577
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>            Reporter: Mark Payne
>         Attachments: 
> 0001-NIFI-1577-Close-any-streams-that-are-left-open-for-a.patch
>
>
> If a Processor calls ProcessSession.append() and has a Run Duration scheduled 
> > 0 ms, we quickly end up with "Too many open files" exceptions.
> This appears to be due to the fact that calling append() holds the content 
> repository's stream open so that the session can keep appending to it, but on 
> checkpoint() the session does not close these streams. It should close these 
> streams on checkpoint, since the Processor is no longer allowed to reference 
> these FlowFiles anyway at that point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to