[ 
https://issues.apache.org/jira/browse/NIFI-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963355#comment-15963355
 ] 

Michael Moser commented on NIFI-3686:
-------------------------------------

Note: I didn't encounter this on a production system, I simulated this 
happening by truncating a swap file while NiFi was not running.

I have a simple code patch to StandardFlowFileQueue that will remove the swap 
contents from the swapQueue if the swap summary is valid.  This fixes the user 
experience by logging the EOFException ERROR to the nifi-app.log, then the 
queue size goes to 0 and the processor reading from this queue is not 
triggered.  On the next NiFi restart, if the corrupt swap file is still there, 
the EOFException ERROR happens again.  I'm not sure this is the desired 
approach, though.

[~markap14] if you can ponder this, please let me know if I should submit this 
as a PR or if it should be resolved in another way.  Thanks!

> EOFException on swap in causes tight loop in polling for flowfiles
> ------------------------------------------------------------------
>
>                 Key: NIFI-3686
>                 URL: https://issues.apache.org/jira/browse/NIFI-3686
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.1.1
>            Reporter: Michael Moser
>
> If flowfile_repository partition fills 100% while swapping files out to a new 
> swap file, then this swap file becomes corrupt (partially written).  When 
> NiFi tries to swap this file in, EOFException happens and we get following 
> ERROR, which is nice.
> 2017-04-10 18:02:58,855 ERROR [Timer-Driven Process Thread-3] 
> o.a.n.controller.StandardFlowFileQueue Failed to swap in FlowFiles from Swap 
> File 
> /local/mwmoser/nifi-1.2.0-SNAPSHOT/./flowfile_repository/swap/1491574631605-2840b630-57fc-4f49-615b-0b37d77bec66-5dbc0ad0-921c-483e-a05d-5c65d014fa48.swap;
>  Swap File appears to be corrupt!
> However, once all other dataflow stops, the queue now shows 10000 flowfiles 
> in it.  The processor reading from this queue constantly has its onTrigger() 
> called, and session.get() polls the queue and gets 0 files returned.  This 
> happens in a tight loop, with no other errors.
> To a user it appears that the processor is doing lots of work but just not 
> processing those 10000 files.  The error message above only appears once in 
> the nifi-app.log, so you don't see anything wrong if you tail the log. 
>  When you restart NiFi, the error message above appears again, but the user 
> experience of 10000 files not processing remains.
> The new SchemaSwapDeserializer does not (and perhaps cannot) implement the 
> IncompleteSwapFileException that the old SimpleSwapDeserializer does.  So, 
> reading a swap file is currently all-or-nothing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to