[ 
https://issues.apache.org/jira/browse/NIFI-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15589332#comment-15589332
 ] 

Mark Payne commented on NIFI-2920:
----------------------------------

I have been able to reproduce this locally. To do so, I created a simple Flow: 
GenerateFlowFile (1 KB file size) with success going to 2 different 
UpdateAttribute Processors (so that the same Content Claim is held by 2 
different FlowFiles). I let about 150,000 FlowFiles queue up (with backpressure 
turned off). I then start one of the UpdateAttribute processors. This drained 
its queue. I could then look at my content repo for any files not archived:

{code}
content_repository $ find . -type f | grep -v archive | wc -l
     192
{code}

After a few minutes, the FlowFile repo is checkpointed, which will result in 
things getting cleaned up if they can. The above command shows the same result 
(expected, since the FlowFiles are still held. I then empty the queue. After 
the FlowFile checkpoints again, I should see nothing in the content repo 
outside of archive, but I see:

{code}
content_repository $ find . -type f | grep -v archive | wc -l
     167
{code}

I see the same thing happening if I turn on expiration to remove the FlowFiles 
instead of clicking Empty Queue.


> Swapped FlowFiles are not removed from content repo when a queue is emptied.
> ----------------------------------------------------------------------------
>
>                 Key: NIFI-2920
>                 URL: https://issues.apache.org/jira/browse/NIFI-2920
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.0.0, 0.6.1
>         Environment: Linux
>            Reporter: Matthew Clarke
>            Priority: Critical
>
> If a queue contains enough FlowFiles to trigger swapping to occur and a user 
> selects "empty queue" or sets file expiration, only the content claims 
> associated to FlowFiles that were not swapped get removed from the content 
> repository.  All other content claims are left in the content repository and 
> are not moved to archive and/or purged.
> A restart of NiFi will produce app log messages about these unknown files 
> left in the content repo and will at that time move them to archive or purge 
> them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to