[ https://issues.apache.org/jira/browse/NIFI-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754490#comment-15754490 ]
ASF GitHub Bot commented on NIFI-3205: -------------------------------------- GitHub user markap14 opened a pull request: https://github.com/apache/nifi/pull/1336 NIFI-3205: Ensure that FlowFile Repository is updated with any Transi… Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. …ent Content Claims when session rollback occurs You can merge this pull request into a Git repository by running: $ git pull https://github.com/markap14/nifi NIFI-3205 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/1336.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1336 ---- commit a9d32443eef9d4b6afd48600e3fedfb0bfe23b0c Author: Mark Payne <marka...@hotmail.com> Date: 2016-12-16T14:01:01Z NIFI-3205: Ensure that FlowFile Repository is updated with any Transient Content Claims when session rollback occurs ---- > Uncaught Failures Can Leave New Flow Files on Disk > -------------------------------------------------- > > Key: NIFI-3205 > URL: https://issues.apache.org/jira/browse/NIFI-3205 > Project: Apache NiFi > Issue Type: Bug > Affects Versions: 1.1.0 > Reporter: Alan Jackoway > Assignee: Mark Payne > Priority: Critical > Fix For: 1.2.0 > > > We have been hitting a situation where our content repository quickly fills > the entire disk despite having archiving off and close to nothing queued. > We believe this problem happens more often when a processor that creates many > flow files fails in the middle. > I then created this test script and deployed it on a new nifi with a 100KB > GenerateFlowFile in front of it. The script makes 5 copies of the incoming > flow file, then does session.remove on those copies, then throws a > RuntimeException. However, the content repository grows 500KB every time it > runs. Then when you restart nifi, it cleans up the content repository with > messages like this: > {noformat} > 2016-12-15 11:17:29,774 INFO [main] o.a.n.c.repository.FileSystemRepository > Found unknown file > /Users/alanj/nifi-1.1.0/content_repository/1/1481818525279-1 (1126400 bytes) > in File System Repository; archiving file > 2016-12-15 11:17:29,778 INFO [main] o.a.n.c.repository.FileSystemRepository > Found unknown file > /Users/alanj/nifi-1.1.0/content_repository/2/1481818585493-2 (409600 bytes) > in File System Repository; archiving file > {noformat} > The test processor is the following: > {code:java} > // Copyright 2016 (c) Cloudera > package com.cloudera.edh.nifi.processors.bundles; > import com.google.common.collect.Lists; > import java.io.IOException; > import java.io.InputStream; > import java.io.OutputStream; > import java.util.List; > import org.apache.nifi.annotation.behavior.InputRequirement; > import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; > import org.apache.nifi.flowfile.FlowFile; > import org.apache.nifi.processor.AbstractProcessor; > import org.apache.nifi.processor.ProcessContext; > import org.apache.nifi.processor.ProcessSession; > import org.apache.nifi.processor.exception.ProcessException; > import org.apache.nifi.processor.io.InputStreamCallback; > import org.apache.nifi.processor.io.OutputStreamCallback; > import org.apache.nifi.stream.io.StreamUtils; > /** > * Makes 5 copies of an incoming file, then fails and rolls back. > */ > @InputRequirement(value = Requirement.INPUT_REQUIRED) > public class CopyAndFail extends AbstractProcessor { > @Override > public void onTrigger(ProcessContext context, ProcessSession session) > throws ProcessException { > FlowFile inputFile = session.get(); > if (inputFile == null) { > context.yield(); > return; > } > final List<FlowFile> newFiles = Lists.newArrayList(); > > // Copy the file 5 times (simulates us opening a zip file and unpacking > its contents) > for (int i = 0; i < 5; i++) { > session.read(inputFile, new InputStreamCallback() { > @Override > public void process(InputStream inputStream) throws IOException { > FlowFile ff = session.create(inputFile); > ff = session.write(ff, new OutputStreamCallback() { > @Override > public void process(final OutputStream out) throws IOException { > StreamUtils.copy(inputStream, out); > } > }); > newFiles.add(ff); > } > }); > } > > getLogger().warn("Removing the new files"); > System.err.println("Removing the new files"); > session.remove(newFiles); > > // Simulate an error handling some file in the zip after unpacking the > rest > throw new RuntimeException(); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)