[ https://issues.apache.org/jira/browse/NIFI-11232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694679#comment-17694679 ]
Christian Wahl edited comment on NIFI-11232 at 2/28/23 6:30 PM: ---------------------------------------------------------------- Sure here is a test case (added to TestContentClaimInputStream) that causes this error in the Release commit of 1.20 I also added the whole file with that test case as an attachment. {code:java} @Test public void test_NIFI_11232() throws IOException { final InputStream original = new BufferedInputStream(new ByteArrayInputStream(new byte[10_000])); // use an offset greater than the first two reads final ContentClaimInputStream in = new ContentClaimInputStream(repo, contentClaim, 3L, original, new NopPerformanceTracker()); in.mark(1); in.read(); in.reset(); in.mark(1); in.read(); // in this reset the "original" stream is closed, but the buffer is still there and using that one in.reset(); // This fails because it tries to read from the buffer and the buffer from the closed original stream. // To force the buffer to read from the original stream a value greater than the buffer size is required (8192) in.readNBytes(9000); } {code} In the ContentClaimInputStream I identfied two problematic parts causing this behaviour. Line 176, bytesConsumed is only ever incremented and does not reflect the bytes read since the last mark {code:java} if (bufferedIn != null && bytesConsumed <= markReadLimit) {code} Line 217-221, the currentOffset is first of all reset to the claim offset and is therefore not usable to calculate the bytes read since the last mark and currentOffset can be easily bigger than the markReadLimit (since the markReadLimit is not the markOffset) {code:java} currentOffset = claimOffset; if (markReadLimit > 0) { final int limitLeft = (int) (markReadLimit - currentOffset); if (limitLeft > 0) { {code} was (Author: JIRAUSER299040): Sure here is a test case (added to TestContentClaimInputStream) that causes this error in the Release commit of 1.20 I also added the whole file with that test case as an attachment. {code:java} @Test public void test_NIFI_11232() throws IOException { final InputStream original = new BufferedInputStream(new ByteArrayInputStream(new byte[10_000])); // use an offset greater than the first two reads final ContentClaimInputStream in = new ContentClaimInputStream(repo, contentClaim, 3L, original, new NopPerformanceTracker()); in.mark(1); in.read(); in.reset(); in.mark(1); in.read(); // in this reset the "original" stream is closed, but the buffer is still there and using that one in.reset(); // this fails because it tries to read from the buffer // and the buffer from the closed original stream // to force the buffer to read from the original stream a value greater than the buffer size is required (8192) in.readNBytes(9000); } {code} In the ContentClaimInputStream I identfied two problematic parts causing this behaviour. Line 176, bytesConsumed is only ever incremented and does not reflect the bytes read since the last mark {code:java} if (bufferedIn != null && bytesConsumed <= markReadLimit) {code} Line 217-221, the currentOffset is first of all reset to the claim offset and is therefore not usable to calculate the bytes read since the last mark and currentOffset can be easily bigger than the markReadLimit (since the markReadLimit is not the markOffset) {code:java} currentOffset = claimOffset; if (markReadLimit > 0) { final int limitLeft = (int) (markReadLimit - currentOffset); if (limitLeft > 0) { {code} > FlowFileAccessException using ContentClaimInputStream > ----------------------------------------------------- > > Key: NIFI-11232 > URL: https://issues.apache.org/jira/browse/NIFI-11232 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework > Affects Versions: 1.20.0 > Reporter: Christian Wahl > Priority: Major > Attachments: TestContentClaimInputStream.java > > > NIFI-10888 introduced a BufferedInputStream inside of the > ContentClaimInputStream to speed up rewinding in small flow files (<1MB). > Under some circumstances it can happen in reset that the delegate stream is > closed and a new delegate stream is created, but the bufferedIn is not > recreated with the new delegate. > During the next read this leads to a situation where it tries to read from > bufferedIn and bufferedIn in turn tries to read from the old and closed > delegate stream causing an IOException or FlowFileAccessException. -- This message was sent by Atlassian Jira (v8.20.10#820010)