[ 
https://issues.apache.org/jira/browse/NIFI-11232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694679#comment-17694679
 ] 

Christian Wahl edited comment on NIFI-11232 at 2/28/23 6:30 PM:
----------------------------------------------------------------

Sure here is a test case (added to TestContentClaimInputStream) that causes 
this error in the Release commit of 1.20

I also added the whole file with that test case as an attachment.
{code:java}
@Test
public void test_NIFI_11232() throws IOException {
    final InputStream original = new BufferedInputStream(new 
ByteArrayInputStream(new byte[10_000]));
    // use an offset greater than the first two reads
    final ContentClaimInputStream in = new ContentClaimInputStream(repo, 
contentClaim, 3L, original, new NopPerformanceTracker());
    in.mark(1);
    in.read();
    in.reset();

    in.mark(1);
    in.read();
    // in this reset the "original" stream is closed, but the buffer is still 
there and using that one
    in.reset();

    // This fails because it tries to read from the buffer and the buffer from 
the closed original stream.
    // To force the buffer to read from the original stream a value greater 
than the buffer size is required (8192)
    in.readNBytes(9000);
} {code}
 

In the ContentClaimInputStream I identfied two problematic parts causing this 
behaviour.

Line 176, bytesConsumed is only ever incremented and does not reflect the bytes 
read since the last mark
{code:java}
if (bufferedIn != null && bytesConsumed <= markReadLimit)
{code}
Line 217-221, the currentOffset is first of all reset to the claim offset and 
is therefore not usable to calculate the bytes read since the last mark and 
currentOffset can be easily bigger than the markReadLimit (since the 
markReadLimit is not the markOffset)
{code:java}
currentOffset = claimOffset;

if (markReadLimit > 0) {
    final int limitLeft = (int) (markReadLimit - currentOffset);
    if (limitLeft > 0) { {code}


was (Author: JIRAUSER299040):
Sure here is a test case (added to TestContentClaimInputStream) that causes 
this error in the Release commit of 1.20

I also added the whole file with that test case as an attachment.
{code:java}
@Test
public void test_NIFI_11232() throws IOException {
    final InputStream original = new BufferedInputStream(new 
ByteArrayInputStream(new byte[10_000]));
    // use an offset greater than the first two reads
    final ContentClaimInputStream in = new ContentClaimInputStream(repo, 
contentClaim, 3L, original, new NopPerformanceTracker());
    in.mark(1);
    in.read();
    in.reset();

    in.mark(1);
    in.read();
    // in this reset the "original" stream is closed, but the buffer is still 
there and using that one
    in.reset();

    // this fails because it tries to read from the buffer
    // and the buffer from the closed original stream
    // to force the buffer to read from the original stream a value greater 
than the buffer size is required (8192)
    in.readNBytes(9000);
} {code}
 

In the ContentClaimInputStream I identfied two problematic parts causing this 
behaviour.

Line 176, bytesConsumed is only ever incremented and does not reflect the bytes 
read since the last mark
{code:java}
if (bufferedIn != null && bytesConsumed <= markReadLimit)
{code}

Line 217-221, the currentOffset is first of all reset to the claim offset and 
is therefore not usable to calculate the bytes read since the last mark and 
currentOffset can be easily bigger than the markReadLimit (since the 
markReadLimit is not the markOffset)
{code:java}
currentOffset = claimOffset;

if (markReadLimit > 0) {
    final int limitLeft = (int) (markReadLimit - currentOffset);
    if (limitLeft > 0) { {code}

> FlowFileAccessException using ContentClaimInputStream
> -----------------------------------------------------
>
>                 Key: NIFI-11232
>                 URL: https://issues.apache.org/jira/browse/NIFI-11232
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.20.0
>            Reporter: Christian Wahl
>            Priority: Major
>         Attachments: TestContentClaimInputStream.java
>
>
> NIFI-10888 introduced a BufferedInputStream inside of the 
> ContentClaimInputStream to speed up rewinding in small flow files (<1MB).
> Under some circumstances it can happen in reset that the delegate stream is 
> closed and a new delegate stream is created, but the bufferedIn is not 
> recreated with the new delegate.
> During the next read this leads to a situation where it tries to read from 
> bufferedIn and bufferedIn in turn tries to read from the old and closed 
> delegate stream causing an IOException or FlowFileAccessException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to