[
https://issues.apache.org/jira/browse/NIFI-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756945#comment-17756945
]
Mark Payne commented on NIFI-11971:
-----------------------------------
Thanks [~exceptionfactory] and [~Serhii Nesterov]
As for the Exception handling there: if we think through the logic here, I'd
say that generally it's not going to be a concern. But there's a very small
corner case, where it could cause a concern. In particular, it would mean that
a Processor would have to be using a ProcessSession to write to a FlowFile
using ProcessSession.append(). And then after that, they would need to call
ProcessSession.write() which would be rather weird to have both in the same
Processor. That call to ProcessSession.write() would then have to encounter an
Exception when trying to flush (e.g., the disk is full). That would re-throw
the Exception. The Processor would then have to catch that Exception, and
basically ignore it and keep writing to the Process Session, without committing
or rolling back.
So it's feasible that we could have that super awkward series of events, but
incredibly unlikely. Regardless, though, it is worth ensuring that we never
have that case, even in such an odd corner case. I think there's actually a
super trivial fix to ensure that even then we don't have the problem, so I'll
push up a new one-liner PR for that, as well.
> FlowFile content is corrupted across the whole NiFi instance throughout
> ProcessSession::write with omitting writing any byte to OutputStream
> --------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: NIFI-11971
> URL: https://issues.apache.org/jira/browse/NIFI-11971
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.23.0, 1.23.1
> Reporter: Serhii Nesterov
> Assignee: Mark Payne
> Priority: Blocker
> Labels: corruption
> Fix For: 2.0.0, 1.24.0, 1.23.2
>
> Attachments: image-2023-08-20-19-31-16-598.png,
> image-2023-08-20-19-37-43-772.png, image-2023-08-20-19-38-03-391.png,
> image-2023-08-20-19-42-37-029.png, image-2023-08-20-19-43-03-697.png,
> image-2023-08-20-20-01-50-445.png, image-2023-08-21-13-21-31-091.png
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> One of the scenarios for ProcessSession::write was broken after recent code
> refactoring within the following pull request:
> [https://github.com/apache/nifi/pull/7363/files]
> The issue is located in _StandardContentClaimWriteCache.java_ in the
> _write(final ContentClaim claim)_ method that returns an _OutputStream_ used
> in the _OutputStreamCallback_ interface to let NiFi processors write flowfile
> content through the {_}ProcessSession::write method{_}.
> If a processor calls _session.write_ but does not write any data to the
> output stream, then none of the write methods in the _OutputStream_ is
> invoked, hence the length of the content claim is not recomputed, meaning the
> length will have the default value that is equal to {*}-1{*}. Because of the
> latest refactoring changes that are based on creating a new content claim on
> each _ProcessSession::write_ invocation the following formula gives the wrong
> result:
> {code:java}
> previous offset + previous length = new offset.{code}
> or as in the codebase:
> {code:java}
> scc = new StandardContentClaim(scc.getResourceClaim(), scc.getOffset() +
> scc.getLength());{code}
> For example, if the previous offset was 1000 and nothing was written to the
> stream (length is -1), then 1000 + (-1) will give us 999 which means that the
> offset is shifted back by one, hence the next content will have an extra
> character from the previous content at the beginning and will lose the last
> character at the end, and all other FlowFiles anywhere in NiFi will be
> corrupted by this defect until the NiFi instance is restarted.
> The following steps can be taken to reproduce the issue (critical in our
> commercial project):
> * Create an empty text file (“a.txt”);
> * Create a text file with any text (“b.txt”);
> * Package these files into a .zip archive;
> * Put it into a file system on Azure Cloud (we use ADLS Gen2);
> * Read the zip file and unpack its content on the NiFi Canvas using the
> _FetchAzureDataLakeStorage_ and _UnpackContent_ processors;
> * Start a flow with the _GenerateFlowFile_ processor. See the results. The
> empty file must be extracted before the non-empty file, otherwise the issue
> won’t reproduce. You’ll see that the second FlowFile content will be
> corrupted – the first character is an unreadable character from the zip
> archive (last character of the content with zip) fetched with
> _FetchAzureDataLakeStorage_ and the last character will be lost. Starting
> from this point, NiFi cannot be used at all because any other processors will
> lead to FlowFile content corruption across the entire NiFi instance due to
> the shifted offset.
> A sample canvas:
> !image-2023-08-20-19-31-16-598.png|width=969,height=492!
>
> Important note: the issue is not reproducible if an empty file is a last file
> to be extracted (the length will be reset when the processor completes), or
> if you do not call _session.write()_ when a file has 0 bytes (in case if you
> create your own processor with such logic).
> The offsets for the above picture will look like as follows (#1 - after
> fetching and unpacking an empty file, #2 - before unpacking the second file):
> !image-2023-08-20-19-37-43-772.png|width=961,height=32!
> !image-2023-08-20-19-38-03-391.png|width=960,height=35!
> 1524 - after FetchAzureDataLakeStorage and UnpackContent for the empty file.
> The length *-1* will be kept instead of *0* and used for the next file which
> is why the next offset is equal to 1523 ({*}1524 + (-1) = 1523{*}).
> if your file has the "Hello world" text inside, then after downloading this
> unpacked file from NiFi you'll see (the first character here is a space):
> !image-2023-08-20-20-01-50-445.png!
> Different processors will give you various errors due to the corrupted
> content especially for the json format and queries:
> !image-2023-08-20-19-42-37-029.png!
> !image-2023-08-20-19-43-03-697.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)