Jens,

Thanks for sharing the images.

I tried to setup a test to reproduce the issue. I’ve had it running for quite 
some time. Running through millions of iterations.

I’ve used 5 KB files, 50 KB files, 50 MB files, and larger (to the tune of 
hundreds of MB). I’ve been unable to reproduce an issue after millions of 
iterations.

So far I cannot replicate. And since you’re pulling the data via SFTP and then 
unpacking, which preserves all original attributes from a different system, 
this can easily become confusing.

Recommend trying to reproduce with SFTP-related processors out of the picture, 
as Joe is mentioning. Either using GetFile/FetchFile or GenerateFlowFile. Then 
immediately use CryptographicHashContent to generate an ‘initial hash’, copy 
that value to another attribute, and then loop, generating the hash and 
comparing against the original one. I’ll attach a flow that does this, but not 
sure if the email server will strip out the attachment or not.

This way we remove any possibility of actual corruption between the two nifi 
instances. If we can still see corruption / different hashes within a single 
nifi instance, then it certainly warrants further investigation but i can’t see 
any issues so far.

Thanks
-Mark





On Oct 20, 2021, at 10:21 AM, Joe Witt 
<joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote:

Jens

Actually is this current loop test contained within a single nifi and there you 
see corruption happen?

Joe

On Wed, Oct 20, 2021 at 7:14 AM Joe Witt 
<joe.w...@gmail.com<mailto:joe.w...@gmail.com>> wrote:
Jens,

You have a very involved setup including other systems (non NiFi).  Have you 
removed those systems from the equation so you have more evidence to support 
your expectation that NiFi is doing something other than you expect?

Joe

On Wed, Oct 20, 2021 at 7:10 AM Jens M. Kofoed 
<jmkofoed....@gmail.com<mailto:jmkofoed....@gmail.com>> wrote:
Hi

Today I have another file which have been running through the retry loop one 
time. To test the processors and the algorithm I added the HashContent 
processor and also added hashing by SHA-1.
I file have been going through the system, and both the SHA-1 and SHA-256 are 
both different than expected. with a 1 minutes delay the file is going back 
into the hashing content flow and this time it calculates both hashes fine.

I don't believe that the hashing is buggy, but something is very very strange. 
What can influence the processors/algorithm to calculate a different hash???
All the input/output claim information is exactly the same. It is the same 
flow/content file going in a loop. It happens on all 3 nodes.

Any suggestions for where to dig ?

Regards
Jens M. Kofoed



Den ons. 20. okt. 2021 kl. 06.34 skrev Jens M. Kofoed 
<jmkofoed....@gmail.com<mailto:jmkofoed....@gmail.com>>:
Hi Mark

Thanks for replaying and the suggestion to look at the content Claim.
These 3 pictures is from the first attempt:
<image.png>   <image.png>   <image.png>

Yesterday I realized that the content was still in the archive, so I could 
Replay the file.
<image.png>
So here are the same pictures but for the replay and as you can see the 
Identifier, offset and Size are all the same.
<image.png>   <image.png>   <image.png>

In my flow if the hash does not match my original first calculated hash, it 
goes into a retry loop. Here are the pictures for the 4th time the file went 
through:
<image.png>   <image.png>   <image.png>
Here the content Claim is all the same.

It is very rare that we see these issues <1 : 1.000.000 files and only with 
large files. Only once have I seen the error with a 110MB file, the other times 
the files size are above 800MB.
This time it was a Nifi-Flowstream v3 file, which has been exported from one 
system and imported in another. But while the file has been imported it is the 
same file inside NIFI and it stays at the same node. Going through the same 
loop of processors multiple times and in the end the CryptographicHashContent 
calculate a different SHA256 than it did earlier. This should not be 
possible!!! And that is what concern my the most.
What can influence the same processor to calculate 2 different sha256 on the 
exact same content???

Regards
Jens M. Kofoed


Den tir. 19. okt. 2021 kl. 16.51 skrev Mark Payne 
<marka...@hotmail.com<mailto:marka...@hotmail.com>>:
Jens,

In the two provenance events - one showing a hash of dd4cc… and the other 
showing f6f0….
If you go to the Content tab, do they both show the same Content Claim? I.e., 
do the Input Claim / Output Claim show the same values for Container, Section, 
Identifier, Offset, and Size?

Thanks
-Mark

On Oct 19, 2021, at 1:22 AM, Jens M. Kofoed 
<jmkofoed....@gmail.com<mailto:jmkofoed....@gmail.com>> wrote:

Dear NIFI Users

I have posted this mail in the developers mailing list and just want to inform 
all of our about a very odd behavior we are facing.
The background:
We have data going between 2 different NIFI systems which has no direct network 
access to each other. Therefore we calculate a SHA256 hash value of the content 
at system 1, before the flowfile and data are combined and saved as a 
"flowfile-stream-v3" pkg file. The file is then transported to system 2, where 
the pkg file is unpacked and the flow can continue. To be sure about file 
integrity we calculate a new sha256 at system 2. But sometimes we see that the 
sha256 gets another value, which might suggest the file was corrupted. But 
recalculating the sha256 again gives a new hash value.

----

Tonight I had yet another file which didn't match the expected sha256 hash 
value. The content is a 1.7GB file and the Event Duration was "00:00:17.539" to 
calculate the hash.
I have created a Retry loop, where the file will go to a Wait process for 
delaying the file 1 minute and going back to the CryptographicHashContent for a 
new calculation. After 3 retries the file goes to the retries_exceeded and goes 
to a disabled process just to be in a queue so I manually can look at it. This 
morning I rerouted the file from my retries_exceeded queue back to the 
CryptographicHashContent for a new calculation and this time it calculated the 
correct hash value.

THIS CAN'T BE TRUE :-( :-( But it is. - Something very very strange is 
happening.
<image.png>

We are running NiFi 1.13.2 in a 3 node cluster at Ubuntu 20.04.02 with openjdk 
version "1.8.0_292", OpenJDK Runtime Environment (build 
1.8.0_292-8u292-b10-0ubuntu1~20.04-b10), OpenJDK 64-Bit Server VM (build 
25.292-b10, mixed mode). Each server is a VM with 4 CPU, 8GB Ram on VMware 
ESXi, 7.0.2. Each NIFI node is running at different vm physical hosts.
I have inspected different logs to see if I can find any correlation what 
happened at the same time as the file is going through my loop, but there are 
no event/task at that exact time.

System 1:
At 10/19/2021 00:15:11.247 CEST my file is going through a 
CryptographicHashContent: SHA256 value: 
dd4cc7ef8dbc8d70528e8aa788581f0ab88d297c9c9f39b6b542df68952efd20
The file is exported as a "FlowFile Stream, v3" to System 2

SYSTEM 2:
At 10/19/2021 00:18:10.528 CEST the file is going through a 
CryptographicHashContent: SHA256 value: 
f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819
<image.png>
At 10/19/2021 00:19:08.996 CEST the file is going through the same 
CryptographicHashContent at system 2: SHA256 value: 
f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819
At 10/19/2021 00:20:04.376 CEST the file is going through the same a 
CryptographicHashContent at system 2: SHA256 value: 
f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819
At 10/19/2021 00:21:01.711 CEST the file is going through the same a 
CryptographicHashContent at system 2: SHA256 value: 
f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819

At 10/19/2021 06:07:43.376 CEST the file is going through the same a 
CryptographicHashContent at system 2: SHA256 value: 
dd4cc7ef8dbc8d70528e8aa788581f0ab88d297c9c9f39b6b542df68952efd20
<image.png>

How on earth can this happen???

Kind Regards
Jens M. Kofoed



Attachment: Repro.json
Description: Repro.json

Reply via email to