Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-11-03 Thread Joe Witt
Jens,

I think we're at a loss how to help you specifically then for your
specific installation.  We have attempted to recreate the scenario
with no luck.  We've offered suggestions on experiments which would
help us narrow in but you don't think that will help.

At this point we'll probably have to leave this thread here.  If you
used the forced sync properties we mentioned and it is still happening
then you can pretty much ensure the issue is with the JVM or the
virtual file system mechanism.

Thanks
Joe

On Wed, Nov 3, 2021 at 8:09 AM Jens M. Kofoed  wrote:
>
> Hi Mark
>
> All the files in my testflow are 1GB files. But it happens in my production 
> flow with different file sizes.
>
> When these issues have happened, I have the flowfile routed to an 
> updateAttribute process which is disabled. Just to keep the file in a queue. 
> Enable the process and sent the file back to a new hash calculation, the file 
> is OK. So I don’t think the test with backup and compare makes any sense to 
> do.
>
> Regards
> Jens
>
> > Den 3. nov. 2021 kl. 15.57 skrev Mark Payne :
> >
> > So what I found interesting about the histogram output was that in each 
> > case, the input file was 1 GB. The number of bytes that differed between 
> > the ‘good’ and ‘bad’ hashes was something like 500-700 bytes whose values 
> > were different. But the values ranged significantly. There was no 
> > indication that the type of thing we’ve seen with NFS mounts was happening, 
> > where data was nulled out until received and then updated. If that had been 
> > the case we’d have seen the NUL byte (or some other value) have a very 
> > significant change in the histogram, but we didn’t see that.
> >
> > So a couple more ideas that I think can be useful.
> >
> > 1) Which garbage collector are you using? It’s configured in the 
> > bootstrap.conf file
> >
> > 2) We can try to definitively prove out whether the content on the disk is 
> > changing or if there’s an issue reading the content. To do this:
> >
> > 1. Stop all processors.
> > 2. Shutdown nifi
> > 3. rm -rf content_repository; rm -rf flowfile_repository   (warning, this 
> > will delete all FlowFiles & content, so only do this on a dev/test system 
> > where you’re comfortable deleting it!)
> > 4. Start nifi
> > 5. Let exactly 1 FlowFile into your flow.
> > 6. While it is looping through, create a copy of your entire Content 
> > Repository: cp -r content_repository content_backup1; zip 
> > content_backup1.zip content_backup1
> > 7. Wait for the hashes to differ
> > 8. Create another copy of the Content Repository: cp -r content_repository 
> > content_backup2
> > 9. Find the files within the content_backup1 and content_backup2 and 
> > compare them to see if they are identical. Would recommend comparing them 
> > using each of the 3 methods: sha256, sha512, diff
> >
> > This should make it pretty clear that either:
> > (1) the issue resides in the software: either NiFi or the JVM
> > (2) the issue resides outside of the software: the disk, the disk driver, 
> > the operating system, the VM hypervisor, etc.
> >
> > Thanks
> > -Mark
> >
> >> On Nov 3, 2021, at 10:44 AM, Joe Witt  wrote:
> >>
> >> Jens,
> >>
> >> 184 hours (7.6 days) in and zero issues.
> >>
> >> Will need to turn this off soon but wanted to give a final update.
> >> Looks great.  Given the information on your system there appears to be
> >> something we dont understand related to the virtual file system
> >> involved or something.
> >>
> >> Thanks
> >>
> >>> On Tue, Nov 2, 2021 at 10:55 PM Jens M. Kofoed  
> >>> wrote:
> >>>
> >>> Hi Mark
> >>>
> >>> Of course, sorry :-)  By looking at the error messages, I can see that it 
> >>> is only the histograms which has differences which is listed. And all 3 
> >>> have the first issue at histogram.9. Don't know what that mean
> >>>
> >>> /Jens
> >>>
> >>> Here are the error log:
> >>> 2021-11-01 23:57:21,955 ERROR [Timer-Driven Process Thread-10] 
> >>> org.apache.nifi.processors.script.ExecuteScript 
> >>> ExecuteScript[id=c7d3335b-1045-14ed--a0d62c70] There are 
> >>> differences in the histogram
> >>> Byte Value: histogram.10, Previous Count: 11926720, New Count: 11926721
> >>> Byte Value: histogram.100, Previous Count: 11927504, New Count: 11927503
> >>> Byte Value: histogram.101, Previous Count: 11925396, New Count: 11925407
> >>> Byte Value: histogram.102, Previous Count: 11929923, New Count: 11929941
> >>> Byte Value: histogram.103, Previous Count: 11931596, New Count: 11931591
> >>> Byte Value: histogram.104, Previous Count: 11929071, New Count: 11929064
> >>> Byte Value: histogram.105, Previous Count: 11931365, New Count: 11931348
> >>> Byte Value: histogram.106, Previous Count: 11928661, New Count: 11928645
> >>> Byte Value: histogram.107, Previous Count: 11929864, New Count: 11929866
> >>> Byte Value: histogram.108, Previous Count: 11931611, New Count: 11931642
> >>> Byte Value: histogram.109, Previous Count: 11932758, New Count: 11932763
> >>> Byte 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-11-03 Thread Jens M. Kofoed
Hi Mark

All the files in my testflow are 1GB files. But it happens in my production 
flow with different file sizes. 

When these issues have happened, I have the flowfile routed to an 
updateAttribute process which is disabled. Just to keep the file in a queue. 
Enable the process and sent the file back to a new hash calculation, the file 
is OK. So I don’t think the test with backup and compare makes any sense to do. 

Regards 
Jens

> Den 3. nov. 2021 kl. 15.57 skrev Mark Payne :
> 
> So what I found interesting about the histogram output was that in each case, 
> the input file was 1 GB. The number of bytes that differed between the ‘good’ 
> and ‘bad’ hashes was something like 500-700 bytes whose values were 
> different. But the values ranged significantly. There was no indication that 
> the type of thing we’ve seen with NFS mounts was happening, where data was 
> nulled out until received and then updated. If that had been the case we’d 
> have seen the NUL byte (or some other value) have a very significant change 
> in the histogram, but we didn’t see that.
> 
> So a couple more ideas that I think can be useful.
> 
> 1) Which garbage collector are you using? It’s configured in the 
> bootstrap.conf file
> 
> 2) We can try to definitively prove out whether the content on the disk is 
> changing or if there’s an issue reading the content. To do this:
> 
> 1. Stop all processors.
> 2. Shutdown nifi
> 3. rm -rf content_repository; rm -rf flowfile_repository   (warning, this 
> will delete all FlowFiles & content, so only do this on a dev/test system 
> where you’re comfortable deleting it!)
> 4. Start nifi
> 5. Let exactly 1 FlowFile into your flow.
> 6. While it is looping through, create a copy of your entire Content 
> Repository: cp -r content_repository content_backup1; zip content_backup1.zip 
> content_backup1
> 7. Wait for the hashes to differ
> 8. Create another copy of the Content Repository: cp -r content_repository 
> content_backup2
> 9. Find the files within the content_backup1 and content_backup2 and compare 
> them to see if they are identical. Would recommend comparing them using each 
> of the 3 methods: sha256, sha512, diff
> 
> This should make it pretty clear that either:
> (1) the issue resides in the software: either NiFi or the JVM
> (2) the issue resides outside of the software: the disk, the disk driver, the 
> operating system, the VM hypervisor, etc.
> 
> Thanks
> -Mark
> 
>> On Nov 3, 2021, at 10:44 AM, Joe Witt  wrote:
>> 
>> Jens,
>> 
>> 184 hours (7.6 days) in and zero issues.
>> 
>> Will need to turn this off soon but wanted to give a final update.
>> Looks great.  Given the information on your system there appears to be
>> something we dont understand related to the virtual file system
>> involved or something.
>> 
>> Thanks
>> 
>>> On Tue, Nov 2, 2021 at 10:55 PM Jens M. Kofoed  
>>> wrote:
>>> 
>>> Hi Mark
>>> 
>>> Of course, sorry :-)  By looking at the error messages, I can see that it 
>>> is only the histograms which has differences which is listed. And all 3 
>>> have the first issue at histogram.9. Don't know what that mean
>>> 
>>> /Jens
>>> 
>>> Here are the error log:
>>> 2021-11-01 23:57:21,955 ERROR [Timer-Driven Process Thread-10] 
>>> org.apache.nifi.processors.script.ExecuteScript 
>>> ExecuteScript[id=c7d3335b-1045-14ed--a0d62c70] There are 
>>> differences in the histogram
>>> Byte Value: histogram.10, Previous Count: 11926720, New Count: 11926721
>>> Byte Value: histogram.100, Previous Count: 11927504, New Count: 11927503
>>> Byte Value: histogram.101, Previous Count: 11925396, New Count: 11925407
>>> Byte Value: histogram.102, Previous Count: 11929923, New Count: 11929941
>>> Byte Value: histogram.103, Previous Count: 11931596, New Count: 11931591
>>> Byte Value: histogram.104, Previous Count: 11929071, New Count: 11929064
>>> Byte Value: histogram.105, Previous Count: 11931365, New Count: 11931348
>>> Byte Value: histogram.106, Previous Count: 11928661, New Count: 11928645
>>> Byte Value: histogram.107, Previous Count: 11929864, New Count: 11929866
>>> Byte Value: histogram.108, Previous Count: 11931611, New Count: 11931642
>>> Byte Value: histogram.109, Previous Count: 11932758, New Count: 11932763
>>> Byte Value: histogram.110, Previous Count: 11927893, New Count: 11927895
>>> Byte Value: histogram.111, Previous Count: 11933519, New Count: 11933522
>>> Byte Value: histogram.112, Previous Count: 11931392, New Count: 11931397
>>> Byte Value: histogram.113, Previous Count: 11928534, New Count: 11928548
>>> Byte Value: histogram.114, Previous Count: 11936879, New Count: 11936874
>>> Byte Value: histogram.115, Previous Count: 11932818, New Count: 11932804
>>> Byte Value: histogram.117, Previous Count: 11929143, New Count: 11929151
>>> Byte Value: histogram.118, Previous Count: 11931854, New Count: 11931829
>>> Byte Value: histogram.119, Previous Count: 11926333, New Count: 11926327
>>> Byte Value: histogram.120, Previous Count: 11928731, 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-11-03 Thread Mark Payne
So what I found interesting about the histogram output was that in each case, 
the input file was 1 GB. The number of bytes that differed between the ‘good’ 
and ‘bad’ hashes was something like 500-700 bytes whose values were different. 
But the values ranged significantly. There was no indication that the type of 
thing we’ve seen with NFS mounts was happening, where data was nulled out until 
received and then updated. If that had been the case we’d have seen the NUL 
byte (or some other value) have a very significant change in the histogram, but 
we didn’t see that.

So a couple more ideas that I think can be useful.

1) Which garbage collector are you using? It’s configured in the bootstrap.conf 
file

2) We can try to definitively prove out whether the content on the disk is 
changing or if there’s an issue reading the content. To do this:

1. Stop all processors.
2. Shutdown nifi
3. rm -rf content_repository; rm -rf flowfile_repository   (warning, this will 
delete all FlowFiles & content, so only do this on a dev/test system where 
you’re comfortable deleting it!)
4. Start nifi
5. Let exactly 1 FlowFile into your flow.
6. While it is looping through, create a copy of your entire Content 
Repository: cp -r content_repository content_backup1; zip content_backup1.zip 
content_backup1
7. Wait for the hashes to differ
8. Create another copy of the Content Repository: cp -r content_repository 
content_backup2
9. Find the files within the content_backup1 and content_backup2 and compare 
them to see if they are identical. Would recommend comparing them using each of 
the 3 methods: sha256, sha512, diff

This should make it pretty clear that either:
(1) the issue resides in the software: either NiFi or the JVM
(2) the issue resides outside of the software: the disk, the disk driver, the 
operating system, the VM hypervisor, etc.

Thanks
-Mark

> On Nov 3, 2021, at 10:44 AM, Joe Witt  wrote:
> 
> Jens,
> 
> 184 hours (7.6 days) in and zero issues.
> 
> Will need to turn this off soon but wanted to give a final update.
> Looks great.  Given the information on your system there appears to be
> something we dont understand related to the virtual file system
> involved or something.
> 
> Thanks
> 
> On Tue, Nov 2, 2021 at 10:55 PM Jens M. Kofoed  wrote:
>> 
>> Hi Mark
>> 
>> Of course, sorry :-)  By looking at the error messages, I can see that it is 
>> only the histograms which has differences which is listed. And all 3 have 
>> the first issue at histogram.9. Don't know what that mean
>> 
>> /Jens
>> 
>> Here are the error log:
>> 2021-11-01 23:57:21,955 ERROR [Timer-Driven Process Thread-10] 
>> org.apache.nifi.processors.script.ExecuteScript 
>> ExecuteScript[id=c7d3335b-1045-14ed--a0d62c70] There are differences 
>> in the histogram
>> Byte Value: histogram.10, Previous Count: 11926720, New Count: 11926721
>> Byte Value: histogram.100, Previous Count: 11927504, New Count: 11927503
>> Byte Value: histogram.101, Previous Count: 11925396, New Count: 11925407
>> Byte Value: histogram.102, Previous Count: 11929923, New Count: 11929941
>> Byte Value: histogram.103, Previous Count: 11931596, New Count: 11931591
>> Byte Value: histogram.104, Previous Count: 11929071, New Count: 11929064
>> Byte Value: histogram.105, Previous Count: 11931365, New Count: 11931348
>> Byte Value: histogram.106, Previous Count: 11928661, New Count: 11928645
>> Byte Value: histogram.107, Previous Count: 11929864, New Count: 11929866
>> Byte Value: histogram.108, Previous Count: 11931611, New Count: 11931642
>> Byte Value: histogram.109, Previous Count: 11932758, New Count: 11932763
>> Byte Value: histogram.110, Previous Count: 11927893, New Count: 11927895
>> Byte Value: histogram.111, Previous Count: 11933519, New Count: 11933522
>> Byte Value: histogram.112, Previous Count: 11931392, New Count: 11931397
>> Byte Value: histogram.113, Previous Count: 11928534, New Count: 11928548
>> Byte Value: histogram.114, Previous Count: 11936879, New Count: 11936874
>> Byte Value: histogram.115, Previous Count: 11932818, New Count: 11932804
>> Byte Value: histogram.117, Previous Count: 11929143, New Count: 11929151
>> Byte Value: histogram.118, Previous Count: 11931854, New Count: 11931829
>> Byte Value: histogram.119, Previous Count: 11926333, New Count: 11926327
>> Byte Value: histogram.120, Previous Count: 11928731, New Count: 11928740
>> Byte Value: histogram.121, Previous Count: 11931149, New Count: 11931162
>> Byte Value: histogram.122, Previous Count: 11926725, New Count: 11926733
>> Byte Value: histogram.32, Previous Count: 11930422, New Count: 11930425
>> Byte Value: histogram.33, Previous Count: 11934311, New Count: 11934313
>> Byte Value: histogram.34, Previous Count: 11930459, New Count: 11930446
>> Byte Value: histogram.35, Previous Count: 11924776, New Count: 11924758
>> Byte Value: histogram.36, Previous Count: 11924186, New Count: 11924183
>> Byte Value: histogram.37, Previous Count: 11928616, New Count: 11928627
>> Byte 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-11-03 Thread Joe Witt
Jens,

184 hours (7.6 days) in and zero issues.

Will need to turn this off soon but wanted to give a final update.
Looks great.  Given the information on your system there appears to be
something we dont understand related to the virtual file system
involved or something.

Thanks

On Tue, Nov 2, 2021 at 10:55 PM Jens M. Kofoed  wrote:
>
> Hi Mark
>
> Of course, sorry :-)  By looking at the error messages, I can see that it is 
> only the histograms which has differences which is listed. And all 3 have the 
> first issue at histogram.9. Don't know what that mean
>
> /Jens
>
> Here are the error log:
> 2021-11-01 23:57:21,955 ERROR [Timer-Driven Process Thread-10] 
> org.apache.nifi.processors.script.ExecuteScript 
> ExecuteScript[id=c7d3335b-1045-14ed--a0d62c70] There are differences 
> in the histogram
> Byte Value: histogram.10, Previous Count: 11926720, New Count: 11926721
> Byte Value: histogram.100, Previous Count: 11927504, New Count: 11927503
> Byte Value: histogram.101, Previous Count: 11925396, New Count: 11925407
> Byte Value: histogram.102, Previous Count: 11929923, New Count: 11929941
> Byte Value: histogram.103, Previous Count: 11931596, New Count: 11931591
> Byte Value: histogram.104, Previous Count: 11929071, New Count: 11929064
> Byte Value: histogram.105, Previous Count: 11931365, New Count: 11931348
> Byte Value: histogram.106, Previous Count: 11928661, New Count: 11928645
> Byte Value: histogram.107, Previous Count: 11929864, New Count: 11929866
> Byte Value: histogram.108, Previous Count: 11931611, New Count: 11931642
> Byte Value: histogram.109, Previous Count: 11932758, New Count: 11932763
> Byte Value: histogram.110, Previous Count: 11927893, New Count: 11927895
> Byte Value: histogram.111, Previous Count: 11933519, New Count: 11933522
> Byte Value: histogram.112, Previous Count: 11931392, New Count: 11931397
> Byte Value: histogram.113, Previous Count: 11928534, New Count: 11928548
> Byte Value: histogram.114, Previous Count: 11936879, New Count: 11936874
> Byte Value: histogram.115, Previous Count: 11932818, New Count: 11932804
> Byte Value: histogram.117, Previous Count: 11929143, New Count: 11929151
> Byte Value: histogram.118, Previous Count: 11931854, New Count: 11931829
> Byte Value: histogram.119, Previous Count: 11926333, New Count: 11926327
> Byte Value: histogram.120, Previous Count: 11928731, New Count: 11928740
> Byte Value: histogram.121, Previous Count: 11931149, New Count: 11931162
> Byte Value: histogram.122, Previous Count: 11926725, New Count: 11926733
> Byte Value: histogram.32, Previous Count: 11930422, New Count: 11930425
> Byte Value: histogram.33, Previous Count: 11934311, New Count: 11934313
> Byte Value: histogram.34, Previous Count: 11930459, New Count: 11930446
> Byte Value: histogram.35, Previous Count: 11924776, New Count: 11924758
> Byte Value: histogram.36, Previous Count: 11924186, New Count: 11924183
> Byte Value: histogram.37, Previous Count: 11928616, New Count: 11928627
> Byte Value: histogram.38, Previous Count: 11929474, New Count: 11929490
> Byte Value: histogram.39, Previous Count: 11929607, New Count: 11929600
> Byte Value: histogram.40, Previous Count: 11928053, New Count: 11928048
> Byte Value: histogram.41, Previous Count: 11930402, New Count: 11930399
> Byte Value: histogram.42, Previous Count: 11926830, New Count: 11926846
> Byte Value: histogram.44, Previous Count: 11932536, New Count: 11932538
> Byte Value: histogram.45, Previous Count: 11931053, New Count: 11931044
> Byte Value: histogram.46, Previous Count: 11930008, New Count: 11930011
> Byte Value: histogram.47, Previous Count: 11927747, New Count: 11927734
> Byte Value: histogram.48, Previous Count: 11936055, New Count: 11936057
> Byte Value: histogram.49, Previous Count: 11931471, New Count: 11931474
> Byte Value: histogram.50, Previous Count: 11931921, New Count: 11931908
> Byte Value: histogram.51, Previous Count: 11929643, New Count: 11929637
> Byte Value: histogram.52, Previous Count: 11923847, New Count: 11923854
> Byte Value: histogram.53, Previous Count: 11927311, New Count: 11927303
> Byte Value: histogram.54, Previous Count: 11933754, New Count: 11933766
> Byte Value: histogram.55, Previous Count: 11925964, New Count: 11925970
> Byte Value: histogram.56, Previous Count: 11928872, New Count: 11928873
> Byte Value: histogram.57, Previous Count: 11931124, New Count: 11931127
> Byte Value: histogram.58, Previous Count: 11928474, New Count: 11928477
> Byte Value: histogram.59, Previous Count: 11925814, New Count: 11925812
> Byte Value: histogram.60, Previous Count: 11933978, New Count: 11933991
> Byte Value: histogram.61, Previous Count: 11934136, New Count: 11934123
> Byte Value: histogram.62, Previous Count: 11932016, New Count: 11932011
> Byte Value: histogram.63, Previous Count: 23864588, New Count: 23864584
> Byte Value: histogram.64, Previous Count: 11924792, New Count: 11924789
> Byte Value: histogram.65, Previous Count: 11934789, New Count: 11934797
> Byte 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-11-01 Thread Jens M. Kofoed
Hi Mark and Joe

Yesterday morning I implemented Mark's script in my 2 testflows. One
testflow using sftp the other MergeContent/UnpackContent. Both testflow are
running at a test cluster with 3 nodes and NIFI 1.14.0
The 1st flow with sftp have had 1 file going into the failure queue after
about 16 hours.
The 2nd flow have had 2 files  going into the failure queue after about 15
and 17 hours.

There are definitely something going wrongs in my setup, but I can't figure
out what.

Information from file 1:
histogram.0;0
histogram.1;0
histogram.10;11926720
histogram.100;11927504
histogram.101;11925396
histogram.102;11929923
histogram.103;11931596
histogram.104;11929071
histogram.105;11931365
histogram.106;11928661
histogram.107;11929864
histogram.108;11931611
histogram.109;11932758
histogram.11;0
histogram.110;11927893
histogram.111;11933519
histogram.112;11931392
histogram.113;11928534
histogram.114;11936879
histogram.115;11932818
histogram.116;11934767
histogram.117;11929143
histogram.118;11931854
histogram.119;11926333
histogram.12;0
histogram.120;11928731
histogram.121;11931149
histogram.122;11926725
histogram.123;0
histogram.124;0
histogram.125;0
histogram.126;0
histogram.127;0
histogram.128;0
histogram.129;0
histogram.13;0
histogram.130;0
histogram.131;0
histogram.132;0
histogram.133;0
histogram.134;0
histogram.135;0
histogram.136;0
histogram.137;0
histogram.138;0
histogram.139;0
histogram.14;0
histogram.140;0
histogram.141;0
histogram.142;0
histogram.143;0
histogram.144;0
histogram.145;0
histogram.146;0
histogram.147;0
histogram.148;0
histogram.149;0
histogram.15;0
histogram.150;0
histogram.151;0
histogram.152;0
histogram.153;0
histogram.154;0
histogram.155;0
histogram.156;0
histogram.157;0
histogram.158;0
histogram.159;0
histogram.16;0
histogram.160;0
histogram.161;0
histogram.162;0
histogram.163;0
histogram.164;0
histogram.165;0
histogram.166;0
histogram.167;0
histogram.168;0
histogram.169;0
histogram.17;0
histogram.170;0
histogram.171;0
histogram.172;0
histogram.173;0
histogram.174;0
histogram.175;0
histogram.176;0
histogram.177;0
histogram.178;0
histogram.179;0
histogram.18;0
histogram.180;0
histogram.181;0
histogram.182;0
histogram.183;0
histogram.184;0
histogram.185;0
histogram.186;0
histogram.187;0
histogram.188;0
histogram.189;0
histogram.19;0
histogram.190;0
histogram.191;0
histogram.192;0
histogram.193;0
histogram.194;0
histogram.195;0
histogram.196;0
histogram.197;0
histogram.198;0
histogram.199;0
histogram.2;0
histogram.20;0
histogram.200;0
histogram.201;0
histogram.202;0
histogram.203;0
histogram.204;0
histogram.205;0
histogram.206;0
histogram.207;0
histogram.208;0
histogram.209;0
histogram.21;0
histogram.210;0
histogram.211;0
histogram.212;0
histogram.213;0
histogram.214;0
histogram.215;0
histogram.216;0
histogram.217;0
histogram.218;0
histogram.219;0
histogram.22;0
histogram.220;0
histogram.221;0
histogram.222;0
histogram.223;0
histogram.224;0
histogram.225;0
histogram.226;0
histogram.227;0
histogram.228;0
histogram.229;0
histogram.23;0
histogram.230;0
histogram.231;0
histogram.232;0
histogram.233;0
histogram.234;0
histogram.235;0
histogram.236;0
histogram.237;0
histogram.238;0
histogram.239;0
histogram.24;0
histogram.240;0
histogram.241;0
histogram.242;0
histogram.243;0
histogram.244;0
histogram.245;0
histogram.246;0
histogram.247;0
histogram.248;0
histogram.249;0
histogram.25;0
histogram.250;0
histogram.251;0
histogram.252;0
histogram.253;0
histogram.254;0
histogram.255;0
histogram.26;0
histogram.27;0
histogram.28;0
histogram.29;0
histogram.3;0
histogram.30;0
histogram.31;0
histogram.32;11930422
histogram.33;11934311
histogram.34;11930459
histogram.35;11924776
histogram.36;11924186
histogram.37;11928616
histogram.38;11929474
histogram.39;11929607
histogram.4;0
histogram.40;11928053
histogram.41;11930402
histogram.42;11926830
histogram.43;11938138
histogram.44;11932536
histogram.45;11931053
histogram.46;11930008
histogram.47;11927747
histogram.48;11936055
histogram.49;11931471
histogram.5;0
histogram.50;11931921
histogram.51;11929643
histogram.52;11923847
histogram.53;11927311
histogram.54;11933754
histogram.55;11925964
histogram.56;11928872
histogram.57;11931124
histogram.58;11928474
histogram.59;11925814
histogram.6;0
histogram.60;11933978
histogram.61;11934136
histogram.62;11932016
histogram.63;23864588
histogram.64;11924792
histogram.65;11934789
histogram.66;11933047
histogram.67;11931899
histogram.68;11935615
histogram.69;11927249
histogram.7;0
histogram.70;11933276
histogram.71;11927953
histogram.72;11929275
histogram.73;11930292
histogram.74;11935428
histogram.75;11930317
histogram.76;11935737
histogram.77;11932127
histogram.78;11932344
histogram.79;11932094
histogram.8;0
histogram.80;11930688
histogram.81;11928415
histogram.82;11931559
histogram.83;11934192
histogram.84;11927224
histogram.85;11929491
histogram.86;11930624
histogram.87;11932201
histogram.88;11930694
histogram.89;11936439
histogram.9;11933187
histogram.90;11926445
histogram.91;0
histogram.92;0
histogram.93;0

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-31 Thread Joe Witt
Jen

118 hours in - still goood.

Thanks

On Fri, Oct 29, 2021 at 10:22 AM Joe Witt  wrote:
>
> Jens
>
> Update from hour 67.  Still lookin' good.
>
> Will advise.
>
> Thanks
>
> On Thu, Oct 28, 2021 at 8:08 AM Jens M. Kofoed  wrote:
> >
> > Many many thanks  Joe for looking into this. My test flow was running for 
> > 6 days before the first error occurred
> >
> > Thanks
> >
> > > Den 28. okt. 2021 kl. 16.57 skrev Joe Witt :
> > >
> > > Jens,
> > >
> > > Am 40+ hours in running both your flow and mine to reproduce.  So far
> > > neither have shown any sign of trouble.  Will keep running for another
> > > week or so if I can.
> > >
> > > Thanks
> > >
> > >> On Wed, Oct 27, 2021 at 12:42 PM Jens M. Kofoed  
> > >> wrote:
> > >>
> > >> The Physical hosts with VMWare is using the vmfs but the vm machines 
> > >> running at hosts can’t see that.
> > >> But you asked about the underlying file system  and since my first 
> > >> answer with the copy from the fstab file wasn’t enough I just wanted to 
> > >> give all the details .
> > >>
> > >> If you create a vm for windows you would probably use NTFS (on top of 
> > >> vmfs). For Linux EXT3, EXT4, BTRFS, XFS and so on.
> > >>
> > >> All the partitions at my nifi nodes, are local devices (sda, sdb, sdc 
> > >> and sdd) for each Linux machine. I don’t use nfs
> > >>
> > >> Kind regards
> > >> Jens
> > >>
> > >>
> > >>
> > >> Den 27. okt. 2021 kl. 17.47 skrev Joe Witt :
> > >>
> > >> Jens,
> > >>
> > >> I don't quite follow the EXT4 usage on top of VMFS but the point here
> > >> is you'll ultimately need to truly understand your underlying storage
> > >> system and what sorts of guarantees it is giving you.  If linux/the
> > >> jvm/nifi think it has a typical EXT4 type block storage system to work
> > >> with it can only be safe/operate within those constraints.  I have no
> > >> idea about what VMFS brings to the table or the settings for it.
> > >>
> > >> The sync properties I shared previously might help force the issue of
> > >> ensuring a formal sync/flush cycle all the way through the disk has
> > >> occurred which we'd normally not do or need to do but again in some
> > >> cases offers a stronger guarantee in exchange for performance.
> > >>
> > >> In any case...Mark's path for you here will help identify what we're
> > >> dealing with and we can go from there.
> > >>
> > >> I am aware of significant usage of NiFi on VMWare configurations
> > >> without issue at high rates for many years so whatever it is here is
> > >> likely solvable.
> > >>
> > >> Thanks
> > >>
> > >> On Wed, Oct 27, 2021 at 7:28 AM Jens M. Kofoed  
> > >> wrote:
> > >>
> > >>
> > >> Hi Mark
> > >>
> > >>
> > >> Thanks for the clarification. I will implement the script when I return 
> > >> to the office at Monday next week ( November 1st).
> > >>
> > >> I don’t use NFS, but ext4. But I will implement the script so we can 
> > >> check if it’s the case here. But I think the issue might be after the 
> > >> processors writing content to the repository.
> > >>
> > >> I have a test flow running for more than 2 weeks without any errors. But 
> > >> this flow only calculate hash and comparing.
> > >>
> > >>
> > >> Two other flows both create errors. One flow use 
> > >> PutSFTP->FetchSFTP->CryptographicHashContent->compares. The other flow 
> > >> use MergeContent->UnpackContent->CryptographicHashContent->compares. The 
> > >> last flow is totally inside nifi, excluding other network/server issues.
> > >>
> > >>
> > >> In both cases the CryptographicHashContent is right after a process 
> > >> which writes new content to the repository. But in one case a file in 
> > >> our production flow did calculate a wrong hash 4 times with a 1 minutes 
> > >> delay between each calculation. A few hours later I looped the file back 
> > >> and this time it was OK.
> > >>
> > >> Just like the case in step 5 and 12 in the pdf file
> > >>
> > >>
> > >> I will let you all know more later next week
> > >>
> > >>
> > >> Kind regards
> > >>
> > >> Jens
> > >>
> > >>
> > >>
> > >>
> > >> Den 27. okt. 2021 kl. 15.43 skrev Mark Payne :
> > >>
> > >>
> > >> And the actual script:
> > >>
> > >>
> > >>
> > >> import org.apache.nifi.flowfile.FlowFile
> > >>
> > >>
> > >> import java.util.stream.Collectors
> > >>
> > >>
> > >> Map getPreviousHistogram(final FlowFile flowFile) {
> > >>
> > >>   final Map histogram = 
> > >> flowFile.getAttributes().entrySet().stream()
> > >>
> > >>   .filter({ entry -> entry.getKey().startsWith("histogram.") })
> > >>
> > >>   .collect(Collectors.toMap({ entry -> entry.key}, { entry -> 
> > >> entry.value }))
> > >>
> > >>   return histogram;
> > >>
> > >> }
> > >>
> > >>
> > >> Map createHistogram(final FlowFile flowFile, final 
> > >> InputStream inStream) {
> > >>
> > >>   final Map histogram = new HashMap<>();
> > >>
> > >>   final int[] distribution = new int[256];
> > >>
> > >>   Arrays.fill(distribution, 0);
> > >>
> > >>
> > >>   long total = 0L;
> > >>
> > >>   final 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-29 Thread Joe Witt
Jens

Update from hour 67.  Still lookin' good.

Will advise.

Thanks

On Thu, Oct 28, 2021 at 8:08 AM Jens M. Kofoed  wrote:
>
> Many many thanks  Joe for looking into this. My test flow was running for 6 
> days before the first error occurred
>
> Thanks
>
> > Den 28. okt. 2021 kl. 16.57 skrev Joe Witt :
> >
> > Jens,
> >
> > Am 40+ hours in running both your flow and mine to reproduce.  So far
> > neither have shown any sign of trouble.  Will keep running for another
> > week or so if I can.
> >
> > Thanks
> >
> >> On Wed, Oct 27, 2021 at 12:42 PM Jens M. Kofoed  
> >> wrote:
> >>
> >> The Physical hosts with VMWare is using the vmfs but the vm machines 
> >> running at hosts can’t see that.
> >> But you asked about the underlying file system  and since my first answer 
> >> with the copy from the fstab file wasn’t enough I just wanted to give all 
> >> the details .
> >>
> >> If you create a vm for windows you would probably use NTFS (on top of 
> >> vmfs). For Linux EXT3, EXT4, BTRFS, XFS and so on.
> >>
> >> All the partitions at my nifi nodes, are local devices (sda, sdb, sdc and 
> >> sdd) for each Linux machine. I don’t use nfs
> >>
> >> Kind regards
> >> Jens
> >>
> >>
> >>
> >> Den 27. okt. 2021 kl. 17.47 skrev Joe Witt :
> >>
> >> Jens,
> >>
> >> I don't quite follow the EXT4 usage on top of VMFS but the point here
> >> is you'll ultimately need to truly understand your underlying storage
> >> system and what sorts of guarantees it is giving you.  If linux/the
> >> jvm/nifi think it has a typical EXT4 type block storage system to work
> >> with it can only be safe/operate within those constraints.  I have no
> >> idea about what VMFS brings to the table or the settings for it.
> >>
> >> The sync properties I shared previously might help force the issue of
> >> ensuring a formal sync/flush cycle all the way through the disk has
> >> occurred which we'd normally not do or need to do but again in some
> >> cases offers a stronger guarantee in exchange for performance.
> >>
> >> In any case...Mark's path for you here will help identify what we're
> >> dealing with and we can go from there.
> >>
> >> I am aware of significant usage of NiFi on VMWare configurations
> >> without issue at high rates for many years so whatever it is here is
> >> likely solvable.
> >>
> >> Thanks
> >>
> >> On Wed, Oct 27, 2021 at 7:28 AM Jens M. Kofoed  
> >> wrote:
> >>
> >>
> >> Hi Mark
> >>
> >>
> >> Thanks for the clarification. I will implement the script when I return to 
> >> the office at Monday next week ( November 1st).
> >>
> >> I don’t use NFS, but ext4. But I will implement the script so we can check 
> >> if it’s the case here. But I think the issue might be after the processors 
> >> writing content to the repository.
> >>
> >> I have a test flow running for more than 2 weeks without any errors. But 
> >> this flow only calculate hash and comparing.
> >>
> >>
> >> Two other flows both create errors. One flow use 
> >> PutSFTP->FetchSFTP->CryptographicHashContent->compares. The other flow use 
> >> MergeContent->UnpackContent->CryptographicHashContent->compares. The last 
> >> flow is totally inside nifi, excluding other network/server issues.
> >>
> >>
> >> In both cases the CryptographicHashContent is right after a process which 
> >> writes new content to the repository. But in one case a file in our 
> >> production flow did calculate a wrong hash 4 times with a 1 minutes delay 
> >> between each calculation. A few hours later I looped the file back and 
> >> this time it was OK.
> >>
> >> Just like the case in step 5 and 12 in the pdf file
> >>
> >>
> >> I will let you all know more later next week
> >>
> >>
> >> Kind regards
> >>
> >> Jens
> >>
> >>
> >>
> >>
> >> Den 27. okt. 2021 kl. 15.43 skrev Mark Payne :
> >>
> >>
> >> And the actual script:
> >>
> >>
> >>
> >> import org.apache.nifi.flowfile.FlowFile
> >>
> >>
> >> import java.util.stream.Collectors
> >>
> >>
> >> Map getPreviousHistogram(final FlowFile flowFile) {
> >>
> >>   final Map histogram = 
> >> flowFile.getAttributes().entrySet().stream()
> >>
> >>   .filter({ entry -> entry.getKey().startsWith("histogram.") })
> >>
> >>   .collect(Collectors.toMap({ entry -> entry.key}, { entry -> 
> >> entry.value }))
> >>
> >>   return histogram;
> >>
> >> }
> >>
> >>
> >> Map createHistogram(final FlowFile flowFile, final 
> >> InputStream inStream) {
> >>
> >>   final Map histogram = new HashMap<>();
> >>
> >>   final int[] distribution = new int[256];
> >>
> >>   Arrays.fill(distribution, 0);
> >>
> >>
> >>   long total = 0L;
> >>
> >>   final byte[] buffer = new byte[8192];
> >>
> >>   int len;
> >>
> >>   while ((len = inStream.read(buffer)) > 0) {
> >>
> >>   for (int i=0; i < len; i++) {
> >>
> >>   final int val = buffer[i];
> >>
> >>   distribution[val]++;
> >>
> >>   total++;
> >>
> >>   }
> >>
> >>   }
> >>
> >>
> >>   for (int i=0; i < 256; i++) {
> >>
> >>   histogram.put("histogram." + 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-28 Thread Jens M. Kofoed
Many many thanks  Joe for looking into this. My test flow was running for 6 
days before the first error occurred

Thanks

> Den 28. okt. 2021 kl. 16.57 skrev Joe Witt :
> 
> Jens,
> 
> Am 40+ hours in running both your flow and mine to reproduce.  So far
> neither have shown any sign of trouble.  Will keep running for another
> week or so if I can.
> 
> Thanks
> 
>> On Wed, Oct 27, 2021 at 12:42 PM Jens M. Kofoed  
>> wrote:
>> 
>> The Physical hosts with VMWare is using the vmfs but the vm machines running 
>> at hosts can’t see that.
>> But you asked about the underlying file system  and since my first answer 
>> with the copy from the fstab file wasn’t enough I just wanted to give all 
>> the details .
>> 
>> If you create a vm for windows you would probably use NTFS (on top of vmfs). 
>> For Linux EXT3, EXT4, BTRFS, XFS and so on.
>> 
>> All the partitions at my nifi nodes, are local devices (sda, sdb, sdc and 
>> sdd) for each Linux machine. I don’t use nfs
>> 
>> Kind regards
>> Jens
>> 
>> 
>> 
>> Den 27. okt. 2021 kl. 17.47 skrev Joe Witt :
>> 
>> Jens,
>> 
>> I don't quite follow the EXT4 usage on top of VMFS but the point here
>> is you'll ultimately need to truly understand your underlying storage
>> system and what sorts of guarantees it is giving you.  If linux/the
>> jvm/nifi think it has a typical EXT4 type block storage system to work
>> with it can only be safe/operate within those constraints.  I have no
>> idea about what VMFS brings to the table or the settings for it.
>> 
>> The sync properties I shared previously might help force the issue of
>> ensuring a formal sync/flush cycle all the way through the disk has
>> occurred which we'd normally not do or need to do but again in some
>> cases offers a stronger guarantee in exchange for performance.
>> 
>> In any case...Mark's path for you here will help identify what we're
>> dealing with and we can go from there.
>> 
>> I am aware of significant usage of NiFi on VMWare configurations
>> without issue at high rates for many years so whatever it is here is
>> likely solvable.
>> 
>> Thanks
>> 
>> On Wed, Oct 27, 2021 at 7:28 AM Jens M. Kofoed  
>> wrote:
>> 
>> 
>> Hi Mark
>> 
>> 
>> Thanks for the clarification. I will implement the script when I return to 
>> the office at Monday next week ( November 1st).
>> 
>> I don’t use NFS, but ext4. But I will implement the script so we can check 
>> if it’s the case here. But I think the issue might be after the processors 
>> writing content to the repository.
>> 
>> I have a test flow running for more than 2 weeks without any errors. But 
>> this flow only calculate hash and comparing.
>> 
>> 
>> Two other flows both create errors. One flow use 
>> PutSFTP->FetchSFTP->CryptographicHashContent->compares. The other flow use 
>> MergeContent->UnpackContent->CryptographicHashContent->compares. The last 
>> flow is totally inside nifi, excluding other network/server issues.
>> 
>> 
>> In both cases the CryptographicHashContent is right after a process which 
>> writes new content to the repository. But in one case a file in our 
>> production flow did calculate a wrong hash 4 times with a 1 minutes delay 
>> between each calculation. A few hours later I looped the file back and this 
>> time it was OK.
>> 
>> Just like the case in step 5 and 12 in the pdf file
>> 
>> 
>> I will let you all know more later next week
>> 
>> 
>> Kind regards
>> 
>> Jens
>> 
>> 
>> 
>> 
>> Den 27. okt. 2021 kl. 15.43 skrev Mark Payne :
>> 
>> 
>> And the actual script:
>> 
>> 
>> 
>> import org.apache.nifi.flowfile.FlowFile
>> 
>> 
>> import java.util.stream.Collectors
>> 
>> 
>> Map getPreviousHistogram(final FlowFile flowFile) {
>> 
>>   final Map histogram = 
>> flowFile.getAttributes().entrySet().stream()
>> 
>>   .filter({ entry -> entry.getKey().startsWith("histogram.") })
>> 
>>   .collect(Collectors.toMap({ entry -> entry.key}, { entry -> 
>> entry.value }))
>> 
>>   return histogram;
>> 
>> }
>> 
>> 
>> Map createHistogram(final FlowFile flowFile, final 
>> InputStream inStream) {
>> 
>>   final Map histogram = new HashMap<>();
>> 
>>   final int[] distribution = new int[256];
>> 
>>   Arrays.fill(distribution, 0);
>> 
>> 
>>   long total = 0L;
>> 
>>   final byte[] buffer = new byte[8192];
>> 
>>   int len;
>> 
>>   while ((len = inStream.read(buffer)) > 0) {
>> 
>>   for (int i=0; i < len; i++) {
>> 
>>   final int val = buffer[i];
>> 
>>   distribution[val]++;
>> 
>>   total++;
>> 
>>   }
>> 
>>   }
>> 
>> 
>>   for (int i=0; i < 256; i++) {
>> 
>>   histogram.put("histogram." + i, String.valueOf(distribution[i]));
>> 
>>   }
>> 
>>   histogram.put("histogram.totalBytes", String.valueOf(total));
>> 
>> 
>>   return histogram;
>> 
>> }
>> 
>> 
>> void logHistogramDifferences(final Map previous, final 
>> Map updated) {
>> 
>>   final StringBuilder sb = new StringBuilder("There are differences in the 
>> histogram\n");
>> 
>>   final Map sorted = new 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-28 Thread Joe Witt
Jens,

Am 40+ hours in running both your flow and mine to reproduce.  So far
neither have shown any sign of trouble.  Will keep running for another
week or so if I can.

Thanks

On Wed, Oct 27, 2021 at 12:42 PM Jens M. Kofoed  wrote:
>
> The Physical hosts with VMWare is using the vmfs but the vm machines running 
> at hosts can’t see that.
> But you asked about the underlying file system  and since my first answer 
> with the copy from the fstab file wasn’t enough I just wanted to give all the 
> details .
>
> If you create a vm for windows you would probably use NTFS (on top of vmfs). 
> For Linux EXT3, EXT4, BTRFS, XFS and so on.
>
> All the partitions at my nifi nodes, are local devices (sda, sdb, sdc and 
> sdd) for each Linux machine. I don’t use nfs
>
> Kind regards
> Jens
>
>
>
> Den 27. okt. 2021 kl. 17.47 skrev Joe Witt :
>
> Jens,
>
> I don't quite follow the EXT4 usage on top of VMFS but the point here
> is you'll ultimately need to truly understand your underlying storage
> system and what sorts of guarantees it is giving you.  If linux/the
> jvm/nifi think it has a typical EXT4 type block storage system to work
> with it can only be safe/operate within those constraints.  I have no
> idea about what VMFS brings to the table or the settings for it.
>
> The sync properties I shared previously might help force the issue of
> ensuring a formal sync/flush cycle all the way through the disk has
> occurred which we'd normally not do or need to do but again in some
> cases offers a stronger guarantee in exchange for performance.
>
> In any case...Mark's path for you here will help identify what we're
> dealing with and we can go from there.
>
> I am aware of significant usage of NiFi on VMWare configurations
> without issue at high rates for many years so whatever it is here is
> likely solvable.
>
> Thanks
>
> On Wed, Oct 27, 2021 at 7:28 AM Jens M. Kofoed  wrote:
>
>
> Hi Mark
>
>
> Thanks for the clarification. I will implement the script when I return to 
> the office at Monday next week ( November 1st).
>
> I don’t use NFS, but ext4. But I will implement the script so we can check if 
> it’s the case here. But I think the issue might be after the processors 
> writing content to the repository.
>
> I have a test flow running for more than 2 weeks without any errors. But this 
> flow only calculate hash and comparing.
>
>
> Two other flows both create errors. One flow use 
> PutSFTP->FetchSFTP->CryptographicHashContent->compares. The other flow use 
> MergeContent->UnpackContent->CryptographicHashContent->compares. The last 
> flow is totally inside nifi, excluding other network/server issues.
>
>
> In both cases the CryptographicHashContent is right after a process which 
> writes new content to the repository. But in one case a file in our 
> production flow did calculate a wrong hash 4 times with a 1 minutes delay 
> between each calculation. A few hours later I looped the file back and this 
> time it was OK.
>
> Just like the case in step 5 and 12 in the pdf file
>
>
> I will let you all know more later next week
>
>
> Kind regards
>
> Jens
>
>
>
>
> Den 27. okt. 2021 kl. 15.43 skrev Mark Payne :
>
>
> And the actual script:
>
>
>
> import org.apache.nifi.flowfile.FlowFile
>
>
> import java.util.stream.Collectors
>
>
> Map getPreviousHistogram(final FlowFile flowFile) {
>
>final Map histogram = 
> flowFile.getAttributes().entrySet().stream()
>
>.filter({ entry -> entry.getKey().startsWith("histogram.") })
>
>.collect(Collectors.toMap({ entry -> entry.key}, { entry -> 
> entry.value }))
>
>return histogram;
>
> }
>
>
> Map createHistogram(final FlowFile flowFile, final 
> InputStream inStream) {
>
>final Map histogram = new HashMap<>();
>
>final int[] distribution = new int[256];
>
>Arrays.fill(distribution, 0);
>
>
>long total = 0L;
>
>final byte[] buffer = new byte[8192];
>
>int len;
>
>while ((len = inStream.read(buffer)) > 0) {
>
>for (int i=0; i < len; i++) {
>
>final int val = buffer[i];
>
>distribution[val]++;
>
>total++;
>
>}
>
>}
>
>
>for (int i=0; i < 256; i++) {
>
>histogram.put("histogram." + i, String.valueOf(distribution[i]));
>
>}
>
>histogram.put("histogram.totalBytes", String.valueOf(total));
>
>
>return histogram;
>
> }
>
>
> void logHistogramDifferences(final Map previous, final 
> Map updated) {
>
>final StringBuilder sb = new StringBuilder("There are differences in the 
> histogram\n");
>
>final Map sorted = new TreeMap<>(previous)
>
>for (final Map.Entry entry : sorted.entrySet()) {
>
>final String key = entry.getKey();
>
>final String previousValue = entry.getValue();
>
>final String updatedValue = updated.get(entry.getKey())
>
>
>if (!Objects.equals(previousValue, updatedValue)) {
>
>sb.append("Byte Value: ").append(key).append(", Previous Count: 
> 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-27 Thread Joe Witt
Jens,

I don't quite follow the EXT4 usage on top of VMFS but the point here
is you'll ultimately need to truly understand your underlying storage
system and what sorts of guarantees it is giving you.  If linux/the
jvm/nifi think it has a typical EXT4 type block storage system to work
with it can only be safe/operate within those constraints.  I have no
idea about what VMFS brings to the table or the settings for it.

The sync properties I shared previously might help force the issue of
ensuring a formal sync/flush cycle all the way through the disk has
occurred which we'd normally not do or need to do but again in some
cases offers a stronger guarantee in exchange for performance.

In any case...Mark's path for you here will help identify what we're
dealing with and we can go from there.

I am aware of significant usage of NiFi on VMWare configurations
without issue at high rates for many years so whatever it is here is
likely solvable.

Thanks

On Wed, Oct 27, 2021 at 7:28 AM Jens M. Kofoed  wrote:
>
> Hi Mark
>
> Thanks for the clarification. I will implement the script when I return to 
> the office at Monday next week ( November 1st).
> I don’t use NFS, but ext4. But I will implement the script so we can check if 
> it’s the case here. But I think the issue might be after the processors 
> writing content to the repository.
> I have a test flow running for more than 2 weeks without any errors. But this 
> flow only calculate hash and comparing.
>
> Two other flows both create errors. One flow use 
> PutSFTP->FetchSFTP->CryptographicHashContent->compares. The other flow use 
> MergeContent->UnpackContent->CryptographicHashContent->compares. The last 
> flow is totally inside nifi, excluding other network/server issues.
>
> In both cases the CryptographicHashContent is right after a process which 
> writes new content to the repository. But in one case a file in our 
> production flow did calculate a wrong hash 4 times with a 1 minutes delay 
> between each calculation. A few hours later I looped the file back and this 
> time it was OK.
> Just like the case in step 5 and 12 in the pdf file
>
> I will let you all know more later next week
>
> Kind regards
> Jens
>
>
>
> Den 27. okt. 2021 kl. 15.43 skrev Mark Payne :
>
> And the actual script:
>
>
> import org.apache.nifi.flowfile.FlowFile
>
> import java.util.stream.Collectors
>
> Map getPreviousHistogram(final FlowFile flowFile) {
> final Map histogram = 
> flowFile.getAttributes().entrySet().stream()
> .filter({ entry -> entry.getKey().startsWith("histogram.") })
> .collect(Collectors.toMap({ entry -> entry.key}, { entry -> 
> entry.value }))
> return histogram;
> }
>
> Map createHistogram(final FlowFile flowFile, final 
> InputStream inStream) {
> final Map histogram = new HashMap<>();
> final int[] distribution = new int[256];
> Arrays.fill(distribution, 0);
>
> long total = 0L;
> final byte[] buffer = new byte[8192];
> int len;
> while ((len = inStream.read(buffer)) > 0) {
> for (int i=0; i < len; i++) {
> final int val = buffer[i];
> distribution[val]++;
> total++;
> }
> }
>
> for (int i=0; i < 256; i++) {
> histogram.put("histogram." + i, String.valueOf(distribution[i]));
> }
> histogram.put("histogram.totalBytes", String.valueOf(total));
>
> return histogram;
> }
>
> void logHistogramDifferences(final Map previous, final 
> Map updated) {
> final StringBuilder sb = new StringBuilder("There are differences in the 
> histogram\n");
> final Map sorted = new TreeMap<>(previous)
> for (final Map.Entry entry : sorted.entrySet()) {
> final String key = entry.getKey();
> final String previousValue = entry.getValue();
> final String updatedValue = updated.get(entry.getKey())
>
> if (!Objects.equals(previousValue, updatedValue)) {
> sb.append("Byte Value: ").append(key).append(", Previous Count: 
> ").append(previousValue).append(", New Count: 
> ").append(updatedValue).append("\n");
> }
> }
>
> log.error(sb.toString());
> }
>
>
> def flowFile = session.get()
> if (flowFile == null) {
> return
> }
>
> final Map previousHistogram = getPreviousHistogram(flowFile)
> Map histogram = null;
>
> final InputStream inStream = session.read(flowFile);
> try {
> histogram = createHistogram(flowFile, inStream);
> } finally {
> inStream.close()
> }
>
> if (!previousHistogram.isEmpty()) {
> if (previousHistogram.equals(histogram)) {
> log.info("Histograms match")
> } else {
> logHistogramDifferences(previousHistogram, histogram)
> session.transfer(flowFile, REL_FAILURE)
> return;
> }
> }
>
> flowFile = session.putAllAttributes(flowFile, histogram)
> session.transfer(flowFile, REL_SUCCESS)
>
>
>
>
>
>
> On Oct 27, 2021, at 9:43 AM, Mark Payne  wrote:
>
> Jens,
>
> For a bit of background here, the reason that Joe 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-27 Thread Jens M. Kofoed
Hi Mark

Thanks for the clarification. I will implement the script when I return to the 
office at Monday next week ( November 1st). 
I don’t use NFS, but ext4. But I will implement the script so we can check if 
it’s the case here. But I think the issue might be after the processors writing 
content to the repository.
I have a test flow running for more than 2 weeks without any errors. But this 
flow only calculate hash and comparing. 

Two other flows both create errors. One flow use 
PutSFTP->FetchSFTP->CryptographicHashContent->compares. The other flow use 
MergeContent->UnpackContent->CryptographicHashContent->compares. The last flow 
is totally inside nifi, excluding other network/server issues. 

In both cases the CryptographicHashContent is right after a process which 
writes new content to the repository. But in one case a file in our production 
flow did calculate a wrong hash 4 times with a 1 minutes delay between each 
calculation. A few hours later I looped the file back and this time it was OK. 
Just like the case in step 5 and 12 in the pdf file

I will let you all know more later next week 

Kind regards 
Jens



> Den 27. okt. 2021 kl. 15.43 skrev Mark Payne :
> 
> And the actual script:
> 
> 
> import org.apache.nifi.flowfile.FlowFile
> 
> import java.util.stream.Collectors
> 
> Map getPreviousHistogram(final FlowFile flowFile) {
> final Map histogram = 
> flowFile.getAttributes().entrySet().stream()
> .filter({ entry -> entry.getKey().startsWith("histogram.") })
> .collect(Collectors.toMap({ entry -> entry.key}, { entry -> 
> entry.value }))
> return histogram;
> }
> 
> Map createHistogram(final FlowFile flowFile, final 
> InputStream inStream) {
> final Map histogram = new HashMap<>();
> final int[] distribution = new int[256];
> Arrays.fill(distribution, 0);
> 
> long total = 0L;
> final byte[] buffer = new byte[8192];
> int len;
> while ((len = inStream.read(buffer)) > 0) {
> for (int i=0; i < len; i++) {
> final int val = buffer[i];
> distribution[val]++;
> total++;
> }
> }
> 
> for (int i=0; i < 256; i++) {
> histogram.put("histogram." + i, String.valueOf(distribution[i]));
> }
> histogram.put("histogram.totalBytes", String.valueOf(total));
> 
> return histogram;
> }
> 
> void logHistogramDifferences(final Map previous, final 
> Map updated) {
> final StringBuilder sb = new StringBuilder("There are differences in the 
> histogram\n");
> final Map sorted = new TreeMap<>(previous)
> for (final Map.Entry entry : sorted.entrySet()) {
> final String key = entry.getKey();
> final String previousValue = entry.getValue();
> final String updatedValue = updated.get(entry.getKey())
> 
> if (!Objects.equals(previousValue, updatedValue)) {
> sb.append("Byte Value: ").append(key).append(", Previous Count: 
> ").append(previousValue).append(", New Count: 
> ").append(updatedValue).append("\n");
> }
> }
> 
> log.error(sb.toString());
> }
> 
> 
> def flowFile = session.get()
> if (flowFile == null) {
> return
> }
> 
> final Map previousHistogram = getPreviousHistogram(flowFile)
> Map histogram = null;
> 
> final InputStream inStream = session.read(flowFile);
> try {
> histogram = createHistogram(flowFile, inStream);
> } finally {
> inStream.close()
> }
> 
> if (!previousHistogram.isEmpty()) {
> if (previousHistogram.equals(histogram)) {
> log.info("Histograms match")
> } else {
> logHistogramDifferences(previousHistogram, histogram)
> session.transfer(flowFile, REL_FAILURE)
> return;
> }
> }
> 
> flowFile = session.putAllAttributes(flowFile, histogram)
> session.transfer(flowFile, REL_SUCCESS)
> 
> 
> 
> 
> 
>> On Oct 27, 2021, at 9:43 AM, Mark Payne  wrote:
>> 
>> Jens,
>> 
>> For a bit of background here, the reason that Joe and I have expressed 
>> interest in NFS file systems is that the way the protocol works, it is 
>> allowed to receive packets/chunks of the file out-of-order. So, what happens 
>> is let’s say a 1 MB file is being written. The first 500 KB are received. 
>> Then instead of the the 501st KB it receives the 503rd KB. What happens is 
>> that the size of the file on the file system becomes 503 KB. But what about 
>> 501 & 502? Well when you read the data, the file system just returns ASCII 
>> NUL characters (byte 0) for those bytes. Once the NFS server receives those 
>> bytes, it then goes back and fills in the proper bytes. So if you’re running 
>> on NFS, it is possible for the contents of the file on the underlying file 
>> system to change out from under you. It’s not clear to me what other types 
>> of file system might do something similar.
>> 
>> So, one thing that we can do is to find out whether or not the contents of 
>> the underlying file have changed in some way, or if there’s something else 
>> happening 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-27 Thread Mark Payne
And the actual script:



import org.apache.nifi.flowfile.FlowFile

import java.util.stream.Collectors

Map getPreviousHistogram(final FlowFile flowFile) {
final Map histogram = 
flowFile.getAttributes().entrySet().stream()
.filter({ entry -> entry.getKey().startsWith("histogram.") })
.collect(Collectors.toMap({ entry -> entry.key}, { entry -> entry.value 
}))
return histogram;
}

Map createHistogram(final FlowFile flowFile, final InputStream 
inStream) {
final Map histogram = new HashMap<>();
final int[] distribution = new int[256];
Arrays.fill(distribution, 0);

long total = 0L;
final byte[] buffer = new byte[8192];
int len;
while ((len = inStream.read(buffer)) > 0) {
for (int i=0; i < len; i++) {
final int val = buffer[i];
distribution[val]++;
total++;
}
}

for (int i=0; i < 256; i++) {
histogram.put("histogram." + i, String.valueOf(distribution[i]));
}
histogram.put("histogram.totalBytes", String.valueOf(total));

return histogram;
}

void logHistogramDifferences(final Map previous, final 
Map updated) {
final StringBuilder sb = new StringBuilder("There are differences in the 
histogram\n");
final Map sorted = new TreeMap<>(previous)
for (final Map.Entry entry : sorted.entrySet()) {
final String key = entry.getKey();
final String previousValue = entry.getValue();
final String updatedValue = updated.get(entry.getKey())

if (!Objects.equals(previousValue, updatedValue)) {
sb.append("Byte Value: ").append(key).append(", Previous Count: 
").append(previousValue).append(", New Count: 
").append(updatedValue).append("\n");
}
}

log.error(sb.toString());
}


def flowFile = session.get()
if (flowFile == null) {
return
}

final Map previousHistogram = getPreviousHistogram(flowFile)
Map histogram = null;

final InputStream inStream = session.read(flowFile);
try {
histogram = createHistogram(flowFile, inStream);
} finally {
inStream.close()
}

if (!previousHistogram.isEmpty()) {
if (previousHistogram.equals(histogram)) {
log.info("Histograms match")
} else {
logHistogramDifferences(previousHistogram, histogram)
session.transfer(flowFile, REL_FAILURE)
return;
}
}

flowFile = session.putAllAttributes(flowFile, histogram)
session.transfer(flowFile, REL_SUCCESS)





On Oct 27, 2021, at 9:43 AM, Mark Payne 
mailto:marka...@hotmail.com>> wrote:

Jens,

For a bit of background here, the reason that Joe and I have expressed interest 
in NFS file systems is that the way the protocol works, it is allowed to 
receive packets/chunks of the file out-of-order. So, what happens is let’s say 
a 1 MB file is being written. The first 500 KB are received. Then instead of 
the the 501st KB it receives the 503rd KB. What happens is that the size of the 
file on the file system becomes 503 KB. But what about 501 & 502? Well when you 
read the data, the file system just returns ASCII NUL characters (byte 0) for 
those bytes. Once the NFS server receives those bytes, it then goes back and 
fills in the proper bytes. So if you’re running on NFS, it is possible for the 
contents of the file on the underlying file system to change out from under 
you. It’s not clear to me what other types of file system might do something 
similar.

So, one thing that we can do is to find out whether or not the contents of the 
underlying file have changed in some way, or if there’s something else 
happening that could perhaps result in the hashes being wrong. I’ve put 
together a script that should help diagnose this.

Can you insert an ExecuteScript processor either just before or just after your 
CryptographicHashContent processor? Doesn’t really matter whether it’s run just 
before or just after. I’ll attach the script here. It’s a Groovy Script so you 
should be able to use ExecuteScript with Script Engine = Groovy and the 
following script as the Script Body. No other changes needed.

The way the script works, it reads in the contents of the FlowFile, and then it 
builds up a histogram of all byte values (0-255) that it sees in the contents, 
and then adds that as attributes. So it adds attributes such as:
histogram.0 = 280273
histogram.1 = 2820
histogram.2 = 48202
histogram.3 = 3820
…
histogram.totalBytes = 1780928732

It then checks if those attributes have already been added. If so, after 
calculating that histogram, it checks against the previous values (in the 
attributes). If they are the same, the FlowFile goes to ’success’. If they are 
different, it logs an error indicating the before/after value for any byte 
whose distribution was different, and it routes to failure.

So, if for example, the first time through it sees 280,273 bytes with a value 
of ‘0’, and the second times it only sees 12,001 then we know there were a 
bunch of 0’s previously that were updated to 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-27 Thread Mark Payne
Jens,

For a bit of background here, the reason that Joe and I have expressed interest 
in NFS file systems is that the way the protocol works, it is allowed to 
receive packets/chunks of the file out-of-order. So, what happens is let’s say 
a 1 MB file is being written. The first 500 KB are received. Then instead of 
the the 501st KB it receives the 503rd KB. What happens is that the size of the 
file on the file system becomes 503 KB. But what about 501 & 502? Well when you 
read the data, the file system just returns ASCII NUL characters (byte 0) for 
those bytes. Once the NFS server receives those bytes, it then goes back and 
fills in the proper bytes. So if you’re running on NFS, it is possible for the 
contents of the file on the underlying file system to change out from under 
you. It’s not clear to me what other types of file system might do something 
similar.

So, one thing that we can do is to find out whether or not the contents of the 
underlying file have changed in some way, or if there’s something else 
happening that could perhaps result in the hashes being wrong. I’ve put 
together a script that should help diagnose this.

Can you insert an ExecuteScript processor either just before or just after your 
CryptographicHashContent processor? Doesn’t really matter whether it’s run just 
before or just after. I’ll attach the script here. It’s a Groovy Script so you 
should be able to use ExecuteScript with Script Engine = Groovy and the 
following script as the Script Body. No other changes needed.

The way the script works, it reads in the contents of the FlowFile, and then it 
builds up a histogram of all byte values (0-255) that it sees in the contents, 
and then adds that as attributes. So it adds attributes such as:
histogram.0 = 280273
histogram.1 = 2820
histogram.2 = 48202
histogram.3 = 3820
…
histogram.totalBytes = 1780928732

It then checks if those attributes have already been added. If so, after 
calculating that histogram, it checks against the previous values (in the 
attributes). If they are the same, the FlowFile goes to ’success’. If they are 
different, it logs an error indicating the before/after value for any byte 
whose distribution was different, and it routes to failure.

So, if for example, the first time through it sees 280,273 bytes with a value 
of ‘0’, and the second times it only sees 12,001 then we know there were a 
bunch of 0’s previously that were updated to be some other value. And it 
includes the total number of bytes in case somehow we find that we’re reading 
too many bytes or not enough bytes or something like that. This should help 
narrow down what’s happening.

Thanks
-Mark



On Oct 26, 2021, at 6:25 PM, Joe Witt 
mailto:joe.w...@gmail.com>> wrote:

Jens

Attached is the flow I was using (now running yours and this one).  Curious if 
that one reproduces the issue for you as well.

Thanks

On Tue, Oct 26, 2021 at 3:09 PM Joe Witt 
mailto:joe.w...@gmail.com>> wrote:
Jens

I have your flow running and will keep it running for several days/week to see 
if I can reproduce.  Also of note please use your same test flow but use 
HashContent instead of crypto hash.  Curious if that matters for any reason...

Still want to know more about your underlying storage system.

You could also try updating nifi.properties and changing the following lines:
nifi.flowfile.repository.always.sync=true
nifi.content.repository.always.sync=true
nifi.provenance.repository.always.sync=true

It will hurt performance but can be useful/necessary on certain storage 
subsystems.

Thanks

On Tue, Oct 26, 2021 at 12:05 PM Joe Witt 
mailto:joe.w...@gmail.com>> wrote:
Ignore "For the scenario where you can replicate this please share the 
flow.xml.gz for which it is reproducible."  I see the uploaded JSON

On Tue, Oct 26, 2021 at 12:04 PM Joe Witt 
mailto:joe.w...@gmail.com>> wrote:
Jens,

We asked about the underlying storage system.  You replied with some info but 
not the specifics.  Do you know precisely what the underlying storage is and 
how it is presented to the operating system?  For instance is it NFS or 
something similar?

I've setup a very similar flow at extremely high rates running for the past 
several days with no issue.  In my case though I know precisely what the config 
is and the disk setup is.  Didn't do anything special to be clear but still it 
is important to know.

For the scenario where you can replicate this please share the flow.xml.gz for 
which it is reproducible.

Thanks
Joe

On Sun, Oct 24, 2021 at 9:53 PM Jens M. Kofoed 
mailto:jmkofoed@gmail.com>> wrote:
Dear Joe and Mark

I have created a test flow without the sftp processors, which don't create any 
errors. Therefore I created a new test flow where I use a MergeContent and 
UnpackContent instead of the sftp processors. This keeps all data internal in 
NIFI, but force NIFI to write and read new files totally local.
My flow have been running for 7 days and this morning there where 2 files where 
the 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-20 Thread Joe Witt
Jens,

Also what type of file system/storage system are you running NiFi on
in this case?  We'll need to know this for the NiFi
content/flowfile/provenance repositories? Is it NFS?

Thanks

On Wed, Oct 20, 2021 at 11:14 AM Joe Witt  wrote:
>
> Jens,
>
> And to further narrow this down
>
> "I have a test flow, where a GenerateFlowfile has created 6x 1GB files
> (2 files per node) and next process was a hashcontent before it run
> into a test loop. Where files are uploaded via PutSFTP to a test
> server, and downloaded again and recalculated the hash. I have had one
> issue after 3 days of running."
>
> So to be clear with GenerateFlowFile making these files and then you
> looping the content is wholly and fully exclusively within the control
> of NiFI.  No Get/Fetch/Put-SFTP of any kind at all. In by looping the
> same files over and over in nifi itself you can make this happen or
> cannot?
>
> Thanks
>
> On Wed, Oct 20, 2021 at 11:08 AM Joe Witt  wrote:
> >
> > Jens,
> >
> > "After fetching a FlowFile-stream file and unpacked it back into NiFi
> > I calculate a sha256. 1 minutes later I recalculate the sha256 on the
> > exact same file. And got a new hash. That is what worry’s me.
> > The fact that the same file can be recalculated and produce two
> > different hashes, is very strange, but it happens. "
> >
> > Ok so to confirm you are saying that in each case this happens you see
> > it first compute the wrong hash, but then if you retry the same
> > flowfile it then provides the correct hash?
> >
> > Can you please also show/share the lineage history for such a flow
> > file then?  It should have events for the initial hash, second hash,
> > the unpacking, trace to the original stream, etc...
> >
> > Thanks
> >
> > On Wed, Oct 20, 2021 at 11:00 AM Jens M. Kofoed  
> > wrote:
> > >
> > > Dear Mark and Joe
> > >
> > > I know my setup isn’t normal for many people. But if we only looks at my 
> > > receive side, which the last mails is about. Every thing is happening at 
> > > the same NIFI instance. It is the same 3 node NIFI cluster.
> > > After fetching a FlowFile-stream file and unpacked it back into NiFi I 
> > > calculate a sha256. 1 minutes later I recalculate the sha256 on the exact 
> > > same file. And got a new hash. That is what worry’s me.
> > > The fact that the same file can be recalculated and produce two different 
> > > hashes, is very strange, but it happens. Over the last 5 months it have 
> > > only happen 35-40 times.
> > >
> > > I can understand if the file is not completely loaded and saved into the 
> > > content repository before the hashing starts. But I believe that the 
> > > unpack process don’t forward the flow file to the next process before it 
> > > is 100% finish unpacking and saving the new content to the repository.
> > >
> > > I have a test flow, where a GenerateFlowfile has created 6x 1GB files (2 
> > > files per node) and next process was a hashcontent before it run into a 
> > > test loop. Where files are uploaded via PutSFTP to a test server, and 
> > > downloaded again and recalculated the hash. I have had one issue after 3 
> > > days of running.
> > > Now the test flow is running without the Put/Fetch sftp processors.
> > >
> > > Another problem is that I can’t find any correlation to other events. Not 
> > > within NIFI, nor the server itself or VMWare. If I just could find any 
> > > other event which happens at the same time, I might be able to force some 
> > > kind of event to trigger the issue.
> > > I have tried to force VMware to migrate a NiFi node to another host. 
> > > Forcing it to do a snapshot and deleting snapshots, but nothing can 
> > > trigger and error.
> > >
> > > I know it will be very very difficult to reproduce. But I will setup 
> > > multiple NiFi instances running different test flows to see if I can find 
> > > any reason why it behaves as it does.
> > >
> > > Kind Regards
> > > Jens M. Kofoed
> > >
> > > Den 20. okt. 2021 kl. 16.39 skrev Mark Payne :
> > >
> > > Jens,
> > >
> > > Thanks for sharing the images.
> > >
> > > I tried to setup a test to reproduce the issue. I’ve had it running for 
> > > quite some time. Running through millions of iterations.
> > >
> > > I’ve used 5 KB files, 50 KB files, 50 MB files, and larger (to the tune 
> > > of hundreds of MB). I’ve been unable to reproduce an issue after millions 
> > > of iterations.
> > >
> > > So far I cannot replicate. And since you’re pulling the data via SFTP and 
> > > then unpacking, which preserves all original attributes from a different 
> > > system, this can easily become confusing.
> > >
> > > Recommend trying to reproduce with SFTP-related processors out of the 
> > > picture, as Joe is mentioning. Either using GetFile/FetchFile or 
> > > GenerateFlowFile. Then immediately use CryptographicHashContent to 
> > > generate an ‘initial hash’, copy that value to another attribute, and 
> > > then loop, generating the hash and comparing against the original one. 
> > > 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-20 Thread Joe Witt
Jens,

And to further narrow this down

"I have a test flow, where a GenerateFlowfile has created 6x 1GB files
(2 files per node) and next process was a hashcontent before it run
into a test loop. Where files are uploaded via PutSFTP to a test
server, and downloaded again and recalculated the hash. I have had one
issue after 3 days of running."

So to be clear with GenerateFlowFile making these files and then you
looping the content is wholly and fully exclusively within the control
of NiFI.  No Get/Fetch/Put-SFTP of any kind at all. In by looping the
same files over and over in nifi itself you can make this happen or
cannot?

Thanks

On Wed, Oct 20, 2021 at 11:08 AM Joe Witt  wrote:
>
> Jens,
>
> "After fetching a FlowFile-stream file and unpacked it back into NiFi
> I calculate a sha256. 1 minutes later I recalculate the sha256 on the
> exact same file. And got a new hash. That is what worry’s me.
> The fact that the same file can be recalculated and produce two
> different hashes, is very strange, but it happens. "
>
> Ok so to confirm you are saying that in each case this happens you see
> it first compute the wrong hash, but then if you retry the same
> flowfile it then provides the correct hash?
>
> Can you please also show/share the lineage history for such a flow
> file then?  It should have events for the initial hash, second hash,
> the unpacking, trace to the original stream, etc...
>
> Thanks
>
> On Wed, Oct 20, 2021 at 11:00 AM Jens M. Kofoed  
> wrote:
> >
> > Dear Mark and Joe
> >
> > I know my setup isn’t normal for many people. But if we only looks at my 
> > receive side, which the last mails is about. Every thing is happening at 
> > the same NIFI instance. It is the same 3 node NIFI cluster.
> > After fetching a FlowFile-stream file and unpacked it back into NiFi I 
> > calculate a sha256. 1 minutes later I recalculate the sha256 on the exact 
> > same file. And got a new hash. That is what worry’s me.
> > The fact that the same file can be recalculated and produce two different 
> > hashes, is very strange, but it happens. Over the last 5 months it have 
> > only happen 35-40 times.
> >
> > I can understand if the file is not completely loaded and saved into the 
> > content repository before the hashing starts. But I believe that the unpack 
> > process don’t forward the flow file to the next process before it is 100% 
> > finish unpacking and saving the new content to the repository.
> >
> > I have a test flow, where a GenerateFlowfile has created 6x 1GB files (2 
> > files per node) and next process was a hashcontent before it run into a 
> > test loop. Where files are uploaded via PutSFTP to a test server, and 
> > downloaded again and recalculated the hash. I have had one issue after 3 
> > days of running.
> > Now the test flow is running without the Put/Fetch sftp processors.
> >
> > Another problem is that I can’t find any correlation to other events. Not 
> > within NIFI, nor the server itself or VMWare. If I just could find any 
> > other event which happens at the same time, I might be able to force some 
> > kind of event to trigger the issue.
> > I have tried to force VMware to migrate a NiFi node to another host. 
> > Forcing it to do a snapshot and deleting snapshots, but nothing can trigger 
> > and error.
> >
> > I know it will be very very difficult to reproduce. But I will setup 
> > multiple NiFi instances running different test flows to see if I can find 
> > any reason why it behaves as it does.
> >
> > Kind Regards
> > Jens M. Kofoed
> >
> > Den 20. okt. 2021 kl. 16.39 skrev Mark Payne :
> >
> > Jens,
> >
> > Thanks for sharing the images.
> >
> > I tried to setup a test to reproduce the issue. I’ve had it running for 
> > quite some time. Running through millions of iterations.
> >
> > I’ve used 5 KB files, 50 KB files, 50 MB files, and larger (to the tune of 
> > hundreds of MB). I’ve been unable to reproduce an issue after millions of 
> > iterations.
> >
> > So far I cannot replicate. And since you’re pulling the data via SFTP and 
> > then unpacking, which preserves all original attributes from a different 
> > system, this can easily become confusing.
> >
> > Recommend trying to reproduce with SFTP-related processors out of the 
> > picture, as Joe is mentioning. Either using GetFile/FetchFile or 
> > GenerateFlowFile. Then immediately use CryptographicHashContent to generate 
> > an ‘initial hash’, copy that value to another attribute, and then loop, 
> > generating the hash and comparing against the original one. I’ll attach a 
> > flow that does this, but not sure if the email server will strip out the 
> > attachment or not.
> >
> > This way we remove any possibility of actual corruption between the two 
> > nifi instances. If we can still see corruption / different hashes within a 
> > single nifi instance, then it certainly warrants further investigation but 
> > i can’t see any issues so far.
> >
> > Thanks
> > -Mark
> >
> >
> >
> >
> >
> > On Oct 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-20 Thread Joe Witt
Jens,

"After fetching a FlowFile-stream file and unpacked it back into NiFi
I calculate a sha256. 1 minutes later I recalculate the sha256 on the
exact same file. And got a new hash. That is what worry’s me.
The fact that the same file can be recalculated and produce two
different hashes, is very strange, but it happens. "

Ok so to confirm you are saying that in each case this happens you see
it first compute the wrong hash, but then if you retry the same
flowfile it then provides the correct hash?

Can you please also show/share the lineage history for such a flow
file then?  It should have events for the initial hash, second hash,
the unpacking, trace to the original stream, etc...

Thanks

On Wed, Oct 20, 2021 at 11:00 AM Jens M. Kofoed  wrote:
>
> Dear Mark and Joe
>
> I know my setup isn’t normal for many people. But if we only looks at my 
> receive side, which the last mails is about. Every thing is happening at the 
> same NIFI instance. It is the same 3 node NIFI cluster.
> After fetching a FlowFile-stream file and unpacked it back into NiFi I 
> calculate a sha256. 1 minutes later I recalculate the sha256 on the exact 
> same file. And got a new hash. That is what worry’s me.
> The fact that the same file can be recalculated and produce two different 
> hashes, is very strange, but it happens. Over the last 5 months it have only 
> happen 35-40 times.
>
> I can understand if the file is not completely loaded and saved into the 
> content repository before the hashing starts. But I believe that the unpack 
> process don’t forward the flow file to the next process before it is 100% 
> finish unpacking and saving the new content to the repository.
>
> I have a test flow, where a GenerateFlowfile has created 6x 1GB files (2 
> files per node) and next process was a hashcontent before it run into a test 
> loop. Where files are uploaded via PutSFTP to a test server, and downloaded 
> again and recalculated the hash. I have had one issue after 3 days of running.
> Now the test flow is running without the Put/Fetch sftp processors.
>
> Another problem is that I can’t find any correlation to other events. Not 
> within NIFI, nor the server itself or VMWare. If I just could find any other 
> event which happens at the same time, I might be able to force some kind of 
> event to trigger the issue.
> I have tried to force VMware to migrate a NiFi node to another host. Forcing 
> it to do a snapshot and deleting snapshots, but nothing can trigger and error.
>
> I know it will be very very difficult to reproduce. But I will setup multiple 
> NiFi instances running different test flows to see if I can find any reason 
> why it behaves as it does.
>
> Kind Regards
> Jens M. Kofoed
>
> Den 20. okt. 2021 kl. 16.39 skrev Mark Payne :
>
> Jens,
>
> Thanks for sharing the images.
>
> I tried to setup a test to reproduce the issue. I’ve had it running for quite 
> some time. Running through millions of iterations.
>
> I’ve used 5 KB files, 50 KB files, 50 MB files, and larger (to the tune of 
> hundreds of MB). I’ve been unable to reproduce an issue after millions of 
> iterations.
>
> So far I cannot replicate. And since you’re pulling the data via SFTP and 
> then unpacking, which preserves all original attributes from a different 
> system, this can easily become confusing.
>
> Recommend trying to reproduce with SFTP-related processors out of the 
> picture, as Joe is mentioning. Either using GetFile/FetchFile or 
> GenerateFlowFile. Then immediately use CryptographicHashContent to generate 
> an ‘initial hash’, copy that value to another attribute, and then loop, 
> generating the hash and comparing against the original one. I’ll attach a 
> flow that does this, but not sure if the email server will strip out the 
> attachment or not.
>
> This way we remove any possibility of actual corruption between the two nifi 
> instances. If we can still see corruption / different hashes within a single 
> nifi instance, then it certainly warrants further investigation but i can’t 
> see any issues so far.
>
> Thanks
> -Mark
>
>
>
>
>
> On Oct 20, 2021, at 10:21 AM, Joe Witt  wrote:
>
> Jens
>
> Actually is this current loop test contained within a single nifi and there 
> you see corruption happen?
>
> Joe
>
> On Wed, Oct 20, 2021 at 7:14 AM Joe Witt  wrote:
>
> Jens,
>
> You have a very involved setup including other systems (non NiFi).  Have you 
> removed those systems from the equation so you have more evidence to support 
> your expectation that NiFi is doing something other than you expect?
>
> Joe
>
> On Wed, Oct 20, 2021 at 7:10 AM Jens M. Kofoed  wrote:
>
> Hi
>
> Today I have another file which have been running through the retry loop one 
> time. To test the processors and the algorithm I added the HashContent 
> processor and also added hashing by SHA-1.
> I file have been going through the system, and both the SHA-1 and SHA-256 are 
> both different than expected. with a 1 minutes delay the file is going 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-20 Thread Jens M. Kofoed
Dear Mark and Joe

I know my setup isn’t normal for many people. But if we only looks at my 
receive side, which the last mails is about. Every thing is happening at the 
same NIFI instance. It is the same 3 node NIFI cluster.
After fetching a FlowFile-stream file and unpacked it back into NiFi I 
calculate a sha256. 1 minutes later I recalculate the sha256 on the exact same 
file. And got a new hash. That is what worry’s me.
The fact that the same file can be recalculated and produce two different 
hashes, is very strange, but it happens. Over the last 5 months it have only 
happen 35-40 times.

I can understand if the file is not completely loaded and saved into the 
content repository before the hashing starts. But I believe that the unpack 
process don’t forward the flow file to the next process before it is 100% 
finish unpacking and saving the new content to the repository.

I have a test flow, where a GenerateFlowfile has created 6x 1GB files (2 files 
per node) and next process was a hashcontent before it run into a test loop. 
Where files are uploaded via PutSFTP to a test server, and downloaded again and 
recalculated the hash. I have had one issue after 3 days of running.
Now the test flow is running without the Put/Fetch sftp processors.

Another problem is that I can’t find any correlation to other events. Not 
within NIFI, nor the server itself or VMWare. If I just could find any other 
event which happens at the same time, I might be able to force some kind of 
event to trigger the issue.
I have tried to force VMware to migrate a NiFi node to another host. Forcing it 
to do a snapshot and deleting snapshots, but nothing can trigger and error.

I know it will be very very difficult to reproduce. But I will setup multiple 
NiFi instances running different test flows to see if I can find any reason why 
it behaves as it does.

Kind Regards
Jens M. Kofoed

> Den 20. okt. 2021 kl. 16.39 skrev Mark Payne :
> 
> Jens,
> 
> Thanks for sharing the images.
> 
> I tried to setup a test to reproduce the issue. I’ve had it running for quite 
> some time. Running through millions of iterations.
> 
> I’ve used 5 KB files, 50 KB files, 50 MB files, and larger (to the tune of 
> hundreds of MB). I’ve been unable to reproduce an issue after millions of 
> iterations.
> 
> So far I cannot replicate. And since you’re pulling the data via SFTP and 
> then unpacking, which preserves all original attributes from a different 
> system, this can easily become confusing.
> 
> Recommend trying to reproduce with SFTP-related processors out of the 
> picture, as Joe is mentioning. Either using GetFile/FetchFile or 
> GenerateFlowFile. Then immediately use CryptographicHashContent to generate 
> an ‘initial hash’, copy that value to another attribute, and then loop, 
> generating the hash and comparing against the original one. I’ll attach a 
> flow that does this, but not sure if the email server will strip out the 
> attachment or not.
> 
> This way we remove any possibility of actual corruption between the two nifi 
> instances. If we can still see corruption / different hashes within a single 
> nifi instance, then it certainly warrants further investigation but i can’t 
> see any issues so far.
> 
> Thanks
> -Mark
> 
> 
> 
> 
> 
>> On Oct 20, 2021, at 10:21 AM, Joe Witt  wrote:
>> 
>> Jens
>> 
>> Actually is this current loop test contained within a single nifi and there 
>> you see corruption happen?
>> 
>> Joe
>> 
>> On Wed, Oct 20, 2021 at 7:14 AM Joe Witt  wrote:
>> Jens,
>> 
>> You have a very involved setup including other systems (non NiFi).  Have you 
>> removed those systems from the equation so you have more evidence to support 
>> your expectation that NiFi is doing something other than you expect?
>> 
>> Joe
>> 
>> On Wed, Oct 20, 2021 at 7:10 AM Jens M. Kofoed  
>> wrote:
>> Hi
>> 
>> Today I have another file which have been running through the retry loop one 
>> time. To test the processors and the algorithm I added the HashContent 
>> processor and also added hashing by SHA-1.
>> I file have been going through the system, and both the SHA-1 and SHA-256 
>> are both different than expected. with a 1 minutes delay the file is going 
>> back into the hashing content flow and this time it calculates both hashes 
>> fine.
>> 
>> I don't believe that the hashing is buggy, but something is very very 
>> strange. What can influence the processors/algorithm to calculate a 
>> different hash???
>> All the input/output claim information is exactly the same. It is the same 
>> flow/content file going in a loop. It happens on all 3 nodes.
>> 
>> Any suggestions for where to dig ?
>> 
>> Regards
>> Jens M. Kofoed
>> 
>> 
>> 
>> Den ons. 20. okt. 2021 kl. 06.34 skrev Jens M. Kofoed 
>> :
>> Hi Mark
>> 
>> Thanks for replaying and the suggestion to look at the content Claim.
>> These 3 pictures is from the first attempt:
>>   
>> 
>> Yesterday I realized that the content was still in the archive, so I could 
>> 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-20 Thread Mark Payne
Jens,

Thanks for sharing the images.

I tried to setup a test to reproduce the issue. I’ve had it running for quite 
some time. Running through millions of iterations.

I’ve used 5 KB files, 50 KB files, 50 MB files, and larger (to the tune of 
hundreds of MB). I’ve been unable to reproduce an issue after millions of 
iterations.

So far I cannot replicate. And since you’re pulling the data via SFTP and then 
unpacking, which preserves all original attributes from a different system, 
this can easily become confusing.

Recommend trying to reproduce with SFTP-related processors out of the picture, 
as Joe is mentioning. Either using GetFile/FetchFile or GenerateFlowFile. Then 
immediately use CryptographicHashContent to generate an ‘initial hash’, copy 
that value to another attribute, and then loop, generating the hash and 
comparing against the original one. I’ll attach a flow that does this, but not 
sure if the email server will strip out the attachment or not.

This way we remove any possibility of actual corruption between the two nifi 
instances. If we can still see corruption / different hashes within a single 
nifi instance, then it certainly warrants further investigation but i can’t see 
any issues so far.

Thanks
-Mark





On Oct 20, 2021, at 10:21 AM, Joe Witt 
mailto:joe.w...@gmail.com>> wrote:

Jens

Actually is this current loop test contained within a single nifi and there you 
see corruption happen?

Joe

On Wed, Oct 20, 2021 at 7:14 AM Joe Witt 
mailto:joe.w...@gmail.com>> wrote:
Jens,

You have a very involved setup including other systems (non NiFi).  Have you 
removed those systems from the equation so you have more evidence to support 
your expectation that NiFi is doing something other than you expect?

Joe

On Wed, Oct 20, 2021 at 7:10 AM Jens M. Kofoed 
mailto:jmkofoed@gmail.com>> wrote:
Hi

Today I have another file which have been running through the retry loop one 
time. To test the processors and the algorithm I added the HashContent 
processor and also added hashing by SHA-1.
I file have been going through the system, and both the SHA-1 and SHA-256 are 
both different than expected. with a 1 minutes delay the file is going back 
into the hashing content flow and this time it calculates both hashes fine.

I don't believe that the hashing is buggy, but something is very very strange. 
What can influence the processors/algorithm to calculate a different hash???
All the input/output claim information is exactly the same. It is the same 
flow/content file going in a loop. It happens on all 3 nodes.

Any suggestions for where to dig ?

Regards
Jens M. Kofoed



Den ons. 20. okt. 2021 kl. 06.34 skrev Jens M. Kofoed 
mailto:jmkofoed@gmail.com>>:
Hi Mark

Thanks for replaying and the suggestion to look at the content Claim.
These 3 pictures is from the first attempt:
  

Yesterday I realized that the content was still in the archive, so I could 
Replay the file.

So here are the same pictures but for the replay and as you can see the 
Identifier, offset and Size are all the same.
  

In my flow if the hash does not match my original first calculated hash, it 
goes into a retry loop. Here are the pictures for the 4th time the file went 
through:
  
Here the content Claim is all the same.

It is very rare that we see these issues <1 : 1.000.000 files and only with 
large files. Only once have I seen the error with a 110MB file, the other times 
the files size are above 800MB.
This time it was a Nifi-Flowstream v3 file, which has been exported from one 
system and imported in another. But while the file has been imported it is the 
same file inside NIFI and it stays at the same node. Going through the same 
loop of processors multiple times and in the end the CryptographicHashContent 
calculate a different SHA256 than it did earlier. This should not be 
possible!!! And that is what concern my the most.
What can influence the same processor to calculate 2 different sha256 on the 
exact same content???

Regards
Jens M. Kofoed


Den tir. 19. okt. 2021 kl. 16.51 skrev Mark Payne 
mailto:marka...@hotmail.com>>:
Jens,

In the two provenance events - one showing a hash of dd4cc… and the other 
showing f6f0….
If you go to the Content tab, do they both show the same Content Claim? I.e., 
do the Input Claim / Output Claim show the same values for Container, Section, 
Identifier, Offset, and Size?

Thanks
-Mark

On Oct 19, 2021, at 1:22 AM, Jens M. Kofoed 
mailto:jmkofoed@gmail.com>> wrote:

Dear NIFI Users

I have posted this mail in the developers mailing list and just want to inform 
all of our about a very odd behavior we are facing.
The background:
We have data going between 2 different NIFI systems which has no direct network 
access to each other. Therefore we calculate a SHA256 hash value of the content 
at system 1, before the flowfile and data are combined and saved as a 
"flowfile-stream-v3" pkg file. The file is then transported to system 2, where 
the pkg file 

Re: CryptographicHashContent calculates 2 differents sha256 hashes on the same content

2021-10-19 Thread Mark Payne
Jens,

In the two provenance events - one showing a hash of dd4cc… and the other 
showing f6f0….
If you go to the Content tab, do they both show the same Content Claim? I.e., 
do the Input Claim / Output Claim show the same values for Container, Section, 
Identifier, Offset, and Size?

Thanks
-Mark

On Oct 19, 2021, at 1:22 AM, Jens M. Kofoed 
mailto:jmkofoed@gmail.com>> wrote:

Dear NIFI Users

I have posted this mail in the developers mailing list and just want to inform 
all of our about a very odd behavior we are facing.
The background:
We have data going between 2 different NIFI systems which has no direct network 
access to each other. Therefore we calculate a SHA256 hash value of the content 
at system 1, before the flowfile and data are combined and saved as a 
"flowfile-stream-v3" pkg file. The file is then transported to system 2, where 
the pkg file is unpacked and the flow can continue. To be sure about file 
integrity we calculate a new sha256 at system 2. But sometimes we see that the 
sha256 gets another value, which might suggest the file was corrupted. But 
recalculating the sha256 again gives a new hash value.



Tonight I had yet another file which didn't match the expected sha256 hash 
value. The content is a 1.7GB file and the Event Duration was "00:00:17.539" to 
calculate the hash.
I have created a Retry loop, where the file will go to a Wait process for 
delaying the file 1 minute and going back to the CryptographicHashContent for a 
new calculation. After 3 retries the file goes to the retries_exceeded and goes 
to a disabled process just to be in a queue so I manually can look at it. This 
morning I rerouted the file from my retries_exceeded queue back to the 
CryptographicHashContent for a new calculation and this time it calculated the 
correct hash value.

THIS CAN'T BE TRUE :-( :-( But it is. - Something very very strange is 
happening.


We are running NiFi 1.13.2 in a 3 node cluster at Ubuntu 20.04.02 with openjdk 
version "1.8.0_292", OpenJDK Runtime Environment (build 
1.8.0_292-8u292-b10-0ubuntu1~20.04-b10), OpenJDK 64-Bit Server VM (build 
25.292-b10, mixed mode). Each server is a VM with 4 CPU, 8GB Ram on VMware 
ESXi, 7.0.2. Each NIFI node is running at different vm physical hosts.
I have inspected different logs to see if I can find any correlation what 
happened at the same time as the file is going through my loop, but there are 
no event/task at that exact time.

System 1:
At 10/19/2021 00:15:11.247 CEST my file is going through a 
CryptographicHashContent: SHA256 value: 
dd4cc7ef8dbc8d70528e8aa788581f0ab88d297c9c9f39b6b542df68952efd20
The file is exported as a "FlowFile Stream, v3" to System 2

SYSTEM 2:
At 10/19/2021 00:18:10.528 CEST the file is going through a 
CryptographicHashContent: SHA256 value: 
f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819

At 10/19/2021 00:19:08.996 CEST the file is going through the same 
CryptographicHashContent at system 2: SHA256 value: 
f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819
At 10/19/2021 00:20:04.376 CEST the file is going through the same a 
CryptographicHashContent at system 2: SHA256 value: 
f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819
At 10/19/2021 00:21:01.711 CEST the file is going through the same a 
CryptographicHashContent at system 2: SHA256 value: 
f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819

At 10/19/2021 06:07:43.376 CEST the file is going through the same a 
CryptographicHashContent at system 2: SHA256 value: 
dd4cc7ef8dbc8d70528e8aa788581f0ab88d297c9c9f39b6b542df68952efd20


How on earth can this happen???

Kind Regards
Jens M. Kofoed