Re: NPE MergeContent processor
Hi Mark, That is really good news. As to whether it sounds familiar – I try so many things when I was upgrading from 0.6.1 to 1.0.0 I couldn’t say whether this was indeed the cause. It may have been – I think though whilst the outcome is probably the same, the exact cause may have been due the aforementioned ‘titting about’ ☺ Anyway, I was thinking of splitting my cluster in two anyway to make better use of the syslog ingestion (I think that will give me better throughput as I am seeing the syslog ingestion as a bottleneck with repeated warnings over full buffer), at which point I will delete the provenance repository anyway which will get rid of this won’t it? I’m assuming I can delete all repositories and just leave the flowfile.xml to have the same starting point for the workflows? Anyway, thanks again for pursuing this and once again I am incredibly impressed with the reaction to issues/ bugs etc. in this community. Regards Conrad From: Mark Payne Reply-To: "users@nifi.apache.org" Date: Tuesday, 15 November 2016 at 18:09 To: "users@nifi.apache.org" Subject: Re: NPE MergeContent processor Conrad, Good news - I have been able to replicate the issue and track down the problem. I created a JIRA to address it - https://issues.apache.org/jira/browse/NIFI-3040. I have a PR up to address the issue. It looks like the problem is due to Replaying a FlowFile from Provenance and then restarting NiFi before the replayed FlowFile has completed processing. Does that sound familiar? In the case of MergeContent you'd see a NullPointerException. In other cases it will generally just complain because the UUID is null. The issue has to do with the FlowFile not being properly persisted when a REPLAY event occurs. So if you still have the FlowFile that is causing this, you'd have to manually remove it from its queue to address the issue, but the issue shouldn't happen any more after this fix makes its way in. Sorry that this has been you, but thanks for working with us to give us all we needed to investigate. And thanks for being patient as we've diagnosed and dug in. Cheers -Mark From: Oleg Zhurakousky Sent: Friday, November 11, 2016 2:07 PM To: users@nifi.apache.org Subject: Re: NPE MergeContent processor Sorry, I should have been more clear. I’ve spent considerable amount of time slicing and dicing this thing and while I am still validating few possibilities, this is more likely to due to FlowFile being rehydrated from the corrupted repo with missing UUID and when such file’s ID ends up to be in a parent/child of ProvenanceEventRecord we get this issue. Basically FlowFile must never exist without UUID similar to the way provenance event record where existence if UUID is validated during the call to build(). We should definitely do the same in a builder for FlowFile and even though it will not eliminate the issue it may help to pin point its origin. I’ll raise the corresponding JIRA to improve FlowFile validation. Cheers Oleg > On Nov 11, 2016, at 3:00 PM, Joe Witt wrote: > > that said even if it is due to crashes or even disk full cases we > should figure out what happened and make it not possible. We must > always work to eliminate the possibility of corruption causing events > and work to recover well in the face of corruption... > > On Fri, Nov 11, 2016 at 2:57 PM, Oleg Zhurakousky > wrote: >> Conrad >> >> Is it possible that you may be dealing with corrupted repositories (swap, >> flow file etc.) due to your upgrades or may be even possible crashes? >> >> Cheers >> Oleg >> >> On Nov 11, 2016, at 3:11 AM, Conrad Crampton >> wrote: >> >> Hi, >> This is the flow. The incoming flow is basically a syslog message which is >> parsed, enriched then saved to HDFS >> 1. Parse (extracttext) >> 2. Assign matching parts to attributes >> 3. Enrich ip address location >> 4. Assign attributes with geoenrichment >> 5. Execute python script to parse useragent >> 6. Create json from attributes >> 7. Convert to avro (all strings) >> 8. Convert to target avro schema (had to do 7 & 8 due to bug(?) where >> couldn’t go directly from json to avro with integers/longs) >> 9. Merge into bins (see props below) >> 10. Append ‘.avro’ to filenames (for reading in Spark subsequently) >> 11. Save to HDFS >> >> Does this help at all? >> If you need anything else just shout. >> Regards >> Conrad >> >> >> >> >> >> additional out of shot >> · compression level : 1 >> · Keep Path : false >> >> >> From: Oleg Zhurakousky >> Reply-To: "users@nifi.apache.org&qu
Re: NPE MergeContent processor
Conrad, Good news - I have been able to replicate the issue and track down the problem. I created a JIRA to address it - https://issues.apache.org/jira/browse/NIFI-3040. I have a PR up to address the issue. It looks like the problem is due to Replaying a FlowFile from Provenance and then restarting NiFi before the replayed FlowFile has completed processing. Does that sound familiar? In the case of MergeContent you'd see a NullPointerException. In other cases it will generally just complain because the UUID is null. The issue has to do with the FlowFile not being properly persisted when a REPLAY event occurs. So if you still have the FlowFile that is causing this, you'd have to manually remove it from its queue to address the issue, but the issue shouldn't happen any more after this fix makes its way in. Sorry that this has been you, but thanks for working with us to give us all we needed to investigate. And thanks for being patient as we've diagnosed and dug in. Cheers -Mark From: Oleg Zhurakousky Sent: Friday, November 11, 2016 2:07 PM To: users@nifi.apache.org Subject: Re: NPE MergeContent processor Sorry, I should have been more clear. I’ve spent considerable amount of time slicing and dicing this thing and while I am still validating few possibilities, this is more likely to due to FlowFile being rehydrated from the corrupted repo with missing UUID and when such file’s ID ends up to be in a parent/child of ProvenanceEventRecord we get this issue. Basically FlowFile must never exist without UUID similar to the way provenance event record where existence if UUID is validated during the call to build(). We should definitely do the same in a builder for FlowFile and even though it will not eliminate the issue it may help to pin point its origin. I’ll raise the corresponding JIRA to improve FlowFile validation. Cheers Oleg > On Nov 11, 2016, at 3:00 PM, Joe Witt wrote: > > that said even if it is due to crashes or even disk full cases we > should figure out what happened and make it not possible. We must > always work to eliminate the possibility of corruption causing events > and work to recover well in the face of corruption... > > On Fri, Nov 11, 2016 at 2:57 PM, Oleg Zhurakousky > wrote: >> Conrad >> >> Is it possible that you may be dealing with corrupted repositories (swap, >> flow file etc.) due to your upgrades or may be even possible crashes? >> >> Cheers >> Oleg >> >> On Nov 11, 2016, at 3:11 AM, Conrad Crampton >> wrote: >> >> Hi, >> This is the flow. The incoming flow is basically a syslog message which is >> parsed, enriched then saved to HDFS >> 1. Parse (extracttext) >> 2. Assign matching parts to attributes >> 3. Enrich ip address location >> 4. Assign attributes with geoenrichment >> 5. Execute python script to parse useragent >> 6. Create json from attributes >> 7. Convert to avro (all strings) >> 8. Convert to target avro schema (had to do 7 & 8 due to bug(?) where >> couldn’t go directly from json to avro with integers/longs) >> 9. Merge into bins (see props below) >> 10. Append ‘.avro’ to filenames (for reading in Spark subsequently) >> 11. Save to HDFS >> >> Does this help at all? >> If you need anything else just shout. >> Regards >> Conrad >> >> >> >> >> >> additional out of shot >> · compression level : 1 >> · Keep Path : false >> >> >> From: Oleg Zhurakousky >> Reply-To: "users@nifi.apache.org" >> Date: Thursday, 10 November 2016 at 18:40 >> To: "users@nifi.apache.org" >> Subject: Re: NPE MergeContent processor >> >> Conrad >> >> Any chance you an provide a bit more info about your flow? >> I was able to find a condition when something like this can happen, but it >> would have to be with some legacy NiFi distribution, so it’s a bit puzzling, >> but i really want o see if we can close the loop on this. >> In any event I think it is safe to raise JIRA on this one >> >> Cheers >> Oleg >> >> >> On Nov 10, 2016, at 10:06 AM, Conrad Crampton >> wrote: >> >> Hi, >> The processor continues to write (to HDFS – the next processor in flow) and >> doesn’t block any others coming into this processor (MergeContent), so not >> quite the same observed behaviour as NIFI-2015. >> If there is anything else you would like me to do to help with this more >> than happy to help. >> Regards >> Conrad >> >> From: Bryan Bende >> Reply-To: "users@n
Re: NPE MergeContent processor
Sorry, I should have been more clear. I’ve spent considerable amount of time slicing and dicing this thing and while I am still validating few possibilities, this is more likely to due to FlowFile being rehydrated from the corrupted repo with missing UUID and when such file’s ID ends up to be in a parent/child of ProvenanceEventRecord we get this issue. Basically FlowFile must never exist without UUID similar to the way provenance event record where existence if UUID is validated during the call to build(). We should definitely do the same in a builder for FlowFile and even though it will not eliminate the issue it may help to pin point its origin. I’ll raise the corresponding JIRA to improve FlowFile validation. Cheers Oleg > On Nov 11, 2016, at 3:00 PM, Joe Witt wrote: > > that said even if it is due to crashes or even disk full cases we > should figure out what happened and make it not possible. We must > always work to eliminate the possibility of corruption causing events > and work to recover well in the face of corruption... > > On Fri, Nov 11, 2016 at 2:57 PM, Oleg Zhurakousky > wrote: >> Conrad >> >> Is it possible that you may be dealing with corrupted repositories (swap, >> flow file etc.) due to your upgrades or may be even possible crashes? >> >> Cheers >> Oleg >> >> On Nov 11, 2016, at 3:11 AM, Conrad Crampton >> wrote: >> >> Hi, >> This is the flow. The incoming flow is basically a syslog message which is >> parsed, enriched then saved to HDFS >> 1. Parse (extracttext) >> 2. Assign matching parts to attributes >> 3. Enrich ip address location >> 4. Assign attributes with geoenrichment >> 5. Execute python script to parse useragent >> 6. Create json from attributes >> 7. Convert to avro (all strings) >> 8. Convert to target avro schema (had to do 7 & 8 due to bug(?) where >> couldn’t go directly from json to avro with integers/longs) >> 9. Merge into bins (see props below) >> 10. Append ‘.avro’ to filenames (for reading in Spark subsequently) >> 11. Save to HDFS >> >> Does this help at all? >> If you need anything else just shout. >> Regards >> Conrad >> >> >> >> >> >> additional out of shot >> · compression level : 1 >> · Keep Path : false >> >> >> From: Oleg Zhurakousky >> Reply-To: "users@nifi.apache.org" >> Date: Thursday, 10 November 2016 at 18:40 >> To: "users@nifi.apache.org" >> Subject: Re: NPE MergeContent processor >> >> Conrad >> >> Any chance you an provide a bit more info about your flow? >> I was able to find a condition when something like this can happen, but it >> would have to be with some legacy NiFi distribution, so it’s a bit puzzling, >> but i really want o see if we can close the loop on this. >> In any event I think it is safe to raise JIRA on this one >> >> Cheers >> Oleg >> >> >> On Nov 10, 2016, at 10:06 AM, Conrad Crampton >> wrote: >> >> Hi, >> The processor continues to write (to HDFS – the next processor in flow) and >> doesn’t block any others coming into this processor (MergeContent), so not >> quite the same observed behaviour as NIFI-2015. >> If there is anything else you would like me to do to help with this more >> than happy to help. >> Regards >> Conrad >> >> From: Bryan Bende >> Reply-To: "users@nifi.apache.org" >> Date: Thursday, 10 November 2016 at 14:59 >> To: "users@nifi.apache.org" >> Subject: Re: NPE MergeContent processor >> >> Conrad, >> >> Thanks for reporting this. I wonder if this is also related to: >> >> https://issues.apache.org/jira/browse/NIFI-2015 >> >> Seems like there is some case where the UUID is ending up as null. >> >> -Bryan >> >> >> On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton >> wrote: >> >> Hi, >> I saw this error after I upgraded to 1.0.0 but thought it was maybe due to >> the issues I had with that upgrade (entirely my fault it turns out!), but I >> have seen it a number of times since so I turned debugging on to get a >> better stacktrace. Relevant log section as below. >> Nothing out of the ordinary, and I never saw this in v0.6.1 or below. >> I would have raised a Jira issue, but after logging in to Jira it only let >> me create a service desk request (which didn’t sound right). >> Regards >> Conrad >> >
Re: NPE MergeContent processor
that said even if it is due to crashes or even disk full cases we should figure out what happened and make it not possible. We must always work to eliminate the possibility of corruption causing events and work to recover well in the face of corruption... On Fri, Nov 11, 2016 at 2:57 PM, Oleg Zhurakousky wrote: > Conrad > > Is it possible that you may be dealing with corrupted repositories (swap, > flow file etc.) due to your upgrades or may be even possible crashes? > > Cheers > Oleg > > On Nov 11, 2016, at 3:11 AM, Conrad Crampton > wrote: > > Hi, > This is the flow. The incoming flow is basically a syslog message which is > parsed, enriched then saved to HDFS > 1. Parse (extracttext) > 2. Assign matching parts to attributes > 3. Enrich ip address location > 4. Assign attributes with geoenrichment > 5. Execute python script to parse useragent > 6. Create json from attributes > 7. Convert to avro (all strings) > 8. Convert to target avro schema (had to do 7 & 8 due to bug(?) where > couldn’t go directly from json to avro with integers/longs) > 9. Merge into bins (see props below) > 10. Append ‘.avro’ to filenames (for reading in Spark subsequently) > 11. Save to HDFS > > Does this help at all? > If you need anything else just shout. > Regards > Conrad > > > > > > additional out of shot > · compression level : 1 > · Keep Path : false > > > From: Oleg Zhurakousky > Reply-To: "users@nifi.apache.org" > Date: Thursday, 10 November 2016 at 18:40 > To: "users@nifi.apache.org" > Subject: Re: NPE MergeContent processor > > Conrad > > Any chance you an provide a bit more info about your flow? > I was able to find a condition when something like this can happen, but it > would have to be with some legacy NiFi distribution, so it’s a bit puzzling, > but i really want o see if we can close the loop on this. > In any event I think it is safe to raise JIRA on this one > > Cheers > Oleg > > > On Nov 10, 2016, at 10:06 AM, Conrad Crampton > wrote: > > Hi, > The processor continues to write (to HDFS – the next processor in flow) and > doesn’t block any others coming into this processor (MergeContent), so not > quite the same observed behaviour as NIFI-2015. > If there is anything else you would like me to do to help with this more > than happy to help. > Regards > Conrad > > From: Bryan Bende > Reply-To: "users@nifi.apache.org" > Date: Thursday, 10 November 2016 at 14:59 > To: "users@nifi.apache.org" > Subject: Re: NPE MergeContent processor > > Conrad, > > Thanks for reporting this. I wonder if this is also related to: > > https://issues.apache.org/jira/browse/NIFI-2015 > > Seems like there is some case where the UUID is ending up as null. > > -Bryan > > > On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton > wrote: > > Hi, > I saw this error after I upgraded to 1.0.0 but thought it was maybe due to > the issues I had with that upgrade (entirely my fault it turns out!), but I > have seen it a number of times since so I turned debugging on to get a > better stacktrace. Relevant log section as below. > Nothing out of the ordinary, and I never saw this in v0.6.1 or below. > I would have raised a Jira issue, but after logging in to Jira it only let > me create a service desk request (which didn’t sound right). > Regards > Conrad > > 2016-11-09 16:43:46,413 DEBUG [Timer-Driven Process Thread-5] > o.a.n.processors.standard.MergeContent > MergeContent[id=12c0bec7-68b7-3b60-a020-afcc7b4599e7] has chosen to yield > its resources; will not be scheduled to run again for 1000 milliseconds > 2016-11-09 16:43:46,414 DEBUG [Timer-Driven Process Thread-5] > o.a.n.processors.standard.MergeContent > MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Binned 42 FlowFiles > 2016-11-09 16:43:46,418 INFO [Timer-Driven Process Thread-5] > o.a.n.processors.standard.MergeContent > MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Merged > [StandardFlowFileRecord[uuid=5e846136-0a7a-46fb-be96-8200d5cdd33d,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1475059643340-275849, > container=default, section=393], offset=567158, > length=2337],offset=0,name=17453303363322987,size=2337], > StandardFlowFileRecord[uuid=a5f4bd55-82e3-40cb-9fa9-86b9e6816f67,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1475059643340-275849, > container=default, section=393], offset=573643, > length=2279],offset=0,name=17453303351196175,size=2279], > StandardFlowFileRecord[uuid=c1ca745b-660a-49cd-82e5-fa8b9a2f4165,clai
Re: NPE MergeContent processor
Conrad Is it possible that you may be dealing with corrupted repositories (swap, flow file etc.) due to your upgrades or may be even possible crashes? Cheers Oleg On Nov 11, 2016, at 3:11 AM, Conrad Crampton mailto:conrad.cramp...@secdata.com>> wrote: Hi, This is the flow. The incoming flow is basically a syslog message which is parsed, enriched then saved to HDFS 1. Parse (extracttext) 2. Assign matching parts to attributes 3. Enrich ip address location 4. Assign attributes with geoenrichment 5. Execute python script to parse useragent 6. Create json from attributes 7. Convert to avro (all strings) 8. Convert to target avro schema (had to do 7 & 8 due to bug(?) where couldn’t go directly from json to avro with integers/longs) 9. Merge into bins (see props below) 10. Append ‘.avro’ to filenames (for reading in Spark subsequently) 11. Save to HDFS Does this help at all? If you need anything else just shout. Regards Conrad additional out of shot • compression level : 1 • Keep Path : false From: Oleg Zhurakousky mailto:ozhurakou...@hortonworks.com>> Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" mailto:users@nifi.apache.org>> Date: Thursday, 10 November 2016 at 18:40 To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" mailto:users@nifi.apache.org>> Subject: Re: NPE MergeContent processor Conrad Any chance you an provide a bit more info about your flow? I was able to find a condition when something like this can happen, but it would have to be with some legacy NiFi distribution, so it’s a bit puzzling, but i really want o see if we can close the loop on this. In any event I think it is safe to raise JIRA on this one Cheers Oleg On Nov 10, 2016, at 10:06 AM, Conrad Crampton mailto:conrad.cramp...@secdata.com>> wrote: Hi, The processor continues to write (to HDFS – the next processor in flow) and doesn’t block any others coming into this processor (MergeContent), so not quite the same observed behaviour as NIFI-2015. If there is anything else you would like me to do to help with this more than happy to help. Regards Conrad From: Bryan Bende mailto:bbe...@gmail.com>> Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" mailto:users@nifi.apache.org>> Date: Thursday, 10 November 2016 at 14:59 To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" mailto:users@nifi.apache.org>> Subject: Re: NPE MergeContent processor Conrad, Thanks for reporting this. I wonder if this is also related to: https://issues.apache.org/jira/browse/NIFI-2015 Seems like there is some case where the UUID is ending up as null. -Bryan On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton mailto:conrad.cramp...@secdata.com>> wrote: Hi, I saw this error after I upgraded to 1.0.0 but thought it was maybe due to the issues I had with that upgrade (entirely my fault it turns out!), but I have seen it a number of times since so I turned debugging on to get a better stacktrace. Relevant log section as below. Nothing out of the ordinary, and I never saw this in v0.6.1 or below. I would have raised a Jira issue, but after logging in to Jira it only let me create a service desk request (which didn’t sound right). Regards Conrad 2016-11-09 16:43:46,413 DEBUG [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=12c0bec7-68b7-3b60-a020-afcc7b4599e7] has chosen to yield its resources; will not be scheduled to run again for 1000 milliseconds 2016-11-09 16:43:46,414 DEBUG [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Binned 42 FlowFiles 2016-11-09 16:43:46,418 INFO [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Merged [StandardFlowFileRecord[uuid=5e846136-0a7a-46fb-be96-8200d5cdd33d,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=567158, length=2337],offset=0,name=17453303363322987,size=2337], StandardFlowFileRecord[uuid=a5f4bd55-82e3-40cb-9fa9-86b9e6816f67,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=573643, length=2279],offset=0,name=17453303351196175,size=2279], StandardFlowFileRecord[uuid=c1ca745b-660a-49cd-82e5-fa8b9a2f4165,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=583957, length=2223],offset=0,name=17453303531879367,size=2223], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=595617, length=2356],offset=0,name=,size=235
Re: NPE MergeContent processor
Conrad Any chance you an provide a bit more info about your flow? I was able to find a condition when something like this can happen, but it would have to be with some legacy NiFi distribution, so it’s a bit puzzling, but i really want o see if we can close the loop on this. In any event I think it is safe to raise JIRA on this one Cheers Oleg On Nov 10, 2016, at 10:06 AM, Conrad Crampton mailto:conrad.cramp...@secdata.com>> wrote: Hi, The processor continues to write (to HDFS – the next processor in flow) and doesn’t block any others coming into this processor (MergeContent), so not quite the same observed behaviour as NIFI-2015. If there is anything else you would like me to do to help with this more than happy to help. Regards Conrad From: Bryan Bende mailto:bbe...@gmail.com>> Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" mailto:users@nifi.apache.org>> Date: Thursday, 10 November 2016 at 14:59 To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" mailto:users@nifi.apache.org>> Subject: Re: NPE MergeContent processor Conrad, Thanks for reporting this. I wonder if this is also related to: https://issues.apache.org/jira/browse/NIFI-2015 Seems like there is some case where the UUID is ending up as null. -Bryan On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton mailto:conrad.cramp...@secdata.com>> wrote: Hi, I saw this error after I upgraded to 1.0.0 but thought it was maybe due to the issues I had with that upgrade (entirely my fault it turns out!), but I have seen it a number of times since so I turned debugging on to get a better stacktrace. Relevant log section as below. Nothing out of the ordinary, and I never saw this in v0.6.1 or below. I would have raised a Jira issue, but after logging in to Jira it only let me create a service desk request (which didn’t sound right). Regards Conrad 2016-11-09 16:43:46,413 DEBUG [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=12c0bec7-68b7-3b60-a020-afcc7b4599e7] has chosen to yield its resources; will not be scheduled to run again for 1000 milliseconds 2016-11-09 16:43:46,414 DEBUG [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Binned 42 FlowFiles 2016-11-09 16:43:46,418 INFO [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Merged [StandardFlowFileRecord[uuid=5e846136-0a7a-46fb-be96-8200d5cdd33d,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=567158, length=2337],offset=0,name=17453303363322987,size=2337], StandardFlowFileRecord[uuid=a5f4bd55-82e3-40cb-9fa9-86b9e6816f67,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=573643, length=2279],offset=0,name=17453303351196175,size=2279], StandardFlowFileRecord[uuid=c1ca745b-660a-49cd-82e5-fa8b9a2f4165,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=583957, length=2223],offset=0,name=17453303531879367,size=2223], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=595617, length=2356],offset=0,name=,size=2356], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=705637, length=2317],offset=0,name=,size=2317], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=725376, length=2333],offset=0,name=,size=2333], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=728703, length=2377],offset=0,name=,size=2377]] into StandardFlowFileRecord[uuid=1ef3e5a0-f8db-49eb-935d-ed3c991fd631,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1478709819819-416, container=default, section=416], offset=982498, length=4576],offset=0,name=3649103647775837,size=4576] 2016-11-09 16:43:46,418 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] failed to process session due to java.lang.NullPointerException: java.lang.NullPointerException 2016-11-09 16:43:46,422 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent java.lang.NullPointerException: null at org.apache.nifi.stream.io<http://org.apache.nifi.stream.io/>.DataOutputStream.writeUTF(DataOutputStream.java:300) ~[ni
Re: NPE MergeContent processor
Hi, The processor continues to write (to HDFS – the next processor in flow) and doesn’t block any others coming into this processor (MergeContent), so not quite the same observed behaviour as NIFI-2015. If there is anything else you would like me to do to help with this more than happy to help. Regards Conrad From: Bryan Bende Reply-To: "users@nifi.apache.org" Date: Thursday, 10 November 2016 at 14:59 To: "users@nifi.apache.org" Subject: Re: NPE MergeContent processor Conrad, Thanks for reporting this. I wonder if this is also related to: https://issues.apache.org/jira/browse/NIFI-2015 Seems like there is some case where the UUID is ending up as null. -Bryan On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton mailto:conrad.cramp...@secdata.com>> wrote: Hi, I saw this error after I upgraded to 1.0.0 but thought it was maybe due to the issues I had with that upgrade (entirely my fault it turns out!), but I have seen it a number of times since so I turned debugging on to get a better stacktrace. Relevant log section as below. Nothing out of the ordinary, and I never saw this in v0.6.1 or below. I would have raised a Jira issue, but after logging in to Jira it only let me create a service desk request (which didn’t sound right). Regards Conrad 2016-11-09 16:43:46,413 DEBUG [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=12c0bec7-68b7-3b60-a020-afcc7b4599e7] has chosen to yield its resources; will not be scheduled to run again for 1000 milliseconds 2016-11-09 16:43:46,414 DEBUG [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Binned 42 FlowFiles 2016-11-09 16:43:46,418 INFO [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Merged [StandardFlowFileRecord[uuid=5e846136-0a7a-46fb-be96-8200d5cdd33d,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=567158, length=2337],offset=0,name=17453303363322987,size=2337], StandardFlowFileRecord[uuid=a5f4bd55-82e3-40cb-9fa9-86b9e6816f67,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=573643, length=2279],offset=0,name=17453303351196175,size=2279], StandardFlowFileRecord[uuid=c1ca745b-660a-49cd-82e5-fa8b9a2f4165,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=583957, length=2223],offset=0,name=17453303531879367,size=2223], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=595617, length=2356],offset=0,name=,size=2356], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=705637, length=2317],offset=0,name=,size=2317], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=725376, length=2333],offset=0,name=,size=2333], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=728703, length=2377],offset=0,name=,size=2377]] into StandardFlowFileRecord[uuid=1ef3e5a0-f8db-49eb-935d-ed3c991fd631,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1478709819819-416, container=default, section=416], offset=982498, length=4576],offset=0,name=3649103647775837,size=4576] 2016-11-09 16:43:46,418 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] failed to process session due to java.lang.NullPointerException: java.lang.NullPointerException 2016-11-09 16:43:46,422 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent java.lang.NullPointerException: null at org.apache.nifi.stream.io<http://org.apache.nifi.stream.io>.DataOutputStream.writeUTF(DataOutputStream.java:300) ~[nifi-utils-1.0.0.jar:1.0.0] at org.apache.nifi.stream.io<http://org.apache.nifi.stream.io>.DataOutputStream.writeUTF(DataOutputStream.java:281) ~[nifi-utils-1.0.0.jar:1.0.0] at org.apache.nifi.provenance.StandardRecordWriter.writeUUID(StandardRecordWriter.java:257) ~[na:na] at org.apache.nifi.provenance.StandardRecordWriter.writeUUIDs(StandardRecordWriter.java:266) ~[na:na] at org.apache.nifi.provenance.StandardRecordWriter.writeRecord(StandardRecordWriter.java:232) ~[na:na] at org.apache.nifi.provenance.PersistentProvenanceRepository.persistRecord(PersistentP
Re: NPE MergeContent processor
Conrad, Thanks for reporting this. I wonder if this is also related to: https://issues.apache.org/jira/browse/NIFI-2015 Seems like there is some case where the UUID is ending up as null. -Bryan On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton < conrad.cramp...@secdata.com> wrote: > Hi, > > I saw this error after I upgraded to 1.0.0 but thought it was maybe due to > the issues I had with that upgrade (entirely my fault it turns out!), but I > have seen it a number of times since so I turned debugging on to get a > better stacktrace. Relevant log section as below. > > Nothing out of the ordinary, and I never saw this in v0.6.1 or below. > > I would have raised a Jira issue, but after logging in to Jira it only let > me create a service desk request (which didn’t sound right). > > Regards > > Conrad > > > > 2016-11-09 16:43:46,413 DEBUG [Timer-Driven Process Thread-5] > o.a.n.processors.standard.MergeContent > MergeContent[id=12c0bec7-68b7-3b60-a020-afcc7b4599e7] > has chosen to yield its resources; will not be scheduled to run again for > 1000 milliseconds > > 2016-11-09 16:43:46,414 DEBUG [Timer-Driven Process Thread-5] > o.a.n.processors.standard.MergeContent > MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] > Binned 42 FlowFiles > > 2016-11-09 16:43:46,418 INFO [Timer-Driven Process Thread-5] > o.a.n.processors.standard.MergeContent > MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] > Merged [StandardFlowFileRecord[uuid=5e846136-0a7a-46fb-be96- > 8200d5cdd33d,claim=StandardContentClaim [resourceClaim= > StandardResourceClaim[id=1475059643340-275849, container=default, > section=393], offset=567158, > length=2337],offset=0,name=17453303363322987,size=2337], > StandardFlowFileRecord[uuid=a5f4bd55-82e3-40cb-9fa9-86b9e6816f67,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1475059643340-275849, > container=default, section=393], offset=573643, > length=2279],offset=0,name=17453303351196175,size=2279], > StandardFlowFileRecord[uuid=c1ca745b-660a-49cd-82e5-fa8b9a2f4165,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1475059643340-275849, > container=default, section=393], offset=583957, > length=2223],offset=0,name=17453303531879367,size=2223], > StandardFlowFileRecord[uuid=,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1475059643340-275849, > container=default, section=393], offset=595617, > length=2356],offset=0,name=,size=2356], > StandardFlowFileRecord[uuid=,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1475059643340-275849, > container=default, section=393], offset=705637, > length=2317],offset=0,name=,size=2317], > StandardFlowFileRecord[uuid=,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1475059643340-275849, > container=default, section=393], offset=725376, > length=2333],offset=0,name=,size=2333], > StandardFlowFileRecord[uuid=,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1475059643340-275849, > container=default, section=393], offset=728703, > length=2377],offset=0,name=,size=2377]] > into StandardFlowFileRecord[uuid=1ef3e5a0-f8db-49eb-935d- > ed3c991fd631,claim=StandardContentClaim [resourceClaim= > StandardResourceClaim[id=1478709819819-416, container=default, > section=416], offset=982498, length=4576],offset=0,name= > 3649103647775837,size=4576] > > 2016-11-09 16:43:46,418 ERROR [Timer-Driven Process Thread-5] > o.a.n.processors.standard.MergeContent > MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] > MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] failed to process > session due to java.lang.NullPointerException: > java.lang.NullPointerException > > 2016-11-09 16:43:46,422 ERROR [Timer-Driven Process Thread-5] > o.a.n.processors.standard.MergeContent > > java.lang.NullPointerException: null > > at > org.apache.nifi.stream.io.DataOutputStream.writeUTF(DataOutputStream.java:300) > ~[nifi-utils-1.0.0.jar:1.0.0] > > at > org.apache.nifi.stream.io.DataOutputStream.writeUTF(DataOutputStream.java:281) > ~[nifi-utils-1.0.0.jar:1.0.0] > > at > org.apache.nifi.provenance.StandardRecordWriter.writeUUID(StandardRecordWriter.java:257) > ~[na:na] > > at > org.apache.nifi.provenance.StandardRecordWriter.writeUUIDs(StandardRecordWriter.java:266) > ~[na:na] > > at > org.apache.nifi.provenance.StandardRecordWriter.writeRecord(StandardRecordWriter.java:232) > ~[na:na] > > at org.apache.nifi.provenance.PersistentProvenanceRepository > .persistRecord(PersistentProvenanceRepository.java:766) ~[na:na] > > at org.apache.nifi.provenance.PersistentProvenanceRepository > .registerEvents(PersistentProvenanceRepository.java:432) ~[na:na] > > at org.apache.nifi.controller.repository.StandardProcessSession. > updateProvenanceRepo(StandardProcessSession.java:713) > ~[nifi-framework-core-1.0.0.jar:1.0.0] > > at org.apache.nifi.controller.repository. > StandardProcessSession.commit
NPE MergeContent processor
Hi, I saw this error after I upgraded to 1.0.0 but thought it was maybe due to the issues I had with that upgrade (entirely my fault it turns out!), but I have seen it a number of times since so I turned debugging on to get a better stacktrace. Relevant log section as below. Nothing out of the ordinary, and I never saw this in v0.6.1 or below. I would have raised a Jira issue, but after logging in to Jira it only let me create a service desk request (which didn’t sound right). Regards Conrad 2016-11-09 16:43:46,413 DEBUG [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=12c0bec7-68b7-3b60-a020-afcc7b4599e7] has chosen to yield its resources; will not be scheduled to run again for 1000 milliseconds 2016-11-09 16:43:46,414 DEBUG [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Binned 42 FlowFiles 2016-11-09 16:43:46,418 INFO [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Merged [StandardFlowFileRecord[uuid=5e846136-0a7a-46fb-be96-8200d5cdd33d,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=567158, length=2337],offset=0,name=17453303363322987,size=2337], StandardFlowFileRecord[uuid=a5f4bd55-82e3-40cb-9fa9-86b9e6816f67,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=573643, length=2279],offset=0,name=17453303351196175,size=2279], StandardFlowFileRecord[uuid=c1ca745b-660a-49cd-82e5-fa8b9a2f4165,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=583957, length=2223],offset=0,name=17453303531879367,size=2223], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=595617, length=2356],offset=0,name=,size=2356], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=705637, length=2317],offset=0,name=,size=2317], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=725376, length=2333],offset=0,name=,size=2333], StandardFlowFileRecord[uuid=,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1475059643340-275849, container=default, section=393], offset=728703, length=2377],offset=0,name=,size=2377]] into StandardFlowFileRecord[uuid=1ef3e5a0-f8db-49eb-935d-ed3c991fd631,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1478709819819-416, container=default, section=416], offset=982498, length=4576],offset=0,name=3649103647775837,size=4576] 2016-11-09 16:43:46,418 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] failed to process session due to java.lang.NullPointerException: java.lang.NullPointerException 2016-11-09 16:43:46,422 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.standard.MergeContent java.lang.NullPointerException: null at org.apache.nifi.stream.io.DataOutputStream.writeUTF(DataOutputStream.java:300) ~[nifi-utils-1.0.0.jar:1.0.0] at org.apache.nifi.stream.io.DataOutputStream.writeUTF(DataOutputStream.java:281) ~[nifi-utils-1.0.0.jar:1.0.0] at org.apache.nifi.provenance.StandardRecordWriter.writeUUID(StandardRecordWriter.java:257) ~[na:na] at org.apache.nifi.provenance.StandardRecordWriter.writeUUIDs(StandardRecordWriter.java:266) ~[na:na] at org.apache.nifi.provenance.StandardRecordWriter.writeRecord(StandardRecordWriter.java:232) ~[na:na] at org.apache.nifi.provenance.PersistentProvenanceRepository.persistRecord(PersistentProvenanceRepository.java:766) ~[na:na] at org.apache.nifi.provenance.PersistentProvenanceRepository.registerEvents(PersistentProvenanceRepository.java:432) ~[na:na] at org.apache.nifi.controller.repository.StandardProcessSession.updateProvenanceRepo(StandardProcessSession.java:713) ~[nifi-framework-core-1.0.0.jar:1.0.0] at org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:311) ~[nifi-framework-core-1.0.0.jar:1.0.0] at org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:299) ~[nifi-framework-core-1.0.0.jar:1.0.0] at org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:256) ~[nifi-processor-utils-1.0.0.jar:1.0.0] at org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:190) ~[nifi-processor-utils-1.0.0