Hi Mark,
That is really good news. As to whether it sounds familiar – I try so many 
things when I was upgrading from 0.6.1 to 1.0.0 I couldn’t say whether this was 
indeed the cause. It may have been – I think though whilst the outcome is 
probably the same, the exact cause may have been due the aforementioned 
‘titting about’ ☺
Anyway, I was thinking of splitting my cluster in two anyway to make better use 
of the syslog ingestion (I think that will give me better throughput as I am 
seeing the syslog ingestion as a bottleneck with repeated warnings over full 
buffer), at which point I will delete the provenance repository anyway which 
will get rid of this won’t it? I’m assuming I can delete all repositories and 
just leave the flowfile.xml to have the same starting point for the workflows?
Anyway, thanks again for pursuing this and once again I am incredibly impressed 
with the reaction to issues/ bugs etc. in this community.
Regards
Conrad

From: Mark Payne <marka...@hotmail.com>
Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
Date: Tuesday, 15 November 2016 at 18:09
To: "users@nifi.apache.org" <users@nifi.apache.org>
Subject: Re: NPE MergeContent processor


Conrad,



Good news - I have been able to replicate the issue and track down the problem. 
I created a JIRA to address it - 
https://issues.apache.org/jira/browse/NIFI-3040.

I have a PR up to address the issue. It looks like the problem is due to 
Replaying a FlowFile from Provenance and then restarting NiFi before the 
replayed FlowFile

has completed processing. Does that sound familiar?



In the case of MergeContent you'd see a NullPointerException. In other cases it 
will generally just complain because the UUID is null.



The issue has to do with the FlowFile not being properly persisted when a 
REPLAY event occurs. So if you still have the FlowFile that is causing this, 
you'd have

to manually remove it from its queue to address the issue, but the issue 
shouldn't happen any more after this fix makes its way in.



Sorry that this has been you, but thanks for working with us to give us all we 
needed to investigate. And thanks for being patient as we've diagnosed and dug 
in.



Cheers

-Mark

________________________________
From: Oleg Zhurakousky <ozhurakou...@hortonworks.com>
Sent: Friday, November 11, 2016 2:07 PM
To: users@nifi.apache.org
Subject: Re: NPE MergeContent processor

Sorry, I should have been more clear.
I’ve spent considerable amount of time slicing and dicing this thing and while 
I am still validating few possibilities, this is more likely to due to FlowFile 
being rehydrated from the corrupted repo with missing UUID and when such file’s 
ID ends up to be in a parent/child of ProvenanceEventRecord we get this issue.
Basically FlowFile must never exist without UUID similar to the way provenance 
event record where existence if UUID is validated during the call to build(). 
We should definitely do the same in a builder for FlowFile and even though it 
will not eliminate the issue it may help to pin point its origin.

I’ll raise  the corresponding JIRA to improve FlowFile validation.

Cheers
Oleg

> On Nov 11, 2016, at 3:00 PM, Joe Witt <joe.w...@gmail.com> wrote:
>
> that said even if it is due to crashes or even disk full cases we
> should figure out what happened and make it not possible.  We must
> always work to eliminate the possibility of corruption causing events
> and work to recover well in the face of corruption...
>
> On Fri, Nov 11, 2016 at 2:57 PM, Oleg Zhurakousky
> <ozhurakou...@hortonworks.com> wrote:
>> Conrad
>>
>> Is it possible that you may be dealing with corrupted repositories (swap,
>> flow file etc.) due to your upgrades or may be even possible crashes?
>>
>> Cheers
>> Oleg
>>
>> On Nov 11, 2016, at 3:11 AM, Conrad Crampton <conrad.cramp...@secdata.com>
>> wrote:
>>
>> Hi,
>> This is the flow. The incoming flow is basically a syslog message which is
>> parsed, enriched then saved to HDFS
>> 1.       Parse (extracttext)
>> 2.       Assign matching parts to attributes
>> 3.       Enrich ip address location
>> 4.       Assign attributes with geoenrichment
>> 5.       Execute python script to parse useragent
>> 6.       Create json from attributes
>> 7.       Convert to avro (all strings)
>> 8.       Convert to target avro schema (had to do 7 & 8 due to bug(?) where
>> couldn’t go directly from json to avro with integers/longs)
>> 9.       Merge into bins (see props below)
>> 10.   Append ‘.avro’ to filenames (for reading in Spark subsequently)
>> 11.   Save to HDFS
>>
>> Does this help at all?
>> If you need anything else just shout.
>> Regards
>> Conrad
>>
>> <image001.png>
>>
>>
>> <image002.png>
>> additional out of shot
>> ·         compression level : 1
>> ·         Keep Path : false
>>
>>
>> From: Oleg Zhurakousky <ozhurakou...@hortonworks.com>
>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Date: Thursday, 10 November 2016 at 18:40
>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Subject: Re: NPE MergeContent processor
>>
>> Conrad
>>
>> Any chance you an provide a bit more info about your flow?
>> I was able to find a condition when something like this can happen, but it
>> would have to be with some legacy NiFi distribution, so it’s a bit puzzling,
>> but i really want o see if we can close the loop on this.
>> In any event I think it is safe to raise JIRA on this one
>>
>> Cheers
>> Oleg
>>
>>
>> On Nov 10, 2016, at 10:06 AM, Conrad Crampton <conrad.cramp...@secdata.com>
>> wrote:
>>
>> Hi,
>> The processor continues to write (to HDFS – the next processor in flow) and
>> doesn’t block any others coming into this processor (MergeContent), so not
>> quite the same observed behaviour as NIFI-2015.
>> If there is anything else you would like me to do to help with this more
>> than happy to help.
>> Regards
>> Conrad
>>
>> From: Bryan Bende <bbe...@gmail.com>
>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Date: Thursday, 10 November 2016 at 14:59
>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Subject: Re: NPE MergeContent processor
>>
>> Conrad,
>>
>> Thanks for reporting this. I wonder if this is also related to:
>>
>> https://issues.apache.org/jira/browse/NIFI-2015
>>
>> Seems like there is some case where the UUID is ending up as null.
>>
>> -Bryan
>>
>>
>> On Wed, Nov 9, 2016 at 11:57 AM, Conrad Crampton
>> <conrad.cramp...@secdata.com> wrote:
>>
>> Hi,
>> I saw this error after I upgraded to 1.0.0 but thought it was maybe due to
>> the issues I had with that upgrade (entirely my fault it turns out!), but I
>> have seen it a number of times since so I turned debugging on to get a
>> better stacktrace. Relevant log section as below.
>> Nothing out of the ordinary, and I never saw this in v0.6.1 or below.
>> I would have raised a Jira issue, but after logging in to Jira it only let
>> me create a service desk request (which didn’t sound right).
>> Regards
>> Conrad
>>
>> 2016-11-09 16:43:46,413 DEBUG [Timer-Driven Process Thread-5]
>> o.a.n.processors.standard.MergeContent
>> MergeContent[id=12c0bec7-68b7-3b60-a020-afcc7b4599e7] has chosen to yield
>> its resources; will not be scheduled to run again for 1000 milliseconds
>> 2016-11-09 16:43:46,414 DEBUG [Timer-Driven Process Thread-5]
>> o.a.n.processors.standard.MergeContent
>> MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Binned 42 FlowFiles
>> 2016-11-09 16:43:46,418 INFO [Timer-Driven Process Thread-5]
>> o.a.n.processors.standard.MergeContent
>> MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Merged
>> [StandardFlowFileRecord[uuid=5e846136-0a7a-46fb-be96-8200d5cdd33d,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1475059643340-275849,
>> container=default, section=393], offset=567158,
>> length=2337],offset=0,name=17453303363322987,size=2337],
>> StandardFlowFileRecord[uuid=a5f4bd55-82e3-40cb-9fa9-86b9e6816f67,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1475059643340-275849,
>> container=default, section=393], offset=573643,
>> length=2279],offset=0,name=17453303351196175,size=2279],
>> StandardFlowFileRecord[uuid=c1ca745b-660a-49cd-82e5-fa8b9a2f4165,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1475059643340-275849,
>> container=default, section=393], offset=583957,
>> length=2223],offset=0,name=17453303531879367,size=2223],
>> StandardFlowFileRecord[uuid=<null>,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1475059643340-275849,
>> container=default, section=393], offset=595617,
>> length=2356],offset=0,name=<null>,size=2356],
>> StandardFlowFileRecord[uuid=<null>,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1475059643340-275849,
>> container=default, section=393], offset=705637,
>> length=2317],offset=0,name=<null>,size=2317],
>> StandardFlowFileRecord[uuid=<null>,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1475059643340-275849,
>> container=default, section=393], offset=725376,
>> length=2333],offset=0,name=<null>,size=2333],
>> StandardFlowFileRecord[uuid=<null>,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1475059643340-275849,
>> container=default, section=393], offset=728703,
>> length=2377],offset=0,name=<null>,size=2377]] into
>> StandardFlowFileRecord[uuid=1ef3e5a0-f8db-49eb-935d-ed3c991fd631,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1478709819819-416,
>> container=default, section=416], offset=982498,
>> length=4576],offset=0,name=3649103647775837,size=4576]
>> 2016-11-09 16:43:46,418 ERROR [Timer-Driven Process Thread-5]
>> o.a.n.processors.standard.MergeContent
>> MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116]
>> MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] failed to process
>> session due to java.lang.NullPointerException:
>> java.lang.NullPointerException
>> 2016-11-09 16:43:46,422 ERROR [Timer-Driven Process Thread-5]
>> o.a.n.processors.standard.MergeContent
>> java.lang.NullPointerException: null
>>        at
>> org.apache.nifi.stream.io.DataOutputStream.writeUTF(DataOutputStream.java:300)
>> ~[nifi-utils-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.stream.io.DataOutputStream.writeUTF(DataOutputStream.java:281)
>> ~[nifi-utils-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.provenance.StandardRecordWriter.writeUUID(StandardRecordWriter.java:257)
>> ~[na:na]
>>        at
>> org.apache.nifi.provenance.StandardRecordWriter.writeUUIDs(StandardRecordWriter.java:266)
>> ~[na:na]
>>        at
>> org.apache.nifi.provenance.StandardRecordWriter.writeRecord(StandardRecordWriter.java:232)
>> ~[na:na]
>>        at
>> org.apache.nifi.provenance.PersistentProvenanceRepository.persistRecord(PersistentProvenanceRepository.java:766)
>> ~[na:na]
>>        at
>> org.apache.nifi.provenance.PersistentProvenanceRepository.registerEvents(PersistentProvenanceRepository.java:432)
>> ~[na:na]
>>        at
>> org.apache.nifi.controller.repository.StandardProcessSession.updateProvenanceRepo(StandardProcessSession.java:713)
>> ~[nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:311)
>> ~[nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:299)
>> ~[nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:256)
>> ~[nifi-processor-utils-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:190)
>> ~[nifi-processor-utils-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1064)
>> ~[nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
>> [nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>> [nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>> [nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> [na:1.8.0_51]
>>        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>> [na:1.8.0_51]
>>        at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>> [na:1.8.0_51]
>>        at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>> [na:1.8.0_51]
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_51]
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_51]
>>        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
>> 2016-11-09 16:43:46,422 WARN [Timer-Driven Process Thread-5]
>> o.a.n.processors.standard.MergeContent
>> MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] Processor
>> Administratively Yielded for 1 sec due to processing failure
>> 2016-11-09 16:43:46,422 WARN [Timer-Driven Process Thread-5]
>> o.a.n.c.t.ContinuallyRunProcessorTask Administratively Yielding
>> MergeContent[id=8db3bb68-0354-3116-96c5-dc80854ef116] due to uncaught
>> Exception: java.lang.NullPointerException
>> 2016-11-09 16:43:46,423 WARN [Timer-Driven Process Thread-5]
>> o.a.n.c.t.ContinuallyRunProcessorTask
>> java.lang.NullPointerException: null
>>        at
>> org.apache.nifi.stream.io.DataOutputStream.writeUTF(DataOutputStream.java:300)
>> ~[nifi-utils-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.stream.io.DataOutputStream.writeUTF(DataOutputStream.java:281)
>> ~[nifi-utils-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.provenance.StandardRecordWriter.writeUUID(StandardRecordWriter.java:257)
>> ~[na:na]
>>        at
>> org.apache.nifi.provenance.StandardRecordWriter.writeUUIDs(StandardRecordWriter.java:266)
>> ~[na:na]
>>        at
>> org.apache.nifi.provenance.StandardRecordWriter.writeRecord(StandardRecordWriter.java:232)
>> ~[na:na]
>>        at
>> org.apache.nifi.provenance.PersistentProvenanceRepository.persistRecord(PersistentProvenanceRepository.java:766)
>> ~[na:na]
>>        at
>> org.apache.nifi.provenance.PersistentProvenanceRepository.registerEvents(PersistentProvenanceRepository.java:432)
>> ~[na:na]
>>        at
>> org.apache.nifi.controller.repository.StandardProcessSession.updateProvenanceRepo(StandardProcessSession.java:713)
>> ~[nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:311)
>> ~[nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:299)
>> ~[nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:256)
>> ~[nifi-processor-utils-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:190)
>> ~[nifi-processor-utils-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1064)
>> ~[nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
>> [nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>> [nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>> [nifi-framework-core-1.0.0.jar:1.0.0]
>>        at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> [na:1.8.0_51]
>>        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>> [na:1.8.0_51]
>>        at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>> [na:1.8.0_51]
>>        at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>> [na:1.8.0_51]
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_51]
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_51]
>>        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
>>
>>
>>
>> SecureData, combating cyber threats
>>
>> ________________________________
>> The information contained in this message or any of its attachments may be
>> privileged and confidential and intended for the exclusive use of the
>> intended recipient. If you are not the intended recipient any disclosure,
>> reproduction, distribution or other dissemination or use of this
>> communications is strictly prohibited. The views expressed in this email are
>> those of the individual and not necessarily of SecureData Europe Ltd. Any
>> prices quoted are only valid if followed up by a formal written quote.
>> SecureData Europe Limited. Registered in England & Wales 04365896.
>> Registered Address: SecureData House, Hermitage Court, Hermitage Lane,
>> Maidstone, Kent, ME16 9NT
>>
>>
>>
>>
>>
>> ***This email originated outside SecureData***
>> Click here to report this email as spam.
>>
>>
>>
>>
>

Reply via email to