ListS3 processor question (duplicate files / maintaining state)

2016-06-26 Thread ddewaele
Hi,

I had a question on the ListS3 processor.
I'm using it to monitor the content of an S3 bucket.
The idea is that when new files come in, they need to be processed and sent
through the dataflow, using a FetchS3Object to process the file. This all
works but I had 2 questions :

1. Where does the S3 processor keep its state ? How does it know what files
it has already processed and is there a way to clear this state ?
2. Sometimes, when syncing files to my S3 buckets, I notice that the ListS3
processor is picking up the same file twice. Is there a way to avoid that ?





--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/ListS3-processor-question-duplicate-files-maintaining-state-tp12278.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: UI can take a very long time to become available

2016-06-26 Thread ddewaele
Hi,

Just did a fresh build of 1.0.0-snapshot and it seems to have solved the
issue.
Replayed the 2 scenarios without an issue

(Was using an "old" snapshot from 2 weeks ago.)

Thanks a lot for the support on this ... 



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12276.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: UI can take a very long time to become available

2016-06-25 Thread ddewaele
Ok Will provide some more details tomorrow (need to sleep now).
This is on the 1.0.0-snapshot running on mac os x with jdk8

On Sunday, 26 June 2016, Joe Witt [via Apache NiFi Developer List] <
ml-node+s39713n12270...@n7.nabble.com> wrote:

> Ok that was definitely good to see (the video).  I tried a similar
> setup and could not recreate it.
>
> Let's see if we can get the perspective of a few other folks.
>
> If you can get a thread dump at all during the spinning that would be
> really helpful.
>
> Thanks
> Joe
>
> On Sat, Jun 25, 2016 at 4:28 PM, ddewaele <[hidden email]
> <http:///user/SendEmail.jtp?type=node=12270=0>> wrote:
>
> > Here you can see the 2 crash scenarios mentioned in the previous post :
> > https://dl.dropboxusercontent.com/u/13246619/nifi-crash/NifiCrash.mp4
> >
> > - First one is deleting a processor with a non-empty queue attached
> > - Second one is deleting a processor with a destination that is running
> >
> > both actions fail (error msgs in a popup). However, any action in the UI
> > will now cause the flow to freeze.
> >
> > Hope this helps
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12269.html
> > Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12270.html
> To unsubscribe from UI can take a very long time to become available, click
> here
> <http://apache-nifi-developer-list.39713.n7.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=12201=ZGRld2FlbGVAZ21haWwuY29tfDEyMjAxfDUxMjI3NzA1Mg==>
> .
> NAML
> <http://apache-nifi-developer-list.39713.n7.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12271.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: UI can take a very long time to become available

2016-06-25 Thread ddewaele
Here you can see the 2 crash scenarios mentioned in the previous post :
https://dl.dropboxusercontent.com/u/13246619/nifi-crash/NifiCrash.mp4

- First one is deleting a processor with a non-empty queue attached
- Second one is deleting a processor with a destination that is running 

both actions fail (error msgs in a popup). However, any action in the UI
will now cause the flow to freeze.

Hope this helps





--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12269.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: UI can take a very long time to become available

2016-06-25 Thread ddewaele
That did the trick... I assumed it would be something like that, but assumed
it would apply sufficient back-pressure to stop the flow from getting
flooded

With the 1 x 100 approach things have stabilised, but I'm still seeing
frequent hangs in the UI, forcing me to restart Nifi (This usually happens
when I edit the flow, stop/start processors, deleting/editing/adding
connections). The refresh icon appears in the toolbar and sits there
forever. The date/time on the left of it remains stale.

- At 1 point it "crashed" when all queues were empty and 90% of the flow was
stopped. I was simply modifying the flow (I did a copy - paste of a
processor, the refresh icon kicked in and remained there. cpu was spiking as
well).

There was some logging in the logfiles but the ui / api were not responding.




Is this a known issue (perhaps with a JIRA ticket already created )?





--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12246.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: UI can take a very long time to become available

2016-06-24 Thread ddewaele
Thx for the tip ! Flow is up and running again. 




--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12209.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: UI can take a very long time to become available

2016-06-24 Thread ddewaele
I've  added a gist
<https://gist.github.com/ddewaele/f84c223fff2c18af4efe3b7fabcab946>   to
show you some logging threaddumps of when the issue occurs. In this
particular case I was unable to bring up Nifi 

I had to kill my mqtt broker so that the ConsumeMQTT processor wouldn't be
picking up any new messages.
Then I was able to start Nifi and enter the UI. However, I was unable to
stop the ConsumeMQTT processor. (spinning refresh wheel of death).

There is some activity in the logs but no msgs are processed and ui / api is
unresponsive.

nifi.sh dump also doesn't work due to a socket timeout (probably tries to
contact the api).






--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12204.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


UI can take a very long time to become available

2016-06-24 Thread ddewaele
I have a simple dataflow running on 1.0.0-SNAPSHOT that looks like this

ConsumeMQTT -> MergeContent (batch them up by 100 items) -> PutHDFS

The ConsumeMQTT is receiving about 100 msgs per 5 minutes.

>From time to time I need to restart Nifi because it seems to hang / starts
processing very slowly. I believe this is due to memory related issues
(sometimes I see OutOfMemory errors, but I already have it running on a 6GB
heap).

The problem is that when restarting the flow, it sometimes takes a very very
long time before the UI comes up. 

I had to restart the flow when there were 100 msgs in the queue (50MB)
between ConsumeMQTT / MergeContent. (For some reason ConsumeMQTT was
grinding to a halt).

After restarting it took over 7 minutes for the UI to come up.



Sometimes the UI doesn't come up at all, even after waiting for 20 minutes.
At that point you see similar messages in the logs. Sometimes if fails with
an out of memory. At that point I never get the opportunity to stop the flow
as I cannot access the UI / API.

Any tips on optimizing memory usage and are there any plans to make the UI /
API available immediately upon startup so that corrective actions can take
place before nifi starts grinding to a halt ?







--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.