ListS3 processor question (duplicate files / maintaining state)
Hi, I had a question on the ListS3 processor. I'm using it to monitor the content of an S3 bucket. The idea is that when new files come in, they need to be processed and sent through the dataflow, using a FetchS3Object to process the file. This all works but I had 2 questions : 1. Where does the S3 processor keep its state ? How does it know what files it has already processed and is there a way to clear this state ? 2. Sometimes, when syncing files to my S3 buckets, I notice that the ListS3 processor is picking up the same file twice. Is there a way to avoid that ? -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/ListS3-processor-question-duplicate-files-maintaining-state-tp12278.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Re: UI can take a very long time to become available
Hi, Just did a fresh build of 1.0.0-snapshot and it seems to have solved the issue. Replayed the 2 scenarios without an issue (Was using an "old" snapshot from 2 weeks ago.) Thanks a lot for the support on this ... -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12276.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Re: UI can take a very long time to become available
Ok Will provide some more details tomorrow (need to sleep now). This is on the 1.0.0-snapshot running on mac os x with jdk8 On Sunday, 26 June 2016, Joe Witt [via Apache NiFi Developer List] < ml-node+s39713n12270...@n7.nabble.com> wrote: > Ok that was definitely good to see (the video). I tried a similar > setup and could not recreate it. > > Let's see if we can get the perspective of a few other folks. > > If you can get a thread dump at all during the spinning that would be > really helpful. > > Thanks > Joe > > On Sat, Jun 25, 2016 at 4:28 PM, ddewaele <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=12270&i=0>> wrote: > > > Here you can see the 2 crash scenarios mentioned in the previous post : > > https://dl.dropboxusercontent.com/u/13246619/nifi-crash/NifiCrash.mp4 > > > > - First one is deleting a processor with a non-empty queue attached > > - Second one is deleting a processor with a destination that is running > > > > both actions fail (error msgs in a popup). However, any action in the UI > > will now cause the flow to freeze. > > > > Hope this helps > > > > > > > > > > > > -- > > View this message in context: > http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12269.html > > Sent from the Apache NiFi Developer List mailing list archive at > Nabble.com. > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12270.html > To unsubscribe from UI can take a very long time to become available, click > here > <http://apache-nifi-developer-list.39713.n7.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=12201&code=ZGRld2FlbGVAZ21haWwuY29tfDEyMjAxfDUxMjI3NzA1Mg==> > . > NAML > <http://apache-nifi-developer-list.39713.n7.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12271.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Re: UI can take a very long time to become available
Here you can see the 2 crash scenarios mentioned in the previous post : https://dl.dropboxusercontent.com/u/13246619/nifi-crash/NifiCrash.mp4 - First one is deleting a processor with a non-empty queue attached - Second one is deleting a processor with a destination that is running both actions fail (error msgs in a popup). However, any action in the UI will now cause the flow to freeze. Hope this helps -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12269.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Re: UI can take a very long time to become available
I got rid of the large merges yesterday and switched to a two phase merge. That indeed solved a lot of issues. But I think I can reproduce the UI hang . It seems to always occur after performing an action in the flow where a previous action resulted in an error / warning message : Example : - Attempting to delete a processor that is in the process of shutting down shows an error (popup) - Deleting a processor that has a non-empty queue attached to it results in an error (popup) If you execute a simple refresh on the UI after these messages it will hang. (I can post a video of the crashes if it would help) -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12267.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Re: UI can take a very long time to become available
Thanks a lot ... I was actually already at 6gb heap. I guess the large number of merges (1.000.000) were causing the original issue. The hangs in the UI I think are unrelated. Again, I'm on 1.0.0-SNAPSHOT. I have a very simple flow right now that reads an S3 bucket and writes it to a file. Bucket contains 6 files (50kb each). Don't know exactly how to reproduce it, but this usually does it : - Stop a processor that's active - Delete it (pressing backspace quickly - Nifi stays the processor is still running - probably still shutting down) - Delete it (again pressing backspace quickly - UI / Nifi process hangs) When it hangs it hangs good... I can't do thread dumps on the java process anymore, and cannot attach with jstat. -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12265.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Re: UI can take a very long time to become available
That did the trick... I assumed it would be something like that, but assumed it would apply sufficient back-pressure to stop the flow from getting flooded With the 1 x 100 approach things have stabilised, but I'm still seeing frequent hangs in the UI, forcing me to restart Nifi (This usually happens when I edit the flow, stop/start processors, deleting/editing/adding connections). The refresh icon appears in the toolbar and sits there forever. The date/time on the left of it remains stale. - At 1 point it "crashed" when all queues were empty and 90% of the flow was stopped. I was simply modifying the flow (I did a copy - paste of a processor, the refresh icon kicked in and remained there. cpu was spiking as well). There was some logging in the logfiles but the ui / api were not responding. Is this a known issue (perhaps with a JIRA ticket already created )? -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12246.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Re: UI can take a very long time to become available
Thx for the tip ! Flow is up and running again. -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12209.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Re: UI can take a very long time to become available
I've added a gist <https://gist.github.com/ddewaele/f84c223fff2c18af4efe3b7fabcab946> to show you some logging threaddumps of when the issue occurs. In this particular case I was unable to bring up Nifi I had to kill my mqtt broker so that the ConsumeMQTT processor wouldn't be picking up any new messages. Then I was able to start Nifi and enter the UI. However, I was unable to stop the ConsumeMQTT processor. (spinning refresh wheel of death). There is some activity in the logs but no msgs are processed and ui / api is unresponsive. nifi.sh dump also doesn't work due to a socket timeout (probably tries to contact the api). -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201p12204.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
UI can take a very long time to become available
I have a simple dataflow running on 1.0.0-SNAPSHOT that looks like this ConsumeMQTT -> MergeContent (batch them up by 100 items) -> PutHDFS The ConsumeMQTT is receiving about 100 msgs per 5 minutes. >From time to time I need to restart Nifi because it seems to hang / starts processing very slowly. I believe this is due to memory related issues (sometimes I see OutOfMemory errors, but I already have it running on a 6GB heap). The problem is that when restarting the flow, it sometimes takes a very very long time before the UI comes up. I had to restart the flow when there were 100 msgs in the queue (50MB) between ConsumeMQTT / MergeContent. (For some reason ConsumeMQTT was grinding to a halt). After restarting it took over 7 minutes for the UI to come up. Sometimes the UI doesn't come up at all, even after waiting for 20 minutes. At that point you see similar messages in the logs. Sometimes if fails with an out of memory. At that point I never get the opportunity to stop the flow as I cannot access the UI / API. Any tips on optimizing memory usage and are there any plans to make the UI / API available immediately upon startup so that corrective actions can take place before nifi starts grinding to a halt ? -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/UI-can-take-a-very-long-time-to-become-available-tp12201.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.