I'll do that. Thank you again Joe. On Thu, May 25, 2017 at 11:14 AM, Joe Witt <joe.w...@gmail.com> wrote:
> i would increase index threads from default of 2 (i think) to a bit > larger. You can also tune the other properties like shard size and > the like. They should be described in the admin docs. > > On Thu, May 25, 2017 at 11:12 AM, James McMahon <jsmcmah...@gmail.com> > wrote: > > Thank you Joe. Do you advise, then, that we tune some parameters now or > is > > it acceptable to allow NiFi to ..... self-regulate .... as it appears to > be > > doing? If you suggest tuning, which ones should I look at - > index.threads? I > > notice that at present I have that set to a robust 1. > > > > That improvement with 1.2.0 sounds like it will make a big difference. > Sadly > > as you recall it may be some time before 1.2.x is available to me. > > > > On Thu, May 25, 2017 at 11:05 AM, Joe Witt <joe.w...@gmail.com> wrote: > >> > >> jim > >> > >> that provenance warning is not related to archive/retention. It is > >> provenance telling you it can only index events so fast and at present > >> it is falling behind so will slow the flow to ensure things dont get > >> too far out of balance. However, there are configuration properties > >> that let you give provenance indexing more threads. Also, we created > >> a new provenance implementation available in niFi 1.2.0 which is > >> multiple times faster with immediate indexing. > >> > >> Thanks > >> > >> On Thu, May 25, 2017 at 11:03 AM, James McMahon <jsmcmah...@gmail.com> > >> wrote: > >> > Absolutely. Thank you for looking into this Aldrin. > >> > > >> > I do indeed have NiFi configured as a service. I've stopped an started > >> > it > >> > dozens of times through the life of my workflow development these > recent > >> > months. It's always previously started up like a champ. On this > >> > particular > >> > occasion I did this: > >> > service nifi stop > >> > as user nifi. It shutdown, and the logs presented no errors. > >> > I then did this: > >> > service nifi start > >> > as user nifi. The bootstrap log contained the INFO messages I shared > >> > with > >> > you above. > >> > > >> > Our data flow has not taxed NiFi much at all. There was no data > >> > processing > >> > through at the time. We had recently done two bulk ingests of large > data > >> > directories. The content repo had indicated 46% full, but after I let > it > >> > sit > >> > overnight it had dropped back down to a typical level of 3-6%. As I > >> > learned > >> > yesterday, with my archive retention set to 12 hours it explained why > I > >> > was > >> > seeing the content repo hold on to all that capacity after all my > >> > 100,000 > >> > files had processed through late yesterday. > >> > > >> > Early this morning I modified my conf/nifi.properties to drop my > archive > >> > retention to 1 day from 12 days. This was when I tried and failed to > >> > restart. > >> > > >> > We've since rebooted the host and NiFi came right up. With my new > >> > archive > >> > retention value in place, I tried processing about 16,000 files > through. > >> > They flew through, but I have noticed a Warning that I believe is > caused > >> > by > >> > my change to archive retention: WARNING The rate of the dataflow is > >> > exceeding the provenance recording rate. Slowing down flow to > >> > accommodate. > >> > > >> > What else can I tell you? I suppose it would help to mention that my > >> > three > >> > major repos - content, flowfile, provenance - are on separate local > disk > >> > devices. > >> > > >> > My workflow load peaks when I try to process approximately 100,000 > files > >> > totaling 50 GB through the flow. The content repo maxes out at 46% of > >> > our > >> > 50GB capacity. The provenance and flowfile repos never peak into the > >> > double > >> > digits. I do some custom parsing and custom logging in > >> > InvokeScriptedProcessors. I employ HandleHttpResponse and > >> > HandleHttpRequests > >> > processors. > >> > > >> > I've not yet watched memory usage on the box as I run, but I'll try to > >> > use a > >> > 'watch -n [#] free -m' later to see what happens. My nifi instance > runs > >> > with JVM memory parms in bootstrap.conf of -Xms4096m and -Xmx8192m. > >> > > >> > Jim > >> > > >> > On Thu, May 25, 2017 at 10:38 AM, Aldrin Piri <aldrinp...@gmail.com> > >> > wrote: > >> >> > >> >> If you happen to remember, could you get more specific into your > >> >> sequence > >> >> of operations? Is nifi installed as a service? If so, was it > restarted > >> >> Did you just issue a nifi.sh restart? > >> >> > >> >> Do you have any CM tooling (Puppet, Chef, Salt, etc) that is managing > >> >> this > >> >> process/system? > >> >> > >> >> Could you tell us what the bootstrap log says prior to those lines in > >> >> terms of shutting down? > >> >> > >> >> Would you be able to describe the load exerted on the system by the > >> >> flow? > >> >> A bit of an amorphous question, but is/was the system heavily taxed > >> >> running > >> >> NiFi? > >> >> > >> >> The section you hit _should_ only be hit if NiFi (the flow process > and > >> >> not > >> >> the bootstrap) terminates for some reason (e.g. - Hit an out of > memory > >> >> case). I have a few notions as to how the right confluence of events > >> >> could > >> >> have gotten you otherwise, so any additional details would be great > to > >> >> vet > >> >> their possible culpability. > >> >> > >> >> Thanks! > >> >> > >> >> On Thu, May 25, 2017 at 10:10 AM, James McMahon < > jsmcmah...@gmail.com> > >> >> wrote: > >> >>> > >> >>> I did inspect the log more closely. It offers little additional > >> >>> insight. > >> >>> Here is what it says (unable to export, had to transcribe myself): > >> >>> > >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi > Status > >> >>> File no longer exists. Will not restart NiFi > >> >>> [date] [time],### INFO [main] o.a.n.b.NotificationServiceManager > >> >>> Successfully loaded the following 0 services: [ ] > >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi > >> >>> Registered no Notification Services for Notification Type > NIFI_STARTED > >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi > >> >>> Registered no Notification Services for Notification Type > NIFI_STOPPED > >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi > >> >>> Registered no Notification Services for Notification Type NIFI_DIED > >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.Command > Apache > >> >>> NiFi is not running > >> >>> > >> >>> My hope is that we can figure out what happens to this status file, > >> >>> and > >> >>> how I can prevent it from nonexistence. > >> >>> > >> >>> Jim > >> >>> > >> >>> On Thu, May 25, 2017 at 9:37 AM, Joe Witt <joe.w...@gmail.com> > wrote: > >> >>>> > >> >>>> I don't think rebooting the system had anything to do with NiFi's > >> >>>> ability to startup. But i'm not sure I understand that particular > >> >>>> part of logic in the code in terms of the case it was defending > >> >>>> against. > >> >>>> > >> >>>> On Thu, May 25, 2017 at 9:34 AM, James McMahon < > jsmcmah...@gmail.com> > >> >>>> wrote: > >> >>>> > Will do Joe. I'll dig for that now. > >> >>>> > > >> >>>> > Infrastructure Group did reboot the box, which had been up and > >> >>>> > running > >> >>>> > for > >> >>>> > nearly two months. NiFi did indeed come up following the reboot. > I > >> >>>> > still > >> >>>> > want to try and get you this log information so that I can learn > >> >>>> > what > >> >>>> > triggers such a situation, and whether there is a more refined > way > >> >>>> > to > >> >>>> > solve > >> >>>> > it than full system reboot. There are other things running on the > >> >>>> > resource > >> >>>> > and I should try to minimize impact to them by fully rebooting. > >> >>>> > > >> >>>> > Let me see about that log content. Thank you again. > >> >>>> > > >> >>>> > On Thu, May 25, 2017 at 9:25 AM, Joe Witt <joe.w...@gmail.com> > >> >>>> > wrote: > >> >>>> >> > >> >>>> >> Jim, > >> >>>> >> > >> >>>> >> The code relevant to that log output is here [1]. Can you share > >> >>>> >> the > >> >>>> >> bootstrap output before/after that output? > >> >>>> >> > >> >>>> >> [1] > >> >>>> >> > >> >>>> >> > >> >>>> >> https://github.com/apache/nifi/blob/rel/nifi-0.7.1/nifi- > bootstrap/src/main/java/org/apache/nifi/bootstrap/RunNiFi.java > >> >>>> >> > >> >>>> >> Thanks > >> >>>> >> Joe > >> >>>> >> > >> >>>> >> On Thu, May 25, 2017 at 9:11 AM, James McMahon > >> >>>> >> <jsmcmah...@gmail.com> > >> >>>> >> wrote: > >> >>>> >> > Am running NiFi 0.7.x. Have been running with great stability > >> >>>> >> > for a > >> >>>> >> > long > >> >>>> >> > period of time. Tried this morning to make this change in my > >> >>>> >> > nifi.properties > >> >>>> >> > conf file: > >> >>>> >> > > >> >>>> >> > nifi.content.repository.archive.max.retention.period=1 hour > >> >>>> >> > > >> >>>> >> > Reduced from the default of 12 hours. Relatively simple > change, > >> >>>> >> > requires > >> >>>> >> > a > >> >>>> >> > nifi restart to take effect. > >> >>>> >> > > >> >>>> >> > My restart attempt throws no errors to the nifi app log, but > in > >> >>>> >> > the > >> >>>> >> > bootstrap log I do see this: > >> >>>> >> > org.apache.nifi.bootstrap.RunNiFi Status file no longer > exists. > >> >>>> >> > Will not > >> >>>> >> > restart NiFi > >> >>>> >> > > >> >>>> >> > I've done some digging and all I could find is rebooting the > box > >> >>>> >> > in > >> >>>> >> > hopes of > >> >>>> >> > resolving. Am reaching out to the infrastructure group that > owns > >> >>>> >> > the > >> >>>> >> > server > >> >>>> >> > now, asking them to do so. Would like to also in parallel > >> >>>> >> > understand why > >> >>>> >> > this happened, and where, exactly, this status file should be? > >> >>>> >> > > >> >>>> >> > Can I resolve this by manually recreating such a status file > >> >>>> >> > with > >> >>>> >> > certain > >> >>>> >> > permissions and ownership? > >> >>>> >> > > >> >>>> >> > Thanks in advance for your help. -Jim > >> >>>> >> > > >> >>>> >> > > >> >>>> > > >> >>>> > > >> >>> > >> >>> > >> >> > >> > > > > > >