Thanks Koji, I have already updated the logback configuration to produce more verbose logs. I was trying to reply to you with the verbose nifi logs but since I switched to use the WriteAheadProvenanceRepository implementation, up till now I haven't seen the error again. I will continue to check when the error might occur and post the logs here if needed. Once again thanks very much for your help.
Regards, Ben 2017-12-25 15:37 GMT+08:00 Koji Kawamura <ijokaruma...@gmail.com>: > Hi Ben, > > You can make NiFi log more verbose by editing: > NIFI_HOME/conf/logback.xml > > For example, adding following entry will reveal how NiFi repositories run: > > <logger name="org.apache.nifi" level="INFO"/> > <!-- Example --> > <logger name="org.apache.nifi.controller.repository" level="DEBUG"/> > <logger name="org.apache.nifi.processors" level="WARN"/> > > Thanks, > Koji > > On Mon, Dec 25, 2017 at 4:30 PM, 尹文才 <batman...@gmail.com> wrote: > > Hi Koji, I also didn't find anything related to the unexpected shutdown > in > > my logs, is there anything I could do to make NIFI log more verbose > > information to the logs? > > > > Regards, > > Ben > > > > 2017-12-25 14:56 GMT+08:00 Koji Kawamura <ijokaruma...@gmail.com>: > > > >> Hi Ben, > >> > >> I looked at the log and I expected to see some indication for the > >> cause of shutdown, but couldn't find any. > >> The PersistentProvenanceRepository rate warning is just a warning, and > >> it shouldn't be the trigger of an unexpected shutdown. I suspect other > >> reasons such as OOM killer, but I can't do any further investigation > >> with only these logs. > >> > >> Thanks, > >> Koji > >> > >> On Mon, Dec 25, 2017 at 3:46 PM, 尹文才 <batman...@gmail.com> wrote: > >> > Hi Koji, one more thing, do you have any idea why my first issue > leads to > >> > the unexpected shutdown of NIFI? according to the words, it will just > >> slow > >> > down the flow. thanks. > >> > > >> > Regards, > >> > Ben > >> > > >> > 2017-12-25 14:31 GMT+08:00 尹文才 <batman...@gmail.com>: > >> > > >> >> Hi Koji, thanks for your help, for the first issue, I will switch to > use > >> >> the WriteAheadProvenanceReopsitory implementation. > >> >> > >> >> For the second issue, I have uploaded the relevant part of my log > file > >> >> onto my google drive, the link is: > >> >> https://drive.google.com/open?id=1oxAkSUyYZFy6IWZSeWqHI8e9Utnw1XAj > >> >> > >> >> You mean a custom processor could possibly process a flowfile twice > only > >> >> when it's trying to commit the session but it's interrupted so the > >> flowfile > >> >> still remains inside the original queue(like NIFI went down)? > >> >> > >> >> If you need to see the full log file, please let me know, thanks. > >> >> > >> >> Regards, > >> >> Ben > >> >> > >> >> 2017-12-25 13:51 GMT+08:00 Koji Kawamura <ijokaruma...@gmail.com>: > >> >> > >> >>> Hi Ben, > >> >>> > >> >>> For your 2nd issue, NiFi commits a process session in Processor > >> >>> onTrigger when it's executed by NiFi flow engine by calling > >> >>> session.commit(). > >> >>> https://github.com/apache/nifi/blob/master/nifi-api/src/main > >> >>> /java/org/apache/nifi/processor/AbstractProcessor.java#L28 > >> >>> Once a process session is committed, the FlowFile state (including > >> >>> which queue it is in) is persisted to disk. > >> >>> > >> >>> It's possible for a Processor to process the same FlowFile more than > >> >>> once, if it has done its job, but failed to commit the session. > >> >>> For example, if your custom processor created a temp table from a > >> >>> FlowFile. Then before the process session is committed, something > >> >>> happened and NiFi process session was rollback. In this case, the > >> >>> target database is already updated (the temp table is created), but > >> >>> NiFi FlowFile stays in the incoming queue. If the FlowFile is > >> >>> processed again, the processor will get an error indicating the > table > >> >>> already exists. > >> >>> > >> >>> I tried to look at the logs you attached, but attachments do not > seem > >> >>> to be delivered to this ML. I don't see anything attached. > >> >>> > >> >>> Thanks, > >> >>> Koji > >> >>> > >> >>> > >> >>> On Mon, Dec 25, 2017 at 1:43 PM, Koji Kawamura < > ijokaruma...@gmail.com > >> > > >> >>> wrote: > >> >>> > Hi Ben, > >> >>> > > >> >>> > Just a quick recommendation for your first issue, 'The rate of the > >> >>> > dataflow is exceeding the provenance recording rate' warning > message. > >> >>> > I'd recommend using WriteAheadProvenanceRepository instead of > >> >>> > PersistentProvenanceRepository. WriteAheadProvenanceRepository > >> >>> > provides better performance. > >> >>> > Please take a look at the documentation here. > >> >>> > https://nifi.apache.org/docs/nifi-docs/html/administration-g > >> >>> uide.html#provenance-repository > >> >>> > > >> >>> > Thanks, > >> >>> > Koji > >> >>> > > >> >>> > On Mon, Dec 25, 2017 at 12:56 PM, 尹文才 <batman...@gmail.com> > wrote: > >> >>> >> Hi guys, I'm using nifi 1.4.0 to do some ETL work in my team and > I > >> have > >> >>> >> encountered 2 problems during my testing. > >> >>> >> > >> >>> >> The first problem is I found the nifi bulletin board was showing > the > >> >>> >> following warning to me: > >> >>> >> > >> >>> >> 2017-12-25 01:31:00,460 WARN [Provenance Maintenance Thread-1] > >> >>> >> o.a.n.p.PersistentProvenanceRepository The rate of the dataflow > is > >> >>> exceeding > >> >>> >> the provenance recording rate. Slowing down flow to accommodate. > >> >>> Currently, > >> >>> >> there are 96 journal files (158278228 bytes) and threshold for > >> >>> blocking is > >> >>> >> 80 (1181116006 bytes) > >> >>> >> > >> >>> >> I don't quite understand what this means, and I found also inside > >> the > >> >>> >> bootstrap log that nifi restarted itself: > >> >>> >> > >> >>> >> 2017-12-25 01:31:19,249 WARN [main] org.apache.nifi.bootstrap. > >> RunNiFi > >> >>> Apache > >> >>> >> NiFi appears to have died. Restarting... > >> >>> >> > >> >>> >> Is there anything I could do so solve this problem? > >> >>> >> > >> >>> >> The second problem is about the FlowFiles inside my flow, I > actually > >> >>> >> implemented a few custom processors to do the ETL work. one is to > >> >>> extract > >> >>> >> multiple tables from sql server and for each flowfile out of it, > it > >> >>> contains > >> >>> >> an attribute > >> >>> >> specifying the name of the temp ods table to create, and the > second > >> >>> >> processor is to get all flowfiles from the first processor and > >> create > >> >>> all > >> >>> >> the temp ods tables specified in the flowfiles' attribute. > >> >>> >> I found inside the app log that one of the temp table name > already > >> >>> existed > >> >>> >> when trying to create the temp table, and it caused sql > exception. > >> >>> >> After taking some time investigating in the log, I found the sql > >> query > >> >>> was > >> >>> >> executed twice in the second processor, once before nifi restart, > >> the > >> >>> second > >> >>> >> execution was done right after nifi restart: > >> >>> >> > >> >>> >> 2017-12-25 01:32:35,639 ERROR [Timer-Driven Process Thread-7] > >> >>> >> c.z.nifi.processors.ExecuteSqlCommand > >> >>> >> ExecuteSqlCommand[id=3c97dfd8-aaa4-3a37-626e-fed5a4822d14] > >> >>> 执行sql语句失败:SELECT > >> >>> >> TOP 0 * INTO tmp.ods_bd_e_reason_20171225013007005_5567 FROM > >> >>> >> dbo.ods_bd_e_reason; > >> >>> >> > >> >>> >> > >> >>> >> I have read the document of nifi in depth but I'm still not very > >> aware > >> >>> of > >> >>> >> nifi's internal mechanism, my suspect is nifi didn't manage to > >> >>> checkpoint > >> >>> >> the flowfile's state(which queue it was in) in memory into > flowfile > >> >>> >> repository > >> >>> >> before it was dead and after restarting it recovered the > flowfile's > >> >>> state > >> >>> >> from flowfile repository and then the flowfile went through the > >> second > >> >>> >> processor again and thus the sql was executed twice. Is this > >> correct? > >> >>> >> > >> >>> >> I've attached the relevant part of app log, thanks. > >> >>> >> > >> >>> >> Regards, > >> >>> >> Ben > >> >>> > >> >> > >> >> > >> >