Hi Ben, You can make NiFi log more verbose by editing: NIFI_HOME/conf/logback.xml
For example, adding following entry will reveal how NiFi repositories run: <logger name="org.apache.nifi" level="INFO"/> <!-- Example --> <logger name="org.apache.nifi.controller.repository" level="DEBUG"/> <logger name="org.apache.nifi.processors" level="WARN"/> Thanks, Koji On Mon, Dec 25, 2017 at 4:30 PM, 尹文才 <batman...@gmail.com> wrote: > Hi Koji, I also didn't find anything related to the unexpected shutdown in > my logs, is there anything I could do to make NIFI log more verbose > information to the logs? > > Regards, > Ben > > 2017-12-25 14:56 GMT+08:00 Koji Kawamura <ijokaruma...@gmail.com>: > >> Hi Ben, >> >> I looked at the log and I expected to see some indication for the >> cause of shutdown, but couldn't find any. >> The PersistentProvenanceRepository rate warning is just a warning, and >> it shouldn't be the trigger of an unexpected shutdown. I suspect other >> reasons such as OOM killer, but I can't do any further investigation >> with only these logs. >> >> Thanks, >> Koji >> >> On Mon, Dec 25, 2017 at 3:46 PM, 尹文才 <batman...@gmail.com> wrote: >> > Hi Koji, one more thing, do you have any idea why my first issue leads to >> > the unexpected shutdown of NIFI? according to the words, it will just >> slow >> > down the flow. thanks. >> > >> > Regards, >> > Ben >> > >> > 2017-12-25 14:31 GMT+08:00 尹文才 <batman...@gmail.com>: >> > >> >> Hi Koji, thanks for your help, for the first issue, I will switch to use >> >> the WriteAheadProvenanceReopsitory implementation. >> >> >> >> For the second issue, I have uploaded the relevant part of my log file >> >> onto my google drive, the link is: >> >> https://drive.google.com/open?id=1oxAkSUyYZFy6IWZSeWqHI8e9Utnw1XAj >> >> >> >> You mean a custom processor could possibly process a flowfile twice only >> >> when it's trying to commit the session but it's interrupted so the >> flowfile >> >> still remains inside the original queue(like NIFI went down)? >> >> >> >> If you need to see the full log file, please let me know, thanks. >> >> >> >> Regards, >> >> Ben >> >> >> >> 2017-12-25 13:51 GMT+08:00 Koji Kawamura <ijokaruma...@gmail.com>: >> >> >> >>> Hi Ben, >> >>> >> >>> For your 2nd issue, NiFi commits a process session in Processor >> >>> onTrigger when it's executed by NiFi flow engine by calling >> >>> session.commit(). >> >>> https://github.com/apache/nifi/blob/master/nifi-api/src/main >> >>> /java/org/apache/nifi/processor/AbstractProcessor.java#L28 >> >>> Once a process session is committed, the FlowFile state (including >> >>> which queue it is in) is persisted to disk. >> >>> >> >>> It's possible for a Processor to process the same FlowFile more than >> >>> once, if it has done its job, but failed to commit the session. >> >>> For example, if your custom processor created a temp table from a >> >>> FlowFile. Then before the process session is committed, something >> >>> happened and NiFi process session was rollback. In this case, the >> >>> target database is already updated (the temp table is created), but >> >>> NiFi FlowFile stays in the incoming queue. If the FlowFile is >> >>> processed again, the processor will get an error indicating the table >> >>> already exists. >> >>> >> >>> I tried to look at the logs you attached, but attachments do not seem >> >>> to be delivered to this ML. I don't see anything attached. >> >>> >> >>> Thanks, >> >>> Koji >> >>> >> >>> >> >>> On Mon, Dec 25, 2017 at 1:43 PM, Koji Kawamura <ijokaruma...@gmail.com >> > >> >>> wrote: >> >>> > Hi Ben, >> >>> > >> >>> > Just a quick recommendation for your first issue, 'The rate of the >> >>> > dataflow is exceeding the provenance recording rate' warning message. >> >>> > I'd recommend using WriteAheadProvenanceRepository instead of >> >>> > PersistentProvenanceRepository. WriteAheadProvenanceRepository >> >>> > provides better performance. >> >>> > Please take a look at the documentation here. >> >>> > https://nifi.apache.org/docs/nifi-docs/html/administration-g >> >>> uide.html#provenance-repository >> >>> > >> >>> > Thanks, >> >>> > Koji >> >>> > >> >>> > On Mon, Dec 25, 2017 at 12:56 PM, 尹文才 <batman...@gmail.com> wrote: >> >>> >> Hi guys, I'm using nifi 1.4.0 to do some ETL work in my team and I >> have >> >>> >> encountered 2 problems during my testing. >> >>> >> >> >>> >> The first problem is I found the nifi bulletin board was showing the >> >>> >> following warning to me: >> >>> >> >> >>> >> 2017-12-25 01:31:00,460 WARN [Provenance Maintenance Thread-1] >> >>> >> o.a.n.p.PersistentProvenanceRepository The rate of the dataflow is >> >>> exceeding >> >>> >> the provenance recording rate. Slowing down flow to accommodate. >> >>> Currently, >> >>> >> there are 96 journal files (158278228 bytes) and threshold for >> >>> blocking is >> >>> >> 80 (1181116006 bytes) >> >>> >> >> >>> >> I don't quite understand what this means, and I found also inside >> the >> >>> >> bootstrap log that nifi restarted itself: >> >>> >> >> >>> >> 2017-12-25 01:31:19,249 WARN [main] org.apache.nifi.bootstrap. >> RunNiFi >> >>> Apache >> >>> >> NiFi appears to have died. Restarting... >> >>> >> >> >>> >> Is there anything I could do so solve this problem? >> >>> >> >> >>> >> The second problem is about the FlowFiles inside my flow, I actually >> >>> >> implemented a few custom processors to do the ETL work. one is to >> >>> extract >> >>> >> multiple tables from sql server and for each flowfile out of it, it >> >>> contains >> >>> >> an attribute >> >>> >> specifying the name of the temp ods table to create, and the second >> >>> >> processor is to get all flowfiles from the first processor and >> create >> >>> all >> >>> >> the temp ods tables specified in the flowfiles' attribute. >> >>> >> I found inside the app log that one of the temp table name already >> >>> existed >> >>> >> when trying to create the temp table, and it caused sql exception. >> >>> >> After taking some time investigating in the log, I found the sql >> query >> >>> was >> >>> >> executed twice in the second processor, once before nifi restart, >> the >> >>> second >> >>> >> execution was done right after nifi restart: >> >>> >> >> >>> >> 2017-12-25 01:32:35,639 ERROR [Timer-Driven Process Thread-7] >> >>> >> c.z.nifi.processors.ExecuteSqlCommand >> >>> >> ExecuteSqlCommand[id=3c97dfd8-aaa4-3a37-626e-fed5a4822d14] >> >>> 执行sql语句失败:SELECT >> >>> >> TOP 0 * INTO tmp.ods_bd_e_reason_20171225013007005_5567 FROM >> >>> >> dbo.ods_bd_e_reason; >> >>> >> >> >>> >> >> >>> >> I have read the document of nifi in depth but I'm still not very >> aware >> >>> of >> >>> >> nifi's internal mechanism, my suspect is nifi didn't manage to >> >>> checkpoint >> >>> >> the flowfile's state(which queue it was in) in memory into flowfile >> >>> >> repository >> >>> >> before it was dead and after restarting it recovered the flowfile's >> >>> state >> >>> >> from flowfile repository and then the flowfile went through the >> second >> >>> >> processor again and thus the sql was executed twice. Is this >> correct? >> >>> >> >> >>> >> I've attached the relevant part of app log, thanks. >> >>> >> >> >>> >> Regards, >> >>> >> Ben >> >>> >> >> >> >> >>