Hi,

I recently upgraded our cluster to 1.9 from 1.7.1 and I have seen some weird 
behaviors relating to the flowfile repo.

Since upgrading i have had at-least 3 occurrences of the following error .

nifi-app_2019-03-07_15.0.log:Caused by: 
java.nio.file.FileAlreadyExistsException: 
/flowfile_repo/nifi-1.9.0/journals/overflow-6297272
nifi-app_2019-03-07_21.0.log:Caused by: 
java.nio.file.FileAlreadyExistsException: 
/flowfile_repo/nifi-1.9.0/journals/overflow-8385089
nifi-app_2019-03-08_00.0.log:Caused by: 
java.nio.file.FileAlreadyExistsException: 
/flowfile_repo/nifi-1.9.0/journals/overflow-9081799

This seems to result in the total loss of those flow files.
The flowfile_repo is located on 1+1 SSD drives.

I have however not found any trances of either "To many open files" or "That 
the SSD went full (out of disk) during the process".
The error also happened during a slow time of the day (1000 flowfiles /second)


The error starts like this, and then i got 4.5GB of errors in my logfiles (47 
logfiles @ 100mb).

2019-03-06 14:33:20,305 ERROR [Timer-Driven Process Thread-47] 
o.a.n.c.r.StandardProcessSession Failed to commit session 
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update. Will roll back.
2019-03-06 14:33:20,306 ERROR [Timer-Driven Process Thread-47] 
o.a.n.processors.standard.ConvertRecord 
ConvertRecord[id=32781c92-fab3-34de-b1ff-6138f8788701] Failed to commit session 
StandardProcessSession[id=244416156] due to 
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update; rolling back: 
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update
at 
org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:413)
at 
org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:343)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:256)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.FileAlreadyExistsException: 
/flowfile_repo/nifi-1.9.0/journals/overflow-874061
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at 
sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
at java.nio.file.Files.createDirectory(Files.java:674)
at 
org.apache.nifi.wali.LengthDelimitedJournal.update(LengthDelimitedJournal.java:249)
at 
org.apache.nifi.wali.SequentialAccessWriteAheadLog.update(SequentialAccessWriteAheadLog.java:123)
at 
org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.updateRepository(WriteAheadFlowFileRepository.java:309)
at 
org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.updateRepository(WriteAheadFlowFileRepository.java:259)
at 
org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:407)
... 10 common frames omitted
2019-03-06 14:33:20,337 ERROR [Timer-Driven Process Thread-53] 
o.a.n.c.r.StandardProcessSession Failed to commit session 
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update. Will roll back.
2019-03-06 14:33:20,337 ERROR [Timer-Driven Process Thread-53] 
o.a.n.processors.standard.ConvertRecord 
ConvertRecord[id=32781c92-fab3-34de-b1ff-6138f8788701] Failed to commit session 
StandardProcessSession[id=244415981] due to 
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update; rolling back: 
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update
at 
org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:413)
at 
org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:343)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:256)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Stream Closed
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at 
org.apache.nifi.wali.LengthDelimitedJournal.update(LengthDelimitedJournal.java:308)
at 
org.apache.nifi.wali.SequentialAccessWriteAheadLog.update(SequentialAccessWriteAheadLog.java:123)
at 
org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.updateRepository(WriteAheadFlowFileRepository.java:309)
at 
org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.updateRepository(WriteAheadFlowFileRepository.java:259)
at 
org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:407)
... 10 common frames omitted
.......MANY errors

After 26 rolls of the logfiles the checkpoint arrives
2019-03-06 14:34:53,671 INFO [pool-13-thread-1] 
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile 
Repository with 22744 records in 339 milliseconds

And then a new repository file fails
2019-03-06 14:34:54,287 ERROR [Timer-Driven Process Thread-93] 
o.a.n.processors.standard.ConvertRecord 
ConvertRecord[id=32781c92-fab3-34de-b1ff-6138f8788701] Failed to commit session 
StandardProcessSession[id=246475219] due to 
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update; rolling back: 
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update
org.apache.nifi.processor.exception.ProcessException: FlowFile Repository 
failed to update
at 
org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:413)
at 
org.apache.nifi.controller.repository.StandardProcessSession.commit(StandardProcessSession.java:343)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:256)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.FileAlreadyExistsException: 
/flowfile_repo/nifi-1.9.0/journals/overflow-876686


The following also start showing up after a while.
2019-03-06 14:33:22,363 ERROR [Timer-Driven Process Thread-19] 
o.a.n.c.r.c.StandardResourceClaimManager Decremented claimant count for 
StandardResourceClaim[id=1551879190172-8371, container=default, section=179] to 
-3333
2019-03-06 14:33:22,363 WARN [Timer-Driven Process Thread-53] 
o.a.n.c.r.c.StandardResourceClaimManager Decrementing claimant count for 
StandardResourceClaim[id=1551879161479-8345, container=default, section=153] 
but claimant count is not known. Returning -1
2019-03-06 14:33:22,363 WARN [Timer-Driven Process Thread-47] 
o.a.n.c.r.c.StandardResourceClaimManager Decrementing claimant count for 
StandardResourceClaim[id=1551879161479-8346, container=default, section=154] 
but claimant count is not known. Returning -1
2019-03-06 14:33:22,363 WARN [Timer-Driven Process Thread-47] 
o.a.n.c.r.c.StandardResourceClaimManager Decrementing claimant count for 
StandardResourceClaim[id=1551879161479-8345, container=default, section=153] 
but claimant count is not known. Returning -1
2019-03-06 14:33:22,363 ERROR [Timer-Driven Process Thread-94] 
o.a.n.c.r.c.StandardResourceClaimManager Decremented claimant count for 
StandardResourceClaim[id=1551879190172-8371, container=default, section=179] to 
-3334


Any takes on what I could look into more for details?
I have rolled back the cluster to 1.7.1 as of now, but I still have 1.9 on my 
test cluster so I will try to reproduce it in there.

Best regards
Viking

Reply via email to