[ https://issues.apache.org/jira/browse/NIFI-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15532676#comment-15532676 ]
David Hicks commented on NIFI-2841: ----------------------------------- Ah, you're right. I shouldn't have included the bit about any and all files. I removed it from the description. I'm dealing with a sensitive data set, so I can't just send the file. I'm trying to track down how my files are created so I can send you one. > SplitAvro Processor is Broken > ----------------------------- > > Key: NIFI-2841 > URL: https://issues.apache.org/jira/browse/NIFI-2841 > Project: Apache NiFi > Issue Type: Bug > Reporter: David Hicks > Priority: Critical > > This is largely the fault of the Avro DataFileStream reader, but it's making > the processor unusable. The problem appears to occur when you make the > following series of calls (which happens because of the splitSize comparison): > reader.next() -> returns last element > reader.hasNext() -> returns false > reader.hasNext() -> returns true > reader.next() -> EOFException > org.apache.nifi.processor.exception.ProcessException: IOException thrown from > SplitAvro[id=22e03ca4-0151-4474-92fc-040e1fe12ab9]: java.io.EOFException > at > org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2013) > ~[na:na] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1.process(SplitAvro.java:250) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1851) > ~[na:na] > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1822) > ~[na:na] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter.split(SplitAvro.java:236) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.processors.avro.SplitAvro.onTrigger(SplitAvro.java:202) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) > [nifi-api-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_101] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [na:1.8.0_101] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [na:1.8.0_101] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [na:1.8.0_101] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_101] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_101] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] > Caused by: java.io.EOFException: null > at > org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:363) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:355) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1$1.process(SplitAvro.java:259) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:1998) > ~[na:na] > ... 17 common frames omitted -- This message was sent by Atlassian JIRA (v6.3.4#6332)