carlpayne commented on issue #25794:
URL: https://github.com/apache/beam/issues/25794#issuecomment-1472197118
> I compressed the .avro and the problem still exists:
>
> ```
> Exception in thread "main"
org.apache.beam.sdk.Pipeline$PipelineExecutionException:
java.lang.ClassCastException: class
java.nio.channels.Channels$ReadableByteChannelImpl cannot be cast to class
java.nio.channels.SeekableByteChannel
(java.nio.channels.Channels$ReadableByteChannelImpl and
java.nio.channels.SeekableByteChannel are in module java.base of loader
'bootstrap')
> at
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:374)
> at
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:342)
> at
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:218)
> at
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:323)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:309)
> at beam.demo.Demo.main(Demo.java:59)
> Caused by: java.lang.ClassCastException: class
java.nio.channels.Channels$ReadableByteChannelImpl cannot be cast to class
java.nio.channels.SeekableByteChannel
(java.nio.channels.Channels$ReadableByteChannelImpl and
java.nio.channels.SeekableByteChannel are in module java.base of loader
'bootstrap')
> at
org.apache.beam.sdk.io.AvroSource$AvroReader.startReading(AvroSource.java:743)
> at
org.apache.beam.sdk.io.CompressedSource$CompressedReader.startReading(CompressedSource.java:449)
> at
org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:479)
> at
org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:252)
> at
org.apache.beam.sdk.io.ReadAllViaFileBasedSource$ReadFileRangesFn.process(ReadAllViaFileBasedSource.java:140)
> ```
Correct, but the thing to observe is that this also didn't work with older
versions (hence no real regression issue). If I try the same with `2.34.0`
(running the demo after correctly gzipping the files), I get:
```
Exception in thread "main"
org.apache.beam.sdk.Pipeline$PipelineExecutionException:
java.lang.RuntimeException: Error reading metadata from file
Metadata{resourceId=/Users/cpayne/IdeaProjects/beam-demo/src/main/resources/twitter2.avro.gz,
sizeBytes=381, isReadSeekEfficient=false, checksum=null, lastModifiedMillis=0}
at
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:373)
at
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:341)
at
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:218)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:323)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:309)
at beam.demo.Demo.main(Demo.java:61)
Caused by: java.lang.RuntimeException: Error reading metadata from file
Metadata{resourceId=/Users/cpayne/IdeaProjects/beam-demo/src/main/resources/twitter2.avro.gz,
sizeBytes=381, isReadSeekEfficient=false, checksum=null, lastModifiedMillis=0}
at
org.apache.beam.sdk.io.AvroSource$AvroReader.startReading(AvroSource.java:841)
at
org.apache.beam.sdk.io.CompressedSource$CompressedReader.startReading(CompressedSource.java:449)
at
org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:479)
at
org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:252)
at
org.apache.beam.sdk.io.ReadAllViaFileBasedSource$ReadFileRangesFn.process(ReadAllViaFileBasedSource.java:143)
Caused by: java.io.IOException: Missing Avro file signature:
/Users/cpayne/IdeaProjects/beam-demo/src/main/resources/twitter2.avro.gz
at
org.apache.beam.sdk.io.AvroSource.readMetadataFromFile(AvroSource.java:477)
at
org.apache.beam.sdk.io.AvroSource$AvroReader.startReading(AvroSource.java:838)
at
org.apache.beam.sdk.io.CompressedSource$CompressedReader.startReading(CompressedSource.java:449)
at
org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:479)
at
org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:252)
at
org.apache.beam.sdk.io.ReadAllViaFileBasedSource$ReadFileRangesFn.process(ReadAllViaFileBasedSource.java:143)
at
org.apache.beam.sdk.io.ReadAllViaFileBasedSource$ReadFileRangesFn$DoFnInvoker.invokeProcessElement(Unknown
Source)
at
org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:232)
at
org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:191)
at
org.apache.beam.repackaged.direct_java.runners.core.SimplePushbackSideInputDoFnRunner.processElementInReadyWindows(SimplePushbackSideInputDoFnRunner.java:79)
at
org.apache.beam.runners.direct.ParDoEvaluator.processElement(ParDoEvaluator.java:244)
at
org.apache.beam.runners.direct.DoFnLifecycleManagerRemovingTransformEvaluator.processElement(DoFnLifecycleManagerRemovingTransformEvaluator.java:54)
at
org.apache.beam.runners.direct.DirectTransformExecutor.processElements(DirectTransformExecutor.java:165)
at
org.apache.beam.runners.direct.DirectTransformExecutor.run(DirectTransformExecutor.java:129)
```
Given all this, I don't think there's anything to fix here so
@aromanenko-dev we can close this ticket I think
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]