Re: GSOC - Implement an S3 filesystem for Python SDK
Hi Pasan! I answered with some links an tips in your previous post. You can find them here: https://lists.apache.org/thread.html/c6637178a0fa5e4f0b2f1b3fe8991b79863f384d2573b6f22cb5f3b2@%3Cuser.beam.apache.org%3E Best -P. On Tue, Mar 12, 2019 at 8:26 PM Pasan Kamburugamuwa < pasankamburugamu...@gmail.com> wrote: > Hi , > > I am a 3rd year software Engineering undergraduate at Sri Lanka Institute > of Information Technology(SLIIT), Sri Lanka. I am interested in this > project for GSOC 2019. I have gone through the document and I would like to > deep diving into the codebase. So can you please point me to any relevant > issues so I can get more familiar with the codebase(I am really interested > in this project) > > > > > > Thanks and best reagards > > Pasan kamburugamuwa > > > > > > >
Re: GSOC - Implement an S3 filesystem for Python SDK
Thanks for the help, Now I am going through the code and also study the flow. On Thu, Mar 14, 2019 at 3:49 AM Pablo Estrada wrote: > Hi Pasan! > I answered with some links an tips in your previous post. You can find > them here: > https://lists.apache.org/thread.html/c6637178a0fa5e4f0b2f1b3fe8991b79863f384d2573b6f22cb5f3b2@%3Cuser.beam.apache.org%3E > > Best > -P. > > On Tue, Mar 12, 2019 at 8:26 PM Pasan Kamburugamuwa < > pasankamburugamu...@gmail.com> wrote: > >> Hi , >> >> I am a 3rd year software Engineering undergraduate at Sri Lanka Institute >> of Information Technology(SLIIT), Sri Lanka. I am interested in this >> project for GSOC 2019. I have gone through the document and I would like to >> deep diving into the codebase. So can you please point me to any relevant >> issues so I can get more familiar with the codebase(I am really interested >> in this project) >> >> >> >> >> >> Thanks and best reagards >> >> Pasan kamburugamuwa >> >> >> >> >> >> >> >
java.lang.NoSuchFieldError: NULL_VALUE while reading parquet files
Hi All, I am trying to read the parquet snappy compressed file in the Apache beam using Flink Runner in the AWS EMR cluster. But getting the below error: Caused by: org.apache.beam.sdk.util.UserCodeException: java.lang.NoSuchFieldError: NULL_VALUE at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34) at org.apache.beam.sdk.io.parquet.ParquetIO$ReadFiles$ReadFn$DoFnInvoker.invokeProcessElement(Unknown Source) at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:275) at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:240) at org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:63) at org.apache.beam.runners.flink.translation.functions.FlinkDoFnFunction.mapPartition(FlinkDoFnFunction.java:128) at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103) at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503) at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoSuchFieldError: NULL_VALUE at org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:246) at org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:231) at org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:130) at org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:183) at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:156) at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135) at org.apache.beam.sdk.io.parquet.ParquetIO$ReadFiles$ReadFn.processElement(ParquetIO.java:215) Any help/suggestion is appreciated. Regards, Jitendra Jitendra Sharma