Re: GSOC - Implement an S3 filesystem for Python SDK

2019-03-13 Thread Pablo Estrada
Hi Pasan!
I answered with some links an tips in your previous post. You can find them
here:
https://lists.apache.org/thread.html/c6637178a0fa5e4f0b2f1b3fe8991b79863f384d2573b6f22cb5f3b2@%3Cuser.beam.apache.org%3E

Best
-P.

On Tue, Mar 12, 2019 at 8:26 PM Pasan Kamburugamuwa <
pasankamburugamu...@gmail.com> wrote:

> Hi ,
>
> I am a 3rd year software Engineering undergraduate at Sri Lanka Institute
> of Information Technology(SLIIT), Sri Lanka. I am interested in this
> project for GSOC 2019. I have gone through the document and I would like to
> deep diving into the codebase. So can you please point me to any relevant
> issues so I can get more familiar with the codebase(I am really interested
> in this project)
>
>
>
>
>
> Thanks and best reagards
>
> Pasan kamburugamuwa
>
>
>
>
>
>
>


Re: GSOC - Implement an S3 filesystem for Python SDK

2019-03-13 Thread Pasan Kamburugamuwa
Thanks for the help, Now I am going through the code and also study the
flow.

On Thu, Mar 14, 2019 at 3:49 AM Pablo Estrada  wrote:

> Hi Pasan!
> I answered with some links an tips in your previous post. You can find
> them here:
> https://lists.apache.org/thread.html/c6637178a0fa5e4f0b2f1b3fe8991b79863f384d2573b6f22cb5f3b2@%3Cuser.beam.apache.org%3E
>
> Best
> -P.
>
> On Tue, Mar 12, 2019 at 8:26 PM Pasan Kamburugamuwa <
> pasankamburugamu...@gmail.com> wrote:
>
>> Hi ,
>>
>> I am a 3rd year software Engineering undergraduate at Sri Lanka Institute
>> of Information Technology(SLIIT), Sri Lanka. I am interested in this
>> project for GSOC 2019. I have gone through the document and I would like to
>> deep diving into the codebase. So can you please point me to any relevant
>> issues so I can get more familiar with the codebase(I am really interested
>> in this project)
>>
>>
>>
>>
>>
>> Thanks and best reagards
>>
>> Pasan kamburugamuwa
>>
>>
>>
>>
>>
>>
>>
>


java.lang.NoSuchFieldError: NULL_VALUE while reading parquet files

2019-03-13 Thread jitendra sharma
Hi All,

I am trying to read the parquet snappy compressed file in the Apache beam
using Flink Runner in the AWS EMR cluster. But getting the below error:

Caused by: org.apache.beam.sdk.util.UserCodeException:
java.lang.NoSuchFieldError: NULL_VALUE
at 
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34)
at 
org.apache.beam.sdk.io.parquet.ParquetIO$ReadFiles$ReadFn$DoFnInvoker.invokeProcessElement(Unknown
Source)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:275)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:240)
at 
org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:63)
at 
org.apache.beam.runners.flink.translation.functions.FlinkDoFnFunction.mapPartition(FlinkDoFnFunction.java:128)
at 
org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503)
at 
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchFieldError: NULL_VALUE
at 
org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:246)
at 
org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:231)
at 
org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:130)
at 
org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:183)
at 
org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:156)
at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135)
at 
org.apache.beam.sdk.io.parquet.ParquetIO$ReadFiles$ReadFn.processElement(ParquetIO.java:215)


Any help/suggestion is appreciated.

Regards,
Jitendra
Jitendra Sharma