Re: Spark 3.3.0/3.2.2: java.io.IOException: can not read class org.apache.parquet.format.PageHeader: don't know what type: 15

2022-09-01 Thread FengYu Cao
I will open a JIRA, but since it's our production event log, can't attach to it. try to setup a debugger to provider more information. Chao Sun 于2022年9月1日周四 23:06写道: > Hi Fengyu, > > Do you still have the Parquet file that caused the error? could you > open a JIRA and attach the file to it? I

Re: Spark 3.3.0/3.2.2: java.io.IOException: can not read class org.apache.parquet.format.PageHeader: don't know what type: 15

2022-09-01 Thread Chao Sun
Hi Fengyu, Do you still have the Parquet file that caused the error? could you open a JIRA and attach the file to it? I can take a look. Chao On Thu, Sep 1, 2022 at 4:03 AM FengYu Cao wrote: > > I'm trying to upgrade our spark (3.2.1 now) > > but with spark 3.3.0 and spark 3.2.2, we had error

Spark 3.3.0/3.2.2: java.io.IOException: can not read class org.apache.parquet.format.PageHeader: don't know what type: 15

2022-09-01 Thread FengYu Cao
I'm trying to upgrade our spark (3.2.1 now) but with spark 3.3.0 and spark 3.2.2, we had error with specific parquet file Is anyone else having the same problem as me? Or do I need to provide any information to the devs ? ``` org.apache.spark.SparkException: Job aborted due to stage failure: