subject:"\[Spark SQL\]\: unpredictable errors\: java.io.IOException\: can not read class org.apache.parquet.format.PageHeader"

Re: [Spark SQL]: unpredictable errors: java.io.IOException: can not read class org.apache.parquet.format.PageHeader

2022-12-19 Thread Eric Hanchrow

We’ve discovered a workaround for this; it’s described here<https://issues.apache.org/jira/browse/HADOOP-18521>. From: Eric Hanchrow Date: Thursday, December 8, 2022 at 17:03 To: user@spark.apache.org Subject: [Spark SQL]: unpredictable errors: java.io.IOException: can not read

[Spark SQL]: unpredictable errors: java.io.IOException: can not read class org.apache.parquet.format.PageHeader

2022-12-08 Thread Eric Hanchrow

My company runs java code that uses Spark to read from, and write to, Azure Blob storage. This code runs more or less 24x7. Recently we've noticed a few failures that leave stack traces in our logs; what they have in common are exceptions that look variously like Caused by: