Hi All,
My reports were working fine with JSON data formats. Even with the Parquet
formats I am able to see the schema, counts & even the data via, show().
However, reports are crashing with the below stack-trace, especially when I
have data in languages like Japanese.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 10 in
stage 2023.0 failed 4 times, most recent failure: Lost task 10.3 in stage
2023.0 (TID 17158, ip-172-31-12-157.us-west-2.compute.internal):
java.lang.ClassCastException: optional binary element (UTF8) is not a group
at org.apache.parquet.schema.Type.asGroupType(Type.java:202)
at
org.apache.spark.sql.execution.datasources.parquet.ParquetReadSupport$.org$apache$spark$sql$execution$datasources$parquet$ParquetReadSupport$$clipParquetType(ParquetReadSupport.scala:131)
Regards,
Naveen
________________________________
This email is confidential and intended only for the use of the individual or
entity named above and may contain information that is privileged. If you are
not the intended recipient, you are notified that any dissemination,
distribution or copying of this email is strictly prohibited. If you have
received this email in error, please notify us immediately by return email or
telephone and destroy the original message. - This mail is sent via Sony Asia
Pacific Mail Gateway..