Hi all.
When I specify the number of partitions and save this RDD in parquet
format, my app fail. For example
selectTest.coalesce(28).saveAsParquetFile(hdfs://vm-clusterOutput)
However, it works well if I store data in text
selectTest.coalesce(28).saveAsTextFile(hdfs://vm-clusterOutput)
My
Thanks Sean, I forgot it
The ouput error is the following:
java.lang.ClassCastException: scala.math.BigDecimal cannot be cast to
org.apache.spark.sql.catalyst.types.decimal.Decimal
at
org.apache.spark.sql.parquet.MutableRowWriteSupport.consumeType(ParquetTableSupport.scala:359)
at
Hey Masf,
I’ve created SPARK-6360
https://issues.apache.org/jira/browse/SPARK-6360 to track this issue.
Detailed analysis is provided there. The TL;DR is, for Spark 1.1 and
1.2, if a SchemaRDD contains decimal or UDT column(s), after applying
any traditional RDD transformations (e.g.