Problem with Spark SQL UserDefinedType and sbt assembly

Jaonary Rabarisoa Thu, 16 Apr 2015 08:05:23 -0700

Dear all,

Here is an issue that gets me mad. I wrote a UserDefineType in order to be
able to store a custom type in a parquet file. In my code I just create a
DataFrame with my custom data type and write in into a parquet file. When I
run my code directly inside idea every thing works like a charm. But when I
create the assembly jar with sbt assembly and run the same code with
spark-submit I get the following error :


*15/04/16 17:02:17 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID
0)*
*java.lang.IllegalArgumentException: Unsupported dataType:
{"type":"struct","fields":[{"name":"metadata","type":{"type":"udt","class":"org.apache.spark.vision.types.ImageMetadataUDT","pyClass":null,"sqlType":{"type":"struct","fields":[{"name":"name","type":"string","nullable":true,"metadata":{}},{"name":"encoding","type":"string","nullable":true,"metadata":{}},{"name":"cameraId","type":"string","nullable":true,"metadata":{}},{"name":"timestamp","type":"string","nullable":true,"metadata":{}},{"name":"frameId","type":"string","nullable":true,"metadata":{}}]}},"nullable":true,"metadata":{}}]},
[1.1] failure: `TimestampType' expected but `{' found*

*{"type":"struct","fields":[{"name":"metadata","type":{"type":"udt","class":"org.apache.spark.vision.types.ImageMetadataUDT","pyClass":null,"sqlType":{"type":"struct","fields":[{"name":"name","type":"string","nullable":true,"metadata":{}},{"name":"encoding","type":"string","nullable":true,"metadata":{}},{"name":"cameraId","type":"string","nullable":true,"metadata":{}},{"name":"timestamp","type":"string","nullable":true,"metadata":{}},{"name":"frameId","type":"string","nullable":true,"metadata":{}}]}},"nullable":true,"metadata":{}}]}*
*^*
*        at
org.apache.spark.sql.types.DataType$CaseClassStringParser$.apply(dataTypes.scala:163)*
*        at
org.apache.spark.sql.types.DataType$.fromCaseClassString(dataTypes.scala:98)*
*        at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$6.apply(ParquetTypes.scala:402)*
*        at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$6.apply(ParquetTypes.scala:402)*
*        at scala.util.Try.getOrElse(Try.scala:77)*
*        at
org.apache.spark.sql.parquet.ParquetTypesConverter$.convertFromString(ParquetTypes.scala:402)*
*        at
org.apache.spark.sql.parquet.RowWriteSupport.init(ParquetTableSupport.scala:145)*
*        at
parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:278)*
*        at
parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:252)*
*        at org.apache.spark.sql.parquet.ParquetRelation2.org
<http://org.apache.spark.sql.parquet.ParquetRelation2.org>$apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:691)*
*        at
org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:713)*
*        at
org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:713)*
*        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)*
*        at org.apache.spark.scheduler.Task.run(Task.scala:64)*
*        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:210)*
*        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*
*        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*
*        at java.lang.Thread.run(Thread.java:745)*

Problem with Spark SQL UserDefinedType and sbt assembly

Reply via email to