Hi. I have a spark application where I store the results into table (with HiveContext). Some of these columns allow nulls. In Scala, this columns are represented through Option[Int] or Option[Double].. Depend on the data type.
For example: *val hc = new HiveContext(sc)* *var col1: Option[Ingeger] = None* *...* *val myRow = org.apache.spark.sql.Row(col1, ...)* *val mySchema = StructType(Array(StructField("Column1", IntegerType, true)))* *val TableOutputSchemaRDD = hc.applySchema(myRow, mySchema)* *hc.registerRDDAsTable(TableOutputSchemaRDD, "result_intermediate")* *hc.sql("CREATE TABLE table_output STORED AS PARQUET AS SELECT * FROM result_intermediate")* Produce: java.lang.ClassCastException: scala.Some cannot be cast to java.lang.Integer at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaIntObjectInspector.get(JavaIntObjectInspector.java:40) at org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createPrimitive(ParquetHiveSerDe.java:247) at org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createObject(ParquetHiveSerDe.java:301) at org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createStruct(ParquetHiveSerDe.java:178) at org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.serialize(ParquetHiveSerDe.java:164) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:123) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:114) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org $apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:114) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Thanks!!!!! -- Regards. Miguel Ángel