Inserting Nulls

Masf Tue, 05 May 2015 09:03:08 -0700

Hi.

I have a spark application where I store the results into table (with
HiveContext). Some of these columns allow nulls. In Scala, this columns are
represented through Option[Int] or Option[Double].. Depend on the data type.


For example:

*val hc = new HiveContext(sc)*
*var col1: Option[Ingeger] = None*
*...*

*val myRow = org.apache.spark.sql.Row(col1, ...)*

*val mySchema = StructType(Array(StructField("Column1", IntegerType,
true)))*

*val TableOutputSchemaRDD = hc.applySchema(myRow, mySchema)*
*hc.registerRDDAsTable(TableOutputSchemaRDD, "result_intermediate")*
*hc.sql("CREATE TABLE table_output STORED AS PARQUET AS SELECT * FROM
result_intermediate")*

Produce:

java.lang.ClassCastException: scala.Some cannot be cast to java.lang.Integer
at
org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaIntObjectInspector.get(JavaIntObjectInspector.java:40)
at
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createPrimitive(ParquetHiveSerDe.java:247)
at
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createObject(ParquetHiveSerDe.java:301)
at
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createStruct(ParquetHiveSerDe.java:178)
at
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.serialize(ParquetHiveSerDe.java:164)
at
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:123)
at
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:114)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org
$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:114)
at
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93)
at
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)




Thanks!!!!!
--

Regards.
Miguel Ángel

Inserting Nulls

Reply via email to