Option only works when you are going from case classes.  Just put null into
the Row, when you want the value to be null.

On Tue, May 5, 2015 at 9:00 AM, Masf <masfwo...@gmail.com> wrote:

> Hi.
>
> I have a spark application where I store the results into table (with
> HiveContext). Some of these columns allow nulls. In Scala, this columns are
> represented through Option[Int] or Option[Double].. Depend on the data type.
>
> For example:
>
> *val hc = new HiveContext(sc)*
> *var col1: Option[Ingeger] = None*
> *...*
>
> *val myRow = org.apache.spark.sql.Row(col1, ...)*
>
> *val mySchema = StructType(Array(StructField("Column1", IntegerType,
> true)))*
>
> *val TableOutputSchemaRDD = hc.applySchema(myRow, mySchema)*
> *hc.registerRDDAsTable(TableOutputSchemaRDD, "result_intermediate")*
> *hc.sql("CREATE TABLE table_output STORED AS PARQUET AS SELECT * FROM
> result_intermediate")*
>
> Produce:
>
> java.lang.ClassCastException: scala.Some cannot be cast to
> java.lang.Integer
> at
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaIntObjectInspector.get(JavaIntObjectInspector.java:40)
> at
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createPrimitive(ParquetHiveSerDe.java:247)
> at
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createObject(ParquetHiveSerDe.java:301)
> at
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createStruct(ParquetHiveSerDe.java:178)
> at
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.serialize(ParquetHiveSerDe.java:164)
> at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:123)
> at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:114)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org
> $apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:114)
> at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93)
> at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> at org.apache.spark.scheduler.Task.run(Task.scala:56)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
>
>
>
> Thanks!!!!!
> --
>
> Regards.
> Miguel Ángel
>

Reply via email to