Re: Inserting Nulls

Michael Armbrust Tue, 05 May 2015 12:01:31 -0700

Option only works when you are going from case classes.  Just put null into
the Row, when you want the value to be null.


On Tue, May 5, 2015 at 9:00 AM, Masf <masfwo...@gmail.com> wrote:

> Hi.
>
> I have a spark application where I store the results into table (with
> HiveContext). Some of these columns allow nulls. In Scala, this columns are
> represented through Option[Int] or Option[Double].. Depend on the data type.
>
> For example:
>
> *val hc = new HiveContext(sc)*
> *var col1: Option[Ingeger] = None*
> *...*
>
> *val myRow = org.apache.spark.sql.Row(col1, ...)*
>
> *val mySchema = StructType(Array(StructField("Column1", IntegerType,
> true)))*
>
> *val TableOutputSchemaRDD = hc.applySchema(myRow, mySchema)*
> *hc.registerRDDAsTable(TableOutputSchemaRDD, "result_intermediate")*
> *hc.sql("CREATE TABLE table_output STORED AS PARQUET AS SELECT * FROM
> result_intermediate")*
>
> Produce:
>
> java.lang.ClassCastException: scala.Some cannot be cast to
> java.lang.Integer
> at
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaIntObjectInspector.get(JavaIntObjectInspector.java:40)
> at
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createPrimitive(ParquetHiveSerDe.java:247)
> at
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createObject(ParquetHiveSerDe.java:301)
> at
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.createStruct(ParquetHiveSerDe.java:178)
> at
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe.serialize(ParquetHiveSerDe.java:164)
> at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:123)
> at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1$1.apply(InsertIntoHiveTable.scala:114)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org
> $apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:114)
> at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93)
> at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:93)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> at org.apache.spark.scheduler.Task.run(Task.scala:56)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
>
>
>
> Thanks!!!!!
> --
>
> Regards.
> Miguel Ángel
>

Re: Inserting Nulls

Reply via email to