Re: hiveContext.sql NullPointerException

Cheng Lian Sun, 07 Jun 2015 02:52:27 -0700

Hi,

This is expected behavior. HiveContext.sql (and alsoDataFrame.registerTempTable) is only expected to be invoked on driverside. However, the closure passed to RDD.foreach is executed on executorside, where no viable HiveContext instance exists.


Cheng

On 6/7/15 10:06 AM, patcharee wrote:

Hi,
I try to insert data into a partitioned hive table. The groupByKey isto combine dataset into a partition of the hive table. After thegroupByKey, I converted the iterable[X] to DB by X.toList.toDF(). Butthe hiveContext.sql throws NullPointerException, see below. Anysuggestions? What could be wrong? Thanks!
val varWHeightFlatRDD =varWHeightRDD.flatMap(FlatMapUtilClass().flatKeyFromWrf).groupByKey()
      .foreach(
        x => {
          val zone = x._1._1
          val z = x._1._2
          val year = x._1._3
          val month = x._1._4
          val df_table_4dim = x._2.toList.toDF()
          df_table_4dim.registerTempTable("table_4Dim")
hiveContext.sql("INSERT OVERWRITE table 4dim partition(zone=" + zone + ",z=" + z + ",year=" + year + ",month=" + month + ") " +"select date, hh, x, y, height, u, v, w, ph, phb, t, p,pb, qvapor, qgraup, qnice, qnrain, tke_pbl, el_pbl from table_4Dim");
})


java.lang.NullPointerException
    at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:100)
atno.uni.computing.etl.LoadWrfIntoHiveOptReduce1$$anonfun$7.apply(LoadWrfIntoHiveOptReduce1.scala:113)atno.uni.computing.etl.LoadWrfIntoHiveOptReduce1$$anonfun$7.apply(LoadWrfIntoHiveOptReduce1.scala:103)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
atorg.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
    at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:798)
    at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:798)
atorg.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1511)atorg.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1511)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
    at org.apache.spark.scheduler.Task.run(Task.scala:64)
atorg.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: hiveContext.sql NullPointerException

Reply via email to