In your sample code, you can use hiveContext in the foreach as it is scala
List foreach operation which runs in driver side. But you cannot use
hiveContext in RDD.foreach



Ajay Chander <itsche...@gmail.com>于2016年10月26日周三 上午11:28写道:

> Hi Everyone,
>
> I was thinking if I can use hiveContext inside foreach like below,
>
> object Test {
>   def main(args: Array[String]): Unit = {
>
>     val conf = new SparkConf()
>     val sc = new SparkContext(conf)
>     val hiveContext = new HiveContext(sc)
>
>     val dataElementsFile = args(0)
>     val deDF = 
> hiveContext.read.text(dataElementsFile).toDF("DataElement").coalesce(1).distinct().cache()
>
>     def calculate(de: Row) {
>       val dataElement = de.getAs[String]("DataElement").trim
>       val df1 = hiveContext.sql("SELECT cyc_dt, supplier_proc_i, '" + 
> dataElement + "' as data_elm, " + dataElement + " as data_elm_val FROM 
> TEST_DB.TEST_TABLE1 ")
>       df1.write.insertInto("TEST_DB.TEST_TABLE1")
>     }
>
>     deDF.collect().foreach(calculate)
>   }
> }
>
>
> I looked at 
> https://spark.apache.org/docs/1.6.0/api/scala/index.html#org.apache.spark.sql.hive.HiveContext
>  and I see it is extending SqlContext which extends Logging with Serializable.
>
> Can anyone tell me if this is the right way to use it ? Thanks for your time.
>
> Regards,
>
> Ajay
>
>

Reply via email to