Thanks for the response Sean. I have seen the NPE on similar issues very consistently and assumed that could be the reason :) Thanks for clarifying. regards Sunita
On Tue, Oct 25, 2016 at 10:11 PM, Sean Owen <so...@cloudera.com> wrote: > This usage is fine, because you are only using the HiveContext locally on > the driver. It's applied in a function that's used on a Scala collection. > > You can't use the HiveContext or SparkContext in a distribution operation. > It has nothing to do with for loops. > > The fact that they're serializable is misleading. It's there, I believe, > because these objects may be inadvertently referenced in the closure of a > function that executes remotely, yet doesn't use the context. The closure > cleaner can't always remove this reference. The task would fail to > serialize even though it doesn't use the context. You will find these > objects serialize but then don't work if used remotely. > > The NPE you see is an unrelated cosmetic problem that was fixed in 2.0.1 > IIRC. > > > On Wed, Oct 26, 2016 at 4:28 AM Ajay Chander <itsche...@gmail.com> wrote: > >> Hi Everyone, >> >> I was thinking if I can use hiveContext inside foreach like below, >> >> object Test { >> def main(args: Array[String]): Unit = { >> >> val conf = new SparkConf() >> val sc = new SparkContext(conf) >> val hiveContext = new HiveContext(sc) >> >> val dataElementsFile = args(0) >> val deDF = >> hiveContext.read.text(dataElementsFile).toDF("DataElement").coalesce(1).distinct().cache() >> >> def calculate(de: Row) { >> val dataElement = de.getAs[String]("DataElement").trim >> val df1 = hiveContext.sql("SELECT cyc_dt, supplier_proc_i, '" + >> dataElement + "' as data_elm, " + dataElement + " as data_elm_val FROM >> TEST_DB.TEST_TABLE1 ") >> df1.write.insertInto("TEST_DB.TEST_TABLE1") >> } >> >> deDF.collect().foreach(calculate) >> } >> } >> >> >> I looked at >> https://spark.apache.org/docs/1.6.0/api/scala/index.html#org.apache.spark.sql.hive.HiveContext >> and I see it is extending SqlContext which extends Logging with >> Serializable. >> >> Can anyone tell me if this is the right way to use it ? Thanks for your time. >> >> Regards, >> >> Ajay >> >>