Running this gave 16/01/12 04:06:54 INFO BlockManagerMaster: Registered BlockManagerError in writeJobj(con, object) : invalid jobj 3
How does it know which hive schema to connect to? On Tue, Jan 12, 2016 at 2:34 PM, Felix Cheung <felixcheun...@hotmail.com> wrote: > It looks like you have overwritten sc. Could you try this: > > > Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client") > > .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths())) > library(SparkR) > > sc <- sparkR.init() > hivecontext <- sparkRHive.init(sc) > df <- loadDF(hivecontext, "/data/ingest/sparktest1/", "orc") > > > > ------------------------------ > Date: Tue, 12 Jan 2016 14:28:58 +0530 > Subject: Re: sparkR ORC support. > From: sand...@infoworks.io > To: felixcheun...@hotmail.com > CC: yblia...@gmail.com; user@spark.apache.org; premsure...@gmail.com; > deepakmc...@gmail.com > > > The code is very simple, pasted below . > hive-site.xml is in spark conf already. I still see this error > > Error in writeJobj(con, object) : invalid jobj 3 > > after running the script below > > > script > ======= > Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client") > > > .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths())) > library(SparkR) > > sc <<- sparkR.init() > sc <<- sparkRHive.init() > hivecontext <<- sparkRHive.init(sc) > df <- loadDF(hivecontext, "/data/ingest/sparktest1/", "orc") > #View(df) > > > On Wed, Jan 6, 2016 at 11:08 PM, Felix Cheung <felixcheun...@hotmail.com> > wrote: > > Yes, as Yanbo suggested, it looks like there is something wrong with the > sqlContext. > > Could you forward us your code please? > > > > > > On Wed, Jan 6, 2016 at 5:52 AM -0800, "Yanbo Liang" <yblia...@gmail.com> > wrote: > > You should ensure your sqlContext is HiveContext. > > sc <- sparkR.init() > > sqlContext <- sparkRHive.init(sc) > > > 2016-01-06 20:35 GMT+08:00 Sandeep Khurana <sand...@infoworks.io>: > > Felix > > I tried the option suggested by you. It gave below error. I am going to > try the option suggested by Prem . > > Error in writeJobj(con, object) : invalid jobj 1 > 8 > stop("invalid jobj ", value$id) > 7 > writeJobj(con, object) > 6 > writeObject(con, a) > 5 > writeArgs(rc, args) > 4 > invokeJava(isStatic = TRUE, className, methodName, ...) > 3 > callJStatic("org.apache.spark.sql.api.r.SQLUtils", "loadDF", sqlContext, > source, options) > 2 > read.df(sqlContext, filepath, "orc") at > spark_api.R#108 > > On Wed, Jan 6, 2016 at 10:30 AM, Felix Cheung <felixcheun...@hotmail.com> > wrote: > > Firstly I don't have ORC data to verify but this should work: > > df <- loadDF(sqlContext, "data/path", "orc") > > Secondly, could you check if sparkR.stop() was called? sparkRHive.init() > should be called after sparkR.init() - please check if there is any error > message there. > > _____________________________ > From: Prem Sure <premsure...@gmail.com> > Sent: Tuesday, January 5, 2016 8:12 AM > Subject: Re: sparkR ORC support. > To: Sandeep Khurana <sand...@infoworks.io> > Cc: spark users <user@spark.apache.org>, Deepak Sharma < > deepakmc...@gmail.com> > > > > Yes Sandeep, also copy hive-site.xml too to spark conf directory. > > > On Tue, Jan 5, 2016 at 10:07 AM, Sandeep Khurana <sand...@infoworks.io> > wrote: > > Also, do I need to setup hive in spark as per the link > http://stackoverflow.com/questions/26360725/accesing-hive-tables-in-spark > ? > > We might need to copy hdfs-site.xml file to spark conf directory ? > > On Tue, Jan 5, 2016 at 8:28 PM, Sandeep Khurana <sand...@infoworks.io> > wrote: > > Deepak > > Tried this. Getting this error now > > rror in sql(hivecontext, "FROM CATEGORIES SELECT category_id", "") : unused > argument ("") > > > On Tue, Jan 5, 2016 at 6:48 PM, Deepak Sharma <deepakmc...@gmail.com> > wrote: > > Hi Sandeep > can you try this ? > > results <- sql(hivecontext, "FROM test SELECT id","") > > Thanks > Deepak > > > On Tue, Jan 5, 2016 at 5:49 PM, Sandeep Khurana <sand...@infoworks.io> > wrote: > > Thanks Deepak. > > I tried this as well. I created a hivecontext with "hivecontext <<- > sparkRHive.init(sc) " . > > When I tried to read hive table from this , > > results <- sql(hivecontext, "FROM test SELECT id") > > I get below error, > > Error in callJMethod(sqlContext, "sql", sqlQuery) : Invalid jobj 2. If > SparkR was restarted, Spark operations need to be re-executed. > > > Not sure what is causing this? Any leads or ideas? I am using rstudio. > > > > On Tue, Jan 5, 2016 at 5:35 PM, Deepak Sharma <deepakmc...@gmail.com> > wrote: > > Hi Sandeep > I am not sure if ORC can be read directly in R. > But there can be a workaround .First create hive table on top of ORC files > and then access hive table in R. > > Thanks > Deepak > > On Tue, Jan 5, 2016 at 4:57 PM, Sandeep Khurana <sand...@infoworks.io> > wrote: > > Hello > > I need to read an ORC files in hdfs in R using spark. I am not able to > find a package to do that. > > Can anyone help with documentation or example for this purpose? > > -- > Architect > Infoworks.io <http://infoworks.io> > http://Infoworks.io > > > > > -- > Thanks > Deepak > www.bigdatabig.com > www.keosha.net > > > > > -- > Architect > Infoworks.io <http://infoworks.io> > http://Infoworks.io > > > > > -- > Thanks > Deepak > www.bigdatabig.com > www.keosha.net > > > > > -- > Architect > Infoworks.io <http://infoworks.io> > http://Infoworks.io > > > > > -- > Architect > Infoworks.io <http://infoworks.io> > http://Infoworks.io > > > > > > > > -- > Architect > Infoworks.io > http://Infoworks.io > > > > > > -- > Architect > Infoworks.io > http://Infoworks.io > -- Architect Infoworks.io http://Infoworks.io