Hello, I developed a Spark application for executing adhoc queries on In-Memory Hive tables. Therefore I am using a Spark and a Hive Context. During the startup process of the application some Hive tables are being loaded In-Memory within the Hive Context automatically. By using the sql()- method of the DataFrame API I am able to execute sql queries on the In- Memory tables and show the resultset in a UI.
Now I want to execute R- queries on the same In- Memory Dataset. The SparkR code I use is like this: //start Spark Context re.eval("sc <- sparkR.init('yarn-client', 'SparkRServer')"); //start Hive Context re.eval("hiveContext <- sparkRHive.init(sc)"); The code is working, but it is starting a new Spark Driver with a new Spark- and Hive Context. Because of that it is not possible to execute the R- queries on my In- Memory Dataset. One solution could be to load the Hive Tables In- Memory again, but it don't want to wast RAM by having two identical In- Memory instances of the same tables. So my question is, if it is possible to start SparkR with the same Spark- and Hive Context I am already using for my adhoc application? kind regards Tobias