Hello,

I developed a Spark application for executing adhoc queries on In-Memory
Hive tables. Therefore I am using a Spark and a Hive Context. During the
startup process of the application some Hive tables are being loaded
In-Memory within the Hive Context automatically. By using the sql()- method
of the DataFrame API I am able to execute sql queries on the In- Memory
tables and show the resultset in a UI.

Now I want to execute R- queries on the same In- Memory Dataset. The SparkR
code I use is like this:

//start Spark Context
re.eval("sc <- sparkR.init('yarn-client', 'SparkRServer')");

//start Hive Context
re.eval("hiveContext <- sparkRHive.init(sc)");

The code is working, but it is starting a new Spark Driver with a new
Spark- and Hive Context. Because of that it is not possible to execute the
R- queries on my In- Memory Dataset. One solution could be to load the Hive
Tables In- Memory again, but it don't want to wast RAM by having two
identical In- Memory instances of the same tables.

So my question is, if it is possible to start SparkR with the same Spark-
and Hive Context I am already using for my adhoc application?

kind regards
Tobias

Reply via email to