[ 
https://issues.apache.org/jira/browse/SPARK-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184806#comment-15184806
 ] 

Sun Rui commented on SPARK-13573:
---------------------------------

[~chipsenkbeil] It seems that Toree runs SparkR kernel and the RBackend in the 
same JVM, which is an interesting architecture that the SparkR design has not 
considered. I am not sure if the SparkR community can be convinced to adopt the 
proposed changes, as the R-JVM bridge and RBackend APIs are intended for 
internal use. If we made them public, that means we have to maintain the APIs 
stable, which may limit our future evolution.

One possible solution for Toree could be:
1. Use SparkR::: prefix to access all private methods;
2. Move SparkR.connect() to your sparkr_runner.R;
3. Keep using the existing ReflectiveRBackend to access RBackend.

Then generally you don't need maintain a fork of SparkR.

[~shivaram] any comments?


> Open SparkR APIs (R package) to allow better 3rd party usage
> ------------------------------------------------------------
>
>                 Key: SPARK-13573
>                 URL: https://issues.apache.org/jira/browse/SPARK-13573
>             Project: Spark
>          Issue Type: Improvement
>          Components: SparkR
>            Reporter: Chip Senkbeil
>
> Currently, SparkR's R package does not expose enough of its APIs to be used 
> flexibly. That I am aware of, SparkR still requires you to create a new 
> SparkContext by invoking the sparkR.init method (so you cannot connect to a 
> running one) and there is no way to invoke custom Java methods using the 
> exposed SparkR API (unlike PySpark).
> We currently maintain a fork of SparkR that is used to power the R 
> implementation of Apache Toree, which is a gateway to use Apache Spark. This 
> fork provides a connect method (to use an existing Spark Context), exposes 
> needed methods like invokeJava (to be able to communicate with our JVM to 
> retrieve code to run, etc), and uses reflection to access 
> org.apache.spark.api.r.RBackend.
> Here is the documentation I recorded regarding changes we need to enable 
> SparkR as an option for Apache Toree: 
> https://github.com/apache/incubator-toree/tree/master/sparkr-interpreter/src/main/resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to