[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15429942#comment-15429942 ]
Xiangrui Meng commented on SPARK-16578: --------------------------------------- [~shivaram] I had an offline discussion with [~junyangq] and I feel that we might have some misunderstanding of user scenarios. The old workflow for SparkR is the following: 1. Users download and install Spark distribution by themselves. 2. Users let R know where to find the SparkR package on local. 3. `library(SparkR)` 4. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. And the ideal workflow is the following: 1. install.packages("SparkR") 2. optionally `install.spark` 3. Launch driver/SparkContext (in client mode) and connect to a local or remote cluster. So the way we run spark-submit, RBackend, and R process, and create the SparkContext doesn't really change. They are still running on the same machine (e.g., user's laptop). So it is not necessary to make RBackend running remotely for this scenario. Having RBackend running remotely is a new Spark deployment mode and I think it requires more design and discussions. > Configurable hostname for RBackend > ---------------------------------- > > Key: SPARK-16578 > URL: https://issues.apache.org/jira/browse/SPARK-16578 > Project: Spark > Issue Type: Sub-task > Components: SparkR > Reporter: Shivaram Venkataraman > Assignee: Junyang Qian > > One of the requirements that comes up with SparkR being a standalone package > is that users can now install just the R package on the client side and > connect to a remote machine which runs the RBackend class. > We should check if we can support this mode of execution and what are the > pros / cons of it -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org