Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14639#discussion_r75347400
  
    --- Diff: R/pkg/R/sparkR.R ---
    @@ -344,6 +344,7 @@ sparkRHive.init <- function(jsc = NULL) {
     #' @note sparkR.session since 2.0.0
     sparkR.session <- function(
       master = "",
    +  deployMode = "",
    --- End diff --
    
    Hmm, I think the standard way to do that in Spark is to have the deploy 
mode config:
    YARN cluster
    master=yarn
    deploy-mode=cluster
    
    YARN client
    master=yarn
    deploy-mode=client
    
    deploy-mode is default to client.
    
    In this as on the SparkSession API does it make sense to support 
deploy-mode cluster though? When running in the YARN cluster mode the driver 
JVM starts and then its companion R process. The R SparkSession API could not 
be able to change the master or the deploy-mode at the point R code is running? 
(Unless we want to support a remote R process connecting to the Spark driver 
running in the YARN cluster - another JIRA on this)
    
    In any other case only master is used (standalone, mesos).
    
    I think now one could still set spark.deployMode in the named list (though 
again won't be effective).
    
    Perhaps we just document this in the programming guide regrading how to use 
the SparkSession R API with YARN?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to