Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14639#discussion_r75347400 --- Diff: R/pkg/R/sparkR.R --- @@ -344,6 +344,7 @@ sparkRHive.init <- function(jsc = NULL) { #' @note sparkR.session since 2.0.0 sparkR.session <- function( master = "", + deployMode = "", --- End diff -- Hmm, I think the standard way to do that in Spark is to have the deploy mode config: YARN cluster master=yarn deploy-mode=cluster YARN client master=yarn deploy-mode=client deploy-mode is default to client. In this as on the SparkSession API does it make sense to support deploy-mode cluster though? When running in the YARN cluster mode the driver JVM starts and then its companion R process. The R SparkSession API could not be able to change the master or the deploy-mode at the point R code is running? (Unless we want to support a remote R process connecting to the Spark driver running in the YARN cluster - another JIRA on this) In any other case only master is used (standalone, mesos). I think now one could still set spark.deployMode in the named list (though again won't be effective). Perhaps we just document this in the programming guide regrading how to use the SparkSession R API with YARN?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org