[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16312 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16312#discussion_r95074414 --- Diff: R/pkg/R/mllib_utils.R --- @@ -81,9 +80,11 @@ predict_internal <- function(object, newData) { #' model <- read.ml(path) #' } #' @note read.ml since 2.0.0 -read.ml <- function(path, sparkSession = NULL) { +read.ml <- function(path) { path <- suppressWarnings(normalizePath(path)) - jobj <- callJStatic("org.apache.spark.ml.r.RWrappers", "load", path, sparkSession) + sparkSession <- getSparkSession() + callJStatic("org.apache.spark.ml.r.RWrappers", "session", sparkSession) --- End diff -- Here we call ```session``` method to set spark session, the function name is really confusing, may be named as ```setSession``` should be better, but we use the current name from the very beginning. There is no other places need to be updated, since here is the only place calling ```RWrappers```. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16312#discussion_r95068898 --- Diff: R/pkg/R/mllib_utils.R --- @@ -81,9 +80,11 @@ predict_internal <- function(object, newData) { #' model <- read.ml(path) #' } #' @note read.ml since 2.0.0 -read.ml <- function(path, sparkSession = NULL) { +read.ml <- function(path) { path <- suppressWarnings(normalizePath(path)) - jobj <- callJStatic("org.apache.spark.ml.r.RWrappers", "load", path, sparkSession) + sparkSession <- getSparkSession() + callJStatic("org.apache.spark.ml.r.RWrappers", "session", sparkSession) --- End diff -- I'm a bit confused by this - what does this do? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16312#discussion_r95059202 --- Diff: R/pkg/R/mllib_utils.R --- @@ -80,9 +81,9 @@ predict_internal <- function(object, newData) { #' model <- read.ml(path) #' } #' @note read.ml since 2.0.0 -read.ml <- function(path) { +read.ml <- function(path, sparkSession = NULL) { --- End diff -- Sounds good, updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16312#discussion_r95034128 --- Diff: R/pkg/R/mllib_utils.R --- @@ -80,9 +81,9 @@ predict_internal <- function(object, newData) { #' model <- read.ml(path) #' } #' @note read.ml since 2.0.0 -read.ml <- function(path) { +read.ml <- function(path, sparkSession = NULL) { --- End diff -- we should get this from the current default session instead of making it a parameter, like here https://github.com/apache/spark/blob/master/R/pkg/R/SQLContext.R#L136 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/16312 [SPARK-18862][SPARKR][ML] Split SparkR mllib.R into multiple files ## What changes were proposed in this pull request? SparkR ```mllib.R``` is getting bigger as we add more ML wrappers, I'd like to split it into multiple files to make us easy to maintain: * mllib_classification.R * mllib_clustering.R * mllib_recommendation.R * mllib_regression.R * mllib_stat.R * mllib_tree.R * mllib_utils.R Note: Only reorg, no actual code change. ## How was this patch tested? Existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yanboliang/spark spark-18862 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16312.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16312 commit 319d1ed7a56ea4e307a860fccffe943f9f20051b Author: Yanbo Liang Date: 2016-12-16T16:24:08Z Split SparkR mllib.R into multiple files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org