[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...

2017-01-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16312


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...

2017-01-08 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/16312#discussion_r95074414
  
--- Diff: R/pkg/R/mllib_utils.R ---
@@ -81,9 +80,11 @@ predict_internal <- function(object, newData) {
 #' model <- read.ml(path)
 #' }
 #' @note read.ml since 2.0.0
-read.ml <- function(path, sparkSession = NULL) {
+read.ml <- function(path) {
   path <- suppressWarnings(normalizePath(path))
-  jobj <- callJStatic("org.apache.spark.ml.r.RWrappers", "load", path, 
sparkSession)
+  sparkSession <- getSparkSession()
+  callJStatic("org.apache.spark.ml.r.RWrappers", "session", sparkSession)
--- End diff --

Here we call ```session``` method to set spark session, the function name 
is really confusing, may be named as ```setSession``` should be better, but we 
use the current name from the very beginning.
There is no other places need to be updated, since here is the only place 
calling ```RWrappers```. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...

2017-01-07 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16312#discussion_r95068898
  
--- Diff: R/pkg/R/mllib_utils.R ---
@@ -81,9 +80,11 @@ predict_internal <- function(object, newData) {
 #' model <- read.ml(path)
 #' }
 #' @note read.ml since 2.0.0
-read.ml <- function(path, sparkSession = NULL) {
+read.ml <- function(path) {
   path <- suppressWarnings(normalizePath(path))
-  jobj <- callJStatic("org.apache.spark.ml.r.RWrappers", "load", path, 
sparkSession)
+  sparkSession <- getSparkSession()
+  callJStatic("org.apache.spark.ml.r.RWrappers", "session", sparkSession)
--- End diff --

I'm a bit confused by this - what does this do?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...

2017-01-07 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/16312#discussion_r95059202
  
--- Diff: R/pkg/R/mllib_utils.R ---
@@ -80,9 +81,9 @@ predict_internal <- function(object, newData) {
 #' model <- read.ml(path)
 #' }
 #' @note read.ml since 2.0.0
-read.ml <- function(path) {
+read.ml <- function(path, sparkSession = NULL) {
--- End diff --

Sounds good, updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...

2017-01-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16312#discussion_r95034128
  
--- Diff: R/pkg/R/mllib_utils.R ---
@@ -80,9 +81,9 @@ predict_internal <- function(object, newData) {
 #' model <- read.ml(path)
 #' }
 #' @note read.ml since 2.0.0
-read.ml <- function(path) {
+read.ml <- function(path, sparkSession = NULL) {
--- End diff --

we should get this from the current default session instead of making it a 
parameter, like here
https://github.com/apache/spark/blob/master/R/pkg/R/SQLContext.R#L136


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16312: [SPARK-18862][SPARKR][ML] Split SparkR mllib.R in...

2016-12-16 Thread yanboliang
GitHub user yanboliang opened a pull request:

https://github.com/apache/spark/pull/16312

[SPARK-18862][SPARKR][ML] Split SparkR mllib.R into multiple files

## What changes were proposed in this pull request?
SparkR ```mllib.R``` is getting bigger as we add more ML wrappers, I'd like 
to split it into multiple files to make us easy to maintain:
* mllib_classification.R
* mllib_clustering.R
* mllib_recommendation.R
* mllib_regression.R
* mllib_stat.R
* mllib_tree.R
* mllib_utils.R

Note: Only reorg, no actual code change.

## How was this patch tested?
Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanboliang/spark spark-18862

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16312.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16312


commit 319d1ed7a56ea4e307a860fccffe943f9f20051b
Author: Yanbo Liang 
Date:   2016-12-16T16:24:08Z

Split SparkR mllib.R into multiple files




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org