[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15883


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-15 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r88126472
  
--- Diff: R/pkg/R/mllib.R ---
@@ -870,7 +872,7 @@ setMethod("summary", signature(object = 
"LogisticRegressionModel"),
 #' @param ... additional arguments passed to the method.
 #' @return \code{spark.mlp} returns a fitted Multilayer Perceptron 
Classification Model.
 #' @rdname spark.mlp
-#' @aliases spark.mlp,SparkDataFrame-method
+#' @aliases spark.mlp,SparkDataFrame-method,formula-method
--- End diff --

this should be `#' @aliases spark.mlp,SparkDataFrame,formula-method` with 
one `-method` at the end


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-15 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r88126345
  
--- Diff: R/pkg/R/mllib.R ---
@@ -896,9 +898,10 @@ setMethod("summary", signature(object = 
"LogisticRegressionModel"),
 #' summary(savedModel)
 #' }
 #' @note spark.mlp since 2.1.0
-setMethod("spark.mlp", signature(data = "SparkDataFrame"),
-  function(data, layers, blockSize = 128, solver = "l-bfgs", 
maxIter = 100,
+setMethod("spark.mlp", signature(data = "SparkDataFrame", formula = 
"formula"),
+  function(data, formula, layers, blockSize = 128, solver = 
"l-bfgs", maxIter = 100,
--- End diff --

right, I can see that. basically it makes assumptions on the names of 
feature & label columns


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87965522
  
--- Diff: R/pkg/R/mllib.R ---
@@ -936,20 +939,23 @@ setMethod("predict", signature(object = 
"MultilayerPerceptronClassificationModel
 # Returns the summary of a Multilayer Perceptron Classification Model 
produced by \code{spark.mlp}
 
 #' @param object a Multilayer Perceptron Classification Model fitted by 
\code{spark.mlp}
-#' @return \code{summary} returns a list containing \code{labelCount}, 
\code{layers}, and
-#' \code{weights}. For \code{weights}, it is a numeric vector with 
length equal to
-#' the expected given the architecture (i.e., for 8-10-2 network, 
100 connection weights).
+#' @return \code{summary} returns a list containing \code{numOfInputs}, 
\code{numOfOutputs},
+#' \code{layers}, and \code{weights}. For \code{weights}, it is a 
numeric vector with
+#' length equal to the expected given the architecture (i.e., for 
8-10-2 network,
+#' 112 connection weights).
--- End diff --

Here should be 112 rather than 110, it should be calculated by 8 * 10 + 10 
* 2 + 10 + 2. You can refer the test case to verify it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87965291
  
--- Diff: R/pkg/inst/tests/testthat/test_mllib.R ---
@@ -395,46 +396,56 @@ test_that("spark.mlp", {
   model2 <- read.ml(modelPath)
   summary2 <- summary(model2)
 
-  expect_equal(summary2$labelCount, 3)
+  expect_equal(summary2$numOfInputs, 4)
+  expect_equal(summary2$numOfOutputs, 3)
   expect_equal(summary2$layers, c(4, 5, 4, 3))
   expect_equal(length(summary2$weights), 64)
 
   unlink(modelPath)
 
   # Test default parameter
-  model <- spark.mlp(df, layers = c(4, 5, 4, 3))
+  model <- spark.mlp(df, label ~ features, layers = c(4, 5, 4, 3))
   mlpPredictions <- collect(select(predict(model, mlpTestDF), 
"prediction"))
-  expect_equal(head(mlpPredictions$prediction, 10), c(1, 1, 1, 1, 0, 1, 2, 
2, 1, 0))
+  expect_equal(head(mlpPredictions$prediction, 10),
+   c("1.0", "1.0", "1.0", "1.0", "0.0", "1.0", "2.0", "2.0", 
"1.0", "0.0"))
 
   # Test illegal parameter
-  expect_error(spark.mlp(df, layers = NULL), "layers must be a integer 
vector with length > 1.")
-  expect_error(spark.mlp(df, layers = c()), "layers must be a integer 
vector with length > 1.")
-  expect_error(spark.mlp(df, layers = c(3)), "layers must be a integer 
vector with length > 1.")
+  expect_error(spark.mlp(df, label ~ features, layers = NULL),
+   "layers must be a integer vector with length > 1.")
+  expect_error(spark.mlp(df, label ~ features, layers = c()),
+   "layers must be a integer vector with length > 1.")
+  expect_error(spark.mlp(df, label ~ features, layers = c(3)),
--- End diff --

Added unit test for ```spark.mlp``` with R formula.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87965128
  
--- Diff: R/pkg/R/mllib.R ---
@@ -896,9 +898,10 @@ setMethod("summary", signature(object = 
"LogisticRegressionModel"),
 #' summary(savedModel)
 #' }
 #' @note spark.mlp since 2.1.0
--- End diff --

Yeah, I think so.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87965102
  
--- Diff: R/pkg/R/mllib.R ---
@@ -896,9 +898,10 @@ setMethod("summary", signature(object = 
"LogisticRegressionModel"),
 #' summary(savedModel)
 #' }
 #' @note spark.mlp since 2.1.0
-setMethod("spark.mlp", signature(data = "SparkDataFrame"),
-  function(data, layers, blockSize = 128, solver = "l-bfgs", 
maxIter = 100,
+setMethod("spark.mlp", signature(data = "SparkDataFrame", formula = 
"formula"),
+  function(data, formula, layers, blockSize = 128, solver = 
"l-bfgs", maxIter = 100,
--- End diff --

No, we must have ```formula```. The original PR is very tricky and even can 
be regarded as bug. We suppose the feature and label column as 
"features/label", but it should allow users to specify their own feature and 
label column by R formula. You can refer that almost all ML wrappers support R 
formula.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87964394
  
--- Diff: R/pkg/inst/tests/testthat/test_mllib.R ---
@@ -385,7 +386,7 @@ test_that("spark.mlp", {
   # Test predict method
   mlpTestDF <- df
   mlpPredictions <- collect(select(predict(model, mlpTestDF), 
"prediction"))
-  expect_equal(head(mlpPredictions$prediction, 6), c(0, 1, 1, 1, 1, 1))
+  expect_equal(head(mlpPredictions$prediction, 6), c("1.0", "0.0", "0.0", 
"0.0", "0.0", "0.0"))
--- End diff --

Yes, I think it is. But if the label of training data is numeric, the 
output prediction will be character. This may misleading some users, but I 
think it's OK if we clarify in the document. This issue is not very relevant to 
this PR, we can also discuss whether the behavior is ok for SparkR users in a 
separate thread.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87924839
  
--- Diff: R/pkg/inst/tests/testthat/test_mllib.R ---
@@ -395,46 +396,56 @@ test_that("spark.mlp", {
   model2 <- read.ml(modelPath)
   summary2 <- summary(model2)
 
-  expect_equal(summary2$labelCount, 3)
+  expect_equal(summary2$numOfInputs, 4)
+  expect_equal(summary2$numOfOutputs, 3)
   expect_equal(summary2$layers, c(4, 5, 4, 3))
   expect_equal(length(summary2$weights), 64)
 
   unlink(modelPath)
 
   # Test default parameter
-  model <- spark.mlp(df, layers = c(4, 5, 4, 3))
+  model <- spark.mlp(df, label ~ features, layers = c(4, 5, 4, 3))
   mlpPredictions <- collect(select(predict(model, mlpTestDF), 
"prediction"))
-  expect_equal(head(mlpPredictions$prediction, 10), c(1, 1, 1, 1, 0, 1, 2, 
2, 1, 0))
+  expect_equal(head(mlpPredictions$prediction, 10),
+   c("1.0", "1.0", "1.0", "1.0", "0.0", "1.0", "2.0", "2.0", 
"1.0", "0.0"))
 
   # Test illegal parameter
-  expect_error(spark.mlp(df, layers = NULL), "layers must be a integer 
vector with length > 1.")
-  expect_error(spark.mlp(df, layers = c()), "layers must be a integer 
vector with length > 1.")
-  expect_error(spark.mlp(df, layers = c(3)), "layers must be a integer 
vector with length > 1.")
+  expect_error(spark.mlp(df, label ~ features, layers = NULL),
+   "layers must be a integer vector with length > 1.")
+  expect_error(spark.mlp(df, label ~ features, layers = c()),
+   "layers must be a integer vector with length > 1.")
+  expect_error(spark.mlp(df, label ~ features, layers = c(3)),
--- End diff --

is there a case for formula != `label ~ features`?
link to my comment above 
https://github.com/apache/spark/pull/15883/files#r87923913


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87923954
  
--- Diff: R/pkg/R/mllib.R ---
@@ -896,9 +898,10 @@ setMethod("summary", signature(object = 
"LogisticRegressionModel"),
 #' summary(savedModel)
 #' }
 #' @note spark.mlp since 2.1.0
--- End diff --

we are targeting 2.1.0 for this change yes? otherwise it is a breaking 
signature change


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87923913
  
--- Diff: R/pkg/R/mllib.R ---
@@ -896,9 +898,10 @@ setMethod("summary", signature(object = 
"LogisticRegressionModel"),
 #' summary(savedModel)
 #' }
 #' @note spark.mlp since 2.1.0
-setMethod("spark.mlp", signature(data = "SparkDataFrame"),
-  function(data, layers, blockSize = 128, solver = "l-bfgs", 
maxIter = 100,
+setMethod("spark.mlp", signature(data = "SparkDataFrame", formula = 
"formula"),
+  function(data, formula, layers, blockSize = 128, solver = 
"l-bfgs", maxIter = 100,
--- End diff --

if without `formula` works before, is/should `formula` be optional then? 
with this change it will require it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87923822
  
--- Diff: R/pkg/R/mllib.R ---
@@ -896,9 +898,10 @@ setMethod("summary", signature(object = 
"LogisticRegressionModel"),
 #' summary(savedModel)
 #' }
 #' @note spark.mlp since 2.1.0
-setMethod("spark.mlp", signature(data = "SparkDataFrame"),
-  function(data, layers, blockSize = 128, solver = "l-bfgs", 
maxIter = 100,
+setMethod("spark.mlp", signature(data = "SparkDataFrame", formula = 
"formula"),
+  function(data, formula, layers, blockSize = 128, solver = 
"l-bfgs", maxIter = 100,
tol = 1E-6, stepSize = 0.03, seed = NULL, 
initialWeights = NULL) {
+formula <- paste(deparse(formula), collapse = "")
--- End diff --

should use paste0?
`paste0(deparse(formula), collapse = "")`
otherwise you get one space between each terms back.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87828863
  
--- Diff: R/pkg/inst/tests/testthat/test_mllib.R ---
@@ -385,7 +386,7 @@ test_that("spark.mlp", {
   # Test predict method
   mlpTestDF <- df
   mlpPredictions <- collect(select(predict(model, mlpTestDF), 
"prediction"))
-  expect_equal(head(mlpPredictions$prediction, 6), c(0, 1, 1, 1, 1, 1))
+  expect_equal(head(mlpPredictions$prediction, 6), c("1.0", "0.0", "0.0", 
"0.0", "0.0", "0.0"))
--- End diff --

Currently, SparkR ML classification algorithm wrappers supposes all labels 
are character, so the prediction is character as well. Users can cast it to 
numeric if it's applicable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87829586
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/r/MultilayerPerceptronClassifierWrapper.scala
 ---
@@ -73,25 +90,25 @@ private[r] object MultilayerPerceptronClassifierWrapper
   .setMaxIter(maxIter)
   .setTol(tol)
   .setStepSize(stepSize)
-  .setPredictionCol(PREDICTED_LABEL_COL)
+  .setFeaturesCol(rFormula.getFeaturesCol)
+  .setLabelCol(rFormula.getLabelCol)
+  .setPredictionCol(PREDICTED_LABEL_INDEX_COL)
 if (seed != null && seed.length > 0) mlp.setSeed(seed.toInt)
 if (initialWeights != null) {
   require(initialWeights.length > 0)
   mlp.setInitialWeights(Vectors.dense(initialWeights))
 }
 
+val idxToStr = new IndexToString()
+  .setInputCol(PREDICTED_LABEL_INDEX_COL)
+  .setOutputCol(PREDICTED_LABEL_COL)
+  .setLabels(labels)
+
 val pipeline = new Pipeline()
-  .setStages(Array(mlp))
+  .setStages(Array(rFormulaModel, mlp, idxToStr))
   .fit(data)
 
-val multilayerPerceptronClassificationModel: 
MultilayerPerceptronClassificationModel =
-
pipeline.stages.head.asInstanceOf[MultilayerPerceptronClassificationModel]
-
-val weights = multilayerPerceptronClassificationModel.weights.toArray
-val layersFromPipeline = multilayerPerceptronClassificationModel.layers
-val labelCount = data.select("label").distinct().count()
--- End diff --

```weights``` and ```layers``` can be get from the MLP model, so it's not 
necessary to store them. It's also not necessary to calculate ```labelCount``` 
by ```distinct``` which is very expensive. Actually we can get ```labelCount``` 
from the member variable ```layers``` of MLP model.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/15883#discussion_r87827463
  
--- Diff: R/pkg/R/mllib.R ---
@@ -936,20 +939,23 @@ setMethod("predict", signature(object = 
"MultilayerPerceptronClassificationModel
 # Returns the summary of a Multilayer Perceptron Classification Model 
produced by \code{spark.mlp}
 
 #' @param object a Multilayer Perceptron Classification Model fitted by 
\code{spark.mlp}
-#' @return \code{summary} returns a list containing \code{labelCount}, 
\code{layers}, and
-#' \code{weights}. For \code{weights}, it is a numeric vector with 
length equal to
-#' the expected given the architecture (i.e., for 8-10-2 network, 
100 connection weights).
+#' @return \code{summary} returns a list containing \code{numOfInputs}, 
\code{numOfOutputs},
+#' \code{layers}, and \code{weights}. For \code{weights}, it is a 
numeric vector with
+#' length equal to the expected given the architecture (i.e., for 
8-10-2 network,
+#' 100 connection weights).
 #' @rdname spark.mlp
 #' @export
 #' @aliases summary,MultilayerPerceptronClassificationModel-method
 #' @note summary(MultilayerPerceptronClassificationModel) since 2.1.0
 setMethod("summary", signature(object = 
"MultilayerPerceptronClassificationModel"),
   function(object) {
 jobj <- object@jobj
-labelCount <- callJMethod(jobj, "labelCount")
 layers <- unlist(callJMethod(jobj, "layers"))
+numOfInputs <- head(layers, n = 1)
+numOfOutputs <- tail(layers, n = 1)
 weights <- callJMethod(jobj, "weights")
-list(labelCount = labelCount, layers = layers, weights = 
weights)
+list(numOfInputs = numOfInputs, numOfOutputs = numOfOutputs,
--- End diff --

I changed the summary output to ```numOfInputs``` and ```numOfOutputs```, 
which is consistent with ```mlp``` in R ```RSNNS``` package. The original 
```labelCount``` is actually the ```numOfOutputs```, but the later one should 
be more descriptive as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15883: [SPARK-18438][SPARKR][ML] spark.mlp should suppor...

2016-11-14 Thread yanboliang
GitHub user yanboliang opened a pull request:

https://github.com/apache/spark/pull/15883

[SPARK-18438][SPARKR][ML] spark.mlp should support RFormula.

## What changes were proposed in this pull request?
```spark.mlp``` should support ```RFormula``` like other ML algorithm 
wrappers.
BTW, I did some cleanup and improvement for ```spark.mlp```.

## How was this patch tested?
Unit tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanboliang/spark spark-18438

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15883.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15883


commit 56a58fa22fd2243c536f71bca78ec65a15a44ecc
Author: Yanbo Liang 
Date:   2016-11-14T15:37:27Z

spark.mlp should support RFormula.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org