[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-04-02 Thread BenFradet
Github user BenFradet commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-204676362
  
No problem, thanks for reviewing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-04-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10355


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-04-01 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-204623657
  
LGTM
I'll go ahead and merge this with master before a conflict happens
Thanks for the PR!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-04-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-204478494
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-04-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-204478499
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54707/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-04-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-204478315
  
**[Test build #54707 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54707/consoleFull)**
 for PR 10355 at commit 
[`718774b`](https://github.com/apache/spark/commit/718774b7401b86a99c2c8ae02014470b0d3a77a3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-04-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-204465968
  
**[Test build #54707 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54707/consoleFull)**
 for PR 10355 at commit 
[`718774b`](https://github.com/apache/spark/commit/718774b7401b86a99c2c8ae02014470b0d3a77a3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-04-01 Thread BenFradet
Github user BenFradet commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-204279426
  
Will fix that, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-31 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-204275787
  
One very minor comment on imports, and think there are now merge conflicts 
with latest master that need fixing, subject to those LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-31 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r58168089
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/Predictor.scala ---
@@ -26,7 +26,7 @@ import org.apache.spark.mllib.regression.LabeledPoint
 import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.{DataFrame, Row}
 import org.apache.spark.sql.functions._
-import org.apache.spark.sql.types.{DataType, DoubleType, StructType}
+import org.apache.spark.sql.types.{DataType, DoubleType, NumericType, 
StructType}
--- End diff --

minor, but I believe the `NumericType` import is no longer necessary


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203660671
  
LGTM  @MLnick ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203641373
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54549/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203641371
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203641203
  
**[Test build #54549 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54549/consoleFull)**
 for PR 10355 at commit 
[`58290af`](https://github.com/apache/spark/commit/58290af3c888f8d6a600ca4676488d4b542e04e7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203627025
  
**[Test build #54549 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54549/consoleFull)**
 for PR 10355 at commit 
[`58290af`](https://github.com/apache/spark/commit/58290af3c888f8d6a600ca4676488d4b542e04e7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203617375
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203617385
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54547/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203617194
  
**[Test build #54547 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54547/consoleFull)**
 for PR 10355 at commit 
[`788eb58`](https://github.com/apache/spark/commit/788eb585a69c0997428f5da1d70630fae475ff28).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203606312
  
**[Test build #54547 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54547/consoleFull)**
 for PR 10355 at commit 
[`788eb58`](https://github.com/apache/spark/commit/788eb585a69c0997428f5da1d70630fae475ff28).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203522117
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54535/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203522113
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203522095
  
**[Test build #54535 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54535/consoleFull)**
 for PR 10355 at commit 
[`21e4afb`](https://github.com/apache/spark/commit/21e4afb9d734f81e47c73b16e1f29f76c46df6f5).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203521461
  
**[Test build #54535 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54535/consoleFull)**
 for PR 10355 at commit 
[`21e4afb`](https://github.com/apache/spark/commit/21e4afb9d734f81e47c73b16e1f29f76c46df6f5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203103276
  
**[Test build #54458 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54458/consoleFull)**
 for PR 10355 at commit 
[`cf005af`](https://github.com/apache/spark/commit/cf005af19eb52037bc32905c5d30595681f68152).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203103291
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54458/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203103287
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-203102318
  
**[Test build #54458 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54458/consoleFull)**
 for PR 10355 at commit 
[`cf005af`](https://github.com/apache/spark/commit/cf005af19eb52037bc32905c5d30595681f68152).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-29 Thread BenFradet
Github user BenFradet commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57796113
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala ---
@@ -17,14 +17,116 @@
 
 package org.apache.spark.ml.util
 
-import org.apache.spark.ml.Model
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.ml.{Estimator, Model, PredictionModel, Predictor}
 import org.apache.spark.ml.param.ParamMap
+import org.apache.spark.ml.regression.Regressor
+import org.apache.spark.ml.tree.impl.TreeTests
+import org.apache.spark.mllib.linalg.Vectors
+import org.apache.spark.sql.{DataFrame, SQLContext}
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.types._
 
-object MLTestingUtils {
+object MLTestingUtils extends SparkFunSuite {
   def checkCopy(model: Model[_]): Unit = {
 val copied = model.copy(ParamMap.empty)
   .asInstanceOf[Model[_]]
 assert(copied.parent.uid == model.parent.uid)
 assert(copied.parent == model.parent)
   }
+
+  def checkPredictorAcceptAllNumericTypes[M <: PredictionModel[_, M], T <: 
Predictor[_, _, M]](
--- End diff --

good point, will update


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-29 Thread sethah
Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57768696
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala ---
@@ -17,14 +17,116 @@
 
 package org.apache.spark.ml.util
 
-import org.apache.spark.ml.Model
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.ml.{Estimator, Model, PredictionModel, Predictor}
 import org.apache.spark.ml.param.ParamMap
+import org.apache.spark.ml.regression.Regressor
+import org.apache.spark.ml.tree.impl.TreeTests
+import org.apache.spark.mllib.linalg.Vectors
+import org.apache.spark.sql.{DataFrame, SQLContext}
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.types._
 
-object MLTestingUtils {
+object MLTestingUtils extends SparkFunSuite {
   def checkCopy(model: Model[_]): Unit = {
 val copied = model.copy(ParamMap.empty)
   .asInstanceOf[Model[_]]
 assert(copied.parent.uid == model.parent.uid)
 assert(copied.parent == model.parent)
   }
+
+  def checkPredictorAcceptAllNumericTypes[M <: PredictionModel[_, M], T <: 
Predictor[_, _, M]](
--- End diff --

Is there a reason to have two separate methods for predictor and estimator? 
I also tend to prefer _not_ separating the "accept" and "reject" checks, and 
instead have them in a single function call. This will reduce the complexity of 
having to decide which one to use (and why) and also reduce the number of 
function calls in future tests. I was able to get the tests working having a 
single method for accept/reject and estimator/predictor, like this:

```scala
  def checkEstimatorNumericTypes[M <: Model[M], T <: Estimator[M]](
  predictor: T,
  isClassification: Boolean,
  sqlContext: SQLContext)(check: (M, M) => Unit): Unit = {
val dfs = if (isClassification) {
  genClassifDFWithNumericLabelCol(sqlContext)
} else {
  genRegressionDFWithNumericLabelCol(sqlContext)
}
val expected = predictor.fit(dfs(DoubleType))
val actuals = dfs.keys.filter(_ != DoubleType).map(t => 
predictor.fit(dfs(t)))
actuals.foreach(actual => check(expected, actual))

val dfWithStringLabels = generateDFWithStringLabelCol(sqlContext)
val thrown = intercept[IllegalArgumentException] {
  predictor.fit(dfWithStringLabels)
}
assert(thrown.getMessage contains
  "Column label must be of type NumericType but was actually of type 
StringType")
  }
```

I had to change the fit method of OneVsRest to include a call to 
`transformSchema(dataset.schema)` so that it would fail properly on non-numeric 
label column. I think it's better to change the fit method so that all ML 
estimators provide consistent errors. Also note that the `isClassification` 
flag is needed because MLPClassifier does not inherit from Classifier, and so 
it is not easy to infer that it is a classification estimator.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-28 Thread BenFradet
Github user BenFradet commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-202744058
  
Sure, thanks for your review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-28 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-202603261
  
This looks to be in very good shape.  My main comment is about unit test 
time: the tests take a significant amount of time.  Wherever it is easy to do, 
could you set the algorithm parameters to make for faster tests?  E.g., use 1 
iteration for LogisticRegression, use maxDepth 0 for trees, etc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-28 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57643358
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala ---
@@ -17,14 +17,116 @@
 
 package org.apache.spark.ml.util
 
-import org.apache.spark.ml.Model
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.ml.{Estimator, Model, PredictionModel, Predictor}
 import org.apache.spark.ml.param.ParamMap
+import org.apache.spark.ml.regression.Regressor
+import org.apache.spark.ml.tree.impl.TreeTests
+import org.apache.spark.mllib.linalg.Vectors
+import org.apache.spark.sql.{DataFrame, SQLContext}
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.types._
 
-object MLTestingUtils {
+object MLTestingUtils extends SparkFunSuite {
   def checkCopy(model: Model[_]): Unit = {
 val copied = model.copy(ParamMap.empty)
   .asInstanceOf[Model[_]]
 assert(copied.parent.uid == model.parent.uid)
 assert(copied.parent == model.parent)
   }
+
+  def checkPredictorAcceptAllNumericTypes[M <: PredictionModel[_, M], T <: 
Predictor[_, _, M]](
+  predictor: T, sqlContext: SQLContext)(check: (M, M) => Unit): Unit = 
{
+val dfs = if (predictor.isInstanceOf[Regressor[_, _, _]]) {
+  genRegressionDFWithNumericLabelCol(sqlContext)
+} else {
+  genClassifDFWithNumericLabelCol(sqlContext)
+}
+val expected = predictor.fit(dfs(DoubleType))
+val actuals = dfs.keys.filter(_ != DoubleType).map(t => 
predictor.fit(dfs(t)))
+actuals.foreach(actual => check(expected, actual))
+  }
+
+  def checkPredictorRejectNotNumericTypes(
+  predictor: Predictor[_, _, _], sqlContext: SQLContext): Unit = {
+val dfWithStringLabels = generateDFWithStringLabelCol(sqlContext)
+val thrown = intercept[IllegalArgumentException] {
+  predictor.fit(dfWithStringLabels)
+}
+assert(thrown.getMessage contains
+  "Column label must be of type NumericType but was actually of type 
StringType")
+  }
+
+  def checkEstimatorAcceptAllNumericTypes[M <: Model[M], T <: 
Estimator[M]](
+  estimator: T, sqlContext: SQLContext)(check: (M, M) => Unit): Unit = 
{
+val dfs = genRegressionDFWithNumericLabelCol(sqlContext)
+val expected = estimator.fit(dfs(DoubleType))
+val actuals = dfs.keys.filter(_ != DoubleType).map(t => 
estimator.fit(dfs(t)))
+actuals.foreach(actual => check(expected, actual))
+  }
+
+  def checkEstimatorRejectNotNumericTypes(
+  predictor: Estimator[_], sqlContext: SQLContext): Unit = {
+val dfWithStringLabels = generateDFWithStringLabelCol(sqlContext)
+val thrown = intercept[IllegalArgumentException] {
+  predictor.fit(dfWithStringLabels)
+}
+assert(thrown.getMessage contains
+  "Column label must be of type NumericType but was actually of type 
StringType")
+  }
+
+  def genClassifDFWithNumericLabelCol(
+  sqlContext: SQLContext,
+  labelColName: String = "label",
+  featuresColName: String = "features"): Map[NumericType, DataFrame] = 
{
+val df = sqlContext.createDataFrame(Seq(
+  (0, Vectors.dense(0, 2, 3)),
+  (1, Vectors.dense(0, 3, 1)),
+  (0, Vectors.dense(0, 2, 2)),
+  (1, Vectors.dense(0, 3, 9)),
+  (0, Vectors.dense(0, 2, 6))
+)).toDF(labelColName, featuresColName)
+
+val types =
+  Seq(ShortType, LongType, IntegerType, FloatType, ByteType, 
DoubleType, DecimalType(10, 0))
+types.map(t => t -> df.select(col(labelColName).cast(t), 
col(featuresColName)))
+  .map { case (t, d) => t -> TreeTests.setMetadata(d, 2, labelColName) 
}
+  .toMap
+  }
+
+  def genRegressionDFWithNumericLabelCol(
+  sqlContext: SQLContext,
+  labelColName: String = "label",
+  featuresColName: String = "features",
+  censorColName: String = "censor"): Map[NumericType, DataFrame] = {
+val df = sqlContext.createDataFrame(Seq(
+  (0, Vectors.dense(0)),
+  (1, Vectors.dense(1)),
+  (2, Vectors.dense(2)),
+  (3, Vectors.dense(3)),
+  (4, Vectors.dense(4))
+)).toDF(labelColName, featuresColName)
+
+val types =
+  Seq(ShortType, LongType, IntegerType, FloatType, ByteType, 
DoubleType, DecimalType(10, 0))
+types
+  .map(t => t -> df.select(col(labelColName).cast(t), 
col(featuresColName)))
+  .map { case (t, d) =>
+t -> TreeTests.setMetadata(d, 2, 
labelColName).withColumn(censorColName, lit(0.0))
--- End diff --

setMetadata: classes should be 0, not 2


---
If your project is set up for it, you can reply to this email and have your
rep

[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-28 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57643353
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/util/SchemaUtils.scala 
---
@@ -61,6 +61,20 @@ private[spark] object SchemaUtils {
   }
 
   /**
+* Check whether the given schema contains a column of the numeric data 
type.
--- End diff --

indent 1 space less (can update IntelliJ code style settings: Editor -> 
Code Style -> Scala -> ScalaDoc tab -> uncheck Use scaladoc indent for leading 
asterisk


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-28 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-202579805
  
I'll take a look


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-26 Thread BenFradet
Github user BenFradet commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201750481
  
@jkbradley @sethah @MLnick I think it's in a pretty good state now, what do 
you guys think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201582826
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54229/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201582825
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201582376
  
**[Test build #54229 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54229/consoleFull)**
 for PR 10355 at commit 
[`f8b7823`](https://github.com/apache/spark/commit/f8b7823e246c1ea9ee7729971fdc1d3a64ac8208).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201560326
  
**[Test build #54229 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54229/consoleFull)**
 for PR 10355 at commit 
[`f8b7823`](https://github.com/apache/spark/commit/f8b7823e246c1ea9ee7729971fdc1d3a64ac8208).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201037149
  
**[Test build #54088 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54088/consoleFull)**
 for PR 10355 at commit 
[`42d5e0f`](https://github.com/apache/spark/commit/42d5e0f28ac09c15cddb2af11c01df4b04503d4e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201037265
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201037272
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54088/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201021316
  
**[Test build #54088 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54088/consoleFull)**
 for PR 10355 at commit 
[`42d5e0f`](https://github.com/apache/spark/commit/42d5e0f28ac09c15cddb2af11c01df4b04503d4e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201018209
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54085/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201018203
  
**[Test build #54085 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54085/consoleFull)**
 for PR 10355 at commit 
[`18b3308`](https://github.com/apache/spark/commit/18b3308d4042c4eb64dedf77f59b659fe2a041e3).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201018208
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-201017850
  
**[Test build #54085 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54085/consoleFull)**
 for PR 10355 at commit 
[`18b3308`](https://github.com/apache/spark/commit/18b3308d4042c4eb64dedf77f59b659fe2a041e3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-24 Thread BenFradet
Github user BenFradet commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-200726627
  
@jkbradley yes, I'm currently trying to incorporate @sethah's comments and 
keep code duplication to a strict minimum.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-200564243
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53974/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-200564242
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-200564217
  
**[Test build #53974 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53974/consoleFull)**
 for PR 10355 at commit 
[`a88c4a3`](https://github.com/apache/spark/commit/a88c4a33f361c0967fa7e4f0ebf086f62a6460cb).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-23 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-200561614
  
@BenFradet Are you actively working on this?  if so, I'll wait to review it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57245082
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala 
---
@@ -183,12 +183,12 @@ class AFTSurvivalRegression @Since("1.6.0") 
(@Since("1.6.0") override val uid: S
* Extract [[featuresCol]], [[labelCol]] and [[censorCol]] from input 
dataset,
* and put it in an RDD with strong types.
*/
-  protected[ml] def extractAFTPoints(dataset: DataFrame): RDD[AFTPoint] = {
-dataset.select($(featuresCol), $(labelCol), $(censorCol)).rdd.map {
-  case Row(features: Vector, label: Double, censor: Double) =>
-AFTPoint(features, label, censor)
-}
-  }
+  protected[ml] def extractAFTPoints(dataset: DataFrame): RDD[AFTPoint] =
--- End diff --

I prefer to keep braces around multiline methods like this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-200559906
  
**[Test build #53974 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53974/consoleFull)**
 for PR 10355 at commit 
[`a88c4a3`](https://github.com/apache/spark/commit/a88c4a33f361c0967fa7e4f0ebf086f62a6460cb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-23 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-200556843
  
Sorry for the long delay!  I'll take a look now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-23 Thread BenFradet
Github user BenFradet commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57241509
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/RandomForestClassifierSuite.scala
 ---
@@ -178,6 +179,34 @@ class RandomForestClassifierSuite extends 
SparkFunSuite with MLlibTestSparkConte
 assert(importances.toArray.forall(_ >= 0.0))
   }
 
+  test("should support all NumericType labels") {
+val dfs = MLTestingUtils.genClassifDFWithNumericLabelCol(sqlContext, 
"label", "features")
+
+val rf = new RandomForestClassifier().setFeaturesCol("features")
+
+val expected = rf.setLabelCol("label")
+  .fit(TreeTests.setMetadata(dfs(DoubleType), 2, "label"))
+dfs.keys.filter(_ != DoubleType).foreach { t =>
+  TreeTests.checkEqual(expected,
+rf.setLabelCol("label").fit(TreeTests.setMetadata(dfs(t), 2, 
"label")))
+}
+  }
+
+  test("shouldn't support non NumericType labels") {
--- End diff --

Here's what I managed to make work:

```scala
  def checkNumericTypes[M <: RegressionModel[_, M], T <: Regressor[_, _, 
M]](
  regressor: T, sqlContext: SQLContext)(check: (M, M) => Unit): Unit = {
val dfs = MLTestingUtils.genRegressionDFWithNumericLabelCol(sqlContext, 
"label", "features")

val expected = regressor.fit(dfs(DoubleType))
val actuals = dfs.keys.filter(_ != DoubleType).map(t => 
regressor.fit(dfs(t)))
actuals.foreach(actual => check(expected, actual))
  }
```

which could be used like:

```scala
MLTestingUtils.checkNumericTypes[LinearRegressionModel, LinearRegression](
new LinearRegression(), sqlContext) { (expected, actual) =>
  assert(expected.intercept === actual.intercept)
  assert(expected.coefficients === actual.coefficients)
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-23 Thread BenFradet
Github user BenFradet commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57219723
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/RandomForestClassifierSuite.scala
 ---
@@ -178,6 +179,34 @@ class RandomForestClassifierSuite extends 
SparkFunSuite with MLlibTestSparkConte
 assert(importances.toArray.forall(_ >= 0.0))
   }
 
+  test("should support all NumericType labels") {
+val dfs = MLTestingUtils.genClassifDFWithNumericLabelCol(sqlContext, 
"label", "features")
+
+val rf = new RandomForestClassifier().setFeaturesCol("features")
+
+val expected = rf.setLabelCol("label")
+  .fit(TreeTests.setMetadata(dfs(DoubleType), 2, "label"))
+dfs.keys.filter(_ != DoubleType).foreach { t =>
+  TreeTests.checkEqual(expected,
+rf.setLabelCol("label").fit(TreeTests.setMetadata(dfs(t), 2, 
"label")))
+}
+  }
+
+  test("shouldn't support non NumericType labels") {
--- End diff --

I wanted to be able to do something like:

```scala
def checkNumericTypes[T <: Regressor, M <: Model](
regressor: T, sqlContext: SQLContext)(asserts: Seq[(M, M) => Unit]): 
Unit = {
  val dfs = MLTestingUtils.genRegressionDFWithNumericLabelCol(sqlContext, 
"label", "features")

  val expected = regressor.fit(dfs(DoubleType))
  dfs.keys.filter(_ != DoubleType).foreach { t =>
val actual = regressor.fit(dfs(t))
asserts.foreach(f => f(expected, actual))
  }
}
```

But unfortunately that doesn't work since I'm not able to capture which 
model is being created by the regressor's `fit` method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread BenFradet
Github user BenFradet commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57057748
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/RandomForestClassifierSuite.scala
 ---
@@ -178,6 +179,34 @@ class RandomForestClassifierSuite extends 
SparkFunSuite with MLlibTestSparkConte
 assert(importances.toArray.forall(_ >= 0.0))
   }
 
+  test("should support all NumericType labels") {
+val dfs = MLTestingUtils.genClassifDFWithNumericLabelCol(sqlContext, 
"label", "features")
+
+val rf = new RandomForestClassifier().setFeaturesCol("features")
+
+val expected = rf.setLabelCol("label")
+  .fit(TreeTests.setMetadata(dfs(DoubleType), 2, "label"))
+dfs.keys.filter(_ != DoubleType).foreach { t =>
+  TreeTests.checkEqual(expected,
+rf.setLabelCol("label").fit(TreeTests.setMetadata(dfs(t), 2, 
"label")))
+}
+  }
+
+  test("shouldn't support non NumericType labels") {
--- End diff --

Yup, good idea, I'll investigate.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread BenFradet
Github user BenFradet commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57057621
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala ---
@@ -27,4 +31,56 @@ object MLTestingUtils {
 assert(copied.parent.uid == model.parent.uid)
 assert(copied.parent == model.parent)
   }
+
+  def genClassifDFWithNumericLabelCol(
+sqlContext: SQLContext,
+labelColName: String,
+featuresColName: String
+  ): Map[NumericType, DataFrame] = {
--- End diff --

I'll modify the whole file then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-199967566
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-199967576
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53797/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-199967218
  
**[Test build #53797 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53797/consoleFull)**
 for PR 10355 at commit 
[`b2f4f51`](https://github.com/apache/spark/commit/b2f4f518e41e463d679dd7f8266b150674a2e138).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread sethah
Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57047541
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala ---
@@ -27,4 +31,56 @@ object MLTestingUtils {
 assert(copied.parent.uid == model.parent.uid)
 assert(copied.parent == model.parent)
   }
+
+  def genClassifDFWithNumericLabelCol(
+sqlContext: SQLContext,
+labelColName: String,
+featuresColName: String
+  ): Map[NumericType, DataFrame] = {
--- End diff --

nit: Spark style seems to be to put the return type on the line above, if 
it will fit. So:

```scala
def genClassifDFWithNumericLabelCol(
sqlContext: SQLContext,
labelColName: String,
featuresColName: String): Map[NumericType, DataFrame] = {
```

[Example](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/tree/GradientBoostedTrees.scala#L168)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread sethah
Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r57045359
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/RandomForestClassifierSuite.scala
 ---
@@ -178,6 +179,34 @@ class RandomForestClassifierSuite extends 
SparkFunSuite with MLlibTestSparkConte
 assert(importances.toArray.forall(_ >= 0.0))
   }
 
+  test("should support all NumericType labels") {
+val dfs = MLTestingUtils.genClassifDFWithNumericLabelCol(sqlContext, 
"label", "features")
+
+val rf = new RandomForestClassifier().setFeaturesCol("features")
+
+val expected = rf.setLabelCol("label")
+  .fit(TreeTests.setMetadata(dfs(DoubleType), 2, "label"))
+dfs.keys.filter(_ != DoubleType).foreach { t =>
+  TreeTests.checkEqual(expected,
+rf.setLabelCol("label").fit(TreeTests.setMetadata(dfs(t), 2, 
"label")))
+}
+  }
+
+  test("shouldn't support non NumericType labels") {
--- End diff --

I still think the code duplication is a big concern. It might not be that 
hard to add some utility function that can clean this up. For example I was 
able to get this working:

In _MLTestingUtils.scala_
```scala
def checkRejectsNonNumericType(est: Predictor[_, _, _], sqlContext: 
SQLContext) = {
val dfWithStringLabels =
  MLTestingUtils.generateDFWithStringLabelCol(sqlContext, 
est.getLabelCol, est.getFeaturesCol)

val thrown = intercept[IllegalArgumentException] {
  est.fit(dfWithStringLabels)
}
assert(thrown.getMessage contains
  "Column label must be of type NumericType but was actually of type 
StringType")
  }
```

Then in the actual test suites:
```scala
test("shouldn't support non NumericType labels") {
MLTestingUtils.checkRejectsNonNumericType(new RandomForestClassifier, 
sqlContext)
}
```
There will be some complication for some of the estimators like OneVsRest 
and IsotonicRegression (why doesn't it extend Predictor?) but I think it will 
be worth it to get this right since future estimators will have to implement 
this as well. It would be really nice to have something like we do for params 
where there is just a one-liner `ParamsSuite.checkParams(new 
RandomForestClassifier)`. For testing that all numeric types are supported, we 
could have a utility method that produces all actual and expected results, then 
check for equality inside the individual test suites, like:

```scala
val models = MLTestingUtils.fitAllNumericTypes(new NaiveBayes, sqlContext)
models.foreach { case (expected, actual) =>
  assert(expected.pi === actual.pi)
  assert(expected.theta === actual.theta)
}
```

I'm open to feedback on this, realizing it could be hard to generalize this 
for every case. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-199948244
  
**[Test build #53797 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53797/consoleFull)**
 for PR 10355 at commit 
[`b2f4f51`](https://github.com/apache/spark/commit/b2f4f518e41e463d679dd7f8266b150674a2e138).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread BenFradet
Github user BenFradet commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-199709409
  
Yup, I'll rebase tonight and start working on the related jiras.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-199708780
  
Looks fine to me - latest test result indicates there may be merge 
conflicts?

I'd like @jkbradley to take a quick pass too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-22 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-199709002
  
@sethah could you also take a final pass?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-198035588
  
**[Test build #53454 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53454/consoleFull)**
 for PR 10355 at commit 
[`6b8b0cf`](https://github.com/apache/spark/commit/6b8b0cfa44e0c3050c17a4bcb7be9faeaae53748).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-198039870
  
**[Test build #53458 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53458/consoleFull)**
 for PR 10355 at commit 
[`e8a56d5`](https://github.com/apache/spark/commit/e8a56d5927b4652ac0b89fb3783a495a239d0eae).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-198057662
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-198113442
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197537016
  
**[Test build #53343 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53343/consoleFull)**
 for PR 10355 at commit 
[`9feca44`](https://github.com/apache/spark/commit/9feca44e2796baaa9ccf5dfb03e2b9a0eabb731b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-198057482
  
**[Test build #53458 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53458/consoleFull)**
 for PR 10355 at commit 
[`e8a56d5`](https://github.com/apache/spark/commit/e8a56d5927b4652ac0b89fb3783a495a239d0eae).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197586274
  
**[Test build #53368 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53368/consoleFull)**
 for PR 10355 at commit 
[`3b424dc`](https://github.com/apache/spark/commit/3b424dc88211c83615ef8f16402879cc9eb45c2e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197623813
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197623814
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53366/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread MLnick
Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197744650
  
@BenFradet I linked 2 additional sub-tasks to the [umbrella 
JIRA](https://issues.apache.org/jira/browse/SPARK-11107)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197521550
  
**[Test build #53343 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53343/consoleFull)**
 for PR 10355 at commit 
[`9feca44`](https://github.com/apache/spark/commit/9feca44e2796baaa9ccf5dfb03e2b9a0eabb731b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r56465970
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/regression/AFTSurvivalRegressionSuite.scala
 ---
@@ -347,6 +348,34 @@ class AFTSurvivalRegressionSuite
 }
   }
 
+  test("should support all NumericType labels") {
+val dfs = 
MLTestingUtils.genRegressionDFWithNumericLabelCol(sqlContext, "label", 
"features")
--- End diff --

I think you'll need to adjust the generated data here as it requires an 
additional 'censor' col.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-198113445
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53454/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197596347
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53368/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r56465357
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/Predictor.scala ---
@@ -49,10 +50,7 @@ private[ml] trait PredictorParams extends Params
 validateParams()
 // TODO: Support casting Array[Double] and Array[Float] to Vector when 
FeaturesType = Vector
 SchemaUtils.checkColumnType(schema, $(featuresCol), featuresDataType)
-if (fitting) {
-  // TODO: Allow other numeric types
-  SchemaUtils.checkColumnType(schema, $(labelCol), DoubleType)
-}
+if (fitting) SchemaUtils.checkNumericType(schema, $(labelCol))
--- End diff --

Spark style prefers to keep the braces format for if statements. Can you 
change it back please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread BenFradet
Github user BenFradet commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197755584
  
@MLnick yup thanks, I'll get to those once I'm done with this one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197537090
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197596345
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-198113198
  
**[Test build #53454 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53454/consoleFull)**
 for PR 10355 at commit 
[`6b8b0cf`](https://github.com/apache/spark/commit/6b8b0cfa44e0c3050c17a4bcb7be9faeaae53748).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-198057663
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53458/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197596308
  
**[Test build #53368 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53368/consoleFull)**
 for PR 10355 at commit 
[`3b424dc`](https://github.com/apache/spark/commit/3b424dc88211c83615ef8f16402879cc9eb45c2e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread BenFradet
Github user BenFradet commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r56468850
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/regression/AFTSurvivalRegressionSuite.scala
 ---
@@ -347,6 +348,34 @@ class AFTSurvivalRegressionSuite
 }
   }
 
+  test("should support all NumericType labels") {
+val dfs = 
MLTestingUtils.genRegressionDFWithNumericLabelCol(sqlContext, "label", 
"features")
--- End diff --

Thanks for the insight, I'll fix this later today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197537092
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53343/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197581060
  
**[Test build #53366 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53366/consoleFull)**
 for PR 10355 at commit 
[`2690512`](https://github.com/apache/spark/commit/269051247d92ae2a58fb2749a3ace062ad28f9cb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197623583
  
**[Test build #53366 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53366/consoleFull)**
 for PR 10355 at commit 
[`2690512`](https://github.com/apache/spark/commit/269051247d92ae2a58fb2749a3ace062ad28f9cb).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197039251
  
**[Test build #53227 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53227/consoleFull)**
 for PR 10355 at commit 
[`7d9835a`](https://github.com/apache/spark/commit/7d9835abd0f3bfac2ef66db01a4de4fdc7dceb48).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197039263
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53227/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197039258
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-15 Thread BenFradet
Github user BenFradet commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197037079
  
Here's what's left to do:
- IsotonicRegression
- GeneralizedLinearRegression
- AFTSurvival

Otherwise, I'd prefer doing the evaluators and other estimators (like 
rformula and chiSqSelector) in other prs


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197035874
  
**[Test build #53227 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53227/consoleFull)**
 for PR 10355 at commit 
[`7d9835a`](https://github.com/apache/spark/commit/7d9835abd0f3bfac2ef66db01a4de4fdc7dceb48).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-15 Thread BenFradet
Github user BenFradet commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r56203460
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
 ---
@@ -334,6 +334,51 @@ class DecisionTreeClassifierSuite extends 
SparkFunSuite with MLlibTestSparkConte
 assert(importances.toArray.forall(_ >= 0.0))
   }
 
+  test("should support all NumericType labels") {
--- End diff --

OK, will do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/10355#discussion_r56200539
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
 ---
@@ -334,6 +334,51 @@ class DecisionTreeClassifierSuite extends 
SparkFunSuite with MLlibTestSparkConte
 assert(importances.toArray.forall(_ >= 0.0))
   }
 
+  test("should support all NumericType labels") {
--- End diff --

I was thinking more like create a util method (or use an existing one if 
there is one) to generate the required test datasets. It's not a big deal 
though, just a "nice to have"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >