[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15881


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request:

https://github.com/apache/spark/pull/15881#discussion_r87939529
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala ---
@@ -73,11 +73,13 @@ private[ml] trait DecisionTreeParams extends 
PredictorParams
 
   /**
* Minimum information gain for a split to be considered at a tree node.
+   * Should be >= 0.0.
* (default = 0.0)
* @group param
*/
   final val minInfoGain: DoubleParam = new DoubleParam(this, "minInfoGain",
-"Minimum information gain for a split to be considered at a tree 
node.")
+"Minimum information gain for a split to be considered at a tree 
node.",
--- End diff --

Oh, I think it's better to not add `>=0` here, because all params in 
`treeParams.scala` don't add it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request:

https://github.com/apache/spark/pull/15881#discussion_r87939306
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala ---
@@ -73,11 +73,13 @@ private[ml] trait DecisionTreeParams extends 
PredictorParams
 
   /**
* Minimum information gain for a split to be considered at a tree node.
+   * Should be >= 0.0.
* (default = 0.0)
* @group param
*/
   final val minInfoGain: DoubleParam = new DoubleParam(this, "minInfoGain",
-"Minimum information gain for a split to be considered at a tree 
node.")
+"Minimum information gain for a split to be considered at a tree 
node.",
--- End diff --

👌, I will add it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request:

https://github.com/apache/spark/pull/15881#discussion_r87939017
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala ---
@@ -73,11 +73,13 @@ private[ml] trait DecisionTreeParams extends 
PredictorParams
 
   /**
* Minimum information gain for a split to be considered at a tree node.
+   * Should be >= 0.0.
* (default = 0.0)
* @group param
*/
   final val minInfoGain: DoubleParam = new DoubleParam(this, "minInfoGain",
-"Minimum information gain for a split to be considered at a tree 
node.")
+"Minimum information gain for a split to be considered at a tree 
node.",
--- End diff --

Sorry to miss this. Perhaps add >=0 at the end to be consistent with others.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread sethah
Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/15881#discussion_r87887739
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -171,7 +171,10 @@ class LinearRegression @Since("1.3.0") 
(@Since("1.3.0") override val uid: String
* @group setParam
*/
   @Since("1.6.0")
-  def setSolver(value: String): this.type = set(solver, value)
+  def setSolver(value: String): this.type = {
+require(Array("auto", "l-bfgs", "normal").contains(value), s"Solver 
$value was not supported.")
--- End diff --

Yes, good point about the param definition being inherited, I had 
overlooked that. I think it's ok to use a require here then. I would prefer 
`Set("auto", ...).contains` but it's a very minor issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request:

https://github.com/apache/spark/pull/15881#discussion_r87883345
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -171,7 +171,10 @@ class LinearRegression @Since("1.3.0") 
(@Since("1.3.0") override val uid: String
* @group setParam
*/
   @Since("1.6.0")
-  def setSolver(value: String): this.type = set(solver, value)
+  def setSolver(value: String): this.type = {
+require(Array("auto", "l-bfgs", "normal").contains(value), s"Solver 
$value was not supported.")
--- End diff --

Thanks for the reply. I assume `solver` is different from other params as 
it's inherited from a trait where we cannot define valid options. I don't mean 
to change any convention here. For this case, I simply mean we can add the 
available options in the error message in the `require`.

And surely we can discuss how to improve ParamValidator elsewhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request:

https://github.com/apache/spark/pull/15881#discussion_r87873242
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -171,7 +171,10 @@ class LinearRegression @Since("1.3.0") 
(@Since("1.3.0") override val uid: String
* @group setParam
*/
   @Since("1.6.0")
-  def setSolver(value: String): this.type = set(solver, value)
+  def setSolver(value: String): this.type = {
+require(Array("auto", "l-bfgs", "normal").contains(value), s"Solver 
$value was not supported.")
--- End diff --

I'm not sure if we can get an error message by using ParamValidator. 
Anyway, it can be helpful to inform users about the available options in the 
error message.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread sethah
Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/15881#discussion_r87826206
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -171,7 +171,10 @@ class LinearRegression @Since("1.3.0") 
(@Since("1.3.0") override val uid: String
* @group setParam
*/
   @Since("1.6.0")
-  def setSolver(value: String): this.type = set(solver, value)
+  def setSolver(value: String): this.type = {
+require(Array("auto", "l-bfgs", "normal").contains(value), s"Solver 
$value was not supported.")
--- End diff --

Why would we not use a param validator here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15881#discussion_r87784943
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -171,7 +171,10 @@ class LinearRegression @Since("1.3.0") 
(@Since("1.3.0") override val uid: String
* @group setParam
*/
   @Since("1.6.0")
-  def setSolver(value: String): this.type = set(solver, value)
+  def setSolver(value: String): this.type = {
+require(Array("auto", "l-bfgs", "normal").contains(value), s"Solver 
$value was not supported.")
--- End diff --

Maybe just a switch/case is simpler? I don't feel strongly though


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15881: [SPARK-18434][ML] Add missing ParamValidations fo...

2016-11-14 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request:

https://github.com/apache/spark/pull/15881

[SPARK-18434][ML] Add missing ParamValidations for ML algos

## What changes were proposed in this pull request?
Add missing ParamValidations for ML algos
## How was this patch tested?
existing tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhengruifeng/spark arg_checking

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15881.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15881


commit aefb2623c6981deff0be54505bcd1e58f71cef55
Author: Zheng RuiFeng 
Date:   2016-11-14T10:13:58Z

create pr

commit 465356fdfbfe1050537866aa86ab1506d09a1194
Author: Zheng RuiFeng 
Date:   2016-11-14T11:24:36Z

create pr

commit 77658b715d1098259aeea660d0926bf3cc0a59db
Author: Zheng RuiFeng 
Date:   2016-11-14T11:34:54Z

update




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org