[GitHub] spark pull request #18764: [SPARK-21306][ML] For branch 2.0, OneVsRest shoul...

2017-08-08 Thread facaiy
Github user facaiy closed the pull request at:

https://github.com/apache/spark/pull/18764


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18764: [SPARK-21306][ML] For branch 2.0, OneVsRest shoul...

2017-08-05 Thread facaiy
Github user facaiy commented on a diff in the pull request:

https://github.com/apache/spark/pull/18764#discussion_r131529693
  
--- Diff: python/pyspark/ml/classification.py ---
@@ -1344,7 +1346,19 @@ def _fit(self, dataset):
 
 numClasses = int(dataset.agg({labelCol: 
"max"}).head()["max("+labelCol+")"]) + 1
 
-multiclassLabeled = dataset.select(labelCol, featuresCol)
+weightCol = None
+if (self.isDefined(self.weightCol) and self.getWeightCol()):
+if isinstance(classifier, HasWeightCol):
+weightCol = self.getWeightCol()
+else:
+warnings.warn("weightCol is ignored, "
+  "as it is not supported by {} now.".format(
--- End diff --

Thank you very much for help, @yanboliang !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18764: [SPARK-21306][ML] For branch 2.0, OneVsRest shoul...

2017-08-05 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/18764#discussion_r131523236
  
--- Diff: python/pyspark/ml/classification.py ---
@@ -1344,7 +1346,19 @@ def _fit(self, dataset):
 
 numClasses = int(dataset.agg({labelCol: 
"max"}).head()["max("+labelCol+")"]) + 1
 
-multiclassLabeled = dataset.select(labelCol, featuresCol)
+weightCol = None
+if (self.isDefined(self.weightCol) and self.getWeightCol()):
+if isinstance(classifier, HasWeightCol):
+weightCol = self.getWeightCol()
+else:
+warnings.warn("weightCol is ignored, "
+  "as it is not supported by {} now.".format(
--- End diff --

See my comment 
[here](https://github.com/apache/spark/pull/18763#pullrequestreview-54516447).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18764: [SPARK-21306][ML] For branch 2.0, OneVsRest shoul...

2017-07-28 Thread facaiy
Github user facaiy commented on a diff in the pull request:

https://github.com/apache/spark/pull/18764#discussion_r130200379
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -143,6 +144,16 @@ class OneVsRestSuite extends SparkFunSuite with 
MLlibTestSparkContext with Defau
 assert(output.schema.fieldNames.toSet === Set("label", "features", 
"prediction"))
   }
 
+  test("SPARK-21306: OneVsRest should support setWeightCol") {
+val dataset2 = dataset.withColumn("weight", lit(1.0))
--- End diff --

use double value, `lit(1.0)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18764: [SPARK-21306][ML] For branch 2.0, OneVsRest shoul...

2017-07-28 Thread facaiy
Github user facaiy commented on a diff in the pull request:

https://github.com/apache/spark/pull/18764#discussion_r130200288
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -33,6 +33,7 @@ import org.apache.spark.mllib.util.MLlibTestSparkContext
 import org.apache.spark.mllib.util.TestingUtils._
 import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.Dataset
+import org.apache.spark.sql.functions._
--- End diff --

add missing import for branch 2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18764: [SPARK-21306][ML] For branch 2.0, OneVsRest shoul...

2017-07-28 Thread facaiy
GitHub user facaiy opened a pull request:

https://github.com/apache/spark/pull/18764

[SPARK-21306][ML] For branch 2.0, OneVsRest should support setWeightCol

The PR is related to #18554, and is modified for branch 2.0.

## What changes were proposed in this pull request?

add `setWeightCol` method for OneVsRest.

`weightCol` is ignored if classifier doesn't inherit HasWeightCol trait.

## How was this patch tested?

+ [x] add an unit test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/facaiy/spark 
BUG/branch-2.0_OneVsRest_support_setWeightCol

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18764.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18764


commit 0c60c28ca6a3da84401f4947a21ace6980be08fa
Author: Yan Facai (颜发才) 
Date:   2017-07-28T02:10:35Z

[SPARK-21306][ML] For branch 2.0, OneVsRest should support setWeightCol




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org