[GitHub] spark pull request #20164: [SPARK-22971][ML] OneVsRestModel should use tempo...

2018-06-14 Thread zhengruifeng
Github user zhengruifeng closed the pull request at:

https://github.com/apache/spark/pull/20164


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20164: [SPARK-22971][ML] OneVsRestModel should use tempo...

2018-01-15 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/20164#discussion_r161535696
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -170,21 +170,24 @@ final class OneVsRestModel private[ml] (
   newDataset.persist(StorageLevel.MEMORY_AND_DISK)
 }
 
+// temporary column to store intermediate raw prediction
+val tmpRawPredictionColName = "rawPrediction_" + 
UUID.randomUUID().toString
+
 // update the accumulator column with the result of prediction of 
models
 val aggregatedDataset = 
models.zipWithIndex.foldLeft[DataFrame](newDataset) {
   case (df, (model, index)) =>
-val rawPredictionCol = model.getRawPredictionCol
-val columns = origCols ++ List(col(rawPredictionCol), 
col(accColName))
+val columns = origCols ++ List(col(tmpRawPredictionColName), 
col(accColName))
--- End diff --

This line doesn't need to be in the `foldLeft` block any longer?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20164: [SPARK-22971][ML] OneVsRestModel should use tempo...

2018-01-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20164#discussion_r160020496
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -170,21 +170,24 @@ final class OneVsRestModel private[ml] (
   newDataset.persist(StorageLevel.MEMORY_AND_DISK)
 }
 
+// temporary column to store intermediate raw prediction
+val tmpRawPredictionColName = "mbc$tmpraw" + UUID.randomUUID().toString
--- End diff --

in other ml cases we are slightly more descriptive with the prefix text


https://github.com/apache/spark/blob/576c43fb4226e4efa12189b41c3bc862019862c6/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala#L1050
  


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20164: [SPARK-22971][ML] OneVsRestModel should use tempo...

2018-01-05 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request:

https://github.com/apache/spark/pull/20164

[SPARK-22971][ML] OneVsRestModel should use temporary RawPredictionCol

## What changes were proposed in this pull request?
use temporary RawPredictionCol in `OneVsRestModel#transform`

## How was this patch tested?
existing tests and added tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhengruifeng/spark 
ovr_not_use_getRawPredictionCol

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20164.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20164


commit f155e1cc6b175ac06a5f2ab710d4c053b0776507
Author: Zheng RuiFeng 
Date:   2018-01-05T09:29:25Z

create pr

commit 9b0dcc69535b6731c9b6cdc0030c846c3352a5de
Author: Zheng RuiFeng 
Date:   2018-01-05T10:19:59Z

create pr

commit 6c567ffb02738346fc83e467752add0d00a42e07
Author: Zheng RuiFeng 
Date:   2018-01-05T10:26:16Z

add test




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org