[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

2018-06-22 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18826 @holdenk @HyukjinKwon I updated `__repr__` by calling `toString`. I also added class name `LogisticRegressionModel: ` to `toString`. Otherwise `uid` alone is a bit confusing

[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

2018-06-18 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18826 @HyukjinKwon Can you recommend someone to take a look at this PR or maybe you can take a look? --- - To unsubscribe, e-mail

[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

2018-06-07 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18826 @HyukjinKwon It's ready to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

2018-05-31 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18826 This PR recently got tested so it draws my attention. Is this something we want to proceed? @holdenk @yanboliang @jkbradley @dbtsai I don't see how the test failures relate

[GitHub] spark issue #18898: [SPARK-21245][ML] Resolve code duplication for classific...

2017-08-24 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18898 Hi @sethah and @yanboliang , can someone take a look? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #18898: [SPARK-21245][ML] Resolve code duplication for cl...

2017-08-09 Thread bravo-zhang
GitHub user bravo-zhang opened a pull request: https://github.com/apache/spark/pull/18898 [SPARK-21245][ML] Resolve code duplication for classification/regression summarizers ## Why the change? In several places (LogReg, LinReg, SVC) in Spark ML, we collect summary

[GitHub] spark issue #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-08-07 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18820 Hi @HyukjinKwon @gatorsmile @viirya I addressed your comments, added more test coverage and provided more info in PR description. One thing that is not clear to user is that they can still

[GitHub] spark pull request #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to r...

2017-08-07 Thread bravo-zhang
Github user bravo-zhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18820#discussion_r131817518 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala --- @@ -366,11 +370,15 @@ final class DataFrameNaFunctions private

[GitHub] spark pull request #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to r...

2017-08-07 Thread bravo-zhang
Github user bravo-zhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18820#discussion_r131814746 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala --- @@ -314,6 +316,7 @@ final class DataFrameNaFunctions private[sql

[GitHub] spark pull request #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to r...

2017-08-07 Thread bravo-zhang
Github user bravo-zhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18820#discussion_r131814622 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala --- @@ -145,8 +145,8 @@ class DataTypeSuite extends SparkFunSuite

[GitHub] spark issue #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-08-06 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18820 @HyukjinKwon Thanks for review. Updated to address your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-08-04 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18820 Hey @nchammas I don't have strong opinion on this and changed back to what it was. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-08-03 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18820 What if the field is not nullable? I did a test: ``` val rows = spark.sparkContext.parallelize(Seq( Row("Bravo", 28, 183.5), Row("Jessie", 18,

[GitHub] spark issue #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-08-03 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18820 Hey @nchammas I made the logic much simpler. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

2017-08-02 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18826 Hi @holdenk , I'm opening this PR to continue the effort in https://github.com/apache/spark/pull/12491 When adding doctest, I noticed that the `uid` part of the string is always a random

[GitHub] spark issue #12491: [SPARK-14712][ML]spark.ml.LogisticRegressionModel.toStri...

2017-08-02 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/12491 Hi @holdenk , to continue this PR I opened https://github.com/apache/spark/pull/18826 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #18826: LogisticRegressionModel.toString should summarize...

2017-08-02 Thread bravo-zhang
GitHub user bravo-zhang opened a pull request: https://github.com/apache/spark/pull/18826 LogisticRegressionModel.toString should summarize model ## What changes were proposed in this pull request? [SPARK-14712](https://issues.apache.org/jira/browse/SPARK-14712

[GitHub] spark issue #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-08-02 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18820 This PR reopens https://github.com/apache/spark/pull/16225 Please take a look @gatorsmile @holdenk @HyukjinKwon Thanks! --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #16225: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-08-02 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/16225 @holdenk Thanks for review. I'll combine type(None) in the `isinstance`. I also made Scala and Python to accept null more generally and in the same way. PR is reopened at: https://github.com

[GitHub] spark pull request #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to r...

2017-08-02 Thread bravo-zhang
GitHub user bravo-zhang opened a pull request: https://github.com/apache/spark/pull/18820 [SPARK-14932][SQL] Allow DataFrame.replace() to replace values with None ## What changes were proposed in this pull request? Allow DataFrame.replace() to replace with None/null values

[GitHub] spark pull request #16365: [SPARK-18950][SQL] Report conflicting fields when...

2017-07-31 Thread bravo-zhang
Github user bravo-zhang commented on a diff in the pull request: https://github.com/apache/spark/pull/16365#discussion_r130432806 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala --- @@ -469,9 +469,16 @@ object StructType extends AbstractDataType

[GitHub] spark issue #16365: [SPARK-18950][SQL] Report conflicting fields when mergin...

2017-07-25 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/16365 @HyukjinKwon @gatorsmile Test added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16225: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-05-23 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/16225 Thanks for taking a look, @gatorsmile The conflicts have been resolved. I appreciate if @zero323 can take a look as well since you made improvement on this function recently. --- If your

[GitHub] spark issue #16365: [SPARK-18950][SQL] Report conflicting fields when mergin...

2016-12-20 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/16365 Thanks for the review @HyukjinKwon ! Are your stacktrace of Before and After swapped? Do you mean the message is confusing? How about correcting it to `Failed to merge field longcol

[GitHub] spark pull request #16365: [SPARK-18950][SQL] Report conflicting fields when...

2016-12-20 Thread bravo-zhang
GitHub user bravo-zhang opened a pull request: https://github.com/apache/spark/pull/16365 [SPARK-18950][SQL] Report conflicting fields when merging two StructTypes ## What changes were proposed in this pull request? Currently, StructType.merge() only reports data types

[GitHub] spark pull request #16225: [SPARK-14932][SQL] Allow DataFrame.replace() to r...

2016-12-09 Thread bravo-zhang
Github user bravo-zhang commented on a diff in the pull request: https://github.com/apache/spark/pull/16225#discussion_r91761099 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala --- @@ -342,11 +342,14 @@ final class DataFrameNaFunctions private[sql

[GitHub] spark pull request #16225: [SPARK-14932][SQL] Allow DataFrame.replace() to r...

2016-12-08 Thread bravo-zhang
GitHub user bravo-zhang opened a pull request: https://github.com/apache/spark/pull/16225 [SPARK-14932][SQL] Allow DataFrame.replace() to replace values with None ## What changes were proposed in this pull request? Allow DataFrame.replace() to replace with None/null values

[GitHub] spark pull request #15787: [SPARK-18286][ML] Add Scala/Java examples for Min...

2016-12-04 Thread bravo-zhang
Github user bravo-zhang closed the pull request at: https://github.com/apache/spark/pull/15787 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15787: [SPARK-18286][ML] Add Scala/Java examples for MinHash an...

2016-12-04 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/15787 This PR can be closed since https://github.com/apache/spark/pull/15795 has merged --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #15795: [SPARK-18081] Add user guide for Locality Sensitive Hash...

2016-11-07 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/15795 Examples are usually class/algorithm based, not interface/use case based. Maybe we can summarize the 5 classes into 2? Do you mind to modify the examples in #15787 after it is merged instead

[GitHub] spark pull request #15795: [SPARK-18081] Add user guide for Locality Sensiti...

2016-11-07 Thread bravo-zhang
Github user bravo-zhang commented on a diff in the pull request: https://github.com/apache/spark/pull/15795#discussion_r86795061 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaRandomProjectionExample.java --- @@ -0,0 +1,72 @@ +/* + * Licensed

[GitHub] spark pull request #15795: [SPARK-18081] Add user guide for Locality Sensiti...

2016-11-07 Thread bravo-zhang
Github user bravo-zhang commented on a diff in the pull request: https://github.com/apache/spark/pull/15795#discussion_r86795179 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/RandomProjectionExample.scala --- @@ -0,0 +1,56 @@ +/* + * Licensed