[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r116408136 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala --- @@ -2318,8 +2319,8 @@ class LogisticRegressionSuite assert(m1.interceptVector ~== m2.interceptVector absTol 0.05) } val testParams = Seq( - ("binomial", smallBinaryDataset, 2), - ("multinomial", smallMultinomialDataset, 3) + ("Binomial", smallBinaryDataset, 2), --- End diff -- The changes you made don't address this comment at all, and there are not tests for the suggestion from Yanbo either. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r115896918 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala --- @@ -2318,8 +2319,8 @@ class LogisticRegressionSuite assert(m1.interceptVector ~== m2.interceptVector absTol 0.05) } val testParams = Seq( - ("binomial", smallBinaryDataset, 2), - ("multinomial", smallMultinomialDataset, 3) + ("Binomial", smallBinaryDataset, 2), --- End diff -- @sethah OK, I will add dedicated tests for this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r115813053 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala --- @@ -2318,8 +2319,8 @@ class LogisticRegressionSuite assert(m1.interceptVector ~== m2.interceptVector absTol 0.05) } val testParams = Seq( - ("binomial", smallBinaryDataset, 2), - ("multinomial", smallMultinomialDataset, 3) + ("Binomial", smallBinaryDataset, 2), --- End diff -- What about "binomial" and "BiNoMiaL"? Also, what about doing: scala lr.setFamily("BiNomial") assert(lr.getFamily === "binomial") Also, I'm not a big fan of sprinkling these very subtle "tests" around the test suite. We should have dedicated tests of this functionality. Otherwise, how should future developers know that the capital "B" here is supposed to be there and is actually testing some desired functionality? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r115669376 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -526,7 +526,7 @@ class LogisticRegression @Since("1.2.0") ( case None => histogram.length } -val isMultinomial = $(family) match { +val isMultinomial = $(family).toLowerCase(Locale.ROOT) match { --- End diff -- good point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r115453085 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -890,7 +890,7 @@ object LogisticRegression extends DefaultParamsReadable[LogisticRegression] { override def load(path: String): LogisticRegression = super.load(path) private[classification] val supportedFamilyNames = -Array("auto", "binomial", "multinomial").map(_.toLowerCase(Locale.ROOT)) +Array("auto", "binomial", "multinomial") --- End diff -- +1 @hhbyyh Let's keep it to handle some special locale. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r115451397 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -526,7 +526,7 @@ class LogisticRegression @Since("1.2.0") ( case None => histogram.length } -val isMultinomial = $(family) match { +val isMultinomial = $(family).toLowerCase(Locale.ROOT) match { --- End diff -- +1 @zhengruifeng. I'd like to put the original value in param, since users may compare the param value with the original input as following: ``` val family = "Binomial" val lr = new LogisticRegression().setFamily(family) val model = lr.fit(dataset) ... if (family == lr.getFamily) println("A") else println("B") ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r115418289 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -526,7 +526,7 @@ class LogisticRegression @Since("1.2.0") ( case None => histogram.length } -val isMultinomial = $(family) match { +val isMultinomial = $(family).toLowerCase(Locale.ROOT) match { --- End diff -- I follow the style in `GeneralizedLinearRegression`. Lower the param in setter can simplify the codes, but it also change the output of coresponding getter. What is your opinion? @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r115418315 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -890,7 +890,7 @@ object LogisticRegression extends DefaultParamsReadable[LogisticRegression] { override def load(path: String): LogisticRegression = super.load(path) private[classification] val supportedFamilyNames = -Array("auto", "binomial", "multinomial").map(_.toLowerCase(Locale.ROOT)) +Array("auto", "binomial", "multinomial") --- End diff -- I am not sure about this. If we should keep `toLowerCase` here, we may also do this in `GeneralizedLinearRegression` and others --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r115416085 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -526,7 +526,7 @@ class LogisticRegression @Since("1.2.0") ( case None => histogram.length } -val isMultinomial = $(family) match { +val isMultinomial = $(family).toLowerCase(Locale.ROOT) match { --- End diff -- As a general practice, I would recommend moving the `.toLowerCase(Locale.ROOT)` into the setter. Then we don't need to invoke the `.toLowerCase(Locale.ROOT)` multiple times in the code. (here it happens to be once). And we can always assume the $(family) has predictable values in the code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/17910#discussion_r115416204 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -890,7 +890,7 @@ object LogisticRegression extends DefaultParamsReadable[LogisticRegression] { override def load(path: String): LogisticRegression = super.load(path) private[classification] val supportedFamilyNames = -Array("auto", "binomial", "multinomial").map(_.toLowerCase(Locale.ROOT)) +Array("auto", "binomial", "multinomial") --- End diff -- We may need to be careful to remove the map. Since Locale.Root can be some special case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17910: [SPARK-20669][ML] LogisticRegression family shoul...
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/17910 [SPARK-20669][ML] LogisticRegression family should be case insensitive ## What changes were proposed in this pull request? make param `family` case insensitive ## How was this patch tested? updated tests @yanboliang You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhengruifeng/spark lr_family_lowercase Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17910.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17910 commit 33c0f9e52c239a6067a535be9c0ce19772d32aef Author: Zheng RuiFeng Date: 2017-05-09T05:43:13Z create pr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org