[GitHub] [spark] zhengruifeng commented on pull request #31693: [SPARK-34448][ML] Binary logistic regression incorrectly computes the intercept and coefficients with small var features

2021-03-16 Thread GitBox
zhengruifeng commented on pull request #31693: URL: https://github.com/apache/spark/pull/31693#issuecomment-800713459 this PR maybe too big, after a offline discussion with weichen, I will split it into serveral prs This is

[GitHub] [spark] zhengruifeng commented on pull request #31693: [SPARK-34448][ML] Binary logistic regression incorrectly computes the intercept and coefficients with small var features

2021-03-11 Thread GitBox
zhengruifeng commented on pull request #31693: URL: https://github.com/apache/spark/pull/31693#issuecomment-797195471 @WeichenXu123 avg(abs(diff_raw_prediction)/raw_prediction_from_center_model) = 0.0021191517840809986 max(abs(diff_raw_prediction)/raw_prediction_from_center_model

[GitHub] [spark] zhengruifeng commented on pull request #31693: [SPARK-34448][ML] Binary logistic regression incorrectly computes the intercept and coefficients with small var features

2021-03-10 Thread GitBox
zhengruifeng commented on pull request #31693: URL: https://github.com/apache/spark/pull/31693#issuecomment-796504039 I use the scala case in the ticket: ``` // scalastyle:off println test("BLR") { import org.apache.spark.ml.feature.VectorAssembler val centered

[GitHub] [spark] zhengruifeng commented on pull request #31693: [SPARK-34448][ML] Binary logistic regression incorrectly computes the intercept and coefficients with small var features

2021-03-09 Thread GitBox
zhengruifeng commented on pull request #31693: URL: https://github.com/apache/spark/pull/31693#issuecomment-794681499 @srowen We need a seperate PR for branches before 3.1, because of the blockification introduced in 3.1.1. We should backport it to 3.0. as to 2.4, If 2.4.8 is the EOL

[GitHub] [spark] zhengruifeng commented on pull request #31693: [SPARK-34448][ML] Binary logistic regression incorrectly computes the intercept and coefficients with small var features

2021-03-08 Thread GitBox
zhengruifeng commented on pull request #31693: URL: https://github.com/apache/spark/pull/31693#issuecomment-793230111 retest this please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] zhengruifeng commented on pull request #31693: [SPARK-34448][ML] Binary logistic regression incorrectly computes the intercept and coefficients with small var features

2021-03-08 Thread GitBox
zhengruifeng commented on pull request #31693: URL: https://github.com/apache/spark/pull/31693#issuecomment-792677693 glmnet result for the update testsuite: ![image](https://user-images.githubusercontent.com/7322292/110313211-16485200-8041-11eb-9d6d-9428a85618d7.png) -

[GitHub] [spark] zhengruifeng commented on pull request #31693: [SPARK-34448][ML] Binary logistic regression incorrectly computes the intercept and coefficients with small var features

2021-03-07 Thread GitBox
zhengruifeng commented on pull request #31693: URL: https://github.com/apache/spark/pull/31693#issuecomment-792496074 @srowen I think it is ok. @dbtsai @mengxr Could I ping you here? Since it seems that existing behavior of standardization without removing centers originated from th