subject:"\[jira\] \[Updated\] \(SPARK\-13029\) Logistic regression returns inaccurate results when there is a column with identical value, and fit

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-02-26 Thread Josh Rosen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Rosen updated SPARK-13029:
---
Target Version/s: 1.5.3, 2.0.0, 1.6.2  (was: 1.5.3, 1.6.1, 2.0.0)

> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Shuo Xiang
>Assignee: Shuo Xiang
>
> This is a bug that appears while fitting a Logistic Regression model with 
> `.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
> has one column with identical value, the resulting model is not correct. 
> Specifically, the special column will always get a weight of 0, due to the 
> special check inside the code. However, the correct solution, which is unique 
> for L2 logistic regression, usually has non-zero weight.
> I use the heart_scale data 
> (https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
> manually augmented the data matrix with a column of one (available in the 
> PR). The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
> following tools:
>  - libsvm
>  - scikit-learn
>  - sparkml
> (Notice libsvm and scikit-learn use a slightly different formulation, so 
> their regularizer is equivalently set to 1/270).
> The first two will have an objective value 0.7275 and give a solution vector:
> [0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
> 0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
> 0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
> 0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
> 0.1801661775839843, -0.01248615347419409].
> Spark will produce an objective value 0.7278 and give a solution vector:
> [0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]
> Notice the last element of the weight vector is 0.
> A even simpler example is:
> {code:title=benchmark.py|borderStyle=solid}
> import numpy as np
> from sklearn.datasets import load_svmlight_file
> from sklearn.linear_model import LogisticRegression
> x_train = np.array([[1, 1], [0, 1]])
> y_train = np.array([1, 0])
> model = LogisticRegression(tol=1e-9, C=0.5, max_iter=1000, 
> fit_intercept=False).fit(x_train, y_train)
> print model.coef_
> [[ 0.22478867 -0.02241016]]
> {code}
> The same data trained by the current solver also gives a different result, 
> see the unit test in the PR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-27 Thread Shuo Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-13029:
---
Description: 
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one (available in the PR). 
The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.


A even simpler example is:
import numpy as np
from sklearn.datasets import load_svmlight_file
from sklearn.linear_model import LogisticRegression
x_train = np.array([[1, 1], [0, 1]])
y_train = np.array([1, 0])
model = LogisticRegression(tol=1e-9, C=0.5, max_iter=1000, 
fit_intercept=False).fit(x_train, y_train)
print model.coef_

   [[ 0.22478867 -0.02241016]]

The same data trained by the current solver also gives a different result, see 
the unit test in the PR.


  was:
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one (available in the PR). 
The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.



> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Shuo Xiang
>Assignee: Shuo Xiang
>
> This is a bug that appears while fitting a Logistic Regression model with 
> `.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
> has one column with identical value, the resulting model is not correct. 
> Specifically, the special column will always get a weight of 0, due to the 
> special check inside the code. However, the correct solution, which is unique 
> for L2 logistic

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-27 Thread Shuo Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-13029:
---
Description: 
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one (available in the PR). 
The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.


A even simpler example is:
{code:title=benchmark.py|borderStyle=solid}
import numpy as np
from sklearn.datasets import load_svmlight_file
from sklearn.linear_model import LogisticRegression
x_train = np.array([[1, 1], [0, 1]])
y_train = np.array([1, 0])
model = LogisticRegression(tol=1e-9, C=0.5, max_iter=1000, 
fit_intercept=False).fit(x_train, y_train)
print model.coef_

[[ 0.22478867 -0.02241016]]

{code}

The same data trained by the current solver also gives a different result, see 
the unit test in the PR.


  was:
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one (available in the PR). 
The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.


A even simpler example is:
{code:title=benchmark.py|borderStyle=solid}
import numpy as np
from sklearn.datasets import load_svmlight_file
from sklearn.linear_model import LogisticRegression
x_train = np.array([[1, 1], [0, 1]])
y_train = np.array([1, 0])
model = LogisticRegression(tol=1e-9, C=0.5, max_iter=1000, 
fit_intercept=False).fit(x_train, y_train)
print model.coef_

   [[ 0.22478867 -0.02241016]]

{code}

The same data trained by the current solver also gives a different result, see 
the unit test in the PR.



> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-27 Thread Shuo Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-13029:
---
Description: 
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one (available in the PR). 
The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.


A even simpler example is:
{code:title=benchmark.py|borderStyle=solid}
import numpy as np
from sklearn.datasets import load_svmlight_file
from sklearn.linear_model import LogisticRegression
x_train = np.array([[1, 1], [0, 1]])
y_train = np.array([1, 0])
model = LogisticRegression(tol=1e-9, C=0.5, max_iter=1000, 
fit_intercept=False).fit(x_train, y_train)
print model.coef_

   [[ 0.22478867 -0.02241016]]

{code}

The same data trained by the current solver also gives a different result, see 
the unit test in the PR.


  was:
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one (available in the PR). 
The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.


A even simpler example is:
import numpy as np
from sklearn.datasets import load_svmlight_file
from sklearn.linear_model import LogisticRegression
x_train = np.array([[1, 1], [0, 1]])
y_train = np.array([1, 0])
model = LogisticRegression(tol=1e-9, C=0.5, max_iter=1000, 
fit_intercept=False).fit(x_train, y_train)
print model.coef_

   [[ 0.22478867 -0.02241016]]

The same data trained by the current solver also gives a different result, see 
the unit test in the PR.



> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
>

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-26 Thread Shuo Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-13029:
---
Description: 
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one. The resulting data is 
run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.


  was:
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the [heart_scale 
data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one. The resulting data is 
run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.



> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Shuo Xiang
>Assignee: Shuo Xiang
>
> This is a bug that appears while fitting a Logistic Regression model with 
> `.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
> has one column with identical value, the resulting model is not correct. 
> Specifically, the special column will always get a weight of 0, due to the 
> special check inside the code. However, the correct solution, which is unique 
> for L2 logistic regression, usually has non-zero weight.
> I use the heart_scale data 
> (https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
> manually augmented the data matrix with a column of one. The resulting data 
> is run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
>  - libsvm
>  - scikit-learn
>  - sparkml
> (Notice libsvm and scikit-learn use a slightly different formulation, so 
> their regularizer is equivalently set to 1/270).
> The first two will have an objective value 0.7275 and give a solution vector:
> [0.03007516959304916,

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-26 Thread Shuo Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-13029:
---
Description: 
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the [heart_scale 
data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one. The resulting data is 
run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.


  was:
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the [heart_scale 
data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one. The resulting data is 
run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.

I have a fix for it and passed the above test.



> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Shuo Xiang
>Assignee: Shuo Xiang
>
> This is a bug that appears while fitting a Logistic Regression model with 
> `.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
> has one column with identical value, the resulting model is not correct. 
> Specifically, the special column will always get a weight of 0, due to the 
> special check inside the code. However, the correct solution, which is unique 
> for L2 logistic regression, usually has non-zero weight.
> I use the [heart_scale 
> data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) 
> and manually augmented the data matrix with a column of one. The resulting 
> data is run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
>  - libsvm
>  - scikit-learn
>  - sparkml
> (Notice libsvm and scikit-learn use a slightly different formulation, so 
> their regularizer is equivalently set to 1/270).
> The first two will have an objective value 0.7275 and give

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-26 Thread Shuo Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-13029:
---
Description: 
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the [heart_scale 
data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one. The resulting data is 
run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.

I have a fix for it and passed the above test.


  was:
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the [heart_scale 
data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one. The resulting data is 
run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

I have a fix for it and passed the above test.



> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Shuo Xiang
>
> This is a bug that appears while fitting a Logistic Regression model with 
> `.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
> has one column with identical value, the resulting model is not correct. 
> Specifically, the special column will always get a weight of 0, due to the 
> special check inside the code. However, the correct solution, which is unique 
> for L2 logistic regression, usually has non-zero weight.
> I use the [heart_scale 
> data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) 
> and manually augmented the data matrix with a column of one. The resulting 
> data is run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
>  - libsvm
>  - scikit-learn
>  - sparkml
> (Notice libsvm and scikit-learn use a slightly different formulation, so 
> their regularizer is equivalently set to 1/270).
> The first two will have an objective value 0.7275 and give a solution vector:
>

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-26 Thread Xiangrui Meng (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangrui Meng updated SPARK-13029:
--
Shepherd: Xiangrui Meng
Target Version/s: 1.5.3, 1.6.1, 2.0.0

> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Shuo Xiang
>Assignee: Shuo Xiang
>
> This is a bug that appears while fitting a Logistic Regression model with 
> `.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
> has one column with identical value, the resulting model is not correct. 
> Specifically, the special column will always get a weight of 0, due to the 
> special check inside the code. However, the correct solution, which is unique 
> for L2 logistic regression, usually has non-zero weight.
> I use the [heart_scale 
> data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) 
> and manually augmented the data matrix with a column of one. The resulting 
> data is run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
>  - libsvm
>  - scikit-learn
>  - sparkml
> (Notice libsvm and scikit-learn use a slightly different formulation, so 
> their regularizer is equivalently set to 1/270).
> The first two will have an objective value 0.7275 and give a solution vector:
> [0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
> 0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
> 0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
> 0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
> 0.1801661775839843, -0.01248615347419409].
> Spark will produce an objective value 0.7278 and give a solution vector:
> [0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]
> Notice the last element of the weight vector is 0.
> I have a fix for it and passed the above test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-26 Thread Xiangrui Meng (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangrui Meng updated SPARK-13029:
--
Assignee: Shuo Xiang

> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Shuo Xiang
>Assignee: Shuo Xiang
>
> This is a bug that appears while fitting a Logistic Regression model with 
> `.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
> has one column with identical value, the resulting model is not correct. 
> Specifically, the special column will always get a weight of 0, due to the 
> special check inside the code. However, the correct solution, which is unique 
> for L2 logistic regression, usually has non-zero weight.
> I use the [heart_scale 
> data](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) 
> and manually augmented the data matrix with a column of one. The resulting 
> data is run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
>  - libsvm
>  - scikit-learn
>  - sparkml
> (Notice libsvm and scikit-learn use a slightly different formulation, so 
> their regularizer is equivalently set to 1/270).
> The first two will have an objective value 0.7275 and give a solution vector:
> [0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
> 0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
> 0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
> 0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
> 0.1801661775839843, -0.01248615347419409].
> Spark will produce an objective value 0.7278 and give a solution vector:
> [0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]
> Notice the last element of the weight vector is 0.
> I have a fix for it and passed the above test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-26 Thread Shuo Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-13029:
---
Description: 
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one (available in the PR). 
The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.


  was:
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html, also 
available in the PR) and manually augmented the data matrix with a column of 
one. The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.



> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Shuo Xiang
>Assignee: Shuo Xiang
>
> This is a bug that appears while fitting a Logistic Regression model with 
> `.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
> has one column with identical value, the resulting model is not correct. 
> Specifically, the special column will always get a weight of 0, due to the 
> special check inside the code. However, the correct solution, which is unique 
> for L2 logistic regression, usually has non-zero weight.
> I use the heart_scale data 
> (https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
> manually augmented the data matrix with a column of one (available in the 
> PR). The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
> following tools:
>  - libsvm
>  - scikit-learn
>  - sparkml
> (Notice libsvm and scikit-learn use a slightly different formulation, so 
> their regularizer is equivalently set to 1/270).
> The first two will have an

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

2016-01-26 Thread Shuo Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-13029:
---
Description: 
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html, also 
available in the PR) and manually augmented the data matrix with a column of 
one. The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.


  was:
This is a bug that appears while fitting a Logistic Regression model with 
`.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
has one column with identical value, the resulting model is not correct. 
Specifically, the special column will always get a weight of 0, due to the 
special check inside the code. However, the correct solution, which is unique 
for L2 logistic regression, usually has non-zero weight.

I use the heart_scale data 
(https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html) and 
manually augmented the data matrix with a column of one. The resulting data is 
run with reg=1.0, max_iter=1000, tol=1e-9 on the following tools:
 - libsvm
 - scikit-learn
 - sparkml

(Notice libsvm and scikit-learn use a slightly different formulation, so their 
regularizer is equivalently set to 1/270).

The first two will have an objective value 0.7275 and give a solution vector:
[0.03007516959304916, 0.09054186091216457, 0.09540306114820495, 
0.02436266296315414, 0.01739437315700921, -0.0006404006623321454
0.06367837291956932, -0.0589096636263823, 0.1382458934368336, 
0.06653302996539669, 0.07988499067852513, 0.1197789052423401, 
0.1801661775839843, -0.01248615347419409].

Spark will produce an objective value 0.7278 and give a solution vector:
[0.029917351003921247,0.08993936770232434,0.09458507615360119,0.024920710363734895,0.018259589234194296,5.929247527202199E-4,0.06362198973221662,-0.059307008587031494,0.13886738997128056,0.0678246717525043,0.08062880450385658,0.12084979858539521,0.180460850026883,0.0]

Notice the last element of the weight vector is 0.



> Logistic regression returns inaccurate results when there is a column with 
> identical value, and fit_intercept=false
> ---
>
> Key: SPARK-13029
> URL: https://issues.apache.org/jira/browse/SPARK-13029
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Shuo Xiang
>Assignee: Shuo Xiang
>
> This is a bug that appears while fitting a Logistic Regression model with 
> `.setStandardization(false)` and `setFitIntercept(false)`. If the data matrix 
> has one column with identical value, the resulting model is not correct. 
> Specifically, the special column will always get a weight of 0, due to the 
> special check inside the code. However, the correct solution, which is unique 
> for L2 logistic regression, usually has non-zero weight.
> I use the heart_scale data 
> (https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html, also 
> available in the PR) and manually augmented the data matrix with a column of 
> one. The resulting data is run with reg=1.0, max_iter=1000, tol=1e-9 on the 
> following tools:
>  - libsvm
>  - scikit-learn
>  - sparkml
> (Notice libsvm and scikit-learn use a slightly different formulation, so 
> their regularizer is equivalently set to 1/270).
> The first two will have an objective value 0.7275 and

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

[jira] [Updated] (SPARK-13029) Logistic regression returns inaccurate results when there is a column with identical value, and fit_intercept=false

11 matches

Site Navigation

Mail list logo

Footer information