[jira] [Commented] (MATH-1428) OLSMultipleLinearRegression estimates different residuals with different order of input

2020-07-03 Thread Gilles Sadowski (Jira)


[ 
https://issues.apache.org/jira/browse/MATH-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151110#comment-17151110
 ] 

Gilles Sadowski commented on MATH-1428:
---

Thanks for the additional information.
Hopefully someone will delve into the code and make it more robust. ;-)

> OLSMultipleLinearRegression estimates different  residuals with different 
> order of input
> 
>
> Key: MATH-1428
> URL: https://issues.apache.org/jira/browse/MATH-1428
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.4.1
> Environment: win7  64bit  jdk1.8  intelljidea 
>Reporter: butchild
>Priority: Major
>  Labels: ols, regression, residuals
>
> I have a regression job with  31 X  ,which 30 of them are dummys .
> And the length of data is 800+ .
> I'm using OLSMultipleLinearRegression to do regression.
> I found if I change the order of the 800+ data, the residuals I got from  
> ols.estimateResiduals()
> are differents ,and  the correlation of the two differet  rersiduals is near 
> 100%,like 99.8%.
> My data is below in Docs Text area.
> The fields of each Column is :
> sig,y,x1,x2,xn



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MATH-1428) OLSMultipleLinearRegression estimates different residuals with different order of input

2020-07-03 Thread David Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/MATH-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150921#comment-17150921
 ] 

David Hudson commented on MATH-1428:


I encountered this issue recently, also with a dataset having multiple dummy 
variables.

Turned out the the columns were not linearly independent. After removing one of 
the dummy variables, different column orders produced a stable output as 
expected.

It's worth noting that I compared the results against some python libraries 
(sklearn/statsmodels) and these gave the correct results for things like the 
intercept and regular varibales even with the dependent columns. 

> OLSMultipleLinearRegression estimates different  residuals with different 
> order of input
> 
>
> Key: MATH-1428
> URL: https://issues.apache.org/jira/browse/MATH-1428
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.4.1
> Environment: win7  64bit  jdk1.8  intelljidea 
>Reporter: butchild
>Priority: Major
>  Labels: ols, regression, residuals
>
> I have a regression job with  31 X  ,which 30 of them are dummys .
> And the length of data is 800+ .
> I'm using OLSMultipleLinearRegression to do regression.
> I found if I change the order of the 800+ data, the residuals I got from  
> ols.estimateResiduals()
> are differents ,and  the correlation of the two differet  rersiduals is near 
> 100%,like 99.8%.
> My data is below in Docs Text area.
> The fields of each Column is :
> sig,y,x1,x2,xn



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MATH-1428) OLSMultipleLinearRegression estimates different residuals with different order of input

2017-08-11 Thread Gilles (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123215#comment-16123215
 ] 

Gilles commented on MATH-1428:
--

What result did you expect?
What do other libraries produce?

Also, please provide a _minimal_ working code (preferably a JUnit test) example.


> OLSMultipleLinearRegression estimates different  residuals with different 
> order of input
> 
>
> Key: MATH-1428
> URL: https://issues.apache.org/jira/browse/MATH-1428
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.4.1
> Environment: win7  64bit  jdk1.8  intelljidea 
>Reporter: butchild
>  Labels: ols, regression, residuals
>
> I have a regression job with  31 X  ,which 30 of them are dummys .
> And the length of data is 800+ .
> I'm using OLSMultipleLinearRegression to do regression.
> I found if I change the order of the 800+ data, the residuals I got from  
> ols.estimateResiduals()
> are differents ,and  the correlation of the two differet  rersiduals is near 
> 100%,like 99.8%.
> My data is below in Docs Text area.
> The fields of each Column is :
> sig,y,x1,x2,xn



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)