[R] Regression Error: Otherwise good variable causes singularity. Why?

2010-08-12 Thread asdir

This command


cdmoutcome- glm(log(value)~factor(year)
   +log(gdppcpppconst)+log(gdppcpppconstAII)
   +log(co2eemisspc)+log(co2eemisspcAII)
   +log(dist)
   +fdiboth
   +odapartnertohost
   +corrupt
   +log(infraindex)
   +litrate
   +africa
   +imr
  , data=cdmdata2, subset=zero==1, gaussian(link =
 identity))

results in this table


Coefficients: (1 not defined because of singularities)
 Estimate Std. Error t value Pr(|t|)  
 (Intercept)1.216e+01  5.771e+01   0.211   0.8332  
 factor(year)2006  -1.403e+00  5.777e-01  -2.429   0.0157 *
 factor(year)2007  -2.799e-01  7.901e-01  -0.354   0.7234  
 log(gdppcpppconst) 2.762e-01  5.517e+00   0.050   0.9601  
 log(gdppcpppconstAII) -1.344e-01  9.025e-01  -0.149   0.8817  
 log(co2eemisspc)   5.655e+00  2.903e+00   1.948   0.0523 .
 log(co2eemisspcAII)   -1.411e-01  4.245e-01  -0.332   0.7399  
 log(dist) -2.938e-01  4.023e-01  -0.730   0.4658  
 fdiboth1.326e-04  1.133e-04   1.171   0.2425  
 odapartnertohost   2.319e-03  1.437e-03   1.613   0.1078  
 corrupt1.875e+00  3.313e+00   0.566   0.5718  
 log(infraindex)4.783e+00  1.091e+01   0.438   0.6615  
 litrate0.47   -2.485e+01  3.190e+01  -0.779   0.4365  
 litrate0.499  -1.657e+01  2.591e+01  -0.639   0.5230  
 litrate0.523  -2.440e+01  3.427e+01  -0.712   0.4769  
 litrate0.528  -9.184e+00  1.379e+01  -0.666   0.5060  
 litrate0.595  -2.309e+01  2.776e+01  -0.832   0.4062  
 litrate0.66   -1.451e+01  2.734e+01  -0.531   0.5961  
 litrate0.675  -1.707e+01  2.813e+01  -0.607   0.5444  
 litrate0.68   -6.346e+00  1.063e+01  -0.597   0.5509  
 litrate0.699   2.717e+00  3.541e+00   0.768   0.4434  
 litrate0.706  -1.960e+01  2.933e+01  -0.668   0.5046  
 litrate0.714  -2.586e+01  4.002e+01  -0.646   0.5186  
 litrate0.736   5.641e+00  1.561e+01   0.361   0.7181  
 litrate0.743  -2.692e+01  4.253e+01  -0.633   0.5273  
 litrate0.762  -2.208e+01  3.100e+01  -0.712   0.4767  
 litrate0.802  -2.325e+01  3.766e+01  -0.617   0.5375  
 litrate0.847  -2.620e+01  3.948e+01  -0.664   0.5075  
 litrate0.86   -3.576e+01  4.950e+01  -0.722   0.4707  
 litrate0.864  -4.482e+01  6.274e+01  -0.714   0.4755  
 litrate0.872  -1.946e+01  2.715e+01  -0.717   0.4739  
 litrate0.877  -2.710e+01  3.702e+01  -0.732   0.4646  
 litrate0.879  -3.460e+01  5.147e+01  -0.672   0.5020  
 litrate0.886  -3.276e+01  4.860e+01  -0.674   0.5008  
 litrate0.889  -4.120e+01  5.755e+01  -0.716   0.4746  
 litrate0.904  -2.282e+01  2.985e+01  -0.764   0.4453  
 litrate0.91   -3.478e+01  5.037e+01  -0.691   0.4904  
 litrate0.923  -1.762e+01  2.551e+01  -0.691   0.4902  
 litrate0.925  -2.445e+01  3.611e+01  -0.677   0.4990  
 litrate0.926  -2.995e+01  4.565e+01  -0.656   0.5123  
 litrate0.928  -2.839e+01  3.933e+01  -0.722   0.4710  
 litrate0.937  -2.571e+01  3.795e+01  -0.677   0.4986  
 litrate0.94   -2.109e+01  3.051e+01  -0.691   0.4900  
 litrate0.959  -2.078e+01  2.895e+01  -0.718   0.4735  
 litrate0.96   -3.403e+01  4.798e+01  -0.709   0.4787  
 litrate0.962  -4.084e+01  5.755e+01  -0.710   0.4785  
 litrate0.971  -3.743e+01  5.247e+01  -0.713   0.4761  
 litrate0.98   -3.709e+01  5.170e+01  -0.717   0.4737  
 litrate0.986  -2.663e+01  4.437e+01  -0.600   0.5488  
 litrate0.991  -3.045e+01  4.166e+01  -0.731   0.4654  
 litrate1  -2.732e+01  4.459e+01  -0.613   0.5405  
 africaNA NA  NA   NA  
 imr2.160e+00  9.357e-01   2.309   0.0216 *

although it should result in something similar to this:


Coefficients: (1 not defined because of singularities)
 Estimate Std. Error t value Pr(|t|)  
 (Intercept)1.216e+01  5.771e+01   0.211   0.8332  
 factor(year)2006  -1.403e+00  5.777e-01  -2.429   0.0157 *
 factor(year)2007  -2.799e-01  7.901e-01  -0.354   0.7234  
 log(gdppcpppconst) 2.762e-01  5.517e+00   0.050   0.9601  
 log(gdppcpppconstAII) -1.344e-01  9.025e-01  -0.149   0.8817  
 log(co2eemisspc)   5.655e+00  2.903e+00   1.948   0.0523 .
 log(co2eemisspcAII)   -1.411e-01  4.245e-01  -0.332   0.7399  
 log(dist) -2.938e-01  4.023e-01  -0.730   0.4658  
 fdiboth1.326e-04  1.133e-04   1.171   0.2425  
 odapartnertohost   2.319e-03  1.437e-03   1.613   0.1078  
 corrupt1.875e+00  3.313e+00   0.566   0.5718  
 log(infraindex)4.783e+00  1.091e+01   0.438   0.6615  
 litrate   -2.485e+01  3.190e+01  -0.779   0.4365  
 

Re: [R] Regression Error: Otherwise good variable causes singularity. Why?

2010-08-12 Thread JLucke
There appears to be a problem in both regressions, as a singularity is 
also reported in the second regression analysis as well.  It appears that 
the litrate variable is considered a factor in the first analysis and 
continuous in the second.   There also appears to be collinearity between 
the litrate variable and the Africa variable.  Look at the package 
lm.influence for regression diagnostics.





asdir dirkroettg...@gmail.com 
Sent by: r-help-boun...@r-project.org
08/12/2010 10:35 AM

To
r-help@r-project.org
cc

Subject
[R] Regression Error: Otherwise good variable causes singularity. Why?







This command


cdmoutcome- glm(log(value)~factor(year)
   +log(gdppcpppconst)+log(gdppcpppconstAII)
   +log(co2eemisspc)+log(co2eemisspcAII)
   +log(dist)
   +fdiboth
   +odapartnertohost
   +corrupt
   +log(infraindex)
   +litrate
   +africa
   +imr
  , data=cdmdata2, subset=zero==1, gaussian(link =
 identity))

results in this table


Coefficients: (1 not defined because of singularities)
 Estimate Std. Error t value Pr(|t|) 
 (Intercept)1.216e+01  5.771e+01   0.211   0.8332 
 factor(year)2006  -1.403e+00  5.777e-01  -2.429   0.0157 *
 factor(year)2007  -2.799e-01  7.901e-01  -0.354   0.7234 
 log(gdppcpppconst) 2.762e-01  5.517e+00   0.050   0.9601 
 log(gdppcpppconstAII) -1.344e-01  9.025e-01  -0.149   0.8817 
 log(co2eemisspc)   5.655e+00  2.903e+00   1.948   0.0523 .
 log(co2eemisspcAII)   -1.411e-01  4.245e-01  -0.332   0.7399 
 log(dist) -2.938e-01  4.023e-01  -0.730   0.4658 
 fdiboth1.326e-04  1.133e-04   1.171   0.2425 
 odapartnertohost   2.319e-03  1.437e-03   1.613   0.1078 
 corrupt1.875e+00  3.313e+00   0.566   0.5718 
 log(infraindex)4.783e+00  1.091e+01   0.438   0.6615 
 litrate0.47   -2.485e+01  3.190e+01  -0.779   0.4365 
 litrate0.499  -1.657e+01  2.591e+01  -0.639   0.5230 
 litrate0.523  -2.440e+01  3.427e+01  -0.712   0.4769 
 litrate0.528  -9.184e+00  1.379e+01  -0.666   0.5060 
 litrate0.595  -2.309e+01  2.776e+01  -0.832   0.4062 
 litrate0.66   -1.451e+01  2.734e+01  -0.531   0.5961 
 litrate0.675  -1.707e+01  2.813e+01  -0.607   0.5444 
 litrate0.68   -6.346e+00  1.063e+01  -0.597   0.5509 
 litrate0.699   2.717e+00  3.541e+00   0.768   0.4434 
 litrate0.706  -1.960e+01  2.933e+01  -0.668   0.5046 
 litrate0.714  -2.586e+01  4.002e+01  -0.646   0.5186 
 litrate0.736   5.641e+00  1.561e+01   0.361   0.7181 
 litrate0.743  -2.692e+01  4.253e+01  -0.633   0.5273 
 litrate0.762  -2.208e+01  3.100e+01  -0.712   0.4767 
 litrate0.802  -2.325e+01  3.766e+01  -0.617   0.5375 
 litrate0.847  -2.620e+01  3.948e+01  -0.664   0.5075 
 litrate0.86   -3.576e+01  4.950e+01  -0.722   0.4707 
 litrate0.864  -4.482e+01  6.274e+01  -0.714   0.4755 
 litrate0.872  -1.946e+01  2.715e+01  -0.717   0.4739 
 litrate0.877  -2.710e+01  3.702e+01  -0.732   0.4646 
 litrate0.879  -3.460e+01  5.147e+01  -0.672   0.5020 
 litrate0.886  -3.276e+01  4.860e+01  -0.674   0.5008 
 litrate0.889  -4.120e+01  5.755e+01  -0.716   0.4746 
 litrate0.904  -2.282e+01  2.985e+01  -0.764   0.4453 
 litrate0.91   -3.478e+01  5.037e+01  -0.691   0.4904 
 litrate0.923  -1.762e+01  2.551e+01  -0.691   0.4902 
 litrate0.925  -2.445e+01  3.611e+01  -0.677   0.4990 
 litrate0.926  -2.995e+01  4.565e+01  -0.656   0.5123 
 litrate0.928  -2.839e+01  3.933e+01  -0.722   0.4710 
 litrate0.937  -2.571e+01  3.795e+01  -0.677   0.4986 
 litrate0.94   -2.109e+01  3.051e+01  -0.691   0.4900 
 litrate0.959  -2.078e+01  2.895e+01  -0.718   0.4735 
 litrate0.96   -3.403e+01  4.798e+01  -0.709   0.4787 
 litrate0.962  -4.084e+01  5.755e+01  -0.710   0.4785 
 litrate0.971  -3.743e+01  5.247e+01  -0.713   0.4761 
 litrate0.98   -3.709e+01  5.170e+01  -0.717   0.4737 
 litrate0.986  -2.663e+01  4.437e+01  -0.600   0.5488 
 litrate0.991  -3.045e+01  4.166e+01  -0.731   0.4654 
 litrate1  -2.732e+01  4.459e+01  -0.613   0.5405 
 africaNA NA  NA   NA 
 imr2.160e+00  9.357e-01   2.309   0.0216 *

although it should result in something similar to this:


Coefficients: (1 not defined because of singularities)
 Estimate Std. Error t value Pr(|t|) 
 (Intercept)1.216e+01  5.771e+01   0.211   0.8332 
 factor(year)2006  -1.403e+00  5.777e-01  -2.429   0.0157 *
 factor(year)2007  -2.799e-01  7.901e-01  -0.354   0.7234 
 log(gdppcpppconst) 2.762e-01  5.517e+00   0.050   0.9601 
 log(gdppcpppconstAII

Re: [R] Regression Error: Otherwise good variable causes singularity. Why?

2010-08-12 Thread David Winsemius


On Aug 12, 2010, at 10:35 AM, asdir wrote:



This command


cdmoutcome- glm(log(value)~factor(year)

 +log(gdppcpppconst)+log(gdppcpppconstAII)
 +log(co2eemisspc)+log(co2eemisspcAII)
 +log(dist)
 +fdiboth
 +odapartnertohost
 +corrupt
 +log(infraindex)
 +litrate
 +africa
 +imr
, data=cdmdata2, subset=zero==1, gaussian(link =
identity))


results in this table


Coefficients: (1 not defined because of singularities)

   Estimate Std. Error t value Pr(|t|)
(Intercept)1.216e+01  5.771e+01   0.211   0.8332
factor(year)2006  -1.403e+00  5.777e-01  -2.429   0.0157 *
factor(year)2007  -2.799e-01  7.901e-01  -0.354   0.7234
log(gdppcpppconst) 2.762e-01  5.517e+00   0.050   0.9601
log(gdppcpppconstAII) -1.344e-01  9.025e-01  -0.149   0.8817
log(co2eemisspc)   5.655e+00  2.903e+00   1.948   0.0523 .
log(co2eemisspcAII)   -1.411e-01  4.245e-01  -0.332   0.7399
log(dist) -2.938e-01  4.023e-01  -0.730   0.4658
fdiboth1.326e-04  1.133e-04   1.171   0.2425
odapartnertohost   2.319e-03  1.437e-03   1.613   0.1078
corrupt1.875e+00  3.313e+00   0.566   0.5718
log(infraindex)4.783e+00  1.091e+01   0.438   0.6615


You have probably created litrate as a factor without realizing it.  
That can easily happen if you just use read.table and one of the  
values cannot be gracefully interpreted as a numeric. Either read in  
with stringsAsFactors=FALSE or asIs=TRUE and then coerce it to  
numeric. or if you want to fix an existing factor f%^-up,  then the  
FAQ tells you to use something like:
cdmdata2$f_ed_variable -  
as.numeric(as.character(cdmdata2$f_ed_variable)




litrate0.47   -2.485e+01  3.190e+01  -0.779   0.4365
litrate0.499  -1.657e+01  2.591e+01  -0.639   0.5230
litrate0.523  -2.440e+01  3.427e+01  -0.712   0.4769
litrate0.528  -9.184e+00  1.379e+01  -0.666   0.5060
litrate0.595  -2.309e+01  2.776e+01  -0.832   0.4062
litrate0.66   -1.451e+01  2.734e+01  -0.531   0.5961
litrate0.675  -1.707e+01  2.813e+01  -0.607   0.5444
litrate0.68   -6.346e+00  1.063e+01  -0.597   0.5509
litrate0.699   2.717e+00  3.541e+00   0.768   0.4434
litrate0.706  -1.960e+01  2.933e+01  -0.668   0.5046
litrate0.714  -2.586e+01  4.002e+01  -0.646   0.5186
litrate0.736   5.641e+00  1.561e+01   0.361   0.7181
litrate0.743  -2.692e+01  4.253e+01  -0.633   0.5273
litrate0.762  -2.208e+01  3.100e+01  -0.712   0.4767
litrate0.802  -2.325e+01  3.766e+01  -0.617   0.5375
litrate0.847  -2.620e+01  3.948e+01  -0.664   0.5075
litrate0.86   -3.576e+01  4.950e+01  -0.722   0.4707
litrate0.864  -4.482e+01  6.274e+01  -0.714   0.4755
litrate0.872  -1.946e+01  2.715e+01  -0.717   0.4739
litrate0.877  -2.710e+01  3.702e+01  -0.732   0.4646
litrate0.879  -3.460e+01  5.147e+01  -0.672   0.5020
litrate0.886  -3.276e+01  4.860e+01  -0.674   0.5008
litrate0.889  -4.120e+01  5.755e+01  -0.716   0.4746
litrate0.904  -2.282e+01  2.985e+01  -0.764   0.4453
litrate0.91   -3.478e+01  5.037e+01  -0.691   0.4904
litrate0.923  -1.762e+01  2.551e+01  -0.691   0.4902
litrate0.925  -2.445e+01  3.611e+01  -0.677   0.4990
litrate0.926  -2.995e+01  4.565e+01  -0.656   0.5123
litrate0.928  -2.839e+01  3.933e+01  -0.722   0.4710
litrate0.937  -2.571e+01  3.795e+01  -0.677   0.4986
litrate0.94   -2.109e+01  3.051e+01  -0.691   0.4900
litrate0.959  -2.078e+01  2.895e+01  -0.718   0.4735
litrate0.96   -3.403e+01  4.798e+01  -0.709   0.4787
litrate0.962  -4.084e+01  5.755e+01  -0.710   0.4785
litrate0.971  -3.743e+01  5.247e+01  -0.713   0.4761
litrate0.98   -3.709e+01  5.170e+01  -0.717   0.4737
litrate0.986  -2.663e+01  4.437e+01  -0.600   0.5488
litrate0.991  -3.045e+01  4.166e+01  -0.731   0.4654
litrate1  -2.732e+01  4.459e+01  -0.613   0.5405
africaNA NA  NA   NA
imr2.160e+00  9.357e-01   2.309   0.0216 *


although it should result in something similar to this:


Coefficients: (1 not defined because of singularities)

   Estimate Std. Error t value Pr(|t|)
(Intercept)1.216e+01  5.771e+01   0.211   0.8332
factor(year)2006  -1.403e+00  5.777e-01  -2.429   0.0157 *
factor(year)2007  -2.799e-01  7.901e-01  -0.354   0.7234
log(gdppcpppconst) 2.762e-01  5.517e+00   0.050   0.9601
log(gdppcpppconstAII) -1.344e-01  9.025e-01  -0.149   0.8817
log(co2eemisspc)   5.655e+00  2.903e+00   1.948   0.0523 .
log(co2eemisspcAII)   -1.411e-01  4.245e-01  -0.332   0.7399
log(dist) -2.938e-01  4.023e-01  -0.730   0.4658
fdiboth

Re: [R] Regression Error: Otherwise good variable causes singularity. Why?

2010-08-12 Thread asdir

@JLucke:
As for the africa variable: I took it out of the model, so that we can
exclude this variable itself and collinearity between the africa and the
litrate variable as causes for the litrate-problem.  This also removed the
singularity remark at the top. However, the problem with litrate-variable
seen as many factors remains.

Just to clarify: The second results table is fictional to explain where I
was headed with my regression.

Anyway, thanks for the quick answer.

@David:
Thanks for the pointer. It was in fact a bad variable, but I created it
myself. I changed the set halfway in between my calculations and thought I
had adjusted everything. It turns out, that I forgot to adjust the
set-length which is re-set in between the two steps of my Heckman-procedure.
In any case: Thanks for the quick and helpful reply. :-)
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Regression-Error-Otherwise-good-variable-causes-singularity-Why-tp2322780p2322925.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.