Em Thomas Lumley, escreveu:
> When you fit logistic regression models to fairly sparse data you can
> often have a situation where for some combination of variables the
> response variable is either all 0 or all 1.  In that case the maximum
> likelihood estimates for at least some of the coefficients will be
> infinite.  That's what R is telling you.
> You should be able to tell which coefficients are infinite -- the
> coefficients and their standard errors will be large.
> When this happens the standard errors and the p-values reported by
> summary.glm() for those variables are useless.
>       -thomas

Hi Thomas,

I try to understand this problem. It is very common in ecology data using 
binomial or poisson errors.

> summary(abundmod)
       x1            y                x2               n         
 Sp1    : 12   Min.   : 0.000   Min.   : 3.210   Min.   : 13.00  
 Sp10   : 12   1st Qu.: 0.000   1st Qu.: 6.572   1st Qu.: 29.25  
 Sp11   : 12   Median : 1.000   Median : 8.845   Median : 44.50  
 Sp12   : 12   Mean   : 4.011   Mean   :19.417   Mean   : 92.25  
 Sp13   : 12   3rd Qu.: 3.000   3rd Qu.:28.988   3rd Qu.:119.25  
 Sp14   : 12   Max.   :92.000   Max.   :60.530   Max.   :338.00  

> m <- glm(y/n~x1*x2,family=binomial,weights=n,maxit=1000)
Warning message: 
fitted probabilities numerically 0 or 1 occurred in: (if (is.empty.model(mt)) 
glm.fit.null else glm.fit)(x = X, y = Y,  

I tell the levels which coefficients are infinite.

x1Sp22      18.9024041 44.4228068   0.426 0.670464    
x1Sp5       22.0655076 42.1371974   0.524 0.600516    

I look the dataset to understand why these two levels are ""wrongs"".

Both appear alone in one value of x2.


some others levels appear alone in some level of x2. Look:

> tapply(y,list(x1,x2),c)
     3.21 4.05 5.56 6.91 7.97 8.56 9.13 16.13 25.58 39.21 46.16 60.53
Sp1     2    0    9    6    6    4    8    21    20    17    60    18
Sp10    4    1    3    1    0    2    2    19    13     7    12     5
Sp11    2    0    0    1    4    0    1     4     8     5    19     6
Sp12    0    1    1    1    5    0    0     1    13     3     7     6
Sp13    0    0    0    1    0    0    0     0     0     1     1     0
Sp14    0    0    0    2    1    0    1     1    13     0     3     3
Sp15    0    0    1    0    0    0    0     0     1     0     0     0
Sp16    2    4    3    6    8    2    8     0    14     4    21     5
Sp17    0    0    0    0    0    0    0     0     0     0     5     0
Sp18    5    4   12    1    9    3    5    36    40    27    92    52
Sp19    0    0    1    1    0    0    2     0     2     0     0     0
Sp2     2    0    1    0    0    0    0     3     1     0     2     1
Sp20    2    2    2    2    5    4    3     2    13    37    77    29
Sp21    0    0    0    0    0    0    1     0     0     0     0     0
Sp22    1    0    0    0    0    0    0     0     0     0     0     0
Sp23    0    0    0    1    0    0    0     0     0     0     0     0
Sp3     2    0    5    6    3    2    6    16    22     7    21     9
Sp4     0    1    0    1    0    0    0     1     3     0     6     1
Sp5     2    0    0    0    0    0    0     0     0     0     0     0
Sp6     0    0    0    0    3    1    2     6    12     0     7     0
Sp7     0    0    0    0    0    0    0     0     6     0     1     0
Sp8     0    0    0    1    0    0    0     1     3     2     3     2
Sp9     0    0    0    0    4    0    2     2     8     3     1     1

sp17, sp21, sp23 appear for some one value of x2. Why the problem is just with 
Sp22 and Sp5?

It is a problem in my dataset? I need to remove these levels? What is the 
correct mean? How to resolve this?


