[R] glm with variance = mu+theta*mu^2?

2005-06-12 Thread Mwalili, S. M.
You can fit negative binomial using the 'zicounts' package
library(zicounts)
 
 data(teeth)
 names(teeth)

   ## c) fit negative binomial regression model
nb.zc <- zicounts(resp = dmft~.,x =~gender + age,data=teeth, distr = "NB")
 nb.zc

Even,
 
library(zicounts)
 library(Fahrmeir) # use cells data
data(cells)
nb.cells <- zicounts(parm=c(2,0,0,0,1),resp = y~.,x 
=~TNF+IFN+TNF:IFN,data=cells, distr = "NB")
 nb.cells
 
 
Samuel.


Kjetil Brinchmann Halvorsen <[EMAIL PROTECTED]> wrote:
Spencer Graves wrote:

> How might you fit a generalized linear model (glm) with variance 
> = mu+theta*mu^2 (where mu = mean of the exponential family random 
> variable and theta is a parameter to be estimated)?
>
> This appears in Table 2.7 of Fahrmeir and Tutz (2001) 
> Multivariate Statisticial Modeling Based on Generalized Linear Models, 
> 2nd ed. (Springer, p. 60), where they compare "log-linear model fits 
> to cellular differentiation data based on quasi-likelihoods" between 
> variance = phi*mu (quasi-Poisson), variance = phi*mu^2 
> (quasi-exponential), and variance = mu+theta*mu^2. The "quasi" 
> function accepted for the family argument in "glm" generates functions 
> "variance", "validmu", and "dev.resids". I can probably write 
> functions to mimic the "quasi" function. However, I have two 
> questions in regard to this:
>
> (1) I don't know what to use for "dev.resids". This may not 
> matter for fitting. I can try a couple of different things to see if 
> it matters.
>
> (2) Might someone else suggest something different, e.g., using 
> something like optim to solve an appropriate quasi-score function?
>
> Thanks,
> spencer graves
>
Since nobody has answerd this I will try. The variance function 
mu+theta*mu^2 is the variance function
of the negative binomial family. If this variance function is used to 
construct a quasi-likelihood, the resulting quasi-
likelihood is identical to the negative binomial likelihood, so for 
fitting we can simly use glm.nb from MASS, which
will give the correct estimated values. However, in a quasi-likelihood 
setting the (co)varince estimation from
glm.nb is not appropriate, and from the book (fahrmeir ..) it seems that 
the estimation method used is a
sandwich estimator, so we can try the sandwich package. This works but 
the numerical results are somewhat different from the book. Any 
comments on this?

my code follows:

> library(Fahrmeir)
> library(help=Fahrmeir)
> library(MASS)
> cells.negbin <- glm(y~TNF+IFN+TNF:IFN, data=cells,
family=negative.binomial(1/0.215))
> summary(cells.negbin)

Call:
glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215),
data = cells)

Deviance Residuals:
Min 1Q Median 3Q Max 
-1.6714 -0.8301 -0.2153 0.4802 1.4282 

Coefficients:
Estimate Std. Error t value Pr(>|t|) 
(Intercept) 3.39874495 0.18791125 18.087 4.5e-10 ***
TNF 0.01616136 0.00360569 4.482 0.00075 ***
IFN 0.00935690 0.00359010 2.606 0.02296 * 
TNF:IFN -0.5910 0.7002 -0.844 0.41515 
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(4.6512) family taken to be 
1.012271)

Null deviance: 46.156 on 15 degrees of freedom
Residual deviance: 12.661 on 12 degrees of freedom
AIC: 155.49

Number of Fisher Scoring iterations: 5

> confint(cells.negbin)
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) 3.0383197319 3.7890206510
TNF 0.0091335087 0.0238915483
IFN 0.0023292566 0.0170195707
TNF:IFN -0.0001996824 0.960427
> library(sandwich)
Loading required package: zoo
> vcovHC( cells.negbin )
(Intercept) TNF IFN 
TNF:IFN
(Intercept) 0.01176249372 -0.0001279740135 -0.0001488223001 
0.0212541999
TNF -0.00012797401 0.039017282 0.021242875 
-0.0019793137
IFN -0.00014882230 0.021242875 0.054314079 
-0.0013277626
TNF:IFN 0.0212542 -0.001979314 -0.001327763 
0.0002370104
> cov2cor(vcovHC( cells.negbin ))
(Intercept) TNF IFN TNF:IFN
(Intercept) 1.000 -0.5973702 -0.5887923 0.1272950
TNF -0.5973702 1.000 0.4614542 -0.6508822
IFN -0.5887923 0.4614542 1.000 -0.3700671
TNF:IFN 0.1272950 -0.6508822 -0.3700671 1.000
> cells.negbin2 <- glm.nb( y~TNF+IFN+TNF:IFN, data=cells)
> summary(cells.negbin)

Call:
glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215),
data = cells)

Deviance Residuals:
Min 1Q Median 3Q Max 
-1.6714 -0.8301 -0.2153 0.4802 1.4282 

Coefficients:
Estimate Std. Error t value Pr(>|t|) 
(Intercept) 3.39874495 0.18791125 18.087 4.5e-10 ***
TNF 0.01616136 0.00360569 4.482 0.00075 ***
IFN 0.00935690 0.00359010 2.606 0.02296 * 
TNF:IFN -0.5910 0.7002 -0.844 0.41515 
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(4.6512) family taken to be 
1.012271)

Null deviance: 46.156 on 15 degrees of freedom
Residual deviance: 12.661 on 12 degrees of freedom
AIC: 155.49

Number of Fisher Scoring iterations: 5

> confint( cells.negbin2 )
Waiting for prof

Re: [R] glm with variance = mu+theta*mu^2?

2005-06-11 Thread Kjetil Brinchmann Halvorsen
Spencer Graves wrote:

>   How might you fit a generalized linear model (glm) with variance 
> = mu+theta*mu^2 (where mu = mean of the exponential family random 
> variable and theta is a parameter to be estimated)?
>
>   This appears in Table 2.7 of Fahrmeir and Tutz (2001) 
> Multivariate Statisticial Modeling Based on Generalized Linear Models, 
> 2nd ed. (Springer, p. 60), where they compare "log-linear model fits 
> to cellular differentiation data based on quasi-likelihoods" between 
> variance = phi*mu (quasi-Poisson), variance = phi*mu^2 
> (quasi-exponential), and variance = mu+theta*mu^2.  The "quasi" 
> function accepted for the family argument in "glm" generates functions 
> "variance", "validmu", and "dev.resids".  I can probably write 
> functions  to mimic the "quasi" function.  However, I have two 
> questions in regard to this:
>
>   (1) I don't know what to use for "dev.resids".  This may not 
> matter for fitting.  I can try a couple of different things to see if 
> it matters.
>
>   (2) Might someone else suggest something different, e.g., using 
> something like optim to solve an appropriate quasi-score function?
>
>   Thanks,
>   spencer graves
>
Since nobody has answerd this I will try. The variance function 
mu+theta*mu^2 is the variance function
of the negative binomial family. If this variance function is used to 
construct a quasi-likelihood, the resulting quasi-
likelihood is identical to the negative binomial likelihood, so for 
fitting we can simly use glm.nb from MASS, which
will give the correct estimated values. However, in a quasi-likelihood 
setting the (co)varince estimation from
glm.nb is not appropriate, and from the book (fahrmeir ..) it seems that 
the estimation method used is a
sandwich estimator, so we can try the sandwich package.  This works but 
the numerical results are somewhat different from  the book. Any 
comments on this?

my code follows:

 > library(Fahrmeir)
 > library(help=Fahrmeir)
 > library(MASS)
 >  cells.negbin <- glm(y~TNF+IFN+TNF:IFN, data=cells,
 family=negative.binomial(1/0.215))
 > summary(cells.negbin)

Call:
glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215),
data = cells)

Deviance Residuals:
Min   1Q   Median   3Q  Max 
-1.6714  -0.8301  -0.2153   0.4802   1.4282 

Coefficients:
   Estimate  Std. Error t value Pr(>|t|)   
(Intercept)  3.39874495  0.18791125  18.087  4.5e-10 ***
TNF  0.01616136  0.00360569   4.482  0.00075 ***
IFN  0.00935690  0.00359010   2.606  0.02296 * 
TNF:IFN -0.5910  0.7002  -0.844  0.41515   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(4.6512) family taken to be 
1.012271)

Null deviance: 46.156  on 15  degrees of freedom
Residual deviance: 12.661  on 12  degrees of freedom
AIC: 155.49

Number of Fisher Scoring iterations: 5

 > confint(cells.negbin)
Waiting for profiling to be done...
2.5 %   97.5 %
(Intercept)  3.0383197319 3.7890206510
TNF  0.0091335087 0.0238915483
IFN  0.0023292566 0.0170195707
TNF:IFN -0.0001996824 0.960427
 > library(sandwich)
Loading required package: zoo
 > vcovHC( cells.negbin )
   (Intercept)  TNF  IFN   
TNF:IFN
(Intercept)  0.01176249372 -0.0001279740135 -0.0001488223001  
0.0212541999
TNF -0.00012797401  0.039017282  0.021242875 
-0.0019793137
IFN -0.00014882230  0.021242875  0.054314079 
-0.0013277626
TNF:IFN  0.0212542 -0.001979314 -0.001327763  
0.0002370104
 > cov2cor(vcovHC( cells.negbin ))
(Intercept)TNFIFNTNF:IFN
(Intercept)   1.000 -0.5973702 -0.5887923  0.1272950
TNF  -0.5973702  1.000  0.4614542 -0.6508822
IFN  -0.5887923  0.4614542  1.000 -0.3700671
TNF:IFN   0.1272950 -0.6508822 -0.3700671  1.000
 > cells.negbin2 <- glm.nb( y~TNF+IFN+TNF:IFN, data=cells)
 > summary(cells.negbin)

Call:
glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215),
data = cells)

Deviance Residuals:
Min   1Q   Median   3Q  Max 
-1.6714  -0.8301  -0.2153   0.4802   1.4282 

Coefficients:
   Estimate  Std. Error t value Pr(>|t|)   
(Intercept)  3.39874495  0.18791125  18.087  4.5e-10 ***
TNF  0.01616136  0.00360569   4.482  0.00075 ***
IFN  0.00935690  0.00359010   2.606  0.02296 * 
TNF:IFN -0.5910  0.7002  -0.844  0.41515   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(4.6512) family taken to be 
1.012271)

Null deviance: 46.156  on 15  degrees of freedom
Residual deviance: 12.661  on 12  degrees of freedom
AIC: 155.49

Number of Fisher Scoring iterations: 5

 > confint( cells.negbin2 )
Waiting for profiling to be done...
   

[R] glm with variance = mu+theta*mu^2?

2005-06-02 Thread Spencer Graves
	  How might you fit a generalized linear model (glm) with variance = 
mu+theta*mu^2 (where mu = mean of the exponential family random variable 
and theta is a parameter to be estimated)?


	  This appears in Table 2.7 of Fahrmeir and Tutz (2001) Multivariate 
Statisticial Modeling Based on Generalized Linear Models, 2nd ed. 
(Springer, p. 60), where they compare "log-linear model fits to cellular 
differentiation data based on quasi-likelihoods" between variance = 
phi*mu (quasi-Poisson), variance = phi*mu^2 (quasi-exponential), and 
variance = mu+theta*mu^2.  The "quasi" function accepted for the family 
argument in "glm" generates functions "variance", "validmu", and 
"dev.resids".  I can probably write functions  to mimic the "quasi" 
function.  However, I have two questions in regard to this:


	  (1) I don't know what to use for "dev.resids".  This may not matter 
for fitting.  I can try a couple of different things to see if it matters.


	  (2) Might someone else suggest something different, e.g., using 
something like optim to solve an appropriate quasi-score function?


  Thanks,
  spencer graves

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html