On Tue, 29 Jul 2008, Ben Bolker wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Prof Brian Ripley wrote:
| On Tue, 29 Jul 2008, Ben Bolker wrote:
|
|> jcarmichael <jcarmichael314 <at> gmail.com> writes:
|>>
|>> Hello.
|>>
|>> I am attempting to duplicate a negative binomial regression in R.
|>> SAS uses
|>> generalized estimating equations for model fitting in the GENMOD
|>> procedure.
|>>
|>> proc genmod data=mydata (where=(gender='F'));
|>> by agegroup;
|>> class id gender type;
|>> model count = var1 var2 var3 /dist=NB link=log offset=lregtm;
|>> repeated subject=id /type=exch;
|>> run;
|>>
|>> Since my dataset has several observations for each subject, I need the
|>> REPEATED statement in order to indicate dependence among observations
|>> with
|>> the same subject ID and independence amongst those with distinct subject
|>> IDs. The TYPE statement goes on to specify the structure of the
|>> correlation
|>> matrix to be used (exchangeable in this case).
|>
|> I would try glmmPQL in the MASS package. I don't think you
|> can *quite* get negative binomial regression this way, but
|> you can definitely get a quasipoisson model. I think exchangeable
|> correlation corresponds to correlation=corCompSymm() in your
|> glmmPQL command.
|
| The problem here is that GLMM and GEE are not fitting the same model --
| in one the coefficients are subject-specific and in the other
| population-average (see MASS4 or Diggle, Liang, Zeger +/- Heagarty).
|
| There are several R packages for GEE, including gee, yags, geepack. The
| documentation of geeglm (geepack) claims it can be used with families as
| in glm(), so you could try it with MASS's negative.binomial family.
|
~ Point taken (although I guess I was pointing the original poster
to a way to do a reasonable analysis, not necessarily to duplicate
the SAS analysis as requested). Will the negative.binomial family
really work for this, since it seems to require a fixed theta
(overdispersion) parameter?
I was answering 'I am trying to duplicate': I don't know how SAS estimates
the parameter by GEE. The best guess I have is that theta is estimated
from the initial glm fit and fixed in the GEE phase, but that is only
interpolation from very vague descriptions.
~ If I very naively do the following:
library(geepack)
data(dietox)
mf2 <- formula(Weight~Cu*Time+I(Time^2)+I(Time^3))
gee2 <- geeglm(mf2, data=dietox, id=Pig,
~ family=poisson("identity"),corstr="ar1")
library(MASS)
gee2 <-
geeglm(mf2,data=dietox,id=Pig,family=negative.binomial(theta=100),corstr="ar1")
~ gives an error "variance invalid" --
~ so the whole thing would seem to take a bit of troubleshooting
I wasn't placing much faith on geeglm actually being as general as it says
it is ('claims' ... 'could try').
~ (geeglm also gives warnings about non-integer Poisson values -- I
don't know why a Poisson link is being used in this example for
a non-integer Weight value ... ?)
'poisson' _family_, I presume?
--
Brian D. Ripley, [EMAIL PROTECTED]
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.