I have been able to isolate the problem, though I still do not understand why 
this is occurring.  Consider the following code:

#####
library(foreign)
muscatine <- 
read.dta('http://www.hsph.harvard.edu/fitzmaur/ala2e/muscatine.dta')
  muscatine$gender <- as.factor(muscatine$gender)
  muscatine$y.fac  <- as.factor(muscatine$y)     # Make the response a factor
  muscatine$cage   <- muscatine$age - 12
  muscatine$cage2  <- muscatine$cage^2
muscatine2 <- na.omit(muscatine)                 # Remove missing data

> str(muscatine2)
'data.frame':   9856 obs. of  9 variables:
 $ id      : num  1 1 1 2 2 2 3 3 3 4 ...
 $ gender  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
 $ baseage : num  6 6 6 6 6 6 6 6 6 6 ...
 $ age     : num  6 8 10 6 8 10 6 8 10 6 ...
 $ occasion: num  1 2 3 1 2 3 1 2 3 1 ...
 $ y       : num  1 1 1 1 1 1 1 1 1 1 ...
 $ y.fac   : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
 $ cage    : num  -6 -4 -2 -6 -4 -2 -6 -4 -2 -6 ...
 $ cage2   : num  36 16 4 36 16 4 36 16 4 36 ...


# This model works and is fairly close to Fitzmaurice book results
f1 <- geeglm(y ~ gender*cage + gender*cage2, id=id, data=muscatine2, 
              family=binomial(link=logit), 
              waves=muscatine2$occasions, corstr='unstructured')  

# This model does not work, only difference is response is a factor
f2 <- geeglm(y.fac ~ gender*cage + gender*cage2, id=id, data=muscatine2, 
             family=binomial(link=logit), 
             waves=muscatine2$occasions, corstr='unstructured')

> ...
Error in lm.fit(zsca, qlf(pr2), offset = soffset) : NA/NaN/Inf in 'y'
In addition: Warning messages:
1: In model.response(mf, "numeric") :
  using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, mu) : - not meaningful for factors

# These models do not fix problem and give a new error message
f3 <- geeglm(as.numeric(y.fac) ~ gender*cage + gender*cage2, id=id, 
data=muscatine2, 
             family=binomial(link=logit), 
             waves=muscatine2$occasions, corstr='unstructured')

f4 <- geeglm(as.integer(y.fac) ~ gender*cage + gender*cage2, id=id, 
data=muscatine2, 
             family=binomial(link=logit), 
             waves=muscatine2$occasions, corstr='unstructured')

> ...
Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1

# This model works and is really really close to Fitzmaurice book
f5 <- ordgee(ordered(y.fac) ~ gender*cage + gender*cage2, id=id, 
data=muscatine2, 
              mean.link='logit', 
              waves=muscatine2$occasions, corstr='unstructured')

###


Bottom line:

Something is occurring when I changed the response variable "y" into the factor 
"y.fac". that makes geeglm spit out an error and occasionally even crash R 
(according to some respondents that were trying to help me).  This error is not 
reversible by converting y.fac back into a numeric variable.  Interestingly, 
the ordgee function from the same geepack package handles the factor response 
variable without issue and appears to give results that best mimic the textbook 
example.

Thanks to all that helped.  Hope this summary helps debug the geeglm function 
and help others. I have cc'd the geepack maintainer as suggested by some of you.

Brant



On Mar 1, 2014, at 8:31 PM, Brant Inman <brant.in...@me.com> wrote:

> Duncan,
> 
> Thank you for your reply.  The example is in fact not ordinal (the response 
> variable Y is an indicator of the presence or absence of obesity).  I too saw 
> their code snippet online where they use an ordinal GEE, but the outcome 
> variable is binary as can be seen from the imported data from the link I 
> provided. I thought that that since Y is a dichotomous outcome that the model 
> I proposed would be appropriate, but somehow the geeglm function thinks there 
> is missing data and I don't see how that can be.
> 
> Any other ideas?
> 
> 
> Brant
> 
> On Mar 1, 2014, at 8:13 PM, Duncan Mackay <dulca...@bigpond.com> wrote:
> 
>> Hi Brant
>> 
>> I have not got Fitzmaurice etal but from their web site it seems  that you
>> are trying to do ordinal GEE
>> 
>> With GEE models particularly ordinal models you MUST get your data structure
>> correct otherwise it can fail or even R can crash
>> 
>> try
>> 
>> f1 = 
>> ordgee(ordered(y) ~ factor(gender) + cage + cage2 +
>>      factor(gender):cage + factor(gender):cage2, id = id, data =
>> muscatine2,
>>      waves=muscatine2$occasion, mean.link="logit",
>> corstr=("unstructured"))
>> 
>>> summary(f1)
>> 
>> Call:
>> ordgee(formula = ordered(y) ~ factor(gender) + cage + cage2 + 
>>   factor(gender):cage + factor(gender):cage2, id = id, waves =
>> muscatine2$occasion, 
>>   data = muscatine2, mean.link = "logit", corstr = ("unstructured"))
>> 
>> Mean Model:
>> Mean Link:                 logit 
>> Variance to Mean Relation: binomial 
>> 
>> Coefficients:
>>                         estimate      san.se        wald            p
>> Inter:0               -1.214613103 0.050571150 576.8597850 0.000000e+00
>> factor(gender)1        0.115330450 0.071158497   2.6268450 1.050703e-01
>> cage                   0.037419375 0.013263832   7.9589357 4.785054e-03
>> cage2                 -0.017437692 0.003378786  26.6352422 2.457205e-07
>> factor(gender)1:cage   0.007510802 0.018268075   0.1690390 6.809673e-01
>> factor(gender)1:cage2  0.003860069 0.004632095   0.6944407 4.046580e-01
>> 
>> Scale is fixed.
>> 
>> Correlation Model:
>> Correlation Structure:     unstructured 
>> Correlation Link:          log 
>> 
>> Estimated Correlation Parameters:
>>       estimate    san.se     wald p
>> alpha.1 3.130702 0.1535950 415.4599 0
>> alpha.2 2.408103 0.1455606 273.6921 0
>> alpha.3 2.793549 0.1351264 427.3978 0
>> 
>> Returned Error Value:    0 
>> Number of clusters:   4856   Maximum cluster size: 3
>> 
>> I presume that you may have a dataset in mind to work on later
>> 
>> you may want to check out the repolr and multgee packages as well
>> 
>> Duncan
>> 
>> Duncan Mackay
>> Department of Agronomy and Soil Science
>> University of New England
>> Armidale NSW 2351
>> Email: home: mac...@northnet.com.au
>> 
>> 
>> 
>> -----Original Message-----
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
>> Behalf Of Brant Inman
>> Sent: Sunday, 2 March 2014 03:52
>> To: r-help@r-project.org
>> Subject: [R] geeglm error NA/NaN/Inf in 'y'
>> 
>> R-helpers:
>> 
>> I am getting an error when trying to fit a GEE model.  Below is code
>> reproducing the error.
>> 
>> ###
>> library(foreign)
>> muscatine <-
>> read.dta('http://www.hsph.harvard.edu/fitzmaur/ala2e/muscatine.dta')
>> muscatine$gender <- as.factor(muscatine$gender)
>> muscatine$y      <- as.factor(muscatine$y)
>> muscatine$cage   <- muscatine$age - 12
>> muscatine$cage2  <- muscatine$cage^2
>> head(muscatine); summary(muscatine)
>> muscatine2 <- na.omit(muscatine);  summary(muscatine2)  # Remove missing
>> data
>> 
>> # GEE model to reproduce example in Fitzmaurice, Laird, Ware book
>> library(geepack)
>> 
>> f1 <- geeglm(y ~ gender*cage + gender*cage2, id=id, data=muscatine2, 
>>         family=binomial(link=logit), 
>>         waves=occasion, corstr='unstructured')
>> ###
>> 
>> This gives me the following error
>> 
>>> f1 <- geeglm(y ~ gender*cage + gender*cage2, id=id, data=muscatine2, 
>> +           family=binomial(link=logit), 
>> +           waves=occasion, corstr='unstructured')
>> Error in lm.fit(zsca, qlf(pr2), offset = soffset) : NA/NaN/Inf in 'y'
>> In addition: Warning messages:
>> 1: In model.response(mf, "numeric") :
>> using type = "numeric" with a factor response will be ignored
>> 2: In Ops.factor(y, mu) : - not meaningful for factors
>> 
>> ###
>> 
>> I would tremendously appreciate any help that could explain why I am getting
>> this error as I am not understanding this.
>> 
>> Brant 
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to