Re: [R] A question on modeling brain growth using GAM

2017-04-07 Thread Leon Lee
Simon

I wonder whether I can take advantage of this thread and ask you another
related question. Now, I want to get the 95%CI of the fit and their
derivatives as well. For the original fitted curves, It is straightforward
as the option "type=terms" can be used to get the CI for the fixed effect.
Now, I want to get the CI for the fixed effect in the first derivative as
well. I tried to follow your example in the predict.gam() and got stuck
here:

%- predict.gam example-
dat <- gamSim(1,n=300,scale=sig)
b<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)
 plot(b,pages=1)

 ## now evaluate derivatives of smooths with associated standard
 ## errors, by finite differencing...
 x.mesh <- seq(0,1,length=200) ## where to evaluate derivatives
 newd <- data.frame(x0 = x.mesh,x1 = x.mesh, x2=x.mesh,x3=x.mesh)
 X0 <- predict(b,newd,type="lpmatrix")

 eps <- 1e-7 ## finite difference interval
 x.mesh <- x.mesh + eps ## shift the evaluation mesh
 newd <- data.frame(x0 = x.mesh,x1 = x.mesh, x2=x.mesh,x3=x.mesh)
 X1 <- predict(b,newd,type="lpmatrix")

 Xp <- (X1-X0)/eps ## maps coefficients to (fd approx.) derivatives
 colnames(Xp)  ## can check which cols relate to which smooth

 par(mfrow=c(2,2))
 Xi <- Xp*0
 Xi[,1:9+1] <- Xp[,1:9+1] ## Xi%*%coef(b) = smooth deriv i
 df <- Xi%*%coef(b)  ## ith smooth derivative
 df.sd <- rowSums(Xi%*%b$Vp*Xi)^.5 ## cheap diag(Xi%*%b$Vp%*%t(Xi))^.5
%-predict.gam example---

Am I right that df.sd is the standard error for the derivatives and I can
get the 95% CI by 1.96*df.sd? If so, is this the CI for the fixed effect or
fixed + random effects for the predictions that have random effects
modeled? as I mentioned earlier, my model includes both fixed and random
effects:
gamm1=gamm(BrainVolume~ s(correctedAge) + s(correctedAge, subjIndexF,
bs="fs", k=5), data=mydata)

For the newd in the predict(), I constructed a data frame with all the time
points for all subjects and fed it into the predict.gam() function. If I
only want to get the CI for the fixed effect only for the derivatives, what
should I change here?

Many thanks in advance!

L

On Thu, Apr 6, 2017 at 2:16 PM, Leon Lee <bhamlio...@gmail.com> wrote:

> Hi, Simon
>
> Thank you for your explanation! I followed the instructions and
> successfully get the predicted values with both fixed and random effects
> incorporated: pred.new=predict.gam(gamm1$gam,newdata,type="response").
>
> Also, what I meant to say was "plot(gamm1$gam, pages=1)" for left and
> right figures. I didn't attach any figures.
>
> Thank you very much for the help!
> L
>
> On Thu, Apr 6, 2017 at 10:44 AM, Simon Wood <simon.w...@bath.edu> wrote:
>
>>
>> gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re") +
>> s(subjIndexF, correctedAge, bs="re"), method="REML", data=mydata), where
>> subjIndexF is a factor for each subject. I was thrown an error saying "more
>> coefficients than data".
>>
>> --- I'm not sure exactly  how many scans and subjects you have. The above
>> model will have 10 + 2*(number of subjects ) coeffs. If that is more than
>> the number of scans then gam will not handle it. Depending on the numbers
>> involved you could reduce the k parameter to s(correctedAge), to fix the
>> problem. (e.g. with 31 subjects and 70 scans s(correctedAge=8) should
>> work).
>>
>> However, when I tried to model similar (please correct me if they are not
>> similar) things using GAMM based on description in
>> ?factor.smooth.interaction:
>>  gamm1=gamm(BrainVolume~ s(correctedAge) + s(correctedAge, subjIndexF,
>> bs="fs", k=5), data=mydata)
>>
>> --- It's not the same model. You now have a random smooth curve per
>> subject. You can add random effects in gamm using the list form of the
>> syntax for specifying random effects in lme. see ?gamm. Random intercepts
>> and slopes can be added that way.
>>
>> The model ran.  When I plotted the data using plot(gamm1), I got two
>> figures: the left one is the group mean and 95%CI, which I assume is the
>> results by gamm1$gam model. The right one shows 30 lines (the number of
>> subjects in my data) fluctuating around 0, which I assume is the random
>> effects (gamm1$lme) modeled within each subject that can be added onto the
>> group mean for individual curves. Is my understanding correct? If so, how
>> can I extract these curves from gamm1$lme?
>>
>> --- I would extract the fitted curves using predict(gamm1$gam,...,
>> type="terms") supplying 

Re: [R] A question on modeling brain growth using GAM

2017-04-07 Thread Leon Lee
Hi, Simon

Thank you for your explanation! I followed the instructions and
successfully get the predicted values with both fixed and random effects
incorporated: pred.new=predict.gam(gamm1$gam,newdata,type="response").

Also, what I meant to say was "plot(gamm1$gam, pages=1)" for left and right
figures. I didn't attach any figures.

Thank you very much for the help!
L

On Thu, Apr 6, 2017 at 10:44 AM, Simon Wood <simon.w...@bath.edu> wrote:

>
> gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re") +
> s(subjIndexF, correctedAge, bs="re"), method="REML", data=mydata), where
> subjIndexF is a factor for each subject. I was thrown an error saying "more
> coefficients than data".
>
> --- I'm not sure exactly  how many scans and subjects you have. The above
> model will have 10 + 2*(number of subjects ) coeffs. If that is more than
> the number of scans then gam will not handle it. Depending on the numbers
> involved you could reduce the k parameter to s(correctedAge), to fix the
> problem. (e.g. with 31 subjects and 70 scans s(correctedAge=8) should work
> ).
>
> However, when I tried to model similar (please correct me if they are not
> similar) things using GAMM based on description in
> ?factor.smooth.interaction:
>  gamm1=gamm(BrainVolume~ s(correctedAge) + s(correctedAge, subjIndexF,
> bs="fs", k=5), data=mydata)
>
> --- It's not the same model. You now have a random smooth curve per
> subject. You can add random effects in gamm using the list form of the
> syntax for specifying random effects in lme. see ?gamm. Random intercepts
> and slopes can be added that way.
>
> The model ran.  When I plotted the data using plot(gamm1), I got two
> figures: the left one is the group mean and 95%CI, which I assume is the
> results by gamm1$gam model. The right one shows 30 lines (the number of
> subjects in my data) fluctuating around 0, which I assume is the random
> effects (gamm1$lme) modeled within each subject that can be added onto the
> group mean for individual curves. Is my understanding correct? If so, how
> can I extract these curves from gamm1$lme?
>
> --- I would extract the fitted curves using predict(gamm1$gam,...,
> type="terms") supplying the factor levels and correctedAges at which you
> want to evaluate the curves.
>
> Many thanks!
> L
>
>
>
> On Thu, Apr 6, 2017 at 8:22 AM, Simon Wood <simon.w...@bath.edu> wrote:
>
>> If 'subjIndexF' is a factor for subject, then s(subjIndexF, bs="re") will
>> produce a random effect for subject. i.e. each subject will be given its
>> own random intercept term, which is a way that repeated measures data like
>> this are often handled.
>>
>> The reason for the s(subjIndexF, bs="re") syntax is that smooths can be
>> viewed as Gaussian random effects, so simple Gaussian random effects can
>> also be viewed as (0-dimensional) smooths. In general s(x,z,w,bs="re") just
>> appends the columns of model.matrix(~x:z:w-1) to the gam model matrix, and
>> treats the associated coefficients as i.i.d. Gaussian random effects with a
>> common variance (to be estimated). In principle this works with any number
>> of arguments to s(...,bs="re").
>>
>> See ?random.effects (and its linked help files) in mgcv for more.
>>
>> There are mechanisms for allowing random smooth curves for each subject,
>> (e.g. ?factor.smooth.interaction), but I would only use these if simpler
>> approaches really aren't adequate here.
>>
>> best,
>> Simon
>>
>>
>>
>>
>> On 30/03/17 17:06, David Winsemius wrote:
>>
>>> On Mar 30, 2017, at 6:56 AM, Leon Lee <bhamlio...@gmail.com> wrote:
>>>>
>>>> David
>>>>
>>>> Thank you for your reply. I apologize if I posted in the wrong forum,
>>>> as I really couldn't decide which forum is the best place for my question
>>>> and I saw similar questions asked before in this forum.
>>>>
>>>> I agree that a sample of ~30 subjects (70 scans in total), the model
>>>> can be too complicated. Based on that, I did the following:
>>>> (1) ignored the gender effect, as we have less females than males.
>>>> (2) corrected chronological age based on their gestational age, that
>>>> is, we subtracted an infant's chronological age by 2 weeks, if the infant's
>>>> gestational age is 38 weeks instead of 40weeks.
>>>>
>>>> When I ran the model with corrected age, gestational age and their
>>>> interactions modeled, I found the main effect of gestati

Re: [R] A question on modeling brain growth using GAM

2017-03-30 Thread Leon Lee
David

Thank you for your reply. I apologize if I posted in the wrong forum, as I
really couldn't decide which forum is the best place for my question and I
saw similar questions asked before in this forum.

I agree that a sample of ~30 subjects (70 scans in total), the model can be
too complicated. Based on that, I did the following:
(1) ignored the gender effect, as we have less females than males.
(2) corrected chronological age based on their gestational age, that is, we
subtracted an infant's chronological age by 2 weeks, if the infant's
gestational age is 38 weeks instead of 40weeks.

When I ran the model with corrected age, gestational age and their
interactions modeled, I found the main effect of gestational age and the
interaction between the two are gone.

So, my final model will look something like this:
gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re"),
method="REML", data=mydata)

Does this look more reasonable? Yes, I am relatively new to the mixed
model. We originally applied functional data analysis (PACE) on the data,
but want to see the results using a different approach. Also, I couldn't
find the Mixed Models list, do you mind sending me a link?

Thank you!
Longchuan


On Tue, Mar 28, 2017 at 4:28 PM, David Winsemius <dwinsem...@comcast.net>
wrote:

>
> > On Mar 28, 2017, at 9:32 AM, Leon Lee <bhamlio...@gmail.com> wrote:
> >
> > Hi, R experts
> >
> > I am new to R & GAM toolbox and would like to get inputs from you all on
> my
> > models. The question I have is as follows:
> > I have 30 subjects with each subject being scanned from one to three
> times
> > in the first year of life. The brain volume from each scan was measured.
> > The scan time was randomly distributed from birth to 1 year.
> > Each subject has different gestational age ranging from 38 to 41 weeks
> > Each subject has chronological age from birth to 1 year old
> > Each subject has gender category.
> > Now, I want to look at how predictors, such as subject's chronological
> age,
> > gestational age and gender will explain the changes in brain volume. I
> also
> > want to include interactions between gender and age, gestational and
> > chronological age. Random effects are also included in the model to
> account
> > for subject variability. My model looks like the follows:
> >
> > gam=gam(brainVolume~ s(age) + ti(age, gestationalAge) + gestationalAge +
> > sex + s(age, by=sex) +  s(subjIndexF, bs="re"), method="REML",
> data=mydata)
> >
> > Are there any obvious mistakes in the model? Any suggestions will be
> > greatly appreciated!
>
> I'm not seeing mistakes in the syntax but I would question whether 30
> subjects is sufficient to adequately support estimates in a a model of this
> complexity. I would also think that the 's(age)' and 'sex' terms would get
> aliased out in a model with "+ s(age, by=sex)". Most R regression functions
> handle removal of over-parametrization automatically.
>
> You also have a variable number of measurements per subject. I am unable
> to comment on the effort to account for the implicit and variably measured
> correlation and auto-correlation of values within subjects using a "smooth"
> on subjIndexF, since that is not an approach I was familiar with.  But I am
> getting concerned whether you are also new to statistical modeling in
> addition to your use of R and GAM being "new to you"?
>
> (Perhaps Simon or one of the mixed-effects experts can correct the gaps in
> my understanding of how to model repeated measures in the context of small
> numbers of subjects and irregular emasurements.)
>
> Please read the Posting Guide and the pages of candidate mailing lists.
> Rhelp is not really the place to go when you need statistical advice. I'm
> not sure if this is really in the center of concerns that get discussed on
> the Mixed Models list, but to my eyes it would be a better fit there.
>
> --
> David.
> >
> > L
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A question on modeling brain growth using GAM

2017-03-28 Thread Leon Lee
Hi, R experts

I am new to R & GAM toolbox and would like to get inputs from you all on my
models. The question I have is as follows:
I have 30 subjects with each subject being scanned from one to three times
in the first year of life. The brain volume from each scan was measured.
The scan time was randomly distributed from birth to 1 year.
Each subject has different gestational age ranging from 38 to 41 weeks
Each subject has chronological age from birth to 1 year old
Each subject has gender category.
Now, I want to look at how predictors, such as subject's chronological age,
gestational age and gender will explain the changes in brain volume. I also
want to include interactions between gender and age, gestational and
chronological age. Random effects are also included in the model to account
for subject variability. My model looks like the follows:

gam=gam(brainVolume~ s(age) + ti(age, gestationalAge) + gestationalAge +
sex + s(age, by=sex) +  s(subjIndexF, bs="re"), method="REML", data=mydata)

Are there any obvious mistakes in the model? Any suggestions will be
greatly appreciated!

L

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.