Re: [R] A question on modeling brain growth using GAM

2017-04-07 Thread Leon Lee
Simon

I wonder whether I can take advantage of this thread and ask you another
related question. Now, I want to get the 95%CI of the fit and their
derivatives as well. For the original fitted curves, It is straightforward
as the option "type=terms" can be used to get the CI for the fixed effect.
Now, I want to get the CI for the fixed effect in the first derivative as
well. I tried to follow your example in the predict.gam() and got stuck
here:

%- predict.gam example-
dat <- gamSim(1,n=300,scale=sig)
b<-gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)
 plot(b,pages=1)

 ## now evaluate derivatives of smooths with associated standard
 ## errors, by finite differencing...
 x.mesh <- seq(0,1,length=200) ## where to evaluate derivatives
 newd <- data.frame(x0 = x.mesh,x1 = x.mesh, x2=x.mesh,x3=x.mesh)
 X0 <- predict(b,newd,type="lpmatrix")

 eps <- 1e-7 ## finite difference interval
 x.mesh <- x.mesh + eps ## shift the evaluation mesh
 newd <- data.frame(x0 = x.mesh,x1 = x.mesh, x2=x.mesh,x3=x.mesh)
 X1 <- predict(b,newd,type="lpmatrix")

 Xp <- (X1-X0)/eps ## maps coefficients to (fd approx.) derivatives
 colnames(Xp)  ## can check which cols relate to which smooth

 par(mfrow=c(2,2))
 Xi <- Xp*0
 Xi[,1:9+1] <- Xp[,1:9+1] ## Xi%*%coef(b) = smooth deriv i
 df <- Xi%*%coef(b)  ## ith smooth derivative
 df.sd <- rowSums(Xi%*%b$Vp*Xi)^.5 ## cheap diag(Xi%*%b$Vp%*%t(Xi))^.5
%-predict.gam example---

Am I right that df.sd is the standard error for the derivatives and I can
get the 95% CI by 1.96*df.sd? If so, is this the CI for the fixed effect or
fixed + random effects for the predictions that have random effects
modeled? as I mentioned earlier, my model includes both fixed and random
effects:
gamm1=gamm(BrainVolume~ s(correctedAge) + s(correctedAge, subjIndexF,
bs="fs", k=5), data=mydata)

For the newd in the predict(), I constructed a data frame with all the time
points for all subjects and fed it into the predict.gam() function. If I
only want to get the CI for the fixed effect only for the derivatives, what
should I change here?

Many thanks in advance!

L

On Thu, Apr 6, 2017 at 2:16 PM, Leon Lee  wrote:

> Hi, Simon
>
> Thank you for your explanation! I followed the instructions and
> successfully get the predicted values with both fixed and random effects
> incorporated: pred.new=predict.gam(gamm1$gam,newdata,type="response").
>
> Also, what I meant to say was "plot(gamm1$gam, pages=1)" for left and
> right figures. I didn't attach any figures.
>
> Thank you very much for the help!
> L
>
> On Thu, Apr 6, 2017 at 10:44 AM, Simon Wood  wrote:
>
>>
>> gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re") +
>> s(subjIndexF, correctedAge, bs="re"), method="REML", data=mydata), where
>> subjIndexF is a factor for each subject. I was thrown an error saying "more
>> coefficients than data".
>>
>> --- I'm not sure exactly  how many scans and subjects you have. The above
>> model will have 10 + 2*(number of subjects ) coeffs. If that is more than
>> the number of scans then gam will not handle it. Depending on the numbers
>> involved you could reduce the k parameter to s(correctedAge), to fix the
>> problem. (e.g. with 31 subjects and 70 scans s(correctedAge=8) should
>> work).
>>
>> However, when I tried to model similar (please correct me if they are not
>> similar) things using GAMM based on description in
>> ?factor.smooth.interaction:
>>  gamm1=gamm(BrainVolume~ s(correctedAge) + s(correctedAge, subjIndexF,
>> bs="fs", k=5), data=mydata)
>>
>> --- It's not the same model. You now have a random smooth curve per
>> subject. You can add random effects in gamm using the list form of the
>> syntax for specifying random effects in lme. see ?gamm. Random intercepts
>> and slopes can be added that way.
>>
>> The model ran.  When I plotted the data using plot(gamm1), I got two
>> figures: the left one is the group mean and 95%CI, which I assume is the
>> results by gamm1$gam model. The right one shows 30 lines (the number of
>> subjects in my data) fluctuating around 0, which I assume is the random
>> effects (gamm1$lme) modeled within each subject that can be added onto the
>> group mean for individual curves. Is my understanding correct? If so, how
>> can I extract these curves from gamm1$lme?
>>
>> --- I would extract the fitted curves using predict(gamm1$gam,...,
>> type="terms") supplying the factor levels and correctedAges at which you
>> want to evaluate the curves.
>>
>> Many thanks!
>> L
>>
>>
>>
>> On Thu, Apr 6, 2017 at 8:22 AM, Simon Wood  wrote:
>>
>>> If 'subjIndexF' is a factor for subject, then s(subjIndexF, bs="re")
>>> will produce a random effect for subject. i.e. each subject will be given
>>> its own random intercept term, which is a way that repeated measures data
>>> like this are often handled.
>>>
>>> The reason for the s(subjInde

Re: [R] A question on modeling brain growth using GAM

2017-04-07 Thread Leon Lee
Hi, Simon

Thank you for your explanation! I followed the instructions and
successfully get the predicted values with both fixed and random effects
incorporated: pred.new=predict.gam(gamm1$gam,newdata,type="response").

Also, what I meant to say was "plot(gamm1$gam, pages=1)" for left and right
figures. I didn't attach any figures.

Thank you very much for the help!
L

On Thu, Apr 6, 2017 at 10:44 AM, Simon Wood  wrote:

>
> gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re") +
> s(subjIndexF, correctedAge, bs="re"), method="REML", data=mydata), where
> subjIndexF is a factor for each subject. I was thrown an error saying "more
> coefficients than data".
>
> --- I'm not sure exactly  how many scans and subjects you have. The above
> model will have 10 + 2*(number of subjects ) coeffs. If that is more than
> the number of scans then gam will not handle it. Depending on the numbers
> involved you could reduce the k parameter to s(correctedAge), to fix the
> problem. (e.g. with 31 subjects and 70 scans s(correctedAge=8) should work
> ).
>
> However, when I tried to model similar (please correct me if they are not
> similar) things using GAMM based on description in
> ?factor.smooth.interaction:
>  gamm1=gamm(BrainVolume~ s(correctedAge) + s(correctedAge, subjIndexF,
> bs="fs", k=5), data=mydata)
>
> --- It's not the same model. You now have a random smooth curve per
> subject. You can add random effects in gamm using the list form of the
> syntax for specifying random effects in lme. see ?gamm. Random intercepts
> and slopes can be added that way.
>
> The model ran.  When I plotted the data using plot(gamm1), I got two
> figures: the left one is the group mean and 95%CI, which I assume is the
> results by gamm1$gam model. The right one shows 30 lines (the number of
> subjects in my data) fluctuating around 0, which I assume is the random
> effects (gamm1$lme) modeled within each subject that can be added onto the
> group mean for individual curves. Is my understanding correct? If so, how
> can I extract these curves from gamm1$lme?
>
> --- I would extract the fitted curves using predict(gamm1$gam,...,
> type="terms") supplying the factor levels and correctedAges at which you
> want to evaluate the curves.
>
> Many thanks!
> L
>
>
>
> On Thu, Apr 6, 2017 at 8:22 AM, Simon Wood  wrote:
>
>> If 'subjIndexF' is a factor for subject, then s(subjIndexF, bs="re") will
>> produce a random effect for subject. i.e. each subject will be given its
>> own random intercept term, which is a way that repeated measures data like
>> this are often handled.
>>
>> The reason for the s(subjIndexF, bs="re") syntax is that smooths can be
>> viewed as Gaussian random effects, so simple Gaussian random effects can
>> also be viewed as (0-dimensional) smooths. In general s(x,z,w,bs="re") just
>> appends the columns of model.matrix(~x:z:w-1) to the gam model matrix, and
>> treats the associated coefficients as i.i.d. Gaussian random effects with a
>> common variance (to be estimated). In principle this works with any number
>> of arguments to s(...,bs="re").
>>
>> See ?random.effects (and its linked help files) in mgcv for more.
>>
>> There are mechanisms for allowing random smooth curves for each subject,
>> (e.g. ?factor.smooth.interaction), but I would only use these if simpler
>> approaches really aren't adequate here.
>>
>> best,
>> Simon
>>
>>
>>
>>
>> On 30/03/17 17:06, David Winsemius wrote:
>>
>>> On Mar 30, 2017, at 6:56 AM, Leon Lee  wrote:

 David

 Thank you for your reply. I apologize if I posted in the wrong forum,
 as I really couldn't decide which forum is the best place for my question
 and I saw similar questions asked before in this forum.

 I agree that a sample of ~30 subjects (70 scans in total), the model
 can be too complicated. Based on that, I did the following:
 (1) ignored the gender effect, as we have less females than males.
 (2) corrected chronological age based on their gestational age, that
 is, we subtracted an infant's chronological age by 2 weeks, if the infant's
 gestational age is 38 weeks instead of 40weeks.

 When I ran the model with corrected age, gestational age and their
 interactions modeled, I found the main effect of gestational age and the
 interaction between the two are gone.

 So, my final model will look something like this:
 gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re"),
 method="REML", data=mydata)

 Does this look more reasonable?

>>> I'm still having difficulty understand how a "smoothing" function would
>>> be used to handle repeated measures without some sort of "group-within"
>>> indicator.
>>>
>>> I would have imagined (and this is because I have no experience with
>>> using this package for repeated measures) something along the lines of:
>>>
>>>   ...+s(correctedAg|subjIndexF)
>>>
>>> I see this statement in the docs:
>>>
>>>
>>> smooth.construc

Re: [R] A question on modeling brain growth using GAM

2017-04-06 Thread Simon Wood
>
> gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re") + 
> s(subjIndexF, correctedAge, bs="re"), method="REML", data=mydata), 
> where subjIndexF is a factor for each subject. I was thrown an error 
> saying "more coefficients than data".
>
--- I'm not sure exactly  how many scans and subjects you have. The 
above model will have 10 + 2*(number of subjects ) coeffs. If that is 
more than the number of scans then gam will not handle it. Depending on 
the numbers involved you could reduce the k parameter to 
s(correctedAge), to fix the problem. (e.g. with 31 subjects and 70 scans 
s(correctedAge=8) should work).

> However, when I tried to model similar (please correct me if they are 
> not similar) things using GAMM based on description in 
> ?factor.smooth.interaction:
>  gamm1=gamm(BrainVolume~ s(correctedAge) + s(correctedAge, subjIndexF, 
> bs="fs", k=5), data=mydata)
--- It's not the same model. You now have a random smooth curve per 
subject. You can add random effects in gamm using the list form of the 
syntax for specifying random effects in lme. see ?gamm. Random 
intercepts and slopes can be added that way.

> The model ran.  When I plotted the data using plot(gamm1), I got two 
> figures: the left one is the group mean and 95%CI, which I assume is 
> the results by gamm1$gam model. The right one shows 30 lines (the 
> number of subjects in my data) fluctuating around 0, which I assume is 
> the random effects (gamm1$lme) modeled within each subject that can be 
> added onto the group mean for individual curves. Is my understanding 
> correct? If so, how can I extract these curves from gamm1$lme?
>
--- I would extract the fitted curves using predict(gamm1$gam,..., 
type="terms") supplying the factor levels and correctedAges at which you 
want to evaluate the curves.

> Many thanks!
> L
>
>
>
> On Thu, Apr 6, 2017 at 8:22 AM, Simon Wood  > wrote:
>
> If 'subjIndexF' is a factor for subject, then s(subjIndexF,
> bs="re") will produce a random effect for subject. i.e. each
> subject will be given its own random intercept term, which is a
> way that repeated measures data like this are often handled.
>
> The reason for the s(subjIndexF, bs="re") syntax is that smooths
> can be viewed as Gaussian random effects, so simple Gaussian
> random effects can also be viewed as (0-dimensional) smooths. In
> general s(x,z,w,bs="re") just appends the columns of
> model.matrix(~x:z:w-1) to the gam model matrix, and treats the
> associated coefficients as i.i.d. Gaussian random effects with a
> common variance (to be estimated). In principle this works with
> any number of arguments to s(...,bs="re").
>
> See ?random.effects (and its linked help files) in mgcv for more.
>
> There are mechanisms for allowing random smooth curves for each
> subject, (e.g. ?factor.smooth.interaction), but I would only use
> these if simpler approaches really aren't adequate here.
>
> best,
> Simon
>
>
>
>
> On 30/03/17 17:06, David Winsemius wrote:
>
> On Mar 30, 2017, at 6:56 AM, Leon Lee
> mailto:bhamlio...@gmail.com>> wrote:
>
> David
>
> Thank you for your reply. I apologize if I posted in the
> wrong forum, as I really couldn't decide which forum is
> the best place for my question and I saw similar questions
> asked before in this forum.
>
> I agree that a sample of ~30 subjects (70 scans in total),
> the model can be too complicated. Based on that, I did the
> following:
> (1) ignored the gender effect, as we have less females
> than males.
> (2) corrected chronological age based on their gestational
> age, that is, we subtracted an infant's chronological age
> by 2 weeks, if the infant's gestational age is 38 weeks
> instead of 40weeks.
>
> When I ran the model with corrected age, gestational age
> and their interactions modeled, I found the main effect of
> gestational age and the interaction between the two are gone.
>
> So, my final model will look something like this:
> gamObj=gam(brainVolume~ s(correctedAge) + s(subjIndexF,
> bs="re"), method="REML", data=mydata)
>
> Does this look more reasonable?
>
> I'm still having difficulty understand how a "smoothing"
> function would be used to handle repeated measures without
> some sort of "group-within" indicator.
>
> I would have imagined (and this is because I have no
> experience with using this package for repeated measures)
> something along the lines of:
>
>   ...+s(correctedAg|subjIndexF)
>
> I see this statement in the docs:
>
>
> smooth.construct.re.smooth.spec {mgcv}
>
> "gam can deal with simple inde

Re: [R] A question on modeling brain growth using GAM

2017-04-06 Thread Simon Wood
If 'subjIndexF' is a factor for subject, then s(subjIndexF, bs="re") 
will produce a random effect for subject. i.e. each subject will be 
given its own random intercept term, which is a way that repeated 
measures data like this are often handled.


The reason for the s(subjIndexF, bs="re") syntax is that smooths can be 
viewed as Gaussian random effects, so simple Gaussian random effects can 
also be viewed as (0-dimensional) smooths. In general s(x,z,w,bs="re") 
just appends the columns of model.matrix(~x:z:w-1) to the gam model 
matrix, and treats the associated coefficients as i.i.d. Gaussian random 
effects with a common variance (to be estimated). In principle this 
works with any number of arguments to s(...,bs="re").


See ?random.effects (and its linked help files) in mgcv for more.

There are mechanisms for allowing random smooth curves for each subject, 
(e.g. ?factor.smooth.interaction), but I would only use these if simpler 
approaches really aren't adequate here.


best,
Simon



On 30/03/17 17:06, David Winsemius wrote:

On Mar 30, 2017, at 6:56 AM, Leon Lee  wrote:

David

Thank you for your reply. I apologize if I posted in the wrong forum, as I 
really couldn't decide which forum is the best place for my question and I saw 
similar questions asked before in this forum.

I agree that a sample of ~30 subjects (70 scans in total), the model can be too 
complicated. Based on that, I did the following:
(1) ignored the gender effect, as we have less females than males.
(2) corrected chronological age based on their gestational age, that is, we 
subtracted an infant's chronological age by 2 weeks, if the infant's 
gestational age is 38 weeks instead of 40weeks.

When I ran the model with corrected age, gestational age and their interactions 
modeled, I found the main effect of gestational age and the interaction between 
the two are gone.

So, my final model will look something like this:
gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re"), 
method="REML", data=mydata)

Does this look more reasonable?

I'm still having difficulty understand how a "smoothing" function would be used to handle 
repeated measures without some sort of "group-within" indicator.

I would have imagined (and this is because I have no experience with using this 
package for repeated measures) something along the lines of:

  ...+s(correctedAg|subjIndexF)

I see this statement in the docs:


smooth.construct.re.smooth.spec {mgcv}  

"gam can deal with simple independent random effects, by exploiting the link between smooths 
and random effects to treat random effects as smooths. s(x,bs="re") implements this."

But I don't see that as applying to the dependency between individuals measured 
repeatedly. I find no examples of repeated measures problems being solve by 
gam(). There is also a note on the same page:

"Note that smooth ids are not supported for random effect terms. Unlike most smooth 
terms, side conditions are never applied to random effect terms in the event of nesting 
(since they are identifiable without side conditions)."

When I do a search on "using gam mgcv formula mixed effects" I am referred to 
packages 'gamm' and 'gamm4' produced by the same author (Simon Wood) as pkg 'mgcv', or to 
package `nlme`.



Yes, I am relatively new to the mixed model. We originally applied functional 
data analysis (PACE) on the data, but want to see the results using a different 
approach. Also, I couldn't find the Mixed Models list, do you mind sending me a 
link?

This is a link to the main mailing lists page:

https://www.r-project.org/mail.html

Found with a search on Google with "R mailing lists"



Thank you!
Longchuan


On Tue, Mar 28, 2017 at 4:28 PM, David Winsemius  wrote:


On Mar 28, 2017, at 9:32 AM, Leon Lee  wrote:

Hi, R experts

I am new to R & GAM toolbox and would like to get inputs from you all on my
models. The question I have is as follows:
I have 30 subjects with each subject being scanned from one to three times
in the first year of life. The brain volume from each scan was measured.
The scan time was randomly distributed from birth to 1 year.
Each subject has different gestational age ranging from 38 to 41 weeks
Each subject has chronological age from birth to 1 year old
Each subject has gender category.
Now, I want to look at how predictors, such as subject's chronological age,
gestational age and gender will explain the changes in brain volume. I also
want to include interactions between gender and age, gestational and
chronological age. Random effects are also included in the model to account
for subject variability. My model looks like the follows:

gam=gam(brainVolume~ s(age) + ti(age, gestationalAge) + gestationalAge +
sex + s(age, by=sex) +  s(subjIndexF, bs="re"), method="REML", data=mydata)

Are there any obvious mistakes in the model? Any suggestions will be
greatly appreciated!

I'm not seeing mistakes in the syntax but I would question whether 30 subjects is 
sufficient t

Re: [R] A question on modeling brain growth using GAM

2017-03-30 Thread David Winsemius

> On Mar 30, 2017, at 6:56 AM, Leon Lee  wrote:
> 
> David
> 
> Thank you for your reply. I apologize if I posted in the wrong forum, as I 
> really couldn't decide which forum is the best place for my question and I 
> saw similar questions asked before in this forum. 
> 
> I agree that a sample of ~30 subjects (70 scans in total), the model can be 
> too complicated. Based on that, I did the following:
> (1) ignored the gender effect, as we have less females than males. 
> (2) corrected chronological age based on their gestational age, that is, we 
> subtracted an infant's chronological age by 2 weeks, if the infant's 
> gestational age is 38 weeks instead of 40weeks. 
> 
> When I ran the model with corrected age, gestational age and their 
> interactions modeled, I found the main effect of gestational age and the 
> interaction between the two are gone. 
> 
> So, my final model will look something like this:
> gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re"), 
> method="REML", data=mydata)
> 
> Does this look more reasonable?

I'm still having difficulty understand how a "smoothing" function would be used 
to handle repeated measures without some sort of "group-within" indicator. 

I would have imagined (and this is because I have no experience with using this 
package for repeated measures) something along the lines of:

 ...+s(correctedAg|subjIndexF)

I see this statement in the docs:


smooth.construct.re.smooth.spec {mgcv}  

"gam can deal with simple independent random effects, by exploiting the link 
between smooths and random effects to treat random effects as smooths. 
s(x,bs="re") implements this."

But I don't see that as applying to the dependency between individuals measured 
repeatedly. I find no examples of repeated measures problems being solve by 
gam(). There is also a note on the same page:

"Note that smooth ids are not supported for random effect terms. Unlike most 
smooth terms, side conditions are never applied to random effect terms in the 
event of nesting (since they are identifiable without side conditions)."

When I do a search on "using gam mgcv formula mixed effects" I am referred to 
packages 'gamm' and 'gamm4' produced by the same author (Simon Wood) as pkg 
'mgcv', or to package `nlme`.


> Yes, I am relatively new to the mixed model. We originally applied functional 
> data analysis (PACE) on the data, but want to see the results using a 
> different approach. Also, I couldn't find the Mixed Models list, do you mind 
> sending me a link?   

This is a link to the main mailing lists page:

https://www.r-project.org/mail.html

Found with a search on Google with "R mailing lists"


> 
> Thank you!
> Longchuan
> 
> 
> On Tue, Mar 28, 2017 at 4:28 PM, David Winsemius  
> wrote:
> 
> > On Mar 28, 2017, at 9:32 AM, Leon Lee  wrote:
> >
> > Hi, R experts
> >
> > I am new to R & GAM toolbox and would like to get inputs from you all on my
> > models. The question I have is as follows:
> > I have 30 subjects with each subject being scanned from one to three times
> > in the first year of life. The brain volume from each scan was measured.
> > The scan time was randomly distributed from birth to 1 year.
> > Each subject has different gestational age ranging from 38 to 41 weeks
> > Each subject has chronological age from birth to 1 year old
> > Each subject has gender category.
> > Now, I want to look at how predictors, such as subject's chronological age,
> > gestational age and gender will explain the changes in brain volume. I also
> > want to include interactions between gender and age, gestational and
> > chronological age. Random effects are also included in the model to account
> > for subject variability. My model looks like the follows:
> >
> > gam=gam(brainVolume~ s(age) + ti(age, gestationalAge) + gestationalAge +
> > sex + s(age, by=sex) +  s(subjIndexF, bs="re"), method="REML", data=mydata)
> >
> > Are there any obvious mistakes in the model? Any suggestions will be
> > greatly appreciated!
> 
> I'm not seeing mistakes in the syntax but I would question whether 30 
> subjects is sufficient to adequately support estimates in a a model of this 
> complexity. I would also think that the 's(age)' and 'sex' terms would get 
> aliased out in a model with "+ s(age, by=sex)". Most R regression functions 
> handle removal of over-parametrization automatically.
> 
> You also have a variable number of measurements per subject. I am unable to 
> comment on the effort to account for the implicit and variably measured 
> correlation and auto-correlation of values within subjects using a "smooth" 
> on subjIndexF, since that is not an approach I was familiar with.  But I am 
> getting concerned whether you are also new to statistical modeling in 
> addition to your use of R and GAM being "new to you"?
> 
> (Perhaps Simon or one of the mixed-effects experts can correct the gaps in my 
> understanding of how to model repeated measures in the context of small 
> numbers 

Re: [R] A question on modeling brain growth using GAM

2017-03-30 Thread Leon Lee
David

Thank you for your reply. I apologize if I posted in the wrong forum, as I
really couldn't decide which forum is the best place for my question and I
saw similar questions asked before in this forum.

I agree that a sample of ~30 subjects (70 scans in total), the model can be
too complicated. Based on that, I did the following:
(1) ignored the gender effect, as we have less females than males.
(2) corrected chronological age based on their gestational age, that is, we
subtracted an infant's chronological age by 2 weeks, if the infant's
gestational age is 38 weeks instead of 40weeks.

When I ran the model with corrected age, gestational age and their
interactions modeled, I found the main effect of gestational age and the
interaction between the two are gone.

So, my final model will look something like this:
gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re"),
method="REML", data=mydata)

Does this look more reasonable? Yes, I am relatively new to the mixed
model. We originally applied functional data analysis (PACE) on the data,
but want to see the results using a different approach. Also, I couldn't
find the Mixed Models list, do you mind sending me a link?

Thank you!
Longchuan


On Tue, Mar 28, 2017 at 4:28 PM, David Winsemius 
wrote:

>
> > On Mar 28, 2017, at 9:32 AM, Leon Lee  wrote:
> >
> > Hi, R experts
> >
> > I am new to R & GAM toolbox and would like to get inputs from you all on
> my
> > models. The question I have is as follows:
> > I have 30 subjects with each subject being scanned from one to three
> times
> > in the first year of life. The brain volume from each scan was measured.
> > The scan time was randomly distributed from birth to 1 year.
> > Each subject has different gestational age ranging from 38 to 41 weeks
> > Each subject has chronological age from birth to 1 year old
> > Each subject has gender category.
> > Now, I want to look at how predictors, such as subject's chronological
> age,
> > gestational age and gender will explain the changes in brain volume. I
> also
> > want to include interactions between gender and age, gestational and
> > chronological age. Random effects are also included in the model to
> account
> > for subject variability. My model looks like the follows:
> >
> > gam=gam(brainVolume~ s(age) + ti(age, gestationalAge) + gestationalAge +
> > sex + s(age, by=sex) +  s(subjIndexF, bs="re"), method="REML",
> data=mydata)
> >
> > Are there any obvious mistakes in the model? Any suggestions will be
> > greatly appreciated!
>
> I'm not seeing mistakes in the syntax but I would question whether 30
> subjects is sufficient to adequately support estimates in a a model of this
> complexity. I would also think that the 's(age)' and 'sex' terms would get
> aliased out in a model with "+ s(age, by=sex)". Most R regression functions
> handle removal of over-parametrization automatically.
>
> You also have a variable number of measurements per subject. I am unable
> to comment on the effort to account for the implicit and variably measured
> correlation and auto-correlation of values within subjects using a "smooth"
> on subjIndexF, since that is not an approach I was familiar with.  But I am
> getting concerned whether you are also new to statistical modeling in
> addition to your use of R and GAM being "new to you"?
>
> (Perhaps Simon or one of the mixed-effects experts can correct the gaps in
> my understanding of how to model repeated measures in the context of small
> numbers of subjects and irregular emasurements.)
>
> Please read the Posting Guide and the pages of candidate mailing lists.
> Rhelp is not really the place to go when you need statistical advice. I'm
> not sure if this is really in the center of concerns that get discussed on
> the Mixed Models list, but to my eyes it would be a better fit there.
>
> --
> David.
> >
> > L
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A question on modeling brain growth using GAM

2017-03-28 Thread David Winsemius

> On Mar 28, 2017, at 9:32 AM, Leon Lee  wrote:
> 
> Hi, R experts
> 
> I am new to R & GAM toolbox and would like to get inputs from you all on my
> models. The question I have is as follows:
> I have 30 subjects with each subject being scanned from one to three times
> in the first year of life. The brain volume from each scan was measured.
> The scan time was randomly distributed from birth to 1 year.
> Each subject has different gestational age ranging from 38 to 41 weeks
> Each subject has chronological age from birth to 1 year old
> Each subject has gender category.
> Now, I want to look at how predictors, such as subject's chronological age,
> gestational age and gender will explain the changes in brain volume. I also
> want to include interactions between gender and age, gestational and
> chronological age. Random effects are also included in the model to account
> for subject variability. My model looks like the follows:
> 
> gam=gam(brainVolume~ s(age) + ti(age, gestationalAge) + gestationalAge +
> sex + s(age, by=sex) +  s(subjIndexF, bs="re"), method="REML", data=mydata)
> 
> Are there any obvious mistakes in the model? Any suggestions will be
> greatly appreciated!

I'm not seeing mistakes in the syntax but I would question whether 30 subjects 
is sufficient to adequately support estimates in a a model of this complexity. 
I would also think that the 's(age)' and 'sex' terms would get aliased out in a 
model with "+ s(age, by=sex)". Most R regression functions handle removal of 
over-parametrization automatically.

You also have a variable number of measurements per subject. I am unable to 
comment on the effort to account for the implicit and variably measured 
correlation and auto-correlation of values within subjects using a "smooth" on 
subjIndexF, since that is not an approach I was familiar with.  But I am 
getting concerned whether you are also new to statistical modeling in addition 
to your use of R and GAM being "new to you"?

(Perhaps Simon or one of the mixed-effects experts can correct the gaps in my 
understanding of how to model repeated measures in the context of small numbers 
of subjects and irregular emasurements.)

Please read the Posting Guide and the pages of candidate mailing lists. Rhelp 
is not really the place to go when you need statistical advice. I'm not sure if 
this is really in the center of concerns that get discussed on the Mixed Models 
list, but to my eyes it would be a better fit there.

-- 
David.
> 
> L
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A question on modeling brain growth using GAM

2017-03-28 Thread Leon Lee
Hi, R experts

I am new to R & GAM toolbox and would like to get inputs from you all on my
models. The question I have is as follows:
I have 30 subjects with each subject being scanned from one to three times
in the first year of life. The brain volume from each scan was measured.
The scan time was randomly distributed from birth to 1 year.
Each subject has different gestational age ranging from 38 to 41 weeks
Each subject has chronological age from birth to 1 year old
Each subject has gender category.
Now, I want to look at how predictors, such as subject's chronological age,
gestational age and gender will explain the changes in brain volume. I also
want to include interactions between gender and age, gestational and
chronological age. Random effects are also included in the model to account
for subject variability. My model looks like the follows:

gam=gam(brainVolume~ s(age) + ti(age, gestationalAge) + gestationalAge +
sex + s(age, by=sex) +  s(subjIndexF, bs="re"), method="REML", data=mydata)

Are there any obvious mistakes in the model? Any suggestions will be
greatly appreciated!

L

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.