Re: [R-sig-eco] GLS, GEE or LMM ??

Jens Oldeland Fri, 16 Apr 2010 05:59:23 -0700

Dear Ben,

How many levels of "bank" are there?


there are four banks, thus four levels and ...

How many observations do you have overall?

fourty values overall which are roughly equally distributed i.e. 10,7,10,13.

I take it that there are not trends in date?

There is a clear trend in date. That´s why I used the AR1-correlation structure.

What is the nature of aWert?  Counts, continuous measurements?

the aWert is a continous unitless measure of the "fleshweight" of a mussel. 
Originally we had many many single values for aWert but since we had always the same 
values pH etc. for each bank at each date, we averaged the aWert to a population average 
per bank.

The usual point of random-effects models is to analyze data where
there are a *large* number of groups, possibly with relatively small
numbers of samples per group.


Hmmm, I see that this is might not be the case in our data. But what is "large" 
and what is relatively small... is 4 groups with 10 samples a large number of groups, or 
not? Sorry, I am lacking experience in this question :(

best
jens



Ben Bolker schrieb:

Jens Oldeland wrote:
Dear Thierry,
thank you very much for your help! However, I think I have not explainedmy approach very good.
I am using this formula
M1.1.lme <- lme(aWert ~ Salinity + pH + chl.a + NO3 + oyster_qm +meanspring, random = ~ 1 | bank, na.action=na.omit, method="ML",data=mussels, correlation = corAR1(form = ~ datumszahl))
hence six variables for the fixed effect, bank (station) as the locationeffect and "datumszahl" for the time effect. Datumszahl is a numericthat replaces a certain date. For example 35932 would be 17. May 98.Hence I am not using year 2000 but day..35000? oops :-)
  How many levels of "bank" are there?  That's the critical question.
  I take it that there are not trends in date?  If so, you should have
'datumszahl' in the fixed effects as well as in the correlation structure.
   What is the nature of aWert?  Counts, continuous measurements?  Are
the counts small numbers?

  How many observations do you have overall?
Do you still think that six variables are not enough to calculate a LMMor GEE?But than...what is the purpose of such models when they do not work witha small set of variables?
  The usual point of random-effects models is to analyze data where
there are a *large* number of groups, possibly with relatively small
numbers of samples per group.
thinking,
Jens



ONKELINX, Thierry schrieb:
Dear Jens,

A random effect with only three levels is not a good idea. You are
estimating a variance on only three numbers. Have a look at the plot
below. It gives the confidence interval of the ratio between the
estimated variance and the true variance. Note that with three levels,
the estimated variance can be from 40 times smaller up to 3.7 times
larger than the true variance. If you have 30 (thirty) levels, this
range is reduced: from 1.8 times smaller up to 1.5 times larger.

n <- seq(2, 100)
low <- qchisq(p = 0.025, df = n - 1) / (n - 1)
high <- qchisq(p = 0.975, df = n - 1) / (n - 1)
plot(n, high, type = "l", ylim = c(0, 5))
lines(n, low)
abline(h = 1, lty = 2)

Therefore I recommend that you add the site variable to the fixed
effects and drop the random effects.

A) Centering continuous data will mostly only affect the estimates of
the intercept. The intercept is the expected value of your respons when
all variables are zero (or at their reference level). So if you have a
timeseries ranging from 2000 to 2010, then the intercept is the value in
the year 0. When you center year to 2000 (year = 2000 --> cyear = 0),
then the intercept will be the expected value in the year 2000. The
first is non sense given your time series, the latter has a practical
interpretation. Note that both model will be mathematically identical
but just use a different parametrisation.

B) Given that you have only three levels, neither a LMM nor GEE will be
a good model. So comparing them is not a good idea.

C) Lower AIC is always better. So -10 is better than -5. AIC = 2 k - 2
log(L) with k = number of parameters, L = likelihood. Models with a high
likelihood will have a lower AIC (if the number of parameters are
equal).

HTH,

Thierry


------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
-----Oorspronkelijk bericht-----
Van: r-sig-ecology-boun...@r-project.org[mailto:r-sig-ecology-boun...@r-project.org] Namens Jens Oldeland
Verzonden: vrijdag 16 april 2010 12:50
Aan: r-sig-ecology@r-project.org
Onderwerp: [R-sig-eco] GLS, GEE or LMM ??
Dear All,I have run into a number of questions, and thus I hope youcould help me out. I am modelling the effect of oysterdensity and nutrients on the bodyweight of mussels(population average).Data was sampled at three different stations over 8 years,with values measured in springtime once per year.
I was following Zuur et al 2009 Mixed Effects Models(wonderful book!), but got lost at some points sincedifferent models lead to totally different results.
a) the first question is about "centring data". Zuur suggestto center parameters (p.334) if they are highly correlatedwith the intercept.When I apply a lme (family=gaussian, random ~ 1 | bank,correlation = corAR1(form = ~ daycount)) I have to centernearly all the values. When I apply a GEE then there is nocorrelation at all (r=0.14).Actually, centring the data leads to the same output at theend (for the
lme)
b) Choosing GEE, the effect of one parameter (salinity) ishighly significant, while using the LMM approach it is not,which would be better for our interpretation...But why? Is it because GEE should not be used on normallydistributed data? I know that GEE uses sandwich estimator andLMM uses ML. Which one would be more "trustworthy" or conservative?
c) one last qeustion: negative AICs, which one is better.AIC: -10 or -5 ? I have read contrasting statements. Is thereany proof?? Does it hold for BIC as well?
thank you in advance!
Jens

--
+++++++++++++++++++++++++++++++++++++++++
Dipl.Biol. Jens Oldeland
Biodiversity of Plants
Biocentre Klein Flottbek and Botanical Garden University ofHamburg Ohnhorststr. 18
22609 Hamburg,
Germany

Tel:    0049-(0)40-42816-407
Fax:    0049-(0)40-42816-543
Mail:   oldel...@botanik.uni-hamburg.de
        oldel...@gmx.de         (for attachments > 2mb!!)
Skype:  jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
+++++++++++++++++++++++++++++++++++++++++

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weeren binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd isdoor een geldig ondertekend document. The views expressed in this messageand any annex are purely those of the writer and may not be regarded as statingan official position of INBO, as long as the message is not confirmed by a dulysigned document.



--
+++++++++++++++++++++++++++++++++++++++++
Dipl.Biol. Jens Oldeland
Biodiversity of Plants
Biocentre Klein Flottbek and Botanical Garden

University of HamburgOhnhorststr. 18

22609 Hamburg,
Germany

Tel:    0049-(0)40-42816-407
Fax:    0049-(0)40-42816-543
Mail:   oldel...@botanik.uni-hamburg.de
       oldel...@gmx.de  (for attachments > 2mb!!)
Skype:  jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://jensoldeland.wordpress.com
+++++++++++++++++++++++++++++++++++++++++

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] GLS, GEE or LMM ??

Reply via email to