Re: [R] Urgent - I really need some help lme4 model avg Estimates

2012-03-28 Thread Mitchell Maltenfort
You said it right in your first letter.  You want to keep it at a
level that is comprehensible.  Not just to you, but your colleagues,
reviewers, readers...

Remember the sage wisdom that all models are wrong, but some of them
are useful.

Rather than try to forge the Excalibur of statistical models, consider
writing your paper around presenting 2-4 different, relatively simple
statistical models and compare and contrast their interpretations.
How well do they agree with each other and with informed expectations?



Ersatzistician and Chutzpahthologist
I can answer any question.  I don't know is an answer. I don't know
yet is a better answer.



On Wed, Mar 28, 2012 at 1:21 AM, Dragonwalker
dragonwalker...@hotmail.com wrote:
 I understand where you are coming from, but the issue is that some
 exploration of the data through graphs and the like, showed that patterns
 could be seen. However with only 7 means it is extremely difficult to get
 any kind of statistical evidence and as some mean values are the same some
 of the tests that I wanted to use such as a Mann-Whitney would not even run
 so I had to resort to a one-sample Wilcoxon with a set mu value. (minimum
 p-value that was even possible was p=0.280).

 I asked a couple of forums in December about the issues at hand and they
 suggested that I look into mixed-effect models so I read some chapters on
 them and got very excited, but at the time still thought of them as some
 test that could give me means. However it all clicked and I realise that
 they can be more useful as a tool to illustrate which factors and covariates
 best fit to the response variable.

 I understand the concepts of fitting an intercept and slope somewhat but the
 paperwork on it can be a little confusing, however the way they were used in
 the paper (of which I attached one of the tables) seemed a very
 straightforward method of teasing the intricate factors of habitat, age and
 other factors that could be affecting behaviour such as time feeding and
 foraging rate. Believe me, if i could have survived with Kruskal-Wallis then
 I would have had my thesis written up three months ago with a lot less
 stress. I am not looking for pretty as I don't even want it published, but I
 did hope to be able to give the time that I spent collecting data justice.

 I have come really far, thanks to some great people, but I do not have
 anyone near me who can help and my adviser is 3000 miles away too and is not
 a statistician either.
 All I would like to know is how could Maslo et al. have calculated estimates
 for all categories AND an intercept and is there a method to do this in R.
 I have spent months trying to find these answers and so I would greatly
 appreciate an answer to this question.

 Thank you again.



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Urgent-I-really-need-some-help-lme4-model-avg-Estimates-tp4511178p4511396.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Urgent - I really need some help lme4 model avg Estimates

2012-03-28 Thread Dragonwalker
Thank you Mitchell,

I will try that. So I presume that the initial paper where they showed the
estimates AND the intercept from a model averaging procedure may have been
done using a different method?

Would it still be prudent to use a global model and then perhaps show the
top so many, perhaps those with a delta2  and then show their weights?
Would it also be okay to just do a model average and then perhaps show the
weights of each covariate and factor within these models to show their
relative importance? I think the way the paper presented the results of
extremely similar research, using only models using A+B+C+(1|D) etc and then
model average, and able to come up with an Intercept and then much smaller
comparable estimates made me think that this was probably the correct way to
present the results and that getting these values must be something that I
just didn't know how to code. They were even able to compare the Estimate
differences among the variables whereas when I used -1 to remove the
intercept the distance between the variables differed (although within
stayed the same).

Thank you again for your kind reply.

Rachel



--
View this message in context: 
http://r.789695.n4.nabble.com/Urgent-I-really-need-some-help-lme4-model-avg-Estimates-tp4511178p4512504.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Urgent - I really need some help lme4 model avg Estimates

2012-03-27 Thread Dragonwalker
Hello all,
If someone could take a little time to help me then I would be very
grateful.
I studied piping plovers last summer. I watched each chick within a brood
for 5 minutes and recorded behaviour, habitat use and foraging rate.
There were two Sites, the first with 4 broods and the second with 3 broods.  
http://r.789695.n4.nabble.com/file/n4511178/Table_PP_Maslo_et_al.png As the
data within a brood is non-independent and the fact that there were so few,
then conventional statistical tests were of little use. I therefore spent a
couple of months looking at mixed-models to allow me to use all the data for
each day and use (1|Brood) as a random effect.

At first i struggled with what models meant, but last week they 'sort of '
clicked and realised how to run them and how to weigh which models were the
best (using AICc). 
As I had a number of factors/covariates that I wanted to look at I learned
to use the dredge command in the MuMIn package from an a priori global model
and decided to model average the models with a delta2.

I have two main questions: 
I was looking at similar research that also looked at models and they also
came up with model average estimates and CIs for each variable and factor. 
They ended up with one table showing the top so many models with their AICc,
delta and weights and then another table showing the model average Estimates
and CIs for each factor and co-variate and also the Intercept.   Each
category within each variable was shown (I have attached an image of the
table - the heading does not seem to match what is shown however).
Their explanation of the variables was as follows:
A second model including these variables and wind speed reported a DAICc
score 2; therefore, we model- averaged the parameter estimates included in
these 2 best models (Table 3). Of the 5 habitats in which we observed
plovers feeding, effect size was highest at artificial tidal ponds (5.52),
followed by the intertidal zone (3.97). Positive effects of ephemeral pools
(2.65) and bay shores (2.32) on adult foraging rates were 48% and 42% lower
than artificial ponds,
respectively. Conversely, sand flats (-2.30) had an equal but opposite
effect on foraging rate, when compared to bay
shores. The results also indicated that foraging rate was highest for adults
during the post-breeding stage. In addition,
vehicles had a 2.3 times larger effect on foraging adults than people.
Finally, foraging rates during low tide were
higher than at high tide by a factor of 2.5, as would be expected.

As you can see, their explanation seems to suggest that all values are
comparable e.g. vehicles and people.

When I ran the model average I also got an Intercept estimate but only the
second and beyond categorical Estimates were shown (e.g. if one factor was
high tide, low tide, then only the estimate for low tide was shown,
obviously an estimate of difference between the two).
I asked on stats.stackexchange and they suggested just adding -1 to the end
of the model, but although this worked, the estimates became much bigger to
compensate for there being no intercept and although the difference between
the Estimates were the same for 'within factor', the 'among factor'
variables seemed to change (bigger differences between), along with the
p-values for each group. In addition there was, of course, no intercept.

I am therefore wondering whether anyone knows how I may be able to preserve
the initial Estimates but still get the missing values (obviously the other
researchers seemed to have done this as they still have an intercept and
comparable estimates).

This is my most important issue right now, but if someone has a moment,
could you also tell me whether I should use the p-values as well, or should
i just stick with explaining the magnitude of the effects, their direction
and their Relative Importance. i want to keep it at a level that I can
understand.

Thank you in advance. I know everyone is busy but I would be very grateful
for a prompt response if at all possible.

Sincerely.

--
View this message in context: 
http://r.789695.n4.nabble.com/Urgent-I-really-need-some-help-lme4-model-avg-Estimates-tp4511178p4511178.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Urgent - I really need some help lme4 model avg Estimates

2012-03-27 Thread Bert Gunter
You've got to be kidding!

You are requesting extensive statistical consulting from the R-Help
list. That is not the purpose of this list, nor is it reasonable to
expect remote statisticians unfamiliar with your work or state of
understanding (which appears to be rather sketchy) to provide reliable
or perhaps even relevant advice. Instead, I suggest that you spend
some serious time with local statistical experts (who may or may not
be statisticians). Or take a much less complicated approach (perhaps
with graphics) to your analysis -- this might actually be better
because you will have a better understanding of the results and what
they mean about the underlying scientific issues. Although I
understand that, alas and alack,  embellishing your work with dazzling
statistical ornamentation may be a prerequisite to publication, so
you're stuck .

Cheers,
Bert

On Tue, Mar 27, 2012 at 7:55 PM, Dragonwalker
dragonwalker...@hotmail.com wrote:
 Hello all,
 If someone could take a little time to help me then I would be very
 grateful.
 I studied piping plovers last summer. I watched each chick within a brood
 for 5 minutes and recorded behaviour, habitat use and foraging rate.
 There were two Sites, the first with 4 broods and the second with 3 broods.
 http://r.789695.n4.nabble.com/file/n4511178/Table_PP_Maslo_et_al.png As the
 data within a brood is non-independent and the fact that there were so few,
 then conventional statistical tests were of little use. I therefore spent a
 couple of months looking at mixed-models to allow me to use all the data for
 each day and use (1|Brood) as a random effect.

 At first i struggled with what models meant, but last week they 'sort of '
 clicked and realised how to run them and how to weigh which models were the
 best (using AICc).
 As I had a number of factors/covariates that I wanted to look at I learned
 to use the dredge command in the MuMIn package from an a priori global model
 and decided to model average the models with a delta2.

 I have two main questions:
 I was looking at similar research that also looked at models and they also
 came up with model average estimates and CIs for each variable and factor.
 They ended up with one table showing the top so many models with their AICc,
 delta and weights and then another table showing the model average Estimates
 and CIs for each factor and co-variate and also the Intercept.   Each
 category within each variable was shown (I have attached an image of the
 table - the heading does not seem to match what is shown however).
 Their explanation of the variables was as follows:
 A second model including these variables and wind speed reported a DAICc
 score 2; therefore, we model- averaged the parameter estimates included in
 these 2 best models (Table 3). Of the 5 habitats in which we observed
 plovers feeding, effect size was highest at artificial tidal ponds (5.52),
 followed by the intertidal zone (3.97). Positive effects of ephemeral pools
 (2.65) and bay shores (2.32) on adult foraging rates were 48% and 42% lower
 than artificial ponds,
 respectively. Conversely, sand flats (-2.30) had an equal but opposite
 effect on foraging rate, when compared to bay
 shores. The results also indicated that foraging rate was highest for adults
 during the post-breeding stage. In addition,
 vehicles had a 2.3 times larger effect on foraging adults than people.
 Finally, foraging rates during low tide were
 higher than at high tide by a factor of 2.5, as would be expected.

 As you can see, their explanation seems to suggest that all values are
 comparable e.g. vehicles and people.

 When I ran the model average I also got an Intercept estimate but only the
 second and beyond categorical Estimates were shown (e.g. if one factor was
 high tide, low tide, then only the estimate for low tide was shown,
 obviously an estimate of difference between the two).
 I asked on stats.stackexchange and they suggested just adding -1 to the end
 of the model, but although this worked, the estimates became much bigger to
 compensate for there being no intercept and although the difference between
 the Estimates were the same for 'within factor', the 'among factor'
 variables seemed to change (bigger differences between), along with the
 p-values for each group. In addition there was, of course, no intercept.

 I am therefore wondering whether anyone knows how I may be able to preserve
 the initial Estimates but still get the missing values (obviously the other
 researchers seemed to have done this as they still have an intercept and
 comparable estimates).

 This is my most important issue right now, but if someone has a moment,
 could you also tell me whether I should use the p-values as well, or should
 i just stick with explaining the magnitude of the effects, their direction
 and their Relative Importance. i want to keep it at a level that I can
 understand.

 Thank you in advance. I know everyone is busy but I would be very grateful
 for a 

Re: [R] Urgent - I really need some help lme4 model avg Estimates

2012-03-27 Thread Bert Gunter
... perhaps also worth mentioning:

The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. 
-- John Tukey

-- Bert

On Tue, Mar 27, 2012 at 7:55 PM, Dragonwalker
dragonwalker...@hotmail.com wrote:
 Hello all,
 If someone could take a little time to help me then I would be very
 grateful.
 I studied piping plovers last summer. I watched each chick within a brood
 for 5 minutes and recorded behaviour, habitat use and foraging rate.
 There were two Sites, the first with 4 broods and the second with 3 broods.
 http://r.789695.n4.nabble.com/file/n4511178/Table_PP_Maslo_et_al.png As the
 data within a brood is non-independent and the fact that there were so few,
 then conventional statistical tests were of little use. I therefore spent a
 couple of months looking at mixed-models to allow me to use all the data for
 each day and use (1|Brood) as a random effect.

 At first i struggled with what models meant, but last week they 'sort of '
 clicked and realised how to run them and how to weigh which models were the
 best (using AICc).
 As I had a number of factors/covariates that I wanted to look at I learned
 to use the dredge command in the MuMIn package from an a priori global model
 and decided to model average the models with a delta2.

 I have two main questions:
 I was looking at similar research that also looked at models and they also
 came up with model average estimates and CIs for each variable and factor.
 They ended up with one table showing the top so many models with their AICc,
 delta and weights and then another table showing the model average Estimates
 and CIs for each factor and co-variate and also the Intercept.   Each
 category within each variable was shown (I have attached an image of the
 table - the heading does not seem to match what is shown however).
 Their explanation of the variables was as follows:
 A second model including these variables and wind speed reported a DAICc
 score 2; therefore, we model- averaged the parameter estimates included in
 these 2 best models (Table 3). Of the 5 habitats in which we observed
 plovers feeding, effect size was highest at artificial tidal ponds (5.52),
 followed by the intertidal zone (3.97). Positive effects of ephemeral pools
 (2.65) and bay shores (2.32) on adult foraging rates were 48% and 42% lower
 than artificial ponds,
 respectively. Conversely, sand flats (-2.30) had an equal but opposite
 effect on foraging rate, when compared to bay
 shores. The results also indicated that foraging rate was highest for adults
 during the post-breeding stage. In addition,
 vehicles had a 2.3 times larger effect on foraging adults than people.
 Finally, foraging rates during low tide were
 higher than at high tide by a factor of 2.5, as would be expected.

 As you can see, their explanation seems to suggest that all values are
 comparable e.g. vehicles and people.

 When I ran the model average I also got an Intercept estimate but only the
 second and beyond categorical Estimates were shown (e.g. if one factor was
 high tide, low tide, then only the estimate for low tide was shown,
 obviously an estimate of difference between the two).
 I asked on stats.stackexchange and they suggested just adding -1 to the end
 of the model, but although this worked, the estimates became much bigger to
 compensate for there being no intercept and although the difference between
 the Estimates were the same for 'within factor', the 'among factor'
 variables seemed to change (bigger differences between), along with the
 p-values for each group. In addition there was, of course, no intercept.

 I am therefore wondering whether anyone knows how I may be able to preserve
 the initial Estimates but still get the missing values (obviously the other
 researchers seemed to have done this as they still have an intercept and
 comparable estimates).

 This is my most important issue right now, but if someone has a moment,
 could you also tell me whether I should use the p-values as well, or should
 i just stick with explaining the magnitude of the effects, their direction
 and their Relative Importance. i want to keep it at a level that I can
 understand.

 Thank you in advance. I know everyone is busy but I would be very grateful
 for a prompt response if at all possible.

 Sincerely.

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Urgent-I-really-need-some-help-lme4-model-avg-Estimates-tp4511178p4511178.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:

Re: [R] Urgent - I really need some help lme4 model avg Estimates

2012-03-27 Thread Dragonwalker
I understand where you are coming from, but the issue is that some
exploration of the data through graphs and the like, showed that patterns
could be seen. However with only 7 means it is extremely difficult to get
any kind of statistical evidence and as some mean values are the same some
of the tests that I wanted to use such as a Mann-Whitney would not even run
so I had to resort to a one-sample Wilcoxon with a set mu value. (minimum
p-value that was even possible was p=0.280). 

I asked a couple of forums in December about the issues at hand and they
suggested that I look into mixed-effect models so I read some chapters on
them and got very excited, but at the time still thought of them as some
test that could give me means. However it all clicked and I realise that
they can be more useful as a tool to illustrate which factors and covariates
best fit to the response variable.

I understand the concepts of fitting an intercept and slope somewhat but the
paperwork on it can be a little confusing, however the way they were used in
the paper (of which I attached one of the tables) seemed a very
straightforward method of teasing the intricate factors of habitat, age and
other factors that could be affecting behaviour such as time feeding and
foraging rate. Believe me, if i could have survived with Kruskal-Wallis then
I would have had my thesis written up three months ago with a lot less
stress. I am not looking for pretty as I don't even want it published, but I
did hope to be able to give the time that I spent collecting data justice.

I have come really far, thanks to some great people, but I do not have
anyone near me who can help and my adviser is 3000 miles away too and is not
a statistician either.
All I would like to know is how could Maslo et al. have calculated estimates
for all categories AND an intercept and is there a method to do this in R. 
I have spent months trying to find these answers and so I would greatly
appreciate an answer to this question. 

Thank you again.



--
View this message in context: 
http://r.789695.n4.nabble.com/Urgent-I-really-need-some-help-lme4-model-avg-Estimates-tp4511178p4511396.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.