Re: [R] distributions and glm

Rubén Roa-Ureta Tue, 21 Oct 2008 04:51:25 -0700

drbn wrote:

Hello,
I have seen that some papers do this:

1.) Group data by year (e.g. 35 years)

2.) Estimate the mean of the key variable through the distribution that fits
better (some years is a normal distribution , others is a more skewed, gamma
distribution, etc.)

3.) With these estimated means of each year do a GLM.

I'd like to know if it is possible (to use these means in a GLM) or is a
wrong idea.

Thanks in advance

David

David,

You can model functions of data, such as means, but you must be carefulto carry over most of the uncertainty in the original data into themodel. If you don't, for example if you let the model know only thevalues of the means, then you are actually assuming that these meanswere observed with absolute certainty instead of being estimated fromthe data. To carry over the uncertainty in the original data to yourmodeling you can use a Bayesian approach or you can use a marginallikelihood approach. A marginal likelihood is a true likelihood functionnot of the data, but of functions of the data, such as of maximumlikelihood estimates. If your means per year were estimated usingmaximum likelihood (for example with fitdistr in package MASS) and yousample size is not too small then you can use a normal marginallikelihood model for the means. Note however that each mean may comefrom a different distribution so the full likelihood model for your datawould be a mixture of normal distributions. You may not be able to usethe pre-built glm function so you may face the challenge to write yourown code.

HTH
Rubén

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] distributions and glm

Reply via email to