Folks: I think this and several other recent posts on ranking predictors are nice illustrations of a fundamental conundrum: Empirical models are fit as good *predictors*; "meaningful" interpretation of separate parameters/components of the predictors may well be difficult or impossible, especially in complex models. All that the fitting process guarantees if it works well is a good overall predictor to data sampled from the same process. Unfortunately, most/much of the time, those who apply the procedures are interested in interpretation, not prediction.
Addendum: Interpretation is helped by well-designed studies and experiments, hindered by data mining of observational data. I don't think any of this is profound, just sometimes forgotten; however, I would welcome public or private reaction to this comment, and especially refinement/corrections. Bert Gunter Genentech Nonclinical Statistics South San Francisco, CA 94404 -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley Sent: Wednesday, January 17, 2007 6:02 AM To: Behnke Jerzy Cc: Reader Tom; r-help@stat.math.ethz.ch Subject: Re: [R] Effect size in GLIM models On Wed, 17 Jan 2007, Behnke Jerzy wrote: > Dear All, > I wonder if anyone can advise me as to whether there is a consensus as > to how the effect size should be calculated from GLIM models in R for > any specified significant main effect or interaction. I think there is consensus that effect sizes are not measured by significance tests. If you have a log link (you did not say), the model coefficients have a direct interpretation via multiplicative increases in rates. > In investigating the causes of variation in infection in wild animals, > we have fitted 4-way GLIM models in R with negative binomial errors. What exactly do you mean by 'GLIM models in R with negative binomial errors'? Negative binomial regression is within the GLM framework only for fixed shape theta. Package MASS has glm.nb() which extends the framework and you may be using without telling us. (AFAIK GLIM is a software package, not a class of models.) I suspect you are using the code from MASS without reference to the book it supports, which has a worked example of model selection. > These are then simplified using the STEP procedure, and finally each of > the remaining terms is deleted in turn, and the model without that term > compared to a model with that term to estimate probability 'probability' of what? > An ANOVA of each model gives the deviance explained by each interaction > and main effect, and the percentage deviance attributable to each factor > can be calculated from NULL deviance. If theta is not held fixed, anova() is probably not appropriate: see the help for anova.negbin. > However, we estimate probabilities by subsequent deletion of terms, and > this gives the LR statistic. Expressing the value of the LR statistic as > a percentage of 2xlog-like in a model without any factors, gives lower > values than the former procedure. I don't know anything to suggest percentages of LR statistics are reasonable summary measures. There are extensions of R^2 to these models, but AFAIK they share the well-attested drawbacks of R^2. > Are either of these appropriate? If so which is best, or alternatively > how can % deviance be calculated. We require % deviance explained by > each factor or interaction, because we need to compare individual > factors (say host age) across a range of infections. > > Any advice will be most gratefully appreciated. I can send you a worked > example if you require more information. We do ask for more information in the posting guide and the footer of every message. I have had to guess uncomfortably much in formulating my answers. > Jerzy. M. Behnke, > The School of Biology, > The University of Nottingham, > University Park, > NOTTINGHAM, NG7 2RD > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.