Dear Sam,

I hear your concern and I sympathise. The reason for the conflicting advice, in my opinion, is partly historical and partly due to academic heredity. When people first started doing statistical analyses, they didn't have computers and all calculations had to be done by hand. This, coupled with a statistical theory in its infancy, limited the choice of analysis methods. The result was the pragmatic approach of altering-your-data-to-fit-the-method. There still is, of course, some good reasons to do this, but only sometimes.

Now to answer your questions. Standardisation of covariates doesn't have inferential benefits. That is the model you fit will still be the same irrespectively. If you transform your covariates (by a non-linear transformation) then the model will change. The reason for standardising is to avoid computational issues (like numerical underflow and overflow) and some believe it helps to place priors on in a Bayesian analysis. The reason for transforming is quite different. It is done when you believe that the scale of the covariate is different to that measured. When fitting smooths (GAM(M)s) then the scale shouldn't matter so much anyway, but there still will be some dependence through the location of knots and the distance between points in covariate space.

Observations with outlying covariates are likely to have high leverage (they have an excessive amount of influence on the analysis result). Some would argue that you should transform these covariates to account for them. I would only transform if I thought the scale was wrong, or there were other (larger) issues with the data/analysis. In preference, I would try to do an analysis that reduced the influence of these covariate values. The extreme case is to remove that observation altogether (assume that the observation actually comes from a different sampling frame than you are interested in). A less extreme approach would be to down-weight the observation, or use bootstrap, or resistant/robust methods. These are just suggestions that I'm not overly familiar with. I have used them before but I need to look them up each time).

I hope that this helps,

Scott



On 04/09/14 03:34, Samantha Cox wrote:
Dear R-sig-ecology,

I have spent some time trawling the internet, and seem to come across slight 
conflicting advice regarding the standardisation and transformation of 
variables prior to multiple regression analysis (e.g. LM, LME/GLS, GLM, GLMM, 
GAM, GAMM).  I searched the archives here and I don't think this is a repeat, 
but I apologise if it is.


1.       I understand that standardisation (subtract mean and divide by 
standard deviation) is important within a Bayesian environment and when using 
programs such as Rjags.  However within frequentist packages (e.g. lme4, MASS 
etc) under what (if any) circumstances is it necessary?



2.       Are transformations (e.g. log, sqrt etc) necessary for non-normal 
(highly skewed) explanatory variables or where extreme/outliers are observed.  
Some literature says this is necessary, other say it is not.  Is the current 
consensus that transformations are generally not required on 
predictor/explanatory variables?


Thank you

Sam
________________________________
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>

This email and any files with it are confidential and intended solely for the 
use of the recipient to whom it is addressed. If you are not the intended 
recipient then copying, distribution or other use of the information contained 
is strictly prohibited and you should not rely on it. If you have received this 
email in error please let the sender know immediately and delete it from your 
system(s). Internet emails are not necessarily secure. While we take every 
care, Plymouth University accepts no responsibility for viruses and it is your 
responsibility to scan emails and their attachments. Plymouth University does 
not accept responsibility for any changes made after it was sent. Nothing in 
this email or its attachments constitutes an order for goods or services unless 
accompanied by an official order form.

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


--
Scott Foster
CSIRO
E scott.fos...@csiro.au T +61 3 6232 5178
Postal address: CSIRO Marine Laboratories, GPO Box 1538, Hobart TAS 7001
Street Address: CSIRO, Castray Esplanade, Hobart Tas 7001, Australia
www.csiro.au

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Reply via email to