Re: [R-sig-eco] AIC in R: back-transforming standardized model parameters (slopes)

Matt Perkins Tue, 12 Jan 2016 01:55:46 -0800

Hi Drew and R-help,

Many thanks for your email, and your explanations and suggestions have been 
very helpful in aiding me to understand this problem. Should you, or anyone 
else on the helplist have time, perhaps I may elaborate on my problem with a 
little more detail.


I appreciate the pragmatic suggestion to use a non-standardised model to 
extract ‘real’ slope values that I can report, which is great to hear as I had 
thought to do this, but wasn’t certain of its legitimacy, so extra support in 
this direction gives me confidence. However, as I am actually running a number 
of analyses this approach only works when AIC identifies a clear single best 
model; in some of these analyses AIC finds good evidence for multiple models in 
the top model set, such that I use model averaging to produce averaged 
parameter estimates. Obviously with model averaged parameters, it is not 
possible to run an alternate non-standardised model. So I need to be able to 
take a model averaged slope parameter estimate from the averaged model summary 
and back-transform it to a real slope value.

The provided equation is useful:
z.time = (time – mean(time))/2*sd(time)
However, I’m still a little uncertain how best I should employ it, or how I 
might relate it to a model averaged slope estimate.

Here is a brief worked example of my confusion, (data at the bottom of the 
page):

My analysis
I would like to test if treatment (kept in air or ice) affects nitrogen (N) 
within shrimps over time. I have repeated measures per shrimp (unique.id) that 
I use as a random factor to account for non-independence within an individual.

My global model is a linear mixed model:
model1<-lmer( N ~ time* air.ice + (1|unique.id), data=shrimp, REML=FALSE)

Standardisation:
So presumably, using the above equation I can standardise my “time” variable to 
create a new standardised time variable for use in the analysis.
“time” is a continuous variable but with 5 different values (2, 30.5, 58.5, 93, 
120 hr) where the mean = 60.8 and SD  = 43.34 (time data below). So for each 
value of time: 
new standardized time value = (old time value – 60.8)/(2*43.34). 

This produces a new standardised time variable (also below). 

I’m not sure how air.ice (2 level factor air or ice) is being standardised when 
I use the deafult R code with package "arm": 
stdz.model1<-standardize(model1, standardize.y=FALSE). 
I believe the default, which I use, leaves this binomial factor unscaled (as a 
binomial is already scaled between 0 and 1?). 

Even if this manual standardisation for time is correct, then I'm still unsure 
how I would back-transform my slope value. Slopes are supposed to be the value 
of increase in y per unit of x, so I assume standardised slope estimates 
represent change in y per standardised 1 unit on x axis. 


So with all this uncertainty in mind:

AIC model selection using standardised models identifies two top models; the 
interaction model and the main effects model. I model average across these two 
standardised models.
I present the model summaries for these at the bottom of the page, and that of 
the model averaged model. 
Strangely, and which I don’t understand, unstandardised model1 and 
unstandardised model2 slopes differ, while both stdz.model 1 and stdz.model2 
have the same slope values. The model averaged summary (based on average of 
stdz model1 and stdz mode2) has a slope value the same as the stdz models.


For simplicity of understanding my problem, I focus on model 1, rather than the 
model-averaged model.

Here are the slopes for different versions of Model 1:
Model 1 unstandardised  (0.008156)  
model 1 standardised   (0.50782) 
model 1 with my manually standardised  time variable (0.3979)

If I take the slope value from model 1 unstandardised (0.008156) I can 
obviously accurately predict the value of dN for any given time value. This is 
not the case for the bigger values in the two standardised models.

Clearly, as the slope of the model using my manually standardised time variable 
is different from the standardised model, my use of the above equation is not 
correct. Or my use is correct, but I am missing something else in the 
standardisation process.

Secondly, I am still unsure how I would apply the above equation to 
back-transform the standardised slope value (0.50782) so that I may get a value 
equivalent to my unstandardised ‘real’ slope estimate (0.008156).

Any suggestions for how I can back-transform a slope estimate from my model 
summary, or properly apply the equation above to standardise my model, would be 
most appreciated.

thanks,

Matt




### Models, data and summary tables below

model1<-lmer( N ~ time* air.ice + (1|unique.id), data=shrimp, REML=FALSE)
model2<-lmer( N ~ time + air.ice + (1|unique.id), data=shrimp, REML=FALSE)
model3<-lmer( N ~ std.Time*air.ice + (1|unique.id), data=shrimp, REML=FALSE)
stdz.model1<-standardize(model1, standardize.y=FALSE)
stdz.model2<-standardize(model2, standardize.y=FALSE)


Data (where manually standardised time = std.Time). 
std.Time  =  (time - mean of time) / (sd of time*2)

time    air.ice unique.id       dN      std.Time
2       air     37      12.41   -0.68
2       air     38      11.97   -0.68
2       air     39      12.00   -0.68
2       air     40      12.71   -0.68
30.5    air     37      12.96   -0.35
30.5    air     38      12.88   -0.35
30.5    air     39      12.56   -0.35
30.5    air     40      13.86   -0.35
58.5    air     37      13.11   -0.03
58.5    air     38      12.71   -0.03
58.5    air     39      12.63   -0.03
58.5    air     40      13.75   -0.03
93      air     37      13.18   0.37
93      air     38      13.15   0.37
93      air     39      12.84   0.37
93      air     40      14.29   0.37
120     air     37      13.24   0.68
120     air     38      13.12   0.68
120     air     39      12.75   0.68
120     air     40      14.26   0.68
2       ice     33      11.71   -0.68
2       ice     34      11.23   -0.68
2       ice     35      11.85   -0.68
2       ice     36      11.55   -0.68
30.5    ice     33      11.86   -0.35
30.5    ice     34      11.76   -0.35
30.5    ice     35      12.51   -0.35
30.5    ice     36      11.94   -0.35
58.5    ice     33      12.02   -0.03
58.5    ice     34      11.63   -0.03
58.5    ice     35      12.22   -0.03
58.5    ice     36      11.99   -0.03
93      ice     33      11.97   0.37
93      ice     34      11.80   0.37
93      ice     35      12.45   0.37
93      ice     36      11.89   0.37
120     ice     33      12.08   0.68
120     ice     34      11.68   0.68
120     ice     35      12.55   0.68
120     ice     36      12.25   0.68




model1 –  interaction (global) model, unstandardised
Fixed effects:
                                                   Estimate     Std. Error    t 
value
(Intercept)                                12.522535   0.197024   63.56
time                                             0.008156   0.001134    7.19
air.iceice                                     -0.801936   0.278634   -2.88
time:air.iceice                           -0.004442   0.001604   -2.77


stdz.model1 – interaction (global) model, standardised
Fixed effects:
                                                  Estimate     Std. Error    t 
value
(Intercept)                                12.48242    0.13051   95.65
z.time                                          0.50782    0.06861    7.40
c.air.ice                                      -1.07202    0.26102   -4.11
z.time:c.air.ice                          -0.38008    0.13722   -2.77


model3 – interaction (global) model, with manually standardised “time” variable
Fixed effects:
                                                   Estimate     Std. Error    t 
value
(Intercept)                                  13.0184     0.1846   70.54
std.Time                                        0.3979     0.1469    2.71
air.iceice                                      -1.0720     0.2610   -4.11
std.Time:air.iceice                      -0.1460     0.2078   -0.70


stdz.model2 – main effects model, unstandardised
Fixed effects:
                                                    Estimate   Std. Error   t 
value
(Intercept)                             12.6575787  0.1923837   65.79
time                                          0.0059351  0.0008929    6.65
air.iceice                                 -1.0720225  0.2610155   -4.11


model2 – main effects model, standardised
Fixed effects:
                                      Estimate   Std. Error    t value
(Intercept)                 12.48242     0.13051      95.65
z.time                            0.50782     0.07639       6.65
c.air.ice                        -1.07202     0.26102     -4.11


Model-averaged coefficients: 
                                                    Estimate    Std. Error    z 
value    Pr(>|z|)    
(Intercept)                                 12.48242     0.13051     95.645   < 
2e-16 ***
c.air.ice                                       -1.07202     0.26102     4.107  
   4.01e-05 ***
z.time                                           0.50782      0.06954     7.302 
   < 2e-16 ***
c.air.ice:z.time                          -0.38008       0.13722     2.770     
0.00561 **


________________________________________
From: Drew Tyre <aty...@unl.edu>
Sent: 11 January 2016 23:12
To: Matt Perkins; r-sig-ecology@r-project.org
Subject: RE: [R-sig-eco] AIC in R: back-transforming standardized model 
parameters (slopes)

Hi Matt,

This isn't going to be a complete answer, but it might help.

I wasn't 100% sure what standardize() was doing, or how it was doing it, so I 
did

getMethod("standardize","glm")

to see the source code. That function calls standardize.default() which is a 
bit hard to get but

getFromNamespace("standardize.default","arm")

pulls it out. From that code you can see that standardize() extracts the data 
from the model object, centers and scales it, and then refits the model to the 
centered and scaled data. So the formula you're looking for is

z.time = (time - mean(time))/2*sd(time)

similar to a standard Z transformation but using 2 times the sd in the 
denominator.

> Therefore I wished to know (preferably) the calculation being made, and more
> importantly the function/code to back-transform my slope estimates to
> reportable 'real' slopes.

Hmmm, but the main reason to do the transformation in the first place is to 
make it easier to do comparisons between the effect sizes of different 
variables. If you want to report "real" slopes, just use the ones from your 
model1, which should be near identical to the backtransformed versions of the 
ones from model2.

Another reason for centering and scaling is to improve numerical stability of 
your estimates, but if you were able to fit model1 and it didn't complain, not 
sure that you need to bother with the standardization.

There are other reasons too, but I don't think any of them apply here. There's 
a great discussion of when to scale and why here:
http://stats.stackexchange.com/questions/29781/when-conducting-multiple-regression-when-should-you-center-your-predictor-varia

--
Drew Tyre

School of Natural Resources
University of Nebraska-Lincoln
416 Hardin Hall, East Campus
3310 Holdrege Street
Lincoln, NE 68583-0974

phone: +1 402 472 4054
fax: +1 402 472 2946
email: aty...@unl.edu
http://snr.unl.edu/tyre
http://aminpractice.blogspot.com
http://www.flickr.com/photos/atiretoo

> -----Original Message-----
> From: R-sig-ecology [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of
> Matt Perkins
> Sent: Monday, January 11, 2016 1:40 AM
> To: r-sig-ecology@r-project.org
> Subject: [R-sig-eco] AIC in R: back-transforming standardized model parameters
> (slopes)
>
>
> Hi All,
>
>
> I have a simple Q that I'm having some difficulty finding an answer for. I'm
> conducting AIC analyses in R, and I would like to be able to report 'real' 
> slope
> values from my model summary output (i.e. take the slopes and report them
> within simple y=mx+c linear equations in my paper). However, following Greuber
> etal 2011, I have standardised the explanatory variables in my model by
> centering them to a mean of zero and and an SD of 2, using the following code
> and the R package "arm". My model has a normal error distribution.
>
>
> stdz.model1<-standardize(model1, standardize.y=FALSE)
>
>
>
> I do not yet know the sums behind this code in order to know how and what
> change has been made to my explanatory variables, in order that I could
> manually make back-transformations.
>
>
> Therefore I wished to know (preferably) the calculation being made, and more
> importantly the function/code to back-transform my slope estimates to
> reportable 'real' slopes.
>
>
> Addtionally, is it correct (or does it even matter) that I should be focusing 
> my
> back-transformation on the slope estimate taken from the model summary, as
> opposed to instead using the model summary standardised slope estimate to
> calculate a y value in my linear equation (y=mx+c), and then back-transforming
> that y value?
>
>
>
> ##
>
>
> If it is useful, my model and summary tables are below.
>
>
> I would like to test if treatment (kept in air or ice) affects nitrogen (N) 
> within
> shrimps over time. I have repeated measures per shrimp (unique.id) that I use 
> as
> a random factor to account for non-independence within an individual.
>
>
> My model is a linear mixed model of the form:
>
>
> model1<-lmer( N ~ time* air.ice + (1|unique.id), data=shrimp, REML=FALSE)
>
>
>
> The two model summary tables below show the 1) un-standardised model with
> ready-to-use slope value ("time" = 0.008156)
>
> and 2) standardised model with much-larger slope value ("z.time" = 0.50782)
>
>
> 1)
>
> Fixed effects:
>                                                   Estimate       Std. Error   
>  t value
> (Intercept)                              12.522535   0.197024      63.56
> time                                           0.008156    0.001134       7.19
> air.ice                                       -0.801936   0.278634      -2.88
> time:air.ice                              -0.004442   0.001604     -2.77
>
>
> 2)
>
>                                                    Estimate      Std. Error   
>  t value
> (Intercept)                                12.48242      0.13051      95.65
> z.time                                          0.50782       0.06861       
> 7.40
> c.air.ice                                      -1.07202       0.26102      
> -4.11
> z.time:c.air.ice                          -0.38008       0.13722     -2.77
>
>
>       [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] AIC in R: back-transforming standardized model parameters (slopes)

Reply via email to