Re: [R] [FORGED] log log regression model

Rolf Turner Fri, 01 Jan 2016 14:29:32 -0800

On 01/01/16 14:35, Andras Farkas wrote:


Dear All,

wonder if you have a thought on the followimg: if I have a simple
model like model <- lm(log(y)~log(x)+log(z),data=data), where both,
the dependent and independent variables are log transformed, is it ok
just to use ypred <- predict(model,type=response) to get the
predictions , then transform ypred with exp(ypred)  to y's original
scale to compare observed or known data (y) with model predicted
(ypred) on the original scale?

appreciate your thoughts...

Is it OK? In what sense? It's a free country --- at least the countrythat I live in is still free (more or less) so you can do whatever youfeel like.

However it helps to do a little elementary mathematics, rather than justhammering and hoping. When in doubt, do the maths.


You are fitting the model

  log(y) = beta_0 + beta_1 * log(x) + beta_2 * log(z) + E

where it is tacitly assumed that E ~ N(0,sigma^2) and that the E'scorresponding to different observations are independent. Using predict()will get you


   log(y).hat = b_0 + b_1 * log(x) + b_2 * log(z)

where the b_i are the *estimates* of the beta_i. Under the givenassumptions, the b_i are unbiased so you get the expected value oflog(y).hat being equal to the expected value of log(y) and the universeis in harmony. OMMMMMMMMM!

However when you exponentiate everything, things change; expected valuesare not preserved since exponentiation is *not* a linear function!


If you set y.hat = exp(log(y).hat) you get

   y.hat = exp(b_0) * x^{b_1}* z^{b_2}

It is not obvious what the expected value of this expression is, but itis pretty sure not to be the same as the expected value of y.


Note that the expected value of y is *not* equal to:

    exp(beta_0) * x^{beta_1} * z^{beta_2}  (1)

Rather, it is equal to this expression multiplied by exp(sigma^2/2)(under the stated assumptions).

I *think* (I have not checked this) that the expected value of y.hatwill be equal "asymptotically" to (1) --- that is, to the expected valueof y divided by exp(sigma^2/2).

Other younger and wiser heads who read this list might like to chime inhere and possibly correct me.

So, bottom line, my guess is that if your sample size is "large" thenyou won't go too far wrong by predicting y by


    exp(log(y).hat)*exp(s^2/2)

i.e. by exp(predict(model))*exp(s^2/2)

where s^2 is the estimated variance from the model that you fitted onthe log scale using lm().

I would advise doing some fairly thorough simulations to get a feel forthe size of the bias. (There will always be *some* bias.)


You might also think about fitting the non-linear model

    y = gamma_0 * x^{gamma_1} * z^{gamma_2} + E

if you think that additive errors make more sense than multiplicativeerrors for your setting.

You can do this "fairly easily" using the function nls(), if you chooseto go this way.


cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] log log regression model

Reply via email to