Re: [R] GLM Gamma Family logLik formula?

2009-07-15 Thread Skipper Seabold
On Wed, Jul 15, 2009 at 8:45 PM, Ben Bolker wrote:
>
>
>
> jseabold wrote:
>>
>> Hello all,
>>
>> I was wondering if someone can enlighten me as to the difference
>> between the logLik in R vis-a-vis Stata for a GLM model with the gamma
>> family.
>>
>> Stata calculates the loglikelihood of the model as (in R notation)
>> some equivalent function of
>>
>> -1/scale *
>> sum(Y/mu+log(mu)+(scale-1)*log(Y)+log(scale)+scale*lgamma(1/scale))
>>
>> where scale (or dispersion) = 1, Y = the response variable, and mu is
>> the fitted values given by the fitted model.
>>
>> R seems to use a very similar approach, but scale is set equal to the
>> calculated dispersion for the gamma model.  However, when I calculate
>> the logLik by hand this way the answer differs slightly (about .5)
>> from the logLik(glm.m1).
>>
>> I haven't been able to figure out why looking at the help.  If anyone
>> has any ideas, the insight would be much appreciated.
>>
>>
>
> This is not a full answer, but the beginning of an answer about how to find
> out the answer.
>
> stats:::logLik.glm
>

So that's how you look at the source.  Very helpful to clear up
confusion.  I am rather new to R.

> shows that R computes the log-likelihood (a bit oddly) by
> extracting the AIC component of a model, dividing by 2, and subtracting
> it from the number of parameters (since AIC = -2 (logLik) + 2 p).
>

Yes, I picked this up from the documentation.  They use the IRLS
algorithm, so there is no need to directly compute a loglikelihood.

> Gamma()$aic in turn gives the formula for the AIC of a gamma GLM:
>
> function (y, n, mu, wt, dev)
> {
>    n <- sum(wt)
>    disp <- dev/n
>    -2 * sum(dgamma(y, 1/disp, scale = mu * disp, log = TRUE) *
>        wt) + 2
> }
>
> and ?dgamma gives the
>
>  The Gamma distribution with parameters 'shape' = a and 'scale' = s
>     has density
>
>               f(x)= 1/(s^a Gamma(a)) x^(a-1) e^-(x/s)
>
>
>
> By the way, I find it deeply confusing that Stata refers to what R
> calls the "shape" parameter (a, the thing that appears inside the
> Gamma or lgamma() function) as "shape" .
> Oh well.
>

What I wouldn't give for more notational conventions to help my
intuition.  Maybe one day I'll get there.

>  good luck
>

Thanks.  This sort of confirms my suspicions, and you've shown me how
to go about figuring these things out for myself.  Much appreciated.

>    Ben Bolker
>

Skipper

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GLM Gamma Family logLik formula?

2009-07-15 Thread Ben Bolker



jseabold wrote:
> 
> Hello all,
> 
> I was wondering if someone can enlighten me as to the difference
> between the logLik in R vis-a-vis Stata for a GLM model with the gamma
> family.
> 
> Stata calculates the loglikelihood of the model as (in R notation)
> some equivalent function of
> 
> -1/scale *
> sum(Y/mu+log(mu)+(scale-1)*log(Y)+log(scale)+scale*lgamma(1/scale))
> 
> where scale (or dispersion) = 1, Y = the response variable, and mu is
> the fitted values given by the fitted model.
> 
> R seems to use a very similar approach, but scale is set equal to the
> calculated dispersion for the gamma model.  However, when I calculate
> the logLik by hand this way the answer differs slightly (about .5)
> from the logLik(glm.m1).
> 
> I haven't been able to figure out why looking at the help.  If anyone
> has any ideas, the insight would be much appreciated.
> 
> 

This is not a full answer, but the beginning of an answer about how to find
out the answer.

stats:::logLik.glm

shows that R computes the log-likelihood (a bit oddly) by
extracting the AIC component of a model, dividing by 2, and subtracting
it from the number of parameters (since AIC = -2 (logLik) + 2 p).

Gamma()$aic in turn gives the formula for the AIC of a gamma GLM:

function (y, n, mu, wt, dev) 
{
n <- sum(wt)
disp <- dev/n
-2 * sum(dgamma(y, 1/disp, scale = mu * disp, log = TRUE) * 
wt) + 2
}

and ?dgamma gives the 

 The Gamma distribution with parameters 'shape' = a and 'scale' = s
 has density

   f(x)= 1/(s^a Gamma(a)) x^(a-1) e^-(x/s)



By the way, I find it deeply confusing that Stata refers to what R 
calls the "shape" parameter (a, the thing that appears inside the 
Gamma or lgamma() function) as "shape" .  
Oh well.

  good luck

Ben Bolker

-- 
View this message in context: 
http://www.nabble.com/GLM-Gamma-Family-logLik-formula--tp24504030p24508346.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GLM Gamma Family logLik formula?

2009-07-15 Thread Skipper Seabold
Hello all,

I was wondering if someone can enlighten me as to the difference
between the logLik in R vis-a-vis Stata for a GLM model with the gamma
family.

Stata calculates the loglikelihood of the model as (in R notation)
some equivalent function of

-1/scale * sum(Y/mu+log(mu)+(scale-1)*log(Y)+log(scale)+scale*lgamma(1/scale))

where scale (or dispersion) = 1, Y = the response variable, and mu is
the fitted values given by the fitted model.

R seems to use a very similar approach, but scale is set equal to the
calculated dispersion for the gamma model.  However, when I calculate
the logLik by hand this way the answer differs slightly (about .5)
from the logLik(glm.m1).

I haven't been able to figure out why looking at the help.  If anyone
has any ideas, the insight would be much appreciated.

Cheers,

Skipper

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.