Re: [R] Where is the SD in output of glm with Gaussian distribution

2019-12-09 Thread Fox, John
Dear Bert,

It's perhaps a bit pedantic to point it out, but the dispersion is estimated 
from the Pearson statistic (sum of squared residuals or weighted squared 
residuals) not from the residual deviance. You can see this in the code for 
summary.glm().

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 9, 2019, at 10:45 AM, Bert Gunter  wrote:
> 
> In addition, as John's included output shows, only 1 parameter, the
> intercept, is fit. As he also said, the sd is estimated from the residual
> deviance -- it is not a model parameter.
> 
> Suggest you spend some time with a glm tutorial/text.
> 
> Bert
> 
> On Mon, Dec 9, 2019 at 7:17 AM Marc Girondot via R-help <
> r-help@r-project.org> wrote:
> 
>> Let do a simple glm:
>> 
>>> y=rnorm(100)
>>> gnul <- glm(y ~ 1)
>>> gnul$coefficients
>> (Intercept)
>>   0.1399966
>> 
>> The logLik shows the fit of two parameters (DF=2) (intercept) and sd
>> 
>>> logLik(gnul)
>> 'log Lik.' -138.7902 (df=2)
>> 
>> But where is the sd term in the glm object?
>> 
>> If I do the same with optim, I can have its value
>> 
>>> dnormx <- function(x, data) {1E9*-sum(dnorm(data, mean=x["mean"],
>> sd=x["sd"], log = TRUE))}
>>> parg <- c(mean=0, sd=1)
>>> o0 <- optim(par = parg, fn=dnormx, data=y, method="BFGS")
>>> o0$value/1E9
>> [1] 138.7902
>>> o0$par
>>  meansd
>> 
>> 0.1399966 0.9694405
>> 
>> But I would like have the value in the glm.
>> 
>> (and in the meantime, I don't understand why gnul$df.residual returned
>> 99... for me it should be 98=100 - number of observations) -1 (for mean)
>> - 1 (for sd); but it is statistical question... I have asked it in
>> crossvalidated [no answer still] !)
>> 
>> Thanks
>> 
>> Marc
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is the SD in output of glm with Gaussian distribution

2019-12-09 Thread Marc Girondot via R-help
Le 09/12/2019 à 16:45, Bert Gunter a écrit :
> In addition, as John's included output shows, only 1 parameter, the 
> intercept, is fit. As he also said, the sd is estimated from the 
> residual deviance -- it is not a model parameter.
>
> Suggest you spend some time with a glm tutorial/text.

I tried ! But I miss this point. I understand now this point. Thanks a  
lot... big progress for me.

But still I don't understand why AIC calculation uses 2 parameters if 
the SD is estimated from the residual deviance.

 > y=rnorm(100)
 > gnul <- glm(y ~ 1)
 > logLik(gnul)
'log Lik.' -136.4343 (df=2)
 > AIC(gnul)
[1] 276.8687
 > -2*logLik(gnul)+2*2
'log Lik.' 276.8687 (df=2)

This is not intuitive when to count SD as a parameter (in AIC) or not in 
df.resuidual !


>
> Bert
>
> On Mon, Dec 9, 2019 at 7:17 AM Marc Girondot via R-help 
> mailto:r-help@r-project.org>> wrote:
>
> Let do a simple glm:
>
>  > y=rnorm(100)
>  > gnul <- glm(y ~ 1)
>  > gnul$coefficients
> (Intercept)
>    0.1399966
>
> The logLik shows the fit of two parameters (DF=2) (intercept) and sd
>
>  > logLik(gnul)
> 'log Lik.' -138.7902 (df=2)
>
> But where is the sd term in the glm object?
>
> If I do the same with optim, I can have its value
>
>  > dnormx <- function(x, data) {1E9*-sum(dnorm(data, mean=x["mean"],
> sd=x["sd"], log = TRUE))}
>  > parg <- c(mean=0, sd=1)
>  > o0 <- optim(par = parg, fn=dnormx, data=y, method="BFGS")
>  > o0$value/1E9
> [1] 138.7902
>  > o0$par
>   mean    sd
>
> 0.1399966 0.9694405
>
> But I would like have the value in the glm.
>
> (and in the meantime, I don't understand why gnul$df.residual
> returned
> 99... for me it should be 98=100 - number of observations) -1 (for
> mean)
> - 1 (for sd); but it is statistical question... I have asked it in
> crossvalidated [no answer still] !)
>
> Thanks
>
> Marc
>
> __
> R-help@r-project.org  mailing list --
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is the SD in output of glm with Gaussian distribution

2019-12-09 Thread Bert Gunter
In addition, as John's included output shows, only 1 parameter, the
intercept, is fit. As he also said, the sd is estimated from the residual
deviance -- it is not a model parameter.

Suggest you spend some time with a glm tutorial/text.

Bert

On Mon, Dec 9, 2019 at 7:17 AM Marc Girondot via R-help <
r-help@r-project.org> wrote:

> Let do a simple glm:
>
>  > y=rnorm(100)
>  > gnul <- glm(y ~ 1)
>  > gnul$coefficients
> (Intercept)
>0.1399966
>
> The logLik shows the fit of two parameters (DF=2) (intercept) and sd
>
>  > logLik(gnul)
> 'log Lik.' -138.7902 (df=2)
>
> But where is the sd term in the glm object?
>
> If I do the same with optim, I can have its value
>
>  > dnormx <- function(x, data) {1E9*-sum(dnorm(data, mean=x["mean"],
> sd=x["sd"], log = TRUE))}
>  > parg <- c(mean=0, sd=1)
>  > o0 <- optim(par = parg, fn=dnormx, data=y, method="BFGS")
>  > o0$value/1E9
> [1] 138.7902
>  > o0$par
>   meansd
>
> 0.1399966 0.9694405
>
> But I would like have the value in the glm.
>
> (and in the meantime, I don't understand why gnul$df.residual returned
> 99... for me it should be 98=100 - number of observations) -1 (for mean)
> - 1 (for sd); but it is statistical question... I have asked it in
> crossvalidated [no answer still] !)
>
> Thanks
>
> Marc
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is the SD in output of glm with Gaussian distribution

2019-12-09 Thread Eric Berger
summary(gnul)

shows the std error of the coefficient estimate

On Mon, Dec 9, 2019 at 5:16 PM Marc Girondot via R-help
 wrote:
>
> Let do a simple glm:
>
>  > y=rnorm(100)
>  > gnul <- glm(y ~ 1)
>  > gnul$coefficients
> (Intercept)
>0.1399966
>
> The logLik shows the fit of two parameters (DF=2) (intercept) and sd
>
>  > logLik(gnul)
> 'log Lik.' -138.7902 (df=2)
>
> But where is the sd term in the glm object?
>
> If I do the same with optim, I can have its value
>
>  > dnormx <- function(x, data) {1E9*-sum(dnorm(data, mean=x["mean"],
> sd=x["sd"], log = TRUE))}
>  > parg <- c(mean=0, sd=1)
>  > o0 <- optim(par = parg, fn=dnormx, data=y, method="BFGS")
>  > o0$value/1E9
> [1] 138.7902
>  > o0$par
>   meansd
>
> 0.1399966 0.9694405
>
> But I would like have the value in the glm.
>
> (and in the meantime, I don't understand why gnul$df.residual returned
> 99... for me it should be 98=100 - number of observations) -1 (for mean)
> - 1 (for sd); but it is statistical question... I have asked it in
> crossvalidated [no answer still] !)
>
> Thanks
>
> Marc
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is the SD in output of glm with Gaussian distribution

2019-12-09 Thread Fox, John
Dear Marc,

For your simple model, the standard deviation of y is the square-root of the 
estimated dispersion parameter:

> set.seed(123)
> y <- rnorm(100)
> gnul <- glm(y ~ 1)
> summary(gnul)

Call:
glm(formula = y ~ 1)

Deviance Residuals: 
 Min1QMedian3Q   Max  
-2.39957  -0.58426  -0.02865   0.60141   2.09693  

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.090410.091280.990.324

(Dispersion parameter for gaussian family taken to be 0.8332328)

Null deviance: 82.49  on 99  degrees of freedom
Residual deviance: 82.49  on 99  degrees of freedom
AIC: 268.54

Number of Fisher Scoring iterations: 2

> sqrt(0.8332328)
[1] 0.9128159
> mean(y)
[1] 0.09040591
> sd(y)
[1] 0.9128159

I hope this helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 9, 2019, at 10:16 AM, Marc Girondot via R-help  
> wrote:
> 
> Let do a simple glm:
> 
> > y=rnorm(100)
> > gnul <- glm(y ~ 1)
> > gnul$coefficients
> (Intercept)
>   0.1399966
> 
> The logLik shows the fit of two parameters (DF=2) (intercept) and sd
> 
> > logLik(gnul)
> 'log Lik.' -138.7902 (df=2)
> 
> But where is the sd term in the glm object?
> 
> If I do the same with optim, I can have its value
> 
> > dnormx <- function(x, data) {1E9*-sum(dnorm(data, mean=x["mean"], 
> > sd=x["sd"], log = TRUE))}
> > parg <- c(mean=0, sd=1)
> > o0 <- optim(par = parg, fn=dnormx, data=y, method="BFGS")
> > o0$value/1E9
> [1] 138.7902
> > o0$par
>  meansd
> 
> 0.1399966 0.9694405
> 
> But I would like have the value in the glm.
> 
> (and in the meantime, I don't understand why gnul$df.residual returned 99... 
> for me it should be 98=100 - number of observations) -1 (for mean) - 1 (for 
> sd); but it is statistical question... I have asked it in crossvalidated [no 
> answer still] !)
> 
> Thanks
> 
> Marc
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.