Re: [R] Question About lm()

2022-02-09 Thread PIKAL Petr
Hi

Is it enough for explanation?

https://stats.stackexchange.com/questions/26176/removal-of-statistically-sig
nificant-intercept-term-increases-r2-in-linear-mo

https://stackoverflow.com/questions/57415793/r-squared-in-lm-for-zero-interc
ept-model

Cheers
Petr
> -Original Message-
> From: R-help  On Behalf Of Bromaghin,
Jeffrey
> F via R-help
> Sent: Wednesday, February 9, 2022 11:01 PM
> To: r-help@r-project.org
> Subject: [R] Question About lm()
> 
> Hello,
> 
> I was constructing a simple linear model with one categorical (3-levels)
and one
> quantitative predictor variable for a colleague. I estimated model
parameters
> with and without an intercept, sometimes called reference cell coding and
cell
> means coding.
> 
> Model 1: yResp ~ -1 + xCat + xCont
> Model 2: yResp ~ xCat + xCont
> 
> These models are equivalent and the estimated coefficients come out fine,
but
> the R-squared and F statistics returned by summary() differ markedly. I
spent
> some time looking at the code for both lm() and summary.lm() but did not
find
> the source of the difference. aov() and anova() results also differ, so I
suspect
> the issue involves how the sums of squares are being computed. I've also
spent
> some time trying to search online for information on this, without
success. I
> haven't used lm() for quite a while, but my memory is that these
differences
> didn't occur in the distant past when I was teaching.
> 
> Thanks in advance for any insights you might have, Jeff
> 
> Jeffrey F. Bromaghin
> Research Statistician
> USGS Alaska Science Center
> 907-786-7086
> Jeffrey Bromaghin, Ph.D. | U.S. Geological Survey
> (usgs.gov)<https://www.usgs.gov/staff-profiles/jeffrey-bromaghin>
> Ecosystems Analytics | U.S. Geological Survey
> (usgs.gov)<https://www.usgs.gov/centers/alaska-science-
> center/science/ecosystems-analytics>
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question About lm()

2022-02-09 Thread Ivan Krylov
On Wed, 9 Feb 2022 22:00:40 +
"Bromaghin, Jeffrey F via R-help"  wrote:

> These models are equivalent and the estimated coefficients come out
> fine, but the R-squared and F statistics returned by summary() differ
> markedly.

Is the mean of yResp far from zero? Here's what summary.lm says about
that:

>> r.squared: R^2, the ‘fraction of variance explained by the model’,
>> 
>>   R^2 = 1 - Sum(R[i]^2) / Sum((y[i] - y*)^2),
>> 
>>where y* is the mean of y[i] if there is an intercept and
>>zero otherwise.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question About lm()

2022-02-09 Thread David Winsemius
The models are NOT equivalent. Why would you’ll think they were?

— 
David

Sent from my iPhone

> On Feb 9, 2022, at 11:10 PM, Bromaghin, Jeffrey F via R-help 
>  wrote:
> 
> Hello,
> 
> I was constructing a simple linear model with one categorical (3-levels) and 
> one quantitative predictor variable for a colleague. I estimated model 
> parameters with and without an intercept, sometimes called reference cell 
> coding and cell means coding.
> 
> Model 1: yResp ~ -1 + xCat + xCont
> Model 2: yResp ~ xCat + xCont
> 
> These models are equivalent and the estimated coefficients come out fine, but 
> the R-squared and F statistics returned by summary() differ markedly. I spent 
> some time looking at the code for both lm() and summary.lm() but did not find 
> the source of the difference. aov() and anova() results also differ, so I 
> suspect the issue involves how the sums of squares are being computed. I've 
> also spent some time trying to search online for information on this, without 
> success. I haven't used lm() for quite a while, but my memory is that these 
> differences didn't occur in the distant past when I was teaching.
> 
> Thanks in advance for any insights you might have,
> Jeff
> 
> Jeffrey F. Bromaghin
> Research Statistician
> USGS Alaska Science Center
> 907-786-7086
> Jeffrey Bromaghin, Ph.D. | U.S. Geological Survey 
> (usgs.gov)
> Ecosystems Analytics | U.S. Geological Survey 
> (usgs.gov)
> 
> 
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.