Re: [R] Is this a valid syntax for lm()

Rui Barradas Wed, 12 Nov 2025 09:32:35 -0800

Às 17:12 de 12/11/2025, Rui Barradas escreveu:

Às 16:30 de 12/11/2025, Brian Smith escreveu:

Hi,


I have below code

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
group1 <- head(gl(2, 10, 22, labels = c("Ctl1","Trt1")), 20)
weight <- c(ctl, trt)
dat = as.data.frame(cbind(weight, group, group1))
lm.D9 <- lm(weight ~ group * group1 - 1 - group1, dat)

I want to incorporate interaction between 2 variables group and
group1, however do not want to incorporate level-0 for group1 not the
intercept.

Therefore I used (-1 - group1) in the formula.

I would like to know if above is a valid syntax for the stated model.

Thanks and regards,

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide https://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

Hello,

Yes, that syntax is valid. But isn't

lm.D9b <- lm(weight ~ 0 + group + group:group1, dat)


more readable?

You can check that the two models are the same with


summary(lm.D9)
summary(lm.D9b)

This will tell where the objects returned by those two calls to lm() aredifferent, giving further arguments to prefer model lm.D9b.



all.equal(lm.D9, lm.D9b, check.attributes = FALSE)


Hope this helps,

Rui Barradas

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide https://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

Hello,

Sorry for my hasty post, there is another problem with your code.
The dat creation code is wrong:


dat = as.data.frame(cbind(weights, group, group1))

first creates a matrix with cbind then coerces the matrix to data.frame.The error is in creating a matrix. Matrices can only have one data classso all variables become numeric and the factors group and group1 are nolonger factors.


This error will impact everything that follows.

The correct way is to use data.frame(weights, group, group1). See thecode below. The models coefficients, s.e's, etc are different. And soare the predictions from the models.



ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
group1 <- head(gl(2, 10, 22, labels = c("Ctl1","Trt1")), 20)
weight <- c(ctl, trt)

wrong_dat <- as.data.frame(cbind(weight, group, group1))
right_dat <- data.frame(weight, group, group1)
str(wrong_dat)
#> 'data.frame':    20 obs. of  3 variables:
#>  $ weight: num  4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
#>  $ group : num  1 1 1 1 1 1 1 1 1 1 ...
#>  $ group1: num  1 1 1 1 1 1 1 1 1 1 ...
str(right_dat)
#> 'data.frame':    20 obs. of  3 variables:
#>  $ weight: num  4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
#>  $ group : Factor w/ 2 levels "Ctl","Trt": 1 1 1 1 1 1 1 1 1 1 ...
#>  $ group1: Factor w/ 2 levels "Ctl1","Trt1": 1 1 1 1 1 1 1 1 1 1 ...

wrong_lm.D9 <- lm(weight ~ group * group1 - 1 - group1, wrong_dat)
right_lm.D9 <- lm(weight ~ group * group1 - 1 - group1, right_dat)
summary(wrong_lm.D9)
#>
#> Call:
#> lm(formula = weight ~ group * group1 - 1 - group1, data = wrong_dat)
#>
#> Residuals:
#>     Min      1Q  Median      3Q     Max
#> -1.0710 -0.4938  0.0685  0.2462  1.3690
#>
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)
#> group          7.7335     0.4540   17.04 1.51e-12 ***
#> group:group1  -2.7015     0.2462  -10.97 2.10e-09 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.6964 on 18 degrees of freedom
#> Multiple R-squared:  0.9818, Adjusted R-squared:  0.9798
#> F-statistic: 485.1 on 2 and 18 DF,  p-value: < 2.2e-16
summary(right_lm.D9)
#>
#> Call:
#> lm(formula = weight ~ group * group1 - 1 - group1, data = right_dat)
#>
#> Residuals:
#>     Min      1Q  Median      3Q     Max
#> -1.0710 -0.4938  0.0685  0.2462  1.3690
#>
#> Coefficients: (2 not defined because of singularities)
#>                     Estimate Std. Error t value Pr(>|t|)
#> groupCtl              5.0320     0.2202   22.85 9.55e-15 ***
#> groupTrt              4.6610     0.2202   21.16 3.62e-14 ***
#> groupCtl:group1Trt1       NA         NA      NA       NA
#> groupTrt:group1Trt1       NA         NA      NA       NA
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.6964 on 18 degrees of freedom
#> Multiple R-squared:  0.9818, Adjusted R-squared:  0.9798
#> F-statistic: 485.1 on 2 and 18 DF,  p-value: < 2.2e-16

# generate data for predict()
g <- gl(2, 1, labels = c("Ctl","Trt"))
g1 <- gl(2, 1, labels = c("Ctl1","Trt1"))
# wrong_new must be coerced to numeric
wrong_new <- expand.grid(group = g, group1 = g1)
wrong_new[] <- lapply(wrong_new, as.numeric)
# keep right_new as factors
right_new <- expand.grid(group = g, group1 = g1)

predict(wrong_lm.D9, newdata = wrong_new)
#>       1       2       3       4
#>  5.0320 10.0640  2.3305  4.6610
predict(right_lm.D9, newdata = right_new)
#> Warning in predict.lm(right_lm.D9, newdata = right_new): prediction from
#> rank-deficient fit; attr(*, "non-estim") has doubtful cases
#>     1     2     3     4
#> 5.032 4.661 5.032 4.661
#> attr(,"non-estim")
#> 2 3
#> 2 3



Hope this helps,

Rui Barradas

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is this a valid syntax for lm()

Reply via email to