Hi,

Yes, the two methods are equivalent.

The p-value R calculates is based on the same t-statistic used in your manual analysis. You can see this by doing the second method:

y2 = rbind(df1, df2)
y2 = cbind(c(0,0,0,1,1,1), y2)
summary(lm(y2[,3] ~ y2[,1] + y2[,2] + y2[,2]*y2[,1]))

Look at the values you previously calculated and see where they reappear...
print(td)
print(db)
print(sd)

Looked at from the other way, the models with the D's and so on is one way to explain where the t-test comes from. Just do H0: b2=0 vs H1: b2!=0, and sprinkle some independence and normality assumptions.

It's probably preferable to use the automatic lm based method, because then you specify the model explicitly, while with the seemingly recipe based approach the actual models and hypotheses your are testing may not be clear. Plus you get nice diagnostic statistics and pretty graphs. The downside is that you might get lured into complacency...

Zhou Fang

PS: Your model equation isn't right. In both, we are also allowing the intercept to vary between groups. So really you want
y = c + D.b0 + b1.x + D.b2.x

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to