The function 'regsubsets' appears to calculate a BIC value that is different from that returned by the function 'BIC'. The latter is explained in the documentation, but I can't find an expression for the statistic returned by 'regsubsets'.

Incidentally, both of these differ from the BIC that is given in Ramsey and Schafer's, The Statistical Sleuth. I assume these are all linear transformations of each other, but I'd like to know the 'regsubsets' formula (so that I can develop a way to do all-subsets selection based on the AIC rather than the BIC).

The following code defines a function that illustrates the issue.
Thanks
-Paul


script.ic <- function() {

library(datasets)
print(names(airquality))              # Ozone Solar.R Wind Temp Month Day

# Fit a model with two predictors
mod1 <- lm(Ozone ~ Wind + Temp, data=airquality)
npar <- length(mod1$coef)+1         # no. parameters in fitted model,
                                    #   including s2, is 4
nobs <- length(mod1$fitted)         # no. of observations = 116
s2 <- summary(mod1)$sigma2         # MSE = 477.6371
logL <- as.vector(logLik(mod1))     # log likelihood = -520.8705

# Use the R function BIC, defined as: -2*log-likelihood + npar*log(nobs)
tmp1 <- BIC(mod1)                           # 1060.755
tmp2 <- -2 * (-520.8705) + 4 * log(116)     # 1060.755, agrees
cat(paste("\nFrom R's BIC:",signif(tmp1,5),"(",signif(tmp2,5),
   "obtained 'by hand')\n\n"))

# Now see how 'regsubsets' calculates the BIC
tmp3 <- regsubsets(Ozone ~ Solar.R + Wind + Temp, data=airquality)
tmp3.s <- summary(tmp3)

# 'mod1' is the second model in 'tmp3'; what is the formula for this BIC?
cat("\nThe corresponding model from 'regsubsets':\n")
print(tmp3.s$which[2,])
tmp4 <- tmp3.s$bic[2]                       # -82.52875
cat(paste("\nBIC =",signif(tmp4,5),"\n"))

# Incidentally, the 'rsq' and 'rss' components of tmp3.s do not agree
# with the values in the 'mod1' lm output object.

# Just for kicks, try the formula for Schwarz's BIC from Ramsey & Schafer,
# Statistical Sleuth:  nobs*log(MSE) + npar*log(nobs))
tmp5 <- 116 * log(477.6371) + 4*log(116)    # 734.6011
cat(paste("\nStat. Sleuth's BIC =",signif(tmp5,5),"\n"))
cat("\nUff da!\n")

browser()

}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to