Re: [R] can I do this with R?

Frank E Harrell Jr Thu, 29 May 2008 05:46:02 -0700

Xiaohui Chen wrote:

Frank E Harrell Jr 写道:
Xiaohui Chen wrote:
step or stepAIC functions do the job. You can opt to use BIC bychanging the mulplication of penalty.
I think AIC and BIC are not only limited to compare two pre-definedmodels, they can be used as model search criteria. You couldenumerate the information criteria for all possible models if thesize of full model is relatively small. But this is not generallyscaled to practical high-dimensional applications. Hence, it is oftenonly possible to find a 'best' model of a local optimum, e.g.measured by AIC/BIC.
Sure you can use them that way, and they may perform better than othermeasures, but the resulting model will be highly biased (regressioncoefficients biased away from zero). AIC and BIC were not designed tobe used in this fashion originally. Optimizing AIC or BIC will notproduce well-calibrated models as does penalizing a large model.
Sure, I agree with this point. AIC is used to correct the bias from theestimations which minimize the KL distance of true model, provided theassumed model family contains the true model. BIC is designed forapproximating the model marginal likelihood. Those are allpost-selection estimating methods. For simutaneous variable selectionand estimation, there are better penalizations like L1 penalty, which ismuch better than AIC/BIC in terms of consistency.

The main point is to use some kind of penalization. Lots of people areusing AIC/BIC to select models using ordinary unpenalized least squaresor maximum likelihood, and that doesn't work well.

On the other way around, I wouldn't like to say the over-penalizationof BIC. Instead, I think AIC is usually underpenalizing larger modelsin terms of the positive probability of incoperating irrevalentvariables in linear models.
If you put some constraints on the process (e.g., if using AIC to findthe optimum penalty in penalized maximum likelihood estimation), AICworks very well and BIC results if far too much shrinkage(underfitting). If using a dangerous process such as stepwisevariable selection, the more conservative BIC may be better in somesense, worse in others. The main problem with stepwise variableselection is the use of significance levels for entry below 1.0 andespecially below 0.1.
What's the point to use AIC on penalized MLE? I think generally you canview the penalty as the prior regularization and using certainoptimization algorithm to find the MAP estimate.

Simulations show that AIC can choose the optimum L2 penalty in PMLEusing the effective degrees of freedom of each model. Optimizing BICchoose far too much shrinkage.


Frank


Frank


X

Frank E Harrell Jr 写道:

Smita Pakhale wrote:

Hi Maria,

But why do you want to use forwards or backwards
methods? These all are 'backward' methods of modeling.
Try using AIC or BIC. BIC is much better than AIC.
And, you do not have to believe me or any one else on

this.

How does that help? BIC gives too much penalization in certaincontexts; both AIC and BIC were designed to compare twopre-specified models. They were not designed to fix problems ofstepwise variable selection.


Frank


Just make a small data set with a few variables with
known relationship amongst them. With this simulated
data set, use all your modeling methods: backwards,
forwards, AIC, BIC etc and then see which one gives
you a answer closest to the truth. The beauty of using
a simulated dataset is that, you 'know' the truth, as
you are the 'creater' of it!

smita

--- Charilaos Skiadas <[EMAIL PROTECTED]> wrote:

A google search for "logistic regression with
stepwise forward in r" returns the following post:

https://stat.ethz.ch/pipermail/r-help/2003-December/043645.html

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

On May 28, 2008, at 7:01 AM, Maria wrote:

Hello,
I am just about to install R and was wondering

about a few things.

I have only worked in Matlab because I wanted to

do a logistic

regression. However Matlab does not do logistic

regression with

stepwiseforward method. Therefore I thought about

testing R. So my

question is
can I do logistic regression with stepwise forward

in R?

Thanks /M

______________________________________________



--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can I do this with R?

Reply via email to