Milicic B. Marko wrote:
The only solution I can see is fitting all possib le 2 factor models enabling
interactions and then assessing if interaction term is significant...


any more ideas?

Please don't suggest such a thing unless you do simulations to back up its predictive performance, type I error properties, and the impact of collinearities. You'll find this approach works as well as the U.S. economy.

Frank Harrell






Milicic B. Marko wrote:
I have a huge data set with thousands of variable and one binary
variable. I know that most of the variables are correlated and are not
good predictors... but...

It is very hard to start modeling with such a huge dataset. What would
be your suggestion. How to make a first cut... how to eliminate most
of the variables but not to ignore potential interactions... for
example, maybe variable A is not good predictor and variable B is not
good predictor either, but maybe A and B together are good
predictor...

Any suggestion is welcomed

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to