Re: [R] Logistic regression problem

2008-10-01 Thread Pedro.Rodriguez
:[EMAIL PROTECTED] On Behalf Of Liaw, Andy Sent: Wednesday, October 01, 2008 12:01 PM To: Frank E Harrell Jr; [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] Logistic regression problem From: Frank E Harrell Jr > > Bernardo Rangel Tura wrote: > > Em Ter, 2008-09-30 às 18:56 -0

Re: [R] Logistic regression problem

2008-10-01 Thread Liaw, Andy
From: Frank E Harrell Jr > > Bernardo Rangel Tura wrote: > > Em Ter, 2008-09-30 às 18:56 -0500, Frank E Harrell Jr escreveu: > >> Bernardo Rangel Tura wrote: > >>> Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: > I have a huge data set with thousands of variable and one binary > >

Re: [R] Logistic regression problem

2008-10-01 Thread Robert A LaBudde
It would not be possible to answer your original question until you specify your goal. Is it to develop a model with external validity that will generalize to new data? (You are not likely to succeed, if you are starting with a "boil the ocean" approach with 44,000+ covariates and millions o

Re: [R] Logistic regression problem

2008-10-01 Thread Frank E Harrell Jr
Bernardo Rangel Tura wrote: Em Ter, 2008-09-30 às 18:56 -0500, Frank E Harrell Jr escreveu: Bernardo Rangel Tura wrote: Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are co

Re: [R] Logistic regression problem

2008-10-01 Thread Bernardo Rangel Tura
Em Ter, 2008-09-30 às 18:56 -0500, Frank E Harrell Jr escreveu: > Bernardo Rangel Tura wrote: > > Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: > >> I have a huge data set with thousands of variable and one binary > >> variable. I know that most of the variables are correlated and are

Re: [R] Logistic regression problem

2008-09-30 Thread Frank E Harrell Jr
Bernardo Rangel Tura wrote: Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are correlated and are not good predictors... but... It is very hard to start modeling with such a

Re: [R] Logistic regression problem

2008-09-30 Thread Jason Jones Medical Informatics
r Sent: Tuesday, September 30, 2008 2:54 PM To: Milicic B. Marko Cc: r-help@r-project.org Subject: Re: [R] Logistic regression problem Milicic B. Marko wrote: > The only solution I can see is fitting all possib le 2 factor models enabling > interactions and then assessing if interaction term is sig

Re: [R] Logistic regression problem

2008-09-30 Thread Bernardo Rangel Tura
Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: > I have a huge data set with thousands of variable and one binary > variable. I know that most of the variables are correlated and are not > good predictors... but... > > It is very hard to start modeling with such a huge dataset. What wo

Re: [R] Logistic regression problem

2008-09-30 Thread Frank E Harrell Jr
Milicic B. Marko wrote: The only solution I can see is fitting all possib le 2 factor models enabling interactions and then assessing if interaction term is significant... any more ideas? Please don't suggest such a thing unless you do simulations to back up its predictive performance, type

Re: [R] Logistic regression problem

2008-09-30 Thread Milicic B. Marko
The only solution I can see is fitting all possib le 2 factor models enabling interactions and then assessing if interaction term is significant... any more ideas? Milicic B. Marko wrote: > > I have a huge data set with thousands of variable and one binary > variable. I know that most of th

[R] Logistic regression problem

2008-09-27 Thread milicic.marko
I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are correlated and are not good predictors... but... It is very hard to start modeling with such a huge dataset. What would be your suggestion. How to make a first cut... how to eliminate m