Re: [R] Logistic regression problem

2008-10-01 Thread Bernardo Rangel Tura
Em Ter, 2008-09-30 às 18:56 -0500, Frank E Harrell Jr escreveu: Bernardo Rangel Tura wrote: Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are correlated and are not

Re: [R] Logistic regression problem

2008-10-01 Thread Frank E Harrell Jr
Bernardo Rangel Tura wrote: Em Ter, 2008-09-30 às 18:56 -0500, Frank E Harrell Jr escreveu: Bernardo Rangel Tura wrote: Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are

Re: [R] Logistic regression problem

2008-10-01 Thread Robert A LaBudde
It would not be possible to answer your original question until you specify your goal. Is it to develop a model with external validity that will generalize to new data? (You are not likely to succeed, if you are starting with a boil the ocean approach with 44,000+ covariates and millions of

Re: [R] Logistic regression problem

2008-10-01 Thread Liaw, Andy
From: Frank E Harrell Jr Bernardo Rangel Tura wrote: Em Ter, 2008-09-30 às 18:56 -0500, Frank E Harrell Jr escreveu: Bernardo Rangel Tura wrote: Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: I have a huge data set with thousands of variable and one binary variable. I know

Re: [R] Logistic regression problem

2008-10-01 Thread Pedro.Rodriguez
:[EMAIL PROTECTED] On Behalf Of Liaw, Andy Sent: Wednesday, October 01, 2008 12:01 PM To: Frank E Harrell Jr; [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] Logistic regression problem From: Frank E Harrell Jr Bernardo Rangel Tura wrote: Em Ter, 2008-09-30 às 18:56 -0500, Frank E

Re: [R] Logistic regression problem

2008-09-30 Thread Milicic B. Marko
The only solution I can see is fitting all possib le 2 factor models enabling interactions and then assessing if interaction term is significant... any more ideas? Milicic B. Marko wrote: I have a huge data set with thousands of variable and one binary variable. I know that most of the

Re: [R] Logistic regression problem

2008-09-30 Thread Frank E Harrell Jr
Milicic B. Marko wrote: The only solution I can see is fitting all possib le 2 factor models enabling interactions and then assessing if interaction term is significant... any more ideas? Please don't suggest such a thing unless you do simulations to back up its predictive performance, type

Re: [R] Logistic regression problem

2008-09-30 Thread Bernardo Rangel Tura
Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are correlated and are not good predictors... but... It is very hard to start modeling with such a huge dataset. What would

Re: [R] Logistic regression problem

2008-09-30 Thread Jason Jones Medical Informatics
, September 30, 2008 2:54 PM To: Milicic B. Marko Cc: r-help@r-project.org Subject: Re: [R] Logistic regression problem Milicic B. Marko wrote: The only solution I can see is fitting all possib le 2 factor models enabling interactions and then assessing if interaction term is significant... any more

Re: [R] Logistic regression problem

2008-09-30 Thread Frank E Harrell Jr
Bernardo Rangel Tura wrote: Em Sáb, 2008-09-27 às 10:51 -0700, milicic.marko escreveu: I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are correlated and are not good predictors... but... It is very hard to start modeling with such a

[R] Logistic regression problem

2008-09-27 Thread milicic.marko
I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are correlated and are not good predictors... but... It is very hard to start modeling with such a huge dataset. What would be your suggestion. How to make a first cut... how to eliminate