Re: [R] Variable shortlisting for the logistic regression

2008-10-20 Thread Greg Snow
I nominate the below 2 paragraphs (or a possible shortening of them) as a new 
fortune.  While not as entertaining as many of the current fortunes, the wisdom 
gained and sentiment expressed deserves preservation and easy reference for 
future posters who think that Frank is only trying to be funny.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Darin Brooks
 Sent: Sunday, October 19, 2008 9:11 AM
 To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Cc: r-help@r-project.org
 Subject: Re: [R] Variable shortlisting for the logistic regression

 Frank's remark was made in response to my posting.  As funny as it
 was -
 it was the best thing that could have happened to me.  It sparked an
 enlightening discussion between my committee and me (in particular, the
 pros
  cons of stepwise vs. information theoretic approach to model
 selection).
 Being new to the R help list, I had no idea who Frank was.  I googled
 him
 (and asked around) and found very quickly that he should be taken
 seriously.
 And so should his remark.

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On
 Behalf Of Rolf Turner
 Sent: Thursday, October 16, 2008 1:34 PM
 To: useR
 Cc: r-help@r-project.org
 Subject: Re: [R] Variable shortlisting for the logistic regression



 On 17/10/2008, at 8:22 AM, useR wrote:

  Let's try to bring this discussion back again after Frank made
  very funny remark!

 Frank's remark was *serious*.  Take it seriously.

 cheers,

 Rolf Turner

 ##
 Attention:\ This e-mail message is privileged and
 confid...{{dropped:9}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
 No virus found in this incoming message.
 Checked by AVG - http://www.avg.com

 8:02 PM

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable shortlisting for the logistic regression

2008-10-19 Thread Darin Brooks
Frank's remark was made in response to my posting.  As funny as it was -
it was the best thing that could have happened to me.  It sparked an
enlightening discussion between my committee and me (in particular, the pros
 cons of stepwise vs. information theoretic approach to model selection).
Being new to the R help list, I had no idea who Frank was.  I googled him
(and asked around) and found very quickly that he should be taken seriously.
And so should his remark.  

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Rolf Turner
Sent: Thursday, October 16, 2008 1:34 PM
To: useR
Cc: r-help@r-project.org
Subject: Re: [R] Variable shortlisting for the logistic regression



On 17/10/2008, at 8:22 AM, useR wrote:

 Let's try to bring this discussion back again after Frank made 
 very funny remark!

Frank's remark was *serious*.  Take it seriously.

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
No virus found in this incoming message.
Checked by AVG - http://www.avg.com

8:02 PM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable shortlisting for the logistic regression

2008-10-16 Thread useR
Let's try to bring this discussion back again after Frank made
very funny remark!

What I'm doing at the moment is:

1. I split dataset in two (development and holdout)
2. I fit single predictor logistic model for every variable and
collect following stats:

DMaxDeriv=modelD$stats[2]
DModelLR=modelD$stats[3]
DP=modelD$stats[5]
DC=modelD$stats[6]
DDxy=modelD$stats[7]
DGamma=modelD$stats[8]
DTau=modelD$stats[9]
DR2=modelD$stats[10]
DBier=modelD$stats[11]

HMaxDeriv=modelH$stats[2]
HModelLR=modelH$stats[3]
HP=modelH$stats[5]
HC=modelH$stats[6]
HDxy=modelH$stats[7]
HGamma=modelH$stats[8]
HTau=modelH$stats[9]
HR2=modelH$stats[10]
HBier=modelH$stats[11]

where D is prefix for stats on development sample and H is prefix for
stats derived from hold out sample



3. Now I screen factor with sommers d grather than 0.3 and relative
change on hold out sample is smaller than 5%


Any comments are very welcomed















On Oct 14, 2:48 pm, John Kane [EMAIL PROTECTED] wrote:
 --- On Mon, 10/13/08, David Scott [EMAIL PROTECTED] wrote:



  From: David Scott [EMAIL PROTECTED]
  Subject: Re: [R] Variable shortlisting for the logistic regression
  To: Frank E Harrell Jr [EMAIL PROTECTED]
  Cc: [EMAIL PROTECTED]
  Received: Monday, October 13, 2008, 6:32 PM
  On Mon, 13 Oct 2008, Frank E Harrell Jr wrote:

   useR wrote:
   Hi R helpers,

   One rather statistical question?

   What would be the best startegy to shortlist
  thousands of continous
   variables automaticaly using R
   as the preparation for logistic regression
  modleing!

   Thanks

   The easiest approach is to use a random number
  generator.
   Frank

  Got a laugh from me Frank!

  Can I nominate it for a fortune?

  David

 Seconded.

       __
 [[elided Yahoo spam]]

 __
 [EMAIL PROTECTED] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable shortlisting for the logistic regression

2008-10-16 Thread Rolf Turner


On 17/10/2008, at 8:22 AM, useR wrote:


Let's try to bring this discussion back again after Frank made
very funny remark!


Frank's remark was *serious*.  Take it seriously.

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable shortlisting for the logistic regression

2008-10-14 Thread John Kane

--- On Mon, 10/13/08, David Scott [EMAIL PROTECTED] wrote:

 From: David Scott [EMAIL PROTECTED]
 Subject: Re: [R] Variable shortlisting for the logistic regression
 To: Frank E Harrell Jr [EMAIL PROTECTED]
 Cc: r-help@r-project.org
 Received: Monday, October 13, 2008, 6:32 PM
 On Mon, 13 Oct 2008, Frank E Harrell Jr wrote:
 
  useR wrote:
  Hi R helpers,
  
  One rather statistical question?
  
  
  What would be the best startegy to shortlist
 thousands of continous
  variables automaticaly using R
  as the preparation for logistic regression
 modleing!
  
  
  Thanks
 
  The easiest approach is to use a random number
 generator.
  Frank
 
 
 Got a laugh from me Frank!
 
 Can I nominate it for a fortune?
 
 David

Seconded.


  __
[[elided Yahoo spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable shortlisting for the logistic regression

2008-10-13 Thread Stephan Kolassa

Hi Marko,

this may be helpful:
http://www.ingentaconnect.com/content/bpl/rssb/2008/0070/0001/art5;jsessionid=an2la3spa0n5h.alexandra?format=print

Happy modeling!
Stephan


useR schrieb:

Hi R helpers,

One rather statistical question?


What would be the best startegy to shortlist thousands of continous
variables automaticaly using R
as the preparation for logistic regression modleing!


Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable shortlisting for the logistic regression

2008-10-13 Thread Frank E Harrell Jr

useR wrote:

Hi R helpers,

One rather statistical question?


What would be the best startegy to shortlist thousands of continous
variables automaticaly using R
as the preparation for logistic regression modleing!


Thanks


The easiest approach is to use a random number generator.
Frank


--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable shortlisting for the logistic regression

2008-10-13 Thread David Scott

On Mon, 13 Oct 2008, Frank E Harrell Jr wrote:


useR wrote:

Hi R helpers,

One rather statistical question?


What would be the best startegy to shortlist thousands of continous
variables automaticaly using R
as the preparation for logistic regression modleing!


Thanks


The easiest approach is to use a random number generator.
Frank



Got a laugh from me Frank!

Can I nominate it for a fortune?

David

_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.