Re: [R] Variable shortlisting for the logistic regression
I nominate the below 2 paragraphs (or a possible shortening of them) as a new fortune. While not as entertaining as many of the current fortunes, the wisdom gained and sentiment expressed deserves preservation and easy reference for future posters who think that Frank is only trying to be funny. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Darin Brooks Sent: Sunday, October 19, 2008 9:11 AM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] Variable shortlisting for the logistic regression Frank's remark was made in response to my posting. As funny as it was - it was the best thing that could have happened to me. It sparked an enlightening discussion between my committee and me (in particular, the pros cons of stepwise vs. information theoretic approach to model selection). Being new to the R help list, I had no idea who Frank was. I googled him (and asked around) and found very quickly that he should be taken seriously. And so should his remark. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Rolf Turner Sent: Thursday, October 16, 2008 1:34 PM To: useR Cc: r-help@r-project.org Subject: Re: [R] Variable shortlisting for the logistic regression On 17/10/2008, at 8:22 AM, useR wrote: Let's try to bring this discussion back again after Frank made very funny remark! Frank's remark was *serious*. Take it seriously. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. No virus found in this incoming message. Checked by AVG - http://www.avg.com 8:02 PM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable shortlisting for the logistic regression
Frank's remark was made in response to my posting. As funny as it was - it was the best thing that could have happened to me. It sparked an enlightening discussion between my committee and me (in particular, the pros cons of stepwise vs. information theoretic approach to model selection). Being new to the R help list, I had no idea who Frank was. I googled him (and asked around) and found very quickly that he should be taken seriously. And so should his remark. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rolf Turner Sent: Thursday, October 16, 2008 1:34 PM To: useR Cc: r-help@r-project.org Subject: Re: [R] Variable shortlisting for the logistic regression On 17/10/2008, at 8:22 AM, useR wrote: Let's try to bring this discussion back again after Frank made very funny remark! Frank's remark was *serious*. Take it seriously. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. No virus found in this incoming message. Checked by AVG - http://www.avg.com 8:02 PM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable shortlisting for the logistic regression
Let's try to bring this discussion back again after Frank made very funny remark! What I'm doing at the moment is: 1. I split dataset in two (development and holdout) 2. I fit single predictor logistic model for every variable and collect following stats: DMaxDeriv=modelD$stats[2] DModelLR=modelD$stats[3] DP=modelD$stats[5] DC=modelD$stats[6] DDxy=modelD$stats[7] DGamma=modelD$stats[8] DTau=modelD$stats[9] DR2=modelD$stats[10] DBier=modelD$stats[11] HMaxDeriv=modelH$stats[2] HModelLR=modelH$stats[3] HP=modelH$stats[5] HC=modelH$stats[6] HDxy=modelH$stats[7] HGamma=modelH$stats[8] HTau=modelH$stats[9] HR2=modelH$stats[10] HBier=modelH$stats[11] where D is prefix for stats on development sample and H is prefix for stats derived from hold out sample 3. Now I screen factor with sommers d grather than 0.3 and relative change on hold out sample is smaller than 5% Any comments are very welcomed On Oct 14, 2:48 pm, John Kane [EMAIL PROTECTED] wrote: --- On Mon, 10/13/08, David Scott [EMAIL PROTECTED] wrote: From: David Scott [EMAIL PROTECTED] Subject: Re: [R] Variable shortlisting for the logistic regression To: Frank E Harrell Jr [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Received: Monday, October 13, 2008, 6:32 PM On Mon, 13 Oct 2008, Frank E Harrell Jr wrote: useR wrote: Hi R helpers, One rather statistical question? What would be the best startegy to shortlist thousands of continous variables automaticaly using R as the preparation for logistic regression modleing! Thanks The easiest approach is to use a random number generator. Frank Got a laugh from me Frank! Can I nominate it for a fortune? David Seconded. __ [[elided Yahoo spam]] __ [EMAIL PROTECTED] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable shortlisting for the logistic regression
On 17/10/2008, at 8:22 AM, useR wrote: Let's try to bring this discussion back again after Frank made very funny remark! Frank's remark was *serious*. Take it seriously. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable shortlisting for the logistic regression
--- On Mon, 10/13/08, David Scott [EMAIL PROTECTED] wrote: From: David Scott [EMAIL PROTECTED] Subject: Re: [R] Variable shortlisting for the logistic regression To: Frank E Harrell Jr [EMAIL PROTECTED] Cc: r-help@r-project.org Received: Monday, October 13, 2008, 6:32 PM On Mon, 13 Oct 2008, Frank E Harrell Jr wrote: useR wrote: Hi R helpers, One rather statistical question? What would be the best startegy to shortlist thousands of continous variables automaticaly using R as the preparation for logistic regression modleing! Thanks The easiest approach is to use a random number generator. Frank Got a laugh from me Frank! Can I nominate it for a fortune? David Seconded. __ [[elided Yahoo spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable shortlisting for the logistic regression
Hi Marko, this may be helpful: http://www.ingentaconnect.com/content/bpl/rssb/2008/0070/0001/art5;jsessionid=an2la3spa0n5h.alexandra?format=print Happy modeling! Stephan useR schrieb: Hi R helpers, One rather statistical question? What would be the best startegy to shortlist thousands of continous variables automaticaly using R as the preparation for logistic regression modleing! Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable shortlisting for the logistic regression
useR wrote: Hi R helpers, One rather statistical question? What would be the best startegy to shortlist thousands of continous variables automaticaly using R as the preparation for logistic regression modleing! Thanks The easiest approach is to use a random number generator. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable shortlisting for the logistic regression
On Mon, 13 Oct 2008, Frank E Harrell Jr wrote: useR wrote: Hi R helpers, One rather statistical question? What would be the best startegy to shortlist thousands of continous variables automaticaly using R as the preparation for logistic regression modleing! Thanks The easiest approach is to use a random number generator. Frank Got a laugh from me Frank! Can I nominate it for a fortune? David _ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: [EMAIL PROTECTED] Graduate Officer, Department of Statistics Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.