Re: [R] 'all subsets' fitting algorithm for Bayesian approach

Michael Hopkins Mon, 04 Oct 2010 07:19:19 -0700

Bert

Thanks for your reply but I already have the Box & Steinberg work in my 
extensive library of 'homework'  :o)


I have some problem specific priors that I need to use for calculating model 
probabilities so that I can produce predictive distributions using Bayesian 
model averaging - hence I need to be able to extract the key summary stats (as 
an analogue of the likelihood) for each model from an exhaustive selection of 
possible model terms.  

If this is available in R already then I would love to hear about it.  I am 
aware of leaps() and regsubsets() but I am not sure that provides exactly what 
I need here, though it may be possible to adapt it somehow.

Michael


On 1 Oct 2010, at 18:32, Bert Gunter wrote:

> ummmm... You are reinventing the wheel. In fact, several wheels: the
> statistical literature already has several different approaches worked
> out for this. For example, George Box and David Steinberg did one
> about 20 years ago, and it has been incorporated as one of the options
> in the JMP DOE model choice procedure.
> 
> So do your homework and save yourself some effort. Maybe even all your effort.
> 
> -- Bert
> 
> On Fri, Oct 1, 2010 at 7:02 AM, Michael Hopkins
> <hopk...@upstreamsystems.com> wrote:
>> 
>> Hi R experts
>> 
>> I am just wondering if something is already available (or easily adaptable) 
>> to do the following.
>> 
>> I am planning to build linear models for all possible combinations of terms, 
>> so for example if the terms are sent into a function as this string
>> 
>> " X1 + X2 + X3 + X4 + X1:X2"
>> 
>> I would want to build models for all possible combinations of these 5 terms, 
>> e.g.
>> 
>>        m1 <- lm( y ~ X1 + X3 )
>> 
>> and capture at least the residual sum of squares and total number of model 
>> parameters from each model produced.  This will become part of a Bayesian 
>> approach to infer actual model probabilities when specialist prior knowledge 
>> is also introduced into the problem.
>> 
>> At a high level this particular problem requires something like:
>> 
>> 1) the term 'string' to be broken down into it's elements which are 
>> separated by "+" and, I suppose, stored in a list for easier manipulation
>> 
>> 2) a matrix with 2^5 rows and 5 columns to be formed with a 0 present if the 
>> term is not included and 1 if it is.  Then a model will be fitted to 
>> represent every row of this matrix and the key statistics stored in vectors 
>> of length 2^5
>> 
>> For N terms of course the number of models will be 2^N.
>> 
>> Is there anything available already?  This is a very similar problem to all 
>> subsets regression.
>> 
>> My skill at manipulating strings in R is very limited; can anyone recommend 
>> some links or available functions which would make the separations and 
>> constructions required easy to achieve?
>> 
>> Thanks in advance to all
>> 
>> 
>> Michael Hopkins
>> Algorithm and Statistical Modelling Expert
>> 
> 
> -- 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 467-7374
> http://devo.gene.com/groups/devo/depts/ncb/home.shtml




Michael Hopkins
Algorithm and Statistical Modelling Expert
 
Upstream
23 Old Bond Street
London
W1S 4PZ

Mob +44 0782 578 7220
DL   +44 0207 290 1326
Fax  +44 0207 290 1321

hopk...@upstreamsystems.com
www.upstreamsystems.com
 
IMPORTANT NOTICE
The information in this e-mail and any attached files is...{{dropped:22}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'all subsets' fitting algorithm for Bayesian approach

Reply via email to