> If you are *really* trying to predict (rather than test hypotheses), and you 
> really use model averaging, then I would be fine with this approach -- but 
> then you wouldn't be spending any time worrying about which models were 
> weighted how strongly

My approach was to rank the model according to -  AIC  (model of interest) – 
AICmin (aic value of minimum model) = relative AIC difference and then only use 
model averaging on the set of models where the value was 0-2 - (Burnham & 
Anderson, 2002).


>  I don't quite understand.

Sorry i was trying to say i then need to think of a way of validating the 
goodness of fit as i want to use my training data to predict my test data, and 
i have never used a model to predict unknown values. But i am sure i will come 
to it if  read around!

Thanks for all your help, it is greatly appreciated



On 4 Aug 2010, at 20:09, Ben Bolker wrote:

On 10-08-04 01:13 PM, Chris Mcowen wrote:
> Hi Ben,
> 
> That is great thanks.
> 
>   
>> whether you select models via p-value or AIC *should* be based on whether 
>> you are trying to test hypotheses or make predictions
>>     
> I have 7 factors of which 5 have been shown, theoretically and empirically, 
> to have an impact on my response variable. The other two are somewhat wild 
> shots, but i have a hunch they are important too.
> 
> The problem is there are no clear analytical patterns of the variables, they 
> don't fit into neat boxed themes (size, shape etc)  if you will, therefore 
> making a hypotheses about how they inter-react is hard. Therefore forming a 
> subset of models to test is very difficult, my approach has been to use all 
> combinations of factors to generate the candidate models. I am worried that 
> this approach is taking me down the data dredging/ model simplification route 
> i am trying to avoid. Is it bad practice to use all combinations? As long as 
> i rank them by akaike weight and use model averaging techniques isn't this OK?
>   

  If you are *really* trying to predict (rather than test hypotheses), and you 
really use model averaging, then I would be fine with this approach -- but then 
you wouldn't be spending any time worrying about which models were weighted how 
strongly (although I do admit that wondering why p-values and AIC gave 
different rankings is worth thinking about -- I'm just not sure there's a short 
answer without looking through all of the data).

 You should take a look at the AICcmodavg and MuMIn packages on CRAN -- one or 
the other may (?) be able to handle lmer fits.
>   
>> My best guess as to what's going on here is that you have a good deal of 
>> correlation among your factors
>>     
> I tested this with Pearson's R and only one combination showed up as having a 
> strong correlation, is this not sufficient?
>   

   Often but not necessarily.  Zuur et al have a recent paper in Methods in 
Ecology and Evolution you might want to look at.
>   
>> some combinations of factors are under/overrepresented in the data set)
>>     
> Thats is certainly the case, but i cant do much about that, is it not just 
> sufficent to rely on Pearson's values as mentioned above?
> 
>   
>> simply fit
>> the full model and base your inference on the estimates and confidence 
>> intervals from the full mode
>>     
> I want to be able to predict the threat status ( the response variable) for 
> species i only have traits (factors) for, this approach would not really let 
> me do this would it?
>   

  I don't quite understand.

 Ben

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Reply via email to