> If you are *really* trying to predict (rather than test hypotheses), and you > really use model averaging, then I would be fine with this approach -- but > then you wouldn't be spending any time worrying about which models were > weighted how strongly
My approach was to rank the model according to - AIC (model of interest) – AICmin (aic value of minimum model) = relative AIC difference and then only use model averaging on the set of models where the value was 0-2 - (Burnham & Anderson, 2002). > I don't quite understand. Sorry i was trying to say i then need to think of a way of validating the goodness of fit as i want to use my training data to predict my test data, and i have never used a model to predict unknown values. But i am sure i will come to it if read around! Thanks for all your help, it is greatly appreciated On 4 Aug 2010, at 20:09, Ben Bolker wrote: On 10-08-04 01:13 PM, Chris Mcowen wrote: > Hi Ben, > > That is great thanks. > > >> whether you select models via p-value or AIC *should* be based on whether >> you are trying to test hypotheses or make predictions >> > I have 7 factors of which 5 have been shown, theoretically and empirically, > to have an impact on my response variable. The other two are somewhat wild > shots, but i have a hunch they are important too. > > The problem is there are no clear analytical patterns of the variables, they > don't fit into neat boxed themes (size, shape etc) if you will, therefore > making a hypotheses about how they inter-react is hard. Therefore forming a > subset of models to test is very difficult, my approach has been to use all > combinations of factors to generate the candidate models. I am worried that > this approach is taking me down the data dredging/ model simplification route > i am trying to avoid. Is it bad practice to use all combinations? As long as > i rank them by akaike weight and use model averaging techniques isn't this OK? > If you are *really* trying to predict (rather than test hypotheses), and you really use model averaging, then I would be fine with this approach -- but then you wouldn't be spending any time worrying about which models were weighted how strongly (although I do admit that wondering why p-values and AIC gave different rankings is worth thinking about -- I'm just not sure there's a short answer without looking through all of the data). You should take a look at the AICcmodavg and MuMIn packages on CRAN -- one or the other may (?) be able to handle lmer fits. > >> My best guess as to what's going on here is that you have a good deal of >> correlation among your factors >> > I tested this with Pearson's R and only one combination showed up as having a > strong correlation, is this not sufficient? > Often but not necessarily. Zuur et al have a recent paper in Methods in Ecology and Evolution you might want to look at. > >> some combinations of factors are under/overrepresented in the data set) >> > Thats is certainly the case, but i cant do much about that, is it not just > sufficent to rely on Pearson's values as mentioned above? > > >> simply fit >> the full model and base your inference on the estimates and confidence >> intervals from the full mode >> > I want to be able to predict the threat status ( the response variable) for > species i only have traits (factors) for, this approach would not really let > me do this would it? > I don't quite understand. Ben _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology