[R-sig-eco] unsubscribe me please
-- Erin L. Page MS Environmental Science, Water and Wetland Resources SUNY College of Environmental Science and Forestry http://www.esf.edu/efb/horton/Page_bio.htm [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] interactions in stepAIC
Hi Vincenzo, There are a couple of things that might be worth considering. In your first model you consider only main effects and no interactions. Do any of your main effects drop off after you run stepAIC? If so, when you go to build an interaction model don't include these main effects only the significant parasites. Given that you don't have any a prior predictions about interactions, why do you think there are any? Maybe it's best not to look for them then to prevent stumble upon some interactions by chance? Have you tried plotting your data? This could help guide you with interactions. I would also recommend against include higher order interactions that you won't be able to interpret. What do you hope to get from the interactions? Finally, since your approach is somewhat data driven and you seem to want to reduce the number of parasite predictors, have you considered a LASSO regression? There are several LASSO implementations in R. Best, Chris On 4/4/12 4:29 PM, Vincenzo Ellis wrote: Dear R Ecology Group Members, I have data on parasite prevalences (coded as 0s or 1s) for several species of parasites of one host species, and I am interested in seeing if these parasites can predict health parameters that I measured in the hosts. I wanted to tackle this with a multiple regression approach. I used the MASS package's stepAIC function to first figure out what parasites might be good predictors, if any. Code is: x<- lm(HealthVar ~ Par1 + Par2 + Par3 + Par4 + Par5 + Par6, data= mydata) step<- stepAIC(x, direction= "both") step$anova The problem with this method is it does not take into account interactions between parasites. I have tried rewriting the code to look for interactions: x<- lm(HealthVar ~ Par1 * Par2 * Par3 * Par4 * Par5 * Par6, data= mydata) step<- stepAIC(x, direction= "both") step$anova The resulting models from this code, however, don't make much sense (lots and lots of terms, and many two, and three way interactions). I would try to code for interactions manually, but I have no a prior predictions about which parasites might be interacting, nor do I have any sense about what parasites might be making hosts sick. It just seems reasonable to assume that there may be interactions between parasites, even if I don't know which ones would be involved. Any thoughts on how to attack a dataset like this would be much appreciated. Thanks so much!! Vincenzo [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] interactions in stepAIC
Dear R Ecology Group Members, I have data on parasite prevalences (coded as 0s or 1s) for several species of parasites of one host species, and I am interested in seeing if these parasites can predict health parameters that I measured in the hosts. I wanted to tackle this with a multiple regression approach. I used the MASS package's stepAIC function to first figure out what parasites might be good predictors, if any. Code is: x <- lm(HealthVar ~ Par1 + Par2 + Par3 + Par4 + Par5 + Par6, data= mydata) step <- stepAIC(x, direction= "both") step$anova The problem with this method is it does not take into account interactions between parasites. I have tried rewriting the code to look for interactions: x <- lm(HealthVar ~ Par1 * Par2 * Par3 * Par4 * Par5 * Par6, data= mydata) step <- stepAIC(x, direction= "both") step$anova The resulting models from this code, however, don't make much sense (lots and lots of terms, and many two, and three way interactions). I would try to code for interactions manually, but I have no a prior predictions about which parasites might be interacting, nor do I have any sense about what parasites might be making hosts sick. It just seems reasonable to assume that there may be interactions between parasites, even if I don't know which ones would be involved. Any thoughts on how to attack a dataset like this would be much appreciated. Thanks so much!! Vincenzo [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Output for interactions in models that do not include all main effects
On 04/03/2012 11:31 PM, Kristen Gorman wrote: Dear all, I have R code to run AIC including multi-model inference. I am running into a problem in calling the output from models where both parameters in an interaction are not included as main effects. Why would you want to do that? Why would you (for example) expect the average of the Rlipid slope to be zero if the slope varies with the value of RFGinit? Does this make sense? (this is the sort of thing that makes statisticians splutter into their tea when they see someone do it: it rarely makes sense. Well, unless you have nested effects - which you don't have here- where the interaction is the nested effect) if you respect marginality, there won't be a problem because the main effect is always included. If you really want to include interactions without main effects, you can either write the formula "by hand", using paste(): something=Rlipid form = paste("Slipid ~ ", something, " + RFGinit:", something, sep="") lm(form, data = DataSet) and then work out how to get the order. Or you could try using update(): mod1 = lm(formula = Slipid ~ RFGinit*Rlipid, data = DataSet) mod2=update(mod1, . ~ . -RFGinit) HTH Bob In R, the interaction will be called depending on the parameter that was used as the only main effect in the model. So, I end up generating 2 different interactions (e.g., Rlipid:RFGinit vs RFGinit:Rlipid) that are actually the same. This becomes a problem in the remaining R code that requires weighted and summed values for the parameter and SE estimates. Thus, I would like to call the interaction consistently across models. See the following code: -- lm(formula = Slipid ~ Rlipid + RFGinit:Rlipid, data = DataSet) Residuals: Min 1Q Median 3Q Max -74.075 -19.047 7.233 20.445 45.391 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)120.338475.30405 22.688<2e-16 *** Rlipid 0.304930.23615 1.2910.202 Rlipid:RFGinit -0.020990.01773 -1.1840.241 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 30.88 on 60 degrees of freedom Multiple R-squared: 0.02721,Adjusted R-squared: -0.005221 F-statistic: 0.839 on 2 and 60 DF, p-value: 0.4372 lm(formula = Slipid ~ RFGinit + Rlipid:RFGinit, data = DataSet) Residuals: Min 1Q Median 3QMax -76.35 -21.63 7.09 22.46 45.71 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)131.028546 8.717104 15.031<2e-16 *** RFGinit -0.933483 0.742083 -1.2580.213 RFGinit:Rlipid 0.003926 0.009283 0.4230.674 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 30.9 on 60 degrees of freedom Multiple R-squared: 0.02586,Adjusted R-squared: -0.00661 F-statistic: 0.7964 on 2 and 60 DF, p-value: 0.4556 -- Is there a way to tell R to call the interaction based on alphabetical order of the 2 interaction terms and not based on the term that was used as a main effect? Thanks very much for any insight. Kristen Gorman ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- Bob O'Hara Biodiversity and Climate Research Centre Senckenberganlage 25 D-60325 Frankfurt am Main, Germany Tel: +49 69 798 40226 Mobile: +49 1515 888 5440 WWW: http://www.bik-f.de/root/index.php?page_id=219 Blog: http://blogs.nature.com/boboh Journal of Negative Results - EEB: www.jnr-eeb.org ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology