[R] randomForest - what is a 'good' pseudo r-squared?

2009-07-20 Thread lara harrup (IAH-P)
Hi all I have been trying to use the randomForest package to model insect species abundance in different habitats and identify the key variables (landscape/climate etc) in determining abundance, which has all worked fine and I get nice variable importance plots etc. Many thanks to everyone on t

[R] 95% Confidence Intervals for AUC - $auc.samples from the Daim Package

2009-07-13 Thread lara harrup (IAH-P)
Hi I am trying to perform a bootstrap estimate of classification accuracy of a logistic regression using the 'Daim' package in r using the code at the bottom of this post, this all works great and I get the .632+ misclassification accuracy, specificity, sensitivity, AUC etc etc but what I woul

[R] Goodness of fit test / pseudo r^2 measure for Zero Inflated Model

2009-06-24 Thread lara harrup (IAH-P)
Hi I have been using a Zero-Inflated negative binomial model fitted using the pscl zeroinfl command but I would like to extract a goodness of fit measure are there any suitable pseudo R^2 measures available for this type of analysis to try and assess the amount of variation in the data explain

[R] Random Forest Variable Importance Interpretation

2009-06-24 Thread lara harrup (IAH-P)
Hi I am trying to explore the use of random forests for regression to identify the important environmental/microclimate variables involved in predicting the abundance of a species in different habitats, there are approx 40 variable and between 200 and 500 data points depending on the dataset. I

[R] Error with regsubset in leaps package - vcov and all.best option (plus calculating VIFs for subsets)

2009-05-20 Thread lara harrup (IAH-P)
Hi all I am hoping this is just a minor problem, I am trying to implement a best subsets regression procedure on some ecological datasets using the regsubsets function in the leaps package. The dataset contains 43 predictor variables plus the response (logcount) all in a dataframe called env

[R] Installing/using "glars" package --- Error in library(glars) : 'glars' is not a valid installed package

2009-04-29 Thread lara harrup (IAH-P)
Hi all I seem to have fallen at the first hurdle with my analysis, I have a set of binary disease outbreak data linked to a large number of landscape metrics variables and environmental variables which I would like to as predictor variables in a Least Angle Logistic Regression using the glars.fit

[R] Cross-Validation for Zero-Inflated Models

2009-04-15 Thread lara harrup (IAH-P)
Hi all I have developed a zero-inflated negative binomial model using the zeroinfl function from the pscl package, which I have carried out model selection based on AIC and have used likelihood ratio tests (lrtest from the lmtest package) to compare the nested models [My end model contains 2 fac

[R] Multiple Comparisons for (multicomp - glht) for glm negative binomial (glm.nb)

2009-03-22 Thread lara harrup (IAH-P)
Hi I have some experimental data where I have counts of the number of insects collected to different trap types rotated through 5 different location (variable -location), 4 different chemical attractants [A, B, C, D] were applied to the traps (variable - semio) and all were trialled at two diffe

[R] ANCOVA/glm missing/ignored interaction combinations

2008-09-03 Thread lara harrup (IAH-P)
Hi I am using R version 2.7.2. on a windows XP OS and have a question concerning an analysis of covariance with count data I am trying to do, I will give details of a scaled down version of the analysis (as I have more covariates and need to take account of over-dispersion etc etc) but as I am sur

[R] Creating subplots of a ridge trace generated using matplot (ridge regression)

2008-02-22 Thread lara harrup (IAH-P)
Hi I am using ridge regression as a method to over come the multicollinearity in my dataset and in order to select the lambda (ridge estimator/regularization parameter) I am generating a ridge trace using the following commands (using R version 2.6.1): >library(MASS) >model_0.5km_ridge<- lm.ridge