Re: [R] AIC in MuMIn
On Fri, 2010-08-20 at 06:31 +0800, elaine kuo wrote: > > > Oh, actually, I suppose you could automate this, so it will > return all > models with single variable: > > dd <- dredge(lm1) > parms <- !is.na(dd[, -c(1, (ncol(dd) - c(0:7)))]) > want <- which(rowSums(parms) == 1) > dd[want, ] > > Having said all this, I don't think this is a good way to do > model > selection. > > => > > > Thanks for the time-efficient formula. > Actually it is over my current ability to automate it, but it did save > a lot of time already. You don't need to automate it, I did it for you. The four lines above show you only the lines from dd (your dredge result) that contain only one covariate. That was what you asked for, wasn't it? > Moreover, please kindly explain the command beginning with parms. > data (Cement) was checked but could not ensure why the row and the > column were set to be 1 and (ncol(dd) - c(0:7)). OK; for this, you need to first get that dredge returns a data frame with some additional attributes. (It actually inherits from class "data.frame" so whilst it isn't exactly a data frame it works like one.) So we can subset it as we would any other R data frame. If you look at the object returned from dredge, you see the first column refers to the intercept, the next N columns are for your covariates in the full model you fitted, with the remaining columns representing the 'rank' statistic (AIC, BIC etc) plus some other measures. > head(dd) Global model: lm(formula = y ~ ., data = Cement) --- Model selection table (Int) XX1 X2X3 X4 k R.sq Adj.R.sq RSS AIC AICc 11 52.581.468 0.6623 4 0.9787 0.9744 57.90 64.31 69.31 24 71.651.452 0.4161 -0.2365 5 0.9823 0.9764 47.97 63.87 72.44 23 48.191.696 0.6569 0.25 5 0.9823 0.9764 48.11 63.90 72.48 13 103.101.440 -0.6140 4 0.9725 0.9670 74.76 67.63 72.63 25 111.701.052-0.41 -0.6428 5 0.9813 0.9750 50.84 64.62 73.19 17 52.53 0.1551 1.467 0.6410 5 0.9798 0.9731 54.86 65.61 74.18 delta weight 11 0.000 0.544 24 3.125 0.114 23 3.163 0.112 13 3.322 0.103 25 3.879 0.078 17 4.869 0.048 So we select out only the bits we want: dd[, -c(1, (ncol(dd) - c(0:7)))] This bit is selecting columns by dropping the first and the last eight columns. ncol(dd) gives the number of columns, ncol(dd) - 0 is the last column, ncol(dd) - 1 is the next to the last column and so on. There are eight of these columns we don't want. If you just run > c(1, (ncol(dd) - c(0:7))) [1] 1 14 13 12 11 10 9 8 7 you'll see that we have the column integers we want to drop, and then we negate it to get the subsetting to drop rather than retain these columns: -c(1, (ncol(dd) - c(0:7))) Having dropped the superfluous columns we aren't interested in, we are left with an 'array' with one column per covariate (predictor variable). If a given covariate is not in the model, it gets an NA in its column for that model. The is.na() function returns an object of the same dimension as dd[, -c(1, (ncol(dd) - c(0:7)))] but which indicates if variable is NA or not: > is.na(dd[, -c(1, (ncol(dd) - c(0:7)))]) XX1X2X3X4 11 TRUE FALSE FALSE TRUE TRUE 24 TRUE FALSE FALSE TRUE FALSE 23 TRUE FALSE FALSE FALSE TRUE 13 TRUE FALSE TRUE TRUE FALSE 25 TRUE FALSE TRUE FALSE FALSE 17 FALSE FALSE FALSE TRUE TRUE 19 FALSE FALSE TRUE TRUE FALSE The ! negates this turning TRUE into FALSE and vice versa. In computing and R, TRUE == 1, FALSE == 0, so when we do rowSums on this, we get information regarding how many covariates are in each model: > rowSums(parms) 11 24 23 13 25 17 19 26 31 27 28 29 30 16 22 32 14 20 6 4 7 15 10 9 8 3 2 3 3 2 3 3 3 3 4 4 4 4 4 2 3 5 2 3 1 1 2 2 2 2 2 1 12 18 21 5 2 1 2 3 3 1 1 0 You asked for only those models containing a single covariate, so they are the models with rowSum == 1. The which() function tells us which models (rows of dd) these are and we save this indicator in 'want' > which(rowSums(parms) == 1) 6 4 3 5 2 19 20 26 30 31 The final step is to select out rows 19, 20, 26, 30 and 31 (want) from dd as these are the models you said you were interested in. > dd[want, ] Global model: lm(formula = y ~ ., data = Cement) --- Model selection table (Int) XX1 X2 X3 X4 k R.sq Adj.R.sqRSSAIC 6 117.60 -0.7382 3 0.6745 0.6450 883.9 97.74 4 57.42 0.78913 0.6663 0.6359 906.3 98.07 3 81.48 1.869 3 0.5339 0.4916 1266.0 102.40 5 110.20-1.256 3 0.2859 0.2210 1939.0 108.00 2 82.31 1.874 3 0.2353 0.1657 2077.0 108.80 AICc delta weight 6 100.4 31
Re: [R] AIC in MuMIn
> >> > > Please kindly advise if it is possible to show the singular model with > > only one certain variable using the command subset. (or maybe others) > > I tried the command "subset=X3" but it returned multiple models > > including X3. > > > > > > The above demand might look unnecessary when visual inspection offers > > the solution with few exp. variables. > > However, it is of great importance for recognizing AIC for each > > variable in the model selection with more than 10 variables. > > No, there isn't. The object returned by dredge is a data frame so you > can subset it to your hearts content, but you'll need to select rows > where you specify that all your variables (except the one you are > interested in) are missing (NA). Without writing this all out, I don't > know of a quick way of doing this sort of subsetting. > > In effect, you want > > data(Cement) > lm1 <- lm(y ~ ., data = Cement) > dd <- dredge(lm1, subset = X1) > > want <- with(dd, is.na(X) & is.na(X2) & is.na(X3) & is.na(X4)) > want > ## how many models selected? > sum(want) > ## OK selected just 1, show it > dd[want, , drop = FALSE] > > Oh, actually, I suppose you could automate this, so it will return all > models with single variable: > > dd <- dredge(lm1) > parms <- !is.na(dd[, -c(1, (ncol(dd) - c(0:7)))]) > want <- which(rowSums(parms) == 1) > dd[want, ] > > Having said all this, I don't think this is a good way to do model > selection. > => > Thanks for the time-efficient formula. Actually it is over my current ability to automate it, but it did save a lot of time already. Moreover, please kindly explain the command beginning with parms. data (Cement) was checked but could not ensure why the row and the column were set to be 1 and (ncol(dd) - c(0:7)). Elaine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AIC in MuMIn
On Thu, 2010-08-19 at 04:42 +0800, elaine kuo wrote: > Yes, I tried the example in the ?dredge and agreed that something else > caused the mistake. > > > Aside from the cause which takes time to clarify (16 explanatory > variables in the model), > I would like to ask another question. > > > Please kindly advise if it is possible to show the singular model with > only one certain variable using the command subset. (or maybe others) > I tried the command "subset=X3" but it returned multiple models > including X3. > > > The above demand might look unnecessary when visual inspection offers > the solution with few exp. variables. > However, it is of great importance for recognizing AIC for each > variable in the model selection with more than 10 variables. No, there isn't. The object returned by dredge is a data frame so you can subset it to your hearts content, but you'll need to select rows where you specify that all your variables (except the one you are interested in) are missing (NA). Without writing this all out, I don't know of a quick way of doing this sort of subsetting. In effect, you want data(Cement) lm1 <- lm(y ~ ., data = Cement) dd <- dredge(lm1, subset = X1) want <- with(dd, is.na(X) & is.na(X2) & is.na(X3) & is.na(X4)) want ## how many models selected? sum(want) ## OK selected just 1, show it dd[want, , drop = FALSE] Oh, actually, I suppose you could automate this, so it will return all models with single variable: dd <- dredge(lm1) parms <- !is.na(dd[, -c(1, (ncol(dd) - c(0:7)))]) want <- which(rowSums(parms) == 1) dd[want, ] Having said all this, I don't think this is a good way to do model selection. G > > > Please kindly help and thank you in advance. > > > Elaine > > > code > library(MuMIn) > > > data(Cement) > lm1 <- lm(y ~ ., data = Cement) > dd <- dredge(lm1, beta = FALSE, eval = TRUE, rank = "AIC") > X3<-dredge(lm1, rank = "AIC", subset=X3) > > On Wed, Aug 18, 2010 at 9:18 PM, Gavin Simpson > wrote: > On Wed, 2010-08-18 at 21:11 +0800, elaine kuo wrote: > > A cause other than data based on standardized regression > > was identified. > > It is that the manual command added with target <- at the > left hand > > side. > > > C1 did not work but C2 did. > > C1 target<-dredge(mig.stds, subset = temp_max) > > > > > > C2 dredge(mig.stds, subset = temp_max) > > > Glad you have this working, but that can't possibly be the > reason for > the error. There must be something else going on. The only > difference > between the two calls is that you are storing the result > somewhere, yet > the error was coming from within the running of the dredge > function. > > If everything works from within a new, clean R session then > great. > > > G > > > > > > > Elaine > > > > On Wed, Aug 18, 2010 at 5:37 PM, elaine kuo > > > wrote: > > > > > > > > > Please suggest how to define subset in my > case > > > > > > How would I know? I still haven't seen your > data. You > > seem to be > > mistaken on what is and is not included in > your model > > and you fitted it. > > What hope do we have...? However, given the > model > > 'mig.stds' from above > > in this email: > > > > > mig.stds <-lm(SummerM_ratio ~ temp_max + > evi_mean + > > topo_var + > > > topo_mean + coast + > Iso_index_0808, > > > ## now tell R were to find > the > > variables in formula > > > data = datum.std) > > > ## If you are fitting a Gaussian GLM it is > better > > fitted with lm() > > > > > > If you want to consider dredged models > containing > > temp_max, then you > > would do > > > > dredge(mig.stds, subset = temp_max) > > > > If you want models that contain temp_max and > coast, > > then you'd do > > > > dredge(mig.stds, subset = temp_max & coast) > > > > or > > > > dredge(mig.stds, fixed = ~ temp_max + coast) > > > > The bits you include in subse
Re: [R] AIC in MuMIn
Yes, I tried the example in the ?dredge and agreed that something else caused the mistake. Aside from the cause which takes time to clarify (16 explanatory variables in the model), I would like to ask another question. Please kindly advise if it is possible to show the singular model with only one certain variable using the command subset. (or maybe others) I tried the command "subset=X3" but it returned multiple models including X3. The above demand might look unnecessary when visual inspection offers the solution with few exp. variables. However, it is of great importance for recognizing AIC for each variable in the model selection with more than 10 variables. Please kindly help and thank you in advance. Elaine code library(MuMIn) data(Cement) lm1 <- lm(y ~ ., data = Cement) dd <- dredge(lm1, beta = FALSE, eval = TRUE, rank = "AIC") X3<-dredge(lm1, rank = "AIC", subset=X3) On Wed, Aug 18, 2010 at 9:18 PM, Gavin Simpson wrote: > On Wed, 2010-08-18 at 21:11 +0800, elaine kuo wrote: > > A cause other than data based on standardized regression > > was identified. > > It is that the manual command added with target <- at the left hand > > side. > > > C1 did not work but C2 did. > > C1 target<-dredge(mig.stds, subset = temp_max) > > > > > > C2 dredge(mig.stds, subset = temp_max) > > Glad you have this working, but that can't possibly be the reason for > the error. There must be something else going on. The only difference > between the two calls is that you are storing the result somewhere, yet > the error was coming from within the running of the dredge function. > > If everything works from within a new, clean R session then great. > > G > > > > > > > Elaine > > > > On Wed, Aug 18, 2010 at 5:37 PM, elaine kuo > > wrote: > > > > > > > > > Please suggest how to define subset in my case > > > > > > How would I know? I still haven't seen your data. You > > seem to be > > mistaken on what is and is not included in your model > > and you fitted it. > > What hope do we have...? However, given the model > > 'mig.stds' from above > > in this email: > > > > > mig.stds <-lm(SummerM_ratio ~ temp_max + evi_mean + > > topo_var + > > > topo_mean + coast + Iso_index_0808, > > > ## now tell R were to find the > > variables in formula > > > data = datum.std) > > > ## If you are fitting a Gaussian GLM it is better > > fitted with lm() > > > > > > If you want to consider dredged models containing > > temp_max, then you > > would do > > > > dredge(mig.stds, subset = temp_max) > > > > If you want models that contain temp_max and coast, > > then you'd do > > > > dredge(mig.stds, subset = temp_max & coast) > > > > or > > > > dredge(mig.stds, fixed = ~ temp_max + coast) > > > > The bits you include in subset or fixed are the names > > of your variables > > that you want in or out of the models. In your case, > > the names of the > > variables as input into the model formula. With > > 'subset' you need to use > > logical operators (and [&], or [|]) whilst with > > 'fixed' you can specify > > a formula of variables that should be included or > > excluded in the same > > way you'd write any R formula. > > > > But, now having been told this, please note that this > > is *all* discussed > > on the ?dredge help page if you bother to read it. > > I've never used this > > package, and, OK, I have used R for going on for 11 or > > 12 years now so > > am used to reading help pages and understand the > > language a bit more you > > perhaps do, but you do seem to be asking questions or > > running into > > problems that are all covered by the help pages. > > > > > > => I posted it for help, after following the manual > > with the command dredge but receiving an error message > > two days ago. > > > > command target<-dredge(mig.stds, subset = > > temp_max) > >error in eval(expr, envir, enclos) : 'temp_max' not found > > > > > > One possible cause could be data = datam.std. > > datam.std was produced as the code below, which seemed to > > make it hard to find explanatory variables. > > Pl
Re: [R] AIC in MuMIn
On Wed, 2010-08-18 at 21:11 +0800, elaine kuo wrote: > A cause other than data based on standardized regression > was identified. > It is that the manual command added with target <- at the left hand > side. > C1 did not work but C2 did. > C1 target<-dredge(mig.stds, subset = temp_max) > > > C2 dredge(mig.stds, subset = temp_max) Glad you have this working, but that can't possibly be the reason for the error. There must be something else going on. The only difference between the two calls is that you are storing the result somewhere, yet the error was coming from within the running of the dredge function. If everything works from within a new, clean R session then great. G > > > Elaine > > On Wed, Aug 18, 2010 at 5:37 PM, elaine kuo > wrote: > > > > > Please suggest how to define subset in my case > > > How would I know? I still haven't seen your data. You > seem to be > mistaken on what is and is not included in your model > and you fitted it. > What hope do we have...? However, given the model > 'mig.stds' from above > in this email: > > > mig.stds <-lm(SummerM_ratio ~ temp_max + evi_mean + > topo_var + > > topo_mean + coast + Iso_index_0808, > > ## now tell R were to find the > variables in formula > > data = datum.std) > > ## If you are fitting a Gaussian GLM it is better > fitted with lm() > > > If you want to consider dredged models containing > temp_max, then you > would do > > dredge(mig.stds, subset = temp_max) > > If you want models that contain temp_max and coast, > then you'd do > > dredge(mig.stds, subset = temp_max & coast) > > or > > dredge(mig.stds, fixed = ~ temp_max + coast) > > The bits you include in subset or fixed are the names > of your variables > that you want in or out of the models. In your case, > the names of the > variables as input into the model formula. With > 'subset' you need to use > logical operators (and [&], or [|]) whilst with > 'fixed' you can specify > a formula of variables that should be included or > excluded in the same > way you'd write any R formula. > > But, now having been told this, please note that this > is *all* discussed > on the ?dredge help page if you bother to read it. > I've never used this > package, and, OK, I have used R for going on for 11 or > 12 years now so > am used to reading help pages and understand the > language a bit more you > perhaps do, but you do seem to be asking questions or > running into > problems that are all covered by the help pages. > > > => I posted it for help, after following the manual > with the command dredge but receiving an error message > two days ago. > > command target<-dredge(mig.stds, subset = > temp_max) >error in eval(expr, envir, enclos) : 'temp_max' not found > > > One possible cause could be data = datam.std. > datam.std was produced as the code below, which seemed to > make it hard to find explanatory variables. > Please kindly share your experience in R, because I am unsure > if my assumption is logical or not. > > > Also, please kindly advise how to modify the command for > dredging subset if possible. > ( command target<-dredge(mig.stds, subset = temp_max)) > Thank you in advance. > > > Elaine > > > code > library(MuMIn) > datam > <-read.csv("c:/migration/Mig_ratio_20100817.csv",header=T, > row.names=1) > > # std regression model (16 indep. variables) > datam.sd<-scale(datam) > datam.std<-as.data.frame(datam.sd) > summary (datam.std) > mean(datam.std) > > > # obtain standard deviation > sd(datam.std) > >
Re: [R] AIC in MuMIn
A cause other than data based on standardized regression was identified. It is that the manual command added with target <- at the left hand side. C1 did not work but C2 did. C1 target<-dredge(mig.stds, subset = temp_max) C2 dredge(mig.stds, subset = temp_max) Elaine On Wed, Aug 18, 2010 at 5:37 PM, elaine kuo wrote: > >> > >> Please suggest how to define subset in my case >> >> How would I know? I still haven't seen your data. You seem to be >> mistaken on what is and is not included in your model and you fitted it. >> What hope do we have...? However, given the model 'mig.stds' from above >> in this email: >> >> > mig.stds <-lm(SummerM_ratio ~ temp_max + evi_mean + topo_var + >> > topo_mean + coast + Iso_index_0808, >> > ## now tell R were to find the variables in formula >> > data = datum.std) >> > ## If you are fitting a Gaussian GLM it is better fitted with lm() >> >> If you want to consider dredged models containing temp_max, then you >> would do >> >> dredge(mig.stds, subset = temp_max) >> >> If you want models that contain temp_max and coast, then you'd do >> >> dredge(mig.stds, subset = temp_max & coast) >> >> or >> >> dredge(mig.stds, fixed = ~ temp_max + coast) >> >> The bits you include in subset or fixed are the names of your variables >> that you want in or out of the models. In your case, the names of the >> variables as input into the model formula. With 'subset' you need to use >> logical operators (and [&], or [|]) whilst with 'fixed' you can specify >> a formula of variables that should be included or excluded in the same >> way you'd write any R formula. >> >> But, now having been told this, please note that this is *all* discussed >> on the ?dredge help page if you bother to read it. I've never used this >> package, and, OK, I have used R for going on for 11 or 12 years now so >> am used to reading help pages and understand the language a bit more you >> perhaps do, but you do seem to be asking questions or running into >> problems that are all covered by the help pages. >> >> => I posted it for help, after following the manual with the command >> dredge but receiving an error message two days ago. >> > > >> command target<-dredge(mig.stds, subset = temp_max) >> >error in eval(expr, envir, enclos) : 'temp_max' not found > > One possible cause could be data = datam.std. > datam.std was produced as the code below, which seemed to make it hard to > find explanatory variables. > Please kindly share your experience in R, because I am unsure if my > assumption is logical or not. > > Also, please kindly advise how to modify the command for dredging subset if > possible. > ( command target<-dredge(mig.stds, subset = temp_max)) > Thank you in advance. > > Elaine > > code > library(MuMIn) > datam <-read.csv("c:/migration/Mig_ratio_20100817.csv",header=T, > row.names=1) > > # std regression model (16 indep. variables) > datam.sd<-scale(datam) > datam.std<-as.data.frame(datam.sd) > summary (datam.std) > mean(datam.std) > > # obtain standard deviation > sd(datam.std) > > mig.stds > <-lm(SummerM_ratio~temp_ran+temp_mean+temp_max+temp_min+evi_ran+evi_mean+evi_max+evi_min+prec_ran+prec_mean+prec_max+prec_min+topo_var+topo_mean+coast+Iso_index_0808,data=datam.std) > > summary(mig.stds) > > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AIC in MuMIn
On Wed, 2010-08-18 at 17:37 +0800, elaine kuo wrote: > => I posted it for help, after following the manual with the > command dredge but receiving an error message two days ago. > > command target<-dredge(mig.stds, subset = temp_max) >error in eval(expr, envir, enclos) : 'temp_max' not found You really don't get it, do you? I can't see what the problem is with your data + script, because I don't have your data! So I can't run your script, reproduce your error and work round it, then send you the solution. How many times do I have to ask for your data or, if not possible, some dummy data that reproduces the same problem? If your data weren't available to send, this is what I would have expected, as requested by the posting guide: ## dummy data - this could be anything really dat <- data.frame(FOO = rnorm(100), BAR = rnorm(100, mean = 10, sd = 4), OTHER = seq(1, 10, length = 100), Y = runif(100)) ## standardise data dat.std <- data.frame(scale(dat, center = TRUE, scale = TRUE)) ## load MuMIn require(MuMIn) ## fit full model mod <- lm(Y ~ FOO + BAR + OTHER, data = dat.std) ## data dredge dredged.mod <- dredge(mod, subset = OTHER) dredged.mod The above works: > ## dummy data > dat <- data.frame(FOO = rnorm(100), BAR = rnorm(100, mean = 10, sd = 4), + OTHER = seq(1, 10, length = 100), + Y = runif(100)) > > ## standardise data > dat.std <- data.frame(scale(dat, center = TRUE, scale = TRUE)) > > ## load MuMIn > require(MuMIn) Loading required package: MuMIn > > ## fit full model > mod <- lm(Y ~ FOO + BAR + OTHER, data = dat.std) > mod Call: lm(formula = Y ~ FOO + BAR + OTHER, data = dat.std) Coefficients: (Intercept) FOO BAROTHER 1.908e-16 -3.463e-021.635e-02 -1.299e-01 > summary(mod) Call: lm(formula = Y ~ FOO + BAR + OTHER, data = dat.std) Residuals: Min 1Q Median 3Q Max -1.83375 -0.80706 0.01736 0.81847 1.72803 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.908e-16 1.006e-01 1.90e-151.000 FOO -3.463e-02 1.013e-01 -0.3420.733 BAR 1.635e-02 1.011e-010.1620.872 OTHER -1.299e-01 1.013e-01 -1.2820.203 Residual standard error: 1.006 on 96 degrees of freedom Multiple R-squared: 0.01901,Adjusted R-squared: -0.01165 F-statistic: 0.62 on 3 and 96 DF, p-value: 0.6037 > dredged.mod <- dredge(mod, subset = OTHER) > dredged.mod Global model: lm(formula = Y ~ FOO + BAR + OTHER, data = dat.std) --- Model selection table (Int) BAR FOO OTH kR.sq Adj.R.sq RSS AIC AICc 1 1.886e-16 -0.1324 3 0.01754 0.007514 97.26 287.0 287.3 3 1.886e-16 -0.03471 -0.1302 4 0.01874 -0.001493 97.14 288.9 289.3 2 1.908e-16 0.01652 -0.1322 4 0.01781 -0.002439 97.24 289.0 289.4 4 1.908e-16 0.01635 -0.03463 -0.1299 5 0.01901 -0.011650 97.12 290.9 291.5 delta weight 1 0.000 0.549 3 2.049 0.197 2 2.143 0.188 4 4.239 0.066 So is this due to the way your names are formed? dat2 <- data.frame(dat, SOME_THING = rpois(100, 4)) dat2.std <- data.frame(scale(dat2, center = TRUE, scale = TRUE)) mod2 <- lm(Y ~ FOO + BAR + OTHER + SOME_THING, data = dat2.std) ## data dredge dredged.mod2 <- dredge(mod2, subset = SOME_THING) dredged.mod2 No! So I can't reproduce your problem and until you send the *exact* code you enter into R plus your data I can no longer help you. Apologies if I am beginning to sound harsh, but you aren't doing what I have asked you to do (repeatedly) and what the posting guide for this list asks you to. If you want free help here, the least you can do is respect the rules. Feel free to send the above data/code off list to me if you wish. But I won't answer anything further on this thread until they are provided. I am wasting my time in a futile attempt to help you with your problem because you won't do what I ask. G > One possible cause could be data = datam.std. > datam.std was produced as the code below, which seemed to make it > hard to find explanatory variables. > Please kindly share your experience in R, because I am unsure if my > assumption is logical or not. > > > Also, please kindly advise how to modify the command for dredging > subset if possible. > ( command target<-dredge(mig.stds, subset = temp_max)) > Thank you in advance. > > > Elaine > > > code > library(MuMIn) > datam <-read.csv("c:/migration/Mig_ratio_20100817.csv",header=T, > row.names=1) > > # std regression model (16 indep. variables) > datam.sd<-scale(datam) > datam.std<-as.data.frame(datam.sd) > summary (datam.std) > mean(datam.std) > > > # obtain standard deviation > sd(datam.std) > > > mig.stds <-lm(SummerM_ratio~temp_ran+temp_mean+temp_max+temp_min > +evi_ran+evi_mean+evi_max+evi_min+prec_ran+prec_mean+prec_max+prec_min > +topo_var+topo_mean+coast+Iso_ind
Re: [R] AIC in MuMIn
> > > > > Please suggest how to define subset in my case > > How would I know? I still haven't seen your data. You seem to be > mistaken on what is and is not included in your model and you fitted it. > What hope do we have...? However, given the model 'mig.stds' from above > in this email: > > > mig.stds <-lm(SummerM_ratio ~ temp_max + evi_mean + topo_var + > > topo_mean + coast + Iso_index_0808, > > ## now tell R were to find the variables in formula > > data = datum.std) > > ## If you are fitting a Gaussian GLM it is better fitted with lm() > > If you want to consider dredged models containing temp_max, then you > would do > > dredge(mig.stds, subset = temp_max) > > If you want models that contain temp_max and coast, then you'd do > > dredge(mig.stds, subset = temp_max & coast) > > or > > dredge(mig.stds, fixed = ~ temp_max + coast) > > The bits you include in subset or fixed are the names of your variables > that you want in or out of the models. In your case, the names of the > variables as input into the model formula. With 'subset' you need to use > logical operators (and [&], or [|]) whilst with 'fixed' you can specify > a formula of variables that should be included or excluded in the same > way you'd write any R formula. > > But, now having been told this, please note that this is *all* discussed > on the ?dredge help page if you bother to read it. I've never used this > package, and, OK, I have used R for going on for 11 or 12 years now so > am used to reading help pages and understand the language a bit more you > perhaps do, but you do seem to be asking questions or running into > problems that are all covered by the help pages. > > => I posted it for help, after following the manual with the command dredge > but receiving an error message two days ago. > > command target<-dredge(mig.stds, subset = temp_max) > error in eval(expr, envir, enclos) : 'temp_max' not found One possible cause could be data = datam.std. datam.std was produced as the code below, which seemed to make it hard to find explanatory variables. Please kindly share your experience in R, because I am unsure if my assumption is logical or not. Also, please kindly advise how to modify the command for dredging subset if possible. ( command target<-dredge(mig.stds, subset = temp_max)) Thank you in advance. Elaine code library(MuMIn) datam <-read.csv("c:/migration/Mig_ratio_20100817.csv",header=T, row.names=1) # std regression model (16 indep. variables) datam.sd<-scale(datam) datam.std<-as.data.frame(datam.sd) summary (datam.std) mean(datam.std) # obtain standard deviation sd(datam.std) mig.stds <-lm(SummerM_ratio~temp_ran+temp_mean+temp_max+temp_min+evi_ran+evi_mean+evi_max+evi_min+prec_ran+prec_mean+prec_max+prec_min+topo_var+topo_mean+coast+Iso_index_0808,data=datam.std) summary(mig.stds) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AIC in MuMIn
> Hmmm. Thinking some more, I might not have answered your (unstated) > question. What do your mean by GLM? > > => I meant generalized linear model as well. Thanks for the references. The first one was mentioned first in my life time after keeping asking the same question. > I mean the Generalized Linear Model, not the General Linear Model. The > Generalized one allows for non-normal responses and different > mean-variance relationships and is the GLM of Nelder and Wedderburn > (1972, J. Royal Statistical Society, Series A, 135(3),370-384) and the > monograph by McCullagh and Wedderburn (1989, Generalized Linear Models, > Chapman & Hall/CRC). The R function glm() fits these kinds of model. > > The General Linear Model (IIRC) was the linking of linear regression and > anova into a single entity. The R function lm() fits these kinds of > models. > > The linear model is a special case of the Generalized Linear Model when > the Gaussian error is used with the identity link function. Hence a > Gaussian GLM (my GLM) with the identity link fitted by glm() will give > the same results as lm(), but it will do so in a very inefficient > manner. As this was what your code was doing I suggested using lm() > instead. > => Thanks for the important suggestion. I tried both lm and glm until now but no one commented on the method. > > Elaine > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AIC in MuMIn
On Wed, 2010-08-18 at 08:51 +0100, Gavin Simpson wrote: > On Wed, 2010-08-18 at 05:42 +0800, elaine kuo wrote: > > Thank you. > > Most of the answers solved the puzzles. > > => Please explain why fitted lm is better for GLM. > > Seriously? A GLM specified as glm(, family = gaussian) is the linear > model that you'd get with lm(). lm() fits the model far more efficiently > than glm(). The code you showed specifically used 'family = gaussian', > hence my comment. Hmmm. Thinking some more, I might not have answered your (unstated) question. What do your mean by GLM? I mean the Generalized Linear Model, not the General Linear Model. The Generalized one allows for non-normal responses and different mean-variance relationships and is the GLM of Nelder and Wedderburn (1972, J. Royal Statistical Society, Series A, 135(3),370-384) and the monograph by McCullagh and Wedderburn (1989, Generalized Linear Models, Chapman & Hall/CRC). The R function glm() fits these kinds of model. The General Linear Model (IIRC) was the linking of linear regression and anova into a single entity. The R function lm() fits these kinds of models. The linear model is a special case of the Generalized Linear Model when the Gaussian error is used with the identity link function. Hence a Gaussian GLM (my GLM) with the identity link fitted by glm() will give the same results as lm(), but it will do so in a very inefficient manner. As this was what your code was doing I suggested using lm() instead. HTH G > > > But temp_ran is not in your model... > > > > > error in eval(expr, envir, enclos), 'temp_ran' not found > > > > > > When used properly (none of this datam.std$ business), subset > > will do > > what you want: > > > > > dd2 <- dredge(lm1, subset = X1) > > > dd2 > > Global model: lm(formula = y ~ ., data = Cement) > > --- > > Model selection table > >(Int)X X1 X2 X3 X4 k R.sq > > Adj.R.sq RSS > > 3 52.58 1.4680 0.6623 4 0.9787 > > 0.9744 57.90 > > > > > > => Please suggest how to define subset in my case > > How would I know? I still haven't seen your data. You seem to be > mistaken on what is and is not included in your model and you fitted it. > What hope do we have...? However, given the model 'mig.stds' from above > in this email: > > > mig.stds <-lm(SummerM_ratio ~ temp_max + evi_mean + topo_var + > > topo_mean + coast + Iso_index_0808, > > ## now tell R were to find the variables in formula > > data = datum.std) > > ## If you are fitting a Gaussian GLM it is better fitted with lm() > > If you want to consider dredged models containing temp_max, then you > would do > > dredge(mig.stds, subset = temp_max) > > If you want models that contain temp_max and coast, then you'd do > > dredge(mig.stds, subset = temp_max & coast) > > or > > dredge(mig.stds, fixed = ~ temp_max + coast) > > The bits you include in subset or fixed are the names of your variables > that you want in or out of the models. In your case, the names of the > variables as input into the model formula. With 'subset' you need to use > logical operators (and [&], or [|]) whilst with 'fixed' you can specify > a formula of variables that should be included or excluded in the same > way you'd write any R formula. > > But, now having been told this, please note that this is *all* discussed > on the ?dredge help page if you bother to read it. I've never used this > package, and, OK, I have used R for going on for 11 or 12 years now so > am used to reading help pages and understand the language a bit more you > perhaps do, but you do seem to be asking questions or running into > problems that are all covered by the help pages. > > > Finally, it would be highly appreciated to recommend any references of > > R for a beginner like me. > > Read the An Introduction to R manual that comes with R or that can be > downloaded from the R website/CRAN. Also look at the contributed > documentation section on the R website which contains numerous free > introductory guides. > > > Elaine > > HTH > > G > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minima
Re: [R] AIC in MuMIn
On Wed, 2010-08-18 at 05:42 +0800, elaine kuo wrote: > Thank you. > Most of the answers solved the puzzles. > > > Q2 > > > > I tried to display sub-model with only temp_ran using the > code below but > > failed. > > Please kindly suggest the potential failure cause. > > > > code > > > > library(MuMIn) > > datam > <-read.csv("c:/migration/Mig_ratio_20100817.csv",header=T, > > row.names=1) > > > > mig.stds <-lm(SummerM_ratio ~ temp_max + evi_mean + topo_var + > topo_mean + coast + Iso_index_0808, > ## now tell R were to find the variables in > formula > data = datum.std) > ## If you are fitting a Gaussian GLM it is better fitted with > lm() > > > => Please explain why fitted lm is better for GLM. Seriously? A GLM specified as glm(, family = gaussian) is the linear model that you'd get with lm(). lm() fits the model far more efficiently than glm(). The code you showed specifically used 'family = gaussian', hence my comment. > But temp_ran is not in your model... > > > error in eval(expr, envir, enclos), 'temp_ran' not found > > > When used properly (none of this datam.std$ business), subset > will do > what you want: > > > dd2 <- dredge(lm1, subset = X1) > > dd2 > Global model: lm(formula = y ~ ., data = Cement) > --- > Model selection table >(Int)X X1 X2 X3 X4 k R.sq > Adj.R.sq RSS > 3 52.58 1.4680 0.6623 4 0.9787 > 0.9744 57.90 > > > => Please suggest how to define subset in my case How would I know? I still haven't seen your data. You seem to be mistaken on what is and is not included in your model and you fitted it. What hope do we have...? However, given the model 'mig.stds' from above in this email: > mig.stds <-lm(SummerM_ratio ~ temp_max + evi_mean + topo_var + > topo_mean + coast + Iso_index_0808, > ## now tell R were to find the variables in formula > data = datum.std) > ## If you are fitting a Gaussian GLM it is better fitted with lm() If you want to consider dredged models containing temp_max, then you would do dredge(mig.stds, subset = temp_max) If you want models that contain temp_max and coast, then you'd do dredge(mig.stds, subset = temp_max & coast) or dredge(mig.stds, fixed = ~ temp_max + coast) The bits you include in subset or fixed are the names of your variables that you want in or out of the models. In your case, the names of the variables as input into the model formula. With 'subset' you need to use logical operators (and [&], or [|]) whilst with 'fixed' you can specify a formula of variables that should be included or excluded in the same way you'd write any R formula. But, now having been told this, please note that this is *all* discussed on the ?dredge help page if you bother to read it. I've never used this package, and, OK, I have used R for going on for 11 or 12 years now so am used to reading help pages and understand the language a bit more you perhaps do, but you do seem to be asking questions or running into problems that are all covered by the help pages. > Finally, it would be highly appreciated to recommend any references of > R for a beginner like me. Read the An Introduction to R manual that comes with R or that can be downloaded from the R website/CRAN. Also look at the contributed documentation section on the R website which contains numerous free introductory guides. > Elaine HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AIC in MuMIn
Thank you. Most of the answers solved the puzzles. Q2 > > I tried to display sub-model with only temp_ran using the code below but > > failed. > > Please kindly suggest the potential failure cause. > > > > code > > > > library(MuMIn) > > datam <-read.csv("c:/migration/Mig_ratio_20100817.csv",header=T, > > row.names=1) > > > mig.stds <-lm(SummerM_ratio ~ temp_max + evi_mean + topo_var + > topo_mean + coast + Iso_index_0808, > ## now tell R were to find the variables in formula > data = datum.std) > ## If you are fitting a Gaussian GLM it is better fitted with lm() > => Please explain why fitted lm is better for GLM. > > > > But temp_ran is not in your model... > > > error in eval(expr, envir, enclos), 'temp_ran' not found > > When used properly (none of this datam.std$ business), subset will do > what you want: > > > dd2 <- dredge(lm1, subset = X1) > > dd2 > Global model: lm(formula = y ~ ., data = Cement) > --- > Model selection table >(Int)X X1 X2 X3 X4 k R.sq Adj.R.sq RSS > 3 52.58 1.4680 0.6623 4 0.9787 0.9744 57.90 > > => Please suggest how to define subset in my case Finally, it would be highly appreciated to recommend any references of R for a beginner like me. Elaine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AIC in MuMIn
On Tue, 2010-08-17 at 16:05 +0800, elaine kuo wrote: > Hello, Why did you decide to post the exact same message from two different email addresses?? > I am using package MuMIn to calculate AIC for a full model with 10 > explanatory variables. > Thanks in advance in sharing your experience. > > Q1 > In the AIC list of all models, each model is differentiated by model number. > Please kindly advise if it is possible to > find the corresponding explanatory variable(s) for the model number. Did you read ?dredge and look at the example to see how it worked? I've never used this package and 30 seconds installing it, reading ?dredge and running the very first example gave me the answer! > # Example from Burnham and Anderson (2002), page 100: > data(Cement) > lm1 <- lm(y ~ ., data = Cement) > dd <- dredge(lm1) > subset(dd, delta < 4) Global model: lm(formula = y ~ ., data = Cement) --- Model selection table (Int) XX1 X2X3 X4 k R.sq Adj.R.sq RSS AIC AICc 11 52.58 1.468 0.6623 4 0.9787 0.9744 57.90 64.31 69.31 24 71.65 1.452 0.4161 -0.2365 5 0.9823 0.9764 47.97 63.87 72.44 23 48.19 1.696 0.6569 0.25 5 0.9823 0.9764 48.11 63.90 72.48 13 103.10 1.440 -0.6140 4 0.9725 0.9670 74.76 67.63 72.63 25 111.70 1.052-0.41 -0.6428 5 0.9813 0.9750 50.84 64.62 73.19 delta weight 11 0.000 0.572 24 3.125 0.120 23 3.163 0.118 13 3.322 0.109 25 3.879 0.082 (or make the last line just read dd to print the full table) So this clearly shows which variables (X, X1, etc) are in each model. So, without anything further to go with regarding your problem, does this answer this question? > Q2 error message > I tried to display sub-model with only temp_ran using the code below but > failed. > Please kindly suggest the potential failure cause. > > code > > rm(list=ls()) Please don't do the above; people might run your code and delete their workspace if they aren't looking closely! > library(MuMIn) > datam <-read.csv("c:/migration/Mig_ratio_20100817.csv",header=T, > row.names=1) You do realise this isn't reproducible as I don't have your data file? This is not what the posting guide asks you for. If you can't make the problem appear with available data, then you might need to make *your* data available for us to help. > mig.stds <-glm(datam.std$SummerM_ratio~ > datam.std$temp_max++datam.std$evi_mean+datam.std$topo_var+datam.std$topo_mean+ > datam.std$coast+datam.std$Iso_index_0808,data=datam.std,family=gaussian) Uggg! Ok, first lesson before we go any further; STOP using the data frame name in your model formula. This is WRONG, WRONG, WRONG (Is that clear enough ;-) I don't know where you learnt this, but unlearn it, FAST! If someone taught you this, have a word with them. Your model should be fitted like this: mig.stds <-lm(SummerM_ratio ~ temp_max + evi_mean + topo_var + topo_mean + coast + Iso_index_0808, ## now tell R were to find the variables in formula data = datum.std) ## If you are fitting a Gaussian GLM it is better fitted with lm() > target.model <- dredge(mig.stds, subset=datam.std$temp_ran) But temp_ran is not in your model... > error in eval(expr, envir, enclos), 'temp_ran' not found Which would explain the error: > dd2 <- dredge(lm1, subset = FOO) Error in eval(expr, envir, enclos) : object 'FOO' not found When used properly (none of this datam.std$ business), subset will do what you want: > dd2 <- dredge(lm1, subset = X1) > dd2 Global model: lm(formula = y ~ ., data = Cement) --- Model selection table (Int)X X1 X2 X3 X4 k R.sq Adj.R.sq RSS 3 52.58 1.4680 0.6623 4 0.9787 0.9744 57.90 10 71.65 1.4520 0.4161 -0.2365 5 0.9823 0.9764 47.97 9 48.19 1.6960 0.6569 0.2500 5 0.9823 0.9764 48.11 5 103.10 1.4400-0.6140 4 0.9725 0.9670 74.76 11 111.70 1.0520-0.4100 -0.6428 5 0.9813 0.9750 50.84 6 52.53 0.15510 1.4670 0.6410 5 0.9798 0.9731 54.86 8 105.00 -0.16160 1.4380-0.6379 5 0.9735 0.9647 71.91 15 62.41 1.5510 0.5102 0.1019 -0.1441 6 0.9824 0.9736 47.86 12 47.68 -0.05068 1.7240 0.6632 0.2801 6 0.9824 0.9735 47.93 13 70.97 0.01981 1.4520 0.4221 -0.2282 6 0.9823 0.9735 47.94 14 111.50 0.13500 0.9923-0.4747 -0.6273 6 0.9818 0.9727 49.44 16 59.10 -0.01831 1.5930 0.5447 0.1453 -0.1124 7 0.9824 0.9698 47.85 2 71.93 1.51200 1.73004 0.6843 0.6212 857.40 1 81.48 1.86903 0.5339 0.4916 1266.00 4 72.35 2.3120 0.4945 4 0.5482 0.4578 1227.00 7 82.99 1.93300 1.0250-0.7430 5 0.7048 0.6064 801.80 > Q3 showing sub-models for two assigned explanatory variables > The manual only explains how to exc
[R] AIC in MuMIn
Hello, I am using package MuMIn to calculate AIC for a full model with 10 explanatory variables. Thanks in advance in sharing your experience. Q1 In the AIC list of all models, each model is differentiated by model number. Please kindly advise if it is possible to find the corresponding explanatory variable(s) for the model number. Q2 error message I tried to display sub-model with only temp_ran using the code below but failed. Please kindly suggest the potential failure cause. code rm(list=ls()) library(MuMIn) datam <-read.csv("c:/migration/Mig_ratio_20100817.csv",header=T, row.names=1) mig.stds <-glm(datam.std$SummerM_ratio~ datam.std$temp_max++datam.std$evi_mean+datam.std$topo_var+datam.std$topo_mean+datam.std$coast+datam.std$Iso_index_0808,data=datam.std,family=gaussian) target.model <- dredge(mig.stds, subset=datam.std$temp_ran) error in eval(expr, envir, enclos), 'temp_ran' not found Q3 showing sub-models for two assigned explanatory variables The manual only explains how to exclude two variables. Please advise how to contain submodels regarding certain two or more variables. H [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AIC in MuMIn
Hello, I am using package MuMIn to calculate AIC for a full model with 10 explanatory variables. Thanks in advance in sharing your experience. Q1 In the AIC list of all models, each model is differentiated by model number. Please kindly advise if it is possible to find the corresponding explanatory variable(s) for the model number. Q2 error message I tried to display sub-model with only temp_ran using the code below but failed. Please kindly suggest the potential failure cause. code rm(list=ls()) library(MuMIn) datam <-read.csv("c:/migration/Mig_ratio_20100817.csv",header=T, row.names=1) mig.stds <-glm(datam.std$SummerM_ratio~ datam.std$temp_max++datam.std$evi_mean+datam.std$topo_var+datam.std$topo_mean+datam.std$coast+datam.std$Iso_index_0808,data=datam.std,family=gaussian) target.model <- dredge(mig.stds, subset=datam.std$temp_ran) error in eval(expr, envir, enclos), 'temp_ran' not found Q3 showing sub-models for two assigned explanatory variables The manual only explains how to exclude two variables. Please advise how to contain submodels regarding certain two or more variables. Elaine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.