Re: [R] Efficient swapping
Thanks a lot, Ista! I really appreciate it. How about a slightly different case as the following: set.seed(1) (tmp <- data.frame(x = 1:10, R1 = sample(LETTERS[1:5], 10, replace = TRUE), R2 = sample(LETTERS[2:6], 10, replace = TRUE))) x R1 R2 1 C B 2 B B 3 C E 4 E C 5 E B 6 D E 7 E E 8 D F 9 C D 10 A E Notice that the factor levels between the two factors, R1 and R2, slide by one level; that is, factor R1 does not have level F while factor R2 does not have level A. I want to swap the factor levels based on the combined levels of the two factors as shown below: tl <- unique(c(levels(tmp$R1), levels(tmp$R2))) for(ii in 1:dim(tmp)[1]) { kk <- which(tl %in% tmp[ii,'R2'], arr.ind = TRUE) - which(tl %in% tmp[ii,'R1'], arr.ind = TRUE) if(kk%%2!=0) { # swap the their levels between the two factors qq <- tmp[ii,]$R1 tmp[ii,]$R1 <- tmp[ii,]$R2 tmp[ii,]$R2 <- qq } } How to go about this case? Thanks! On Thu, Jul 6, 2017 at 5:16 PM, Ista Zahn <istaz...@gmail.com> wrote: > How about > > foo <- with(list(r1 = tmp$R1, > r2 = tmp$R2, > swapme = (as.numeric(tmp$R1) - as.numeric(tmp$R2)) %% 2 != > 0), > { > tmp[swapme, "R1"] <- r2[swapme] > tmp[swapme, "R2"] <- r1[swapme] > tmp > }) > > Best, > Ista > > On Thu, Jul 6, 2017 at 4:06 PM, Gang Chen <gangch...@gmail.com> wrote: >> Suppose that we have the following dataframe: >> >> set.seed(1) >> (tmp <- data.frame(x = 1:10, R1 = sample(LETTERS[1:5], 10, replace = >> TRUE), R2 = sample(LETTERS[1:5], 10, replace = TRUE))) >> >> x R1 R2 >> 1 1 B B >> 2 2 B A >> 3 3 C D >> 4 4 E B >> 5 5 B D >> 6 6 E C >> 7 7 E D >> 8 8 D E >> 9 9 D B >> 10 10 A D >> >> I want to do the following: if the difference between the level index >> of factor R1 and that of factor R2 is an odd number, the levels of the >> two factors need to be switched between them, which can be performed >> through the following code: >> >> for(ii in 1:dim(tmp)[1]) { >>kk <- which(levels(tmp$R2) %in% tmp[ii,'R2'], arr.ind = TRUE) - >> which(levels(tmp$R1) %in% tmp[ii,'R1'], arr.ind = TRUE) >>if(kk%%2!=0) { # swap the their levels between the two factors >> qq <- tmp[ii,]$R1 >> tmp[ii,]$R1 <- tmp[ii,]$R2 >> tmp[ii,]$R2 <- qq >> } >> } >> >> More concise and efficient way to do this? >> >> Thanks, >> Gang >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Efficient swapping
Suppose that we have the following dataframe: set.seed(1) (tmp <- data.frame(x = 1:10, R1 = sample(LETTERS[1:5], 10, replace = TRUE), R2 = sample(LETTERS[1:5], 10, replace = TRUE))) x R1 R2 1 1 B B 2 2 B A 3 3 C D 4 4 E B 5 5 B D 6 6 E C 7 7 E D 8 8 D E 9 9 D B 10 10 A D I want to do the following: if the difference between the level index of factor R1 and that of factor R2 is an odd number, the levels of the two factors need to be switched between them, which can be performed through the following code: for(ii in 1:dim(tmp)[1]) { kk <- which(levels(tmp$R2) %in% tmp[ii,'R2'], arr.ind = TRUE) - which(levels(tmp$R1) %in% tmp[ii,'R1'], arr.ind = TRUE) if(kk%%2!=0) { # swap the their levels between the two factors qq <- tmp[ii,]$R1 tmp[ii,]$R1 <- tmp[ii,]$R2 tmp[ii,]$R2 <- qq } } More concise and efficient way to do this? Thanks, Gang __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate
Yes, this works out perfectly! Thanks a lot, David. Have a wonderful day... On Wed, Aug 24, 2016 at 4:24 PM, David L Carlson <dcarl...@tamu.edu> wrote: > This will work, but you should double-check to be certain that CP and > unique(myData[, 3:5]) are in the same order. It will fail if N is not > identical for all rows of the same S-Z combination. > >> CP <- sapply(split(myData, paste0(myData$S, myData$Z)), function(x) > + crossprod(x[, 1], x[, 2])) >> data.frame(CP, unique(myData[, 3:5])) > CP N S Z > S1A 22 2.1 S1 A > S1B 38 2.1 S1 B > S2A 38 3.2 S2 A > S2B 22 3.2 S2 B > > David C > > -Original Message- > From: Gang Chen [mailto:gangch...@gmail.com] > Sent: Wednesday, August 24, 2016 2:51 PM > To: David L Carlson > Cc: r-help mailing list > Subject: Re: [R] aggregate > > Thanks again for patiently offering great help, David! I just learned > dput() and paste0() now. Hopefully this is my last question. > > Suppose a new dataframe is as below (one more numeric column): > > myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, > 5, 4, 3, 2, 1), N =c(rep(2.1, 4), rep(3.2, 4)), S = structure(c(1L, > 1L, 1L, 1L, 2L, 2L, 2L, 2L > ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, > 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), > .Names = c("X", > "Y", "N", "S", "Z"), row.names = c(NA, -8L), class = "data.frame") > >> myData > > X Y N S Z > 1 1 8 2.1 S1 A > 2 2 7 2.1 S1 A > 3 3 6 2.1 S1 B > 4 4 5 2.1 S1 B > 5 5 4 3.2 S2 A > 6 6 3 3.2 S2 A > 7 7 2 3.2 S2 B > 8 8 1 3.2 S2 B > > Once I obtain the cross product, > >> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, >> 1], x[, 2])) > S1A S1B S2A S2B > 22 38 38 22 > > how can I easily add the other 3 columns (N, S, and Z) in a new > dataframe? For S and Z, I can play with the names from the cross > product output, but I have trouble dealing with the numeric column N. > > > > > On Wed, Aug 24, 2016 at 1:07 PM, David L Carlson <dcarl...@tamu.edu> wrote: >> You need to spend some time with a basic R tutorial. Your data is messed up >> because you did not use a simple text editor somewhere along the way. R >> understands ', but not ‘ or ’. The best way to send data to the list is to >> use dput: >> >>> dput(myData) >> structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, >> 5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L >> ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, >> 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), >> .Names = c("X", >> "Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame") >> >> Combining two labels just requires the paste0() function: >> >>> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, >>> 1], x[, 2])) >> S1A S1B S2A S2B >> 22 38 38 22 >> >> David C >> >> -Original Message- >> From: Gang Chen [mailto:gangch...@gmail.com] >> Sent: Wednesday, August 24, 2016 11:56 AM >> To: David L Carlson >> Cc: Jim Lemon; r-help mailing list >> Subject: Re: [R] aggregate >> >> Thanks a lot, David! I want to further expand the operation a little >> bit. With a new dataframe: >> >> myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4, >> 3, 2, 1), S=c(‘S1’, ‘S1’, ‘S1’, ‘S1’, ‘S2’, ‘S2’, ‘S2’, ‘S2’), >> Z=c(‘A’, ‘A’, ‘B’, ‘B’, ‘A’, ‘A’, ‘B’, ‘B’)) >> >>> myData >> >> X Y S Z >> 1 1 8 S1 A >> 2 2 7 S1 A >> 3 3 6 S1 B >> 4 4 5 S1 B >> 5 5 4 S2 A >> 6 6 3 S2 A >> 7 7 2 S2 B >> 8 8 1 S2 B >> >> I would like to obtain the same cross product between columns X and Y, >> but at each combination level of factors S and Z. In other words, the >> cross product would be still performed each two rows in the new >> dataframe myData. How can I achieve that? >> >> On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarl...@tamu.edu> wrote: >>> Your is fine, but it will be a little simpler if you use sapply() instead: >>> >>>> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z), >>> + function(x) crossprod(x[, 1], x[, 2]))) >>> Z CP >>> A A 10 >>> B B 10
Re: [R] aggregate
Thanks again for patiently offering great help, David! I just learned dput() and paste0() now. Hopefully this is my last question. Suppose a new dataframe is as below (one more numeric column): myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, 5, 4, 3, 2, 1), N =c(rep(2.1, 4), rep(3.2, 4)), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X", "Y", "N", "S", "Z"), row.names = c(NA, -8L), class = "data.frame") > myData X Y N S Z 1 1 8 2.1 S1 A 2 2 7 2.1 S1 A 3 3 6 2.1 S1 B 4 4 5 2.1 S1 B 5 5 4 3.2 S2 A 6 6 3 3.2 S2 A 7 7 2 3.2 S2 B 8 8 1 3.2 S2 B Once I obtain the cross product, > sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, > 1], x[, 2])) S1A S1B S2A S2B 22 38 38 22 how can I easily add the other 3 columns (N, S, and Z) in a new dataframe? For S and Z, I can play with the names from the cross product output, but I have trouble dealing with the numeric column N. On Wed, Aug 24, 2016 at 1:07 PM, David L Carlson <dcarl...@tamu.edu> wrote: > You need to spend some time with a basic R tutorial. Your data is messed up > because you did not use a simple text editor somewhere along the way. R > understands ', but not ‘ or ’. The best way to send data to the list is to > use dput: > >> dput(myData) > structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, > 5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L > ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, > 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names > = c("X", > "Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame") > > Combining two labels just requires the paste0() function: > >> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, >> 1], x[, 2])) > S1A S1B S2A S2B > 22 38 38 22 > > David C > > -Original Message- > From: Gang Chen [mailto:gangch...@gmail.com] > Sent: Wednesday, August 24, 2016 11:56 AM > To: David L Carlson > Cc: Jim Lemon; r-help mailing list > Subject: Re: [R] aggregate > > Thanks a lot, David! I want to further expand the operation a little > bit. With a new dataframe: > > myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4, > 3, 2, 1), S=c(‘S1’, ‘S1’, ‘S1’, ‘S1’, ‘S2’, ‘S2’, ‘S2’, ‘S2’), > Z=c(‘A’, ‘A’, ‘B’, ‘B’, ‘A’, ‘A’, ‘B’, ‘B’)) > >> myData > > X Y S Z > 1 1 8 S1 A > 2 2 7 S1 A > 3 3 6 S1 B > 4 4 5 S1 B > 5 5 4 S2 A > 6 6 3 S2 A > 7 7 2 S2 B > 8 8 1 S2 B > > I would like to obtain the same cross product between columns X and Y, > but at each combination level of factors S and Z. In other words, the > cross product would be still performed each two rows in the new > dataframe myData. How can I achieve that? > > On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarl...@tamu.edu> wrote: >> Your is fine, but it will be a little simpler if you use sapply() instead: >> >>> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z), >> + function(x) crossprod(x[, 1], x[, 2]))) >> Z CP >> A A 10 >> B B 10 >> >> David C >> >> >> -Original Message- >> From: Gang Chen [mailto:gangch...@gmail.com] >> Sent: Wednesday, August 24, 2016 10:17 AM >> To: David L Carlson >> Cc: Jim Lemon; r-help mailing list >> Subject: Re: [R] aggregate >> >> Thank you all for the suggestions! Yes, I'm looking for the cross >> product between the two columns of X and Y. >> >> A follow-up question: what is a nice way to merge the output of >> >> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) >> >> with the column Z in myData so that I would get a new dataframe as the >> following (the 2nd column is the cross product between X and Y)? >> >> Z CP >> A 10 >> B 10 >> >> Is the following legitimate? >> >> data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData, >> myData$Z), function(x) crossprod(x[, 1], x[, 2] >> >> >> On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarl...@tamu.edu> wrote: >>> Thank you for the reproducible example, but it is not clear what cross >>> product you want. Jim's solution gives you the cross product of the >>> 2-column matrix with itse
Re: [R] aggregate
Thanks a lot, David! I want to further expand the operation a little bit. With a new dataframe: myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4, 3, 2, 1), S=c(‘S1’, ‘S1’, ‘S1’, ‘S1’, ‘S2’, ‘S2’, ‘S2’, ‘S2’), Z=c(‘A’, ‘A’, ‘B’, ‘B’, ‘A’, ‘A’, ‘B’, ‘B’)) > myData X Y S Z 1 1 8 S1 A 2 2 7 S1 A 3 3 6 S1 B 4 4 5 S1 B 5 5 4 S2 A 6 6 3 S2 A 7 7 2 S2 B 8 8 1 S2 B I would like to obtain the same cross product between columns X and Y, but at each combination level of factors S and Z. In other words, the cross product would be still performed each two rows in the new dataframe myData. How can I achieve that? On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarl...@tamu.edu> wrote: > Your is fine, but it will be a little simpler if you use sapply() instead: > >> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z), > + function(x) crossprod(x[, 1], x[, 2]))) > Z CP > A A 10 > B B 10 > > David C > > > -Original Message- > From: Gang Chen [mailto:gangch...@gmail.com] > Sent: Wednesday, August 24, 2016 10:17 AM > To: David L Carlson > Cc: Jim Lemon; r-help mailing list > Subject: Re: [R] aggregate > > Thank you all for the suggestions! Yes, I'm looking for the cross > product between the two columns of X and Y. > > A follow-up question: what is a nice way to merge the output of > > lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) > > with the column Z in myData so that I would get a new dataframe as the > following (the 2nd column is the cross product between X and Y)? > > Z CP > A 10 > B 10 > > Is the following legitimate? > > data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData, > myData$Z), function(x) crossprod(x[, 1], x[, 2] > > > On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarl...@tamu.edu> wrote: >> Thank you for the reproducible example, but it is not clear what cross >> product you want. Jim's solution gives you the cross product of the 2-column >> matrix with itself. If you want the cross product between the columns you >> need something else. The aggregate function will not work since it will >> treat the columns separately: >> >>> A <- as.matrix(myData[myData$Z=="A", 1:2]) >>> A >> X Y >> 1 1 4 >> 2 2 3 >>> crossprod(A) # Same as t(A) %*% A >>X Y >> X 5 10 >> Y 10 25 >>> crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2] >> [,1] >> [1,] 10 >>> >>> # For all the groups >>> lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2]))) >> $A >>X Y >> X 5 10 >> Y 10 25 >> >> $B >>X Y >> X 25 10 >> Y 10 5 >> >>> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) >> $A >> [,1] >> [1,] 10 >> >> $B >> [,1] >> [1,] 10 >> >> - >> David L Carlson >> Department of Anthropology >> Texas A University >> College Station, TX 77840-4352 >> >> >> -Original Message- >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon >> Sent: Tuesday, August 23, 2016 6:02 PM >> To: Gang Chen; r-help mailing list >> Subject: Re: [R] aggregate >> >> Hi Gang Chen, >> If I have the right idea: >> >> for(zval in levels(myData$Z)) >> crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")])) >> >> Jim >> >> On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangch...@gmail.com> wrote: >>> This is a simple question: With a dataframe like the following >>> >>> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', >>> 'B')) >>> >>> how can I get the cross product between X and Y for each level of >>> factor Z? My difficulty is that I don't know how to deal with the fact >>> that crossprod() acts on two variables in this case. >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate
Thank you all for the suggestions! Yes, I'm looking for the cross product between the two columns of X and Y. A follow-up question: what is a nice way to merge the output of lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) with the column Z in myData so that I would get a new dataframe as the following (the 2nd column is the cross product between X and Y)? Z CP A 10 B 10 Is the following legitimate? data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2] On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarl...@tamu.edu> wrote: > Thank you for the reproducible example, but it is not clear what cross > product you want. Jim's solution gives you the cross product of the 2-column > matrix with itself. If you want the cross product between the columns you > need something else. The aggregate function will not work since it will treat > the columns separately: > >> A <- as.matrix(myData[myData$Z=="A", 1:2]) >> A > X Y > 1 1 4 > 2 2 3 >> crossprod(A) # Same as t(A) %*% A >X Y > X 5 10 > Y 10 25 >> crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2] > [,1] > [1,] 10 >> >> # For all the groups >> lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2]))) > $A >X Y > X 5 10 > Y 10 25 > > $B >X Y > X 25 10 > Y 10 5 > >> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) > $A > [,1] > [1,] 10 > > $B > [,1] > [1,] 10 > > - > David L Carlson > Department of Anthropology > Texas A University > College Station, TX 77840-4352 > > > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon > Sent: Tuesday, August 23, 2016 6:02 PM > To: Gang Chen; r-help mailing list > Subject: Re: [R] aggregate > > Hi Gang Chen, > If I have the right idea: > > for(zval in levels(myData$Z)) > crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")])) > > Jim > > On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangch...@gmail.com> wrote: >> This is a simple question: With a dataframe like the following >> >> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', >> 'B')) >> >> how can I get the cross product between X and Y for each level of >> factor Z? My difficulty is that I don't know how to deal with the fact >> that crossprod() acts on two variables in this case. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aggregate
This is a simple question: With a dataframe like the following myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B')) how can I get the cross product between X and Y for each level of factor Z? My difficulty is that I don't know how to deal with the fact that crossprod() acts on two variables in this case. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subtraction with aggregate
Hi Jim and Jeff, Thanks for the quick help! Sorry I didn't state the question clearly: I want the difference between 'neutral' and 'negative' for each subject. And another person offered a solution for it: aggregate(cbind(QM, yi) ~ subject, data = mydata, FUN = diff) On Thu, Jul 28, 2016 at 4:53 PM, jim holtman <jholt...@gmail.com> wrote: > Not sure what you mean by "nice way", but here is a dplyr solution: > >> library(dplyr) >> mydata <- read.table(text = "subject QMemotion yi > +s1 75.1017 neutral -75.928276 > +s2 -47.3512 neutral -178.295990 > +s3 -68.9016 neutral -134.753906 > +s1 17.2099 negative -104.168312 > +s2 -53.1114 negative -182.373474 > +s3 -33.0322 negative -137.420410", header = TRUE) >> agg <- mydata %>% > + group_by(subject) %>% > + summarise(QM = mean(QM), > + yi = mean(yi) > + ) >> >> >> agg > # A tibble: 3 x 3 > subject QM yi > > 1 s1 46.1558 -90.04829 > 2 s2 -50.2313 -180.33473 > 3 s3 -50.9669 -136.08716 > > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > On Thu, Jul 28, 2016 at 4:40 PM, Gang Chen <gangch...@gmail.com> wrote: >> >> With the following data in data.frame: >> >> subject QMemotion yi >> s1 75.1017 neutral -75.928276 >> s2 -47.3512 neutral -178.295990 >> s3 -68.9016 neutral -134.753906 >> s1 17.2099 negative -104.168312 >> s2 -53.1114 negative -182.373474 >> s3 -33.0322 negative -137.420410 >> >> I can obtain the average between the two emotions with >> >> mydata <- read.table('clipboard', header=TRUE) >> aggregate(mydata[,c('yi', 'QM')], by=list(subject=mydata$subject), mean) >> >> My question is, what is a nice way to get the difference between the >> two emotions? >> >> Thanks, >> Gang >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subtraction with aggregate
With the following data in data.frame: subject QMemotion yi s1 75.1017 neutral -75.928276 s2 -47.3512 neutral -178.295990 s3 -68.9016 neutral -134.753906 s1 17.2099 negative -104.168312 s2 -53.1114 negative -182.373474 s3 -33.0322 negative -137.420410 I can obtain the average between the two emotions with mydata <- read.table('clipboard', header=TRUE) aggregate(mydata[,c('yi', 'QM')], by=list(subject=mydata$subject), mean) My question is, what is a nice way to get the difference between the two emotions? Thanks, Gang __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem running mvrnorm
I’m running R 3.2.2 on a Linux server (Redhat 4.4.7-16), and having the following problem. It works fine with the following: require('MASS’) var(mvrnorm(n = 1000, rep(0, 2), Sigma=matrix(c(10,3,3,2),2,2))) However, when running the following in a loop with simulated data (Sigma): # Sigma defined somewhere else mvrnorm(n=1000, rep(0, 190), Sigma) I get this opaque message: *** caught illegal operation *** address 0x7fe78f8693d2, cause 'illegal operand' Traceback: 1: eigen(Sigma, symmetric = TRUE) 2: mvrnorm(n = nr, rep(0, NN), Sigma) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace I tried to do core dump (option 1), but it didn’t go anywhere (hanging there forever). I also ran the same code on a Mac, and there was no problem at all. What is causing the problem on the Linux server? In case the variance-covariance matrix ‘Sigma’ is needed, I can provide its definition later. Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [FORGED] Problem running mvrnorm
Thanks a lot for the pointer, Rolf! You're correct that E <- eigen(Sigma,symmetric=TRUE) does lead to the same error on the RedHat machine. However, the same computation on my Mac works fine: E <- eigen(Sigma,symmetric=TRUE) E$values [1] 4.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 [19] 2.6 2.6 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 [37] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 [55] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 [73] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 [91] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 [109] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 [127] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 [145] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 [163] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 [181] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 As you can see, all the eigenvalues are truly positive. Is it possible that some numerical library or package is required by eigen() but missing on the Linux server? In case the real Sigma is useful, here is how the matrix Sigma is defined: constrSigma <- function(n, sig) { N <- n*(n-1)/2 Sigma <- diag(N) bgn <- 1 for(ii in (n-1):1) { end <- bgn+(ii-1) Sigma[bgn:end,bgn:end][lower.tri(Sigma[bgn:end,bgn:end])] <- rep(sig, ii*(ii-1)/2) if(ii<(n-1)) { xx <- cbind(rep(sig,ii), diag(ii)*sig) yy <- xx for(jj in 1:(n-1-ii)) { if(jj>1) { xx <- cbind(rep(0, ii), xx) yy <- cbind(xx, yy) } } Sigma[bgn:end,1:(bgn-1)] <- yy } bgn <- end+1 } Sigma[upper.tri(Sigma)] <- t(Sigma)[upper.tri(t(Sigma))] return(Sigma) } Sigma <- constrSigma(20, 0.1) mvrnorm(n=1000, rep(0, 190), Sigma) On Wed, Nov 18, 2015 at 3:56 PM, Rolf Turner <r.tur...@auckland.ac.nz> wrote: > > > I cobbled together a 190 x 190 positive definite matrix Sigma and ran your example. I got a result instantaneously, with no error message. (I'm running Linux; an ancient Fedora 17 system.) > > So the problem is peculiar to your particular Sigma. > > As the error message tells you, the problem comes from doing an eigendecomposition of Sigma. So start your investigation by doing > > E <- eigen(Sigma,symmetric=TRUE) > > Presumably that will lead to the same error. How to get around this error is beyond the scope of my capabilities. > > You *might* get somewhere by using the singular value decomposition > (equivalent for a positive definite matrix) rather than the eigendecomposition. I have the vague notion that the svd is more numerically robust than eigendecomposition. However I might well have that wrong. > > Doing anything in 190 dimensions is bound to be fraught with numeric > peril. > > cheers, > > Rolf Turner > > -- > Technical Editor ANZJS > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 > > > On 19/11/15 08:28, Gang Chen wrote: >> >> I’m running R 3.2.2 on a Linux server (Redhat 4.4.7-16), and having the >> following problem. >> >> It works fine with the following: >> >> require('MASS’) >> var(mvrnorm(n = 1000, rep(0, 2), Sigma=matrix(c(10,3,3,2),2,2))) >> >> However, when running the following in a loop with simulated data (Sigma): >> >> # Sigma defined somewhere else >> mvrnorm(n=1000, rep(0, 190), Sigma) >> >> I get this opaque message: >> >> *** caught illegal operation *** >> address 0x7fe78f8693d2, cause 'illegal operand' >> >> Traceback: >> 1: eigen(Sigma, symmetric = TRUE) >> 2: mvrnorm(n = nr, rep(0, NN), Sigma) >> >> Possible actions: >> 1: abort (with core dump, if enabled) >> 2: normal R exit >> 3: exit R without saving workspace >> 4: exit R saving workspace >> >> I tried to do core dump (option 1), but it didn’t go anywhere (hanging >> there forever). I also ran the same code on a Mac, and there was no problem >> at all. What is causing the problem on the Linux server? In case the >> variance-covariance matrix ‘Sigma’ is needed, I can provide its definition >> later. > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with Anova() in package car
I'm having some trouble with Anova() in package car. When the model formula is explicitly expressed: library('nlme') library('car') fm - lme(distance ~ age + Sex, data = Orthodont, random = ~ 1) Anova() works fine: Anova(fm) However, if the model formula is scanned from an external source: myModel - as.formula(distance ~ age + Sex) fm2 - lme(myModel, data = Orthodont, random = ~ 1) I get the following error: Anova(fm2) Error: object of type 'symbol' is not subsettable How to resolve the situation when the model formula is defined externally? Thanks, Gang __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Anova() in package car
Dear John, Thanks a lot for the quick response and fix! I'm looking forward to try out the development version. I assume that the fix will be released in the official version at some point. Thanks again, Gang On Tue, Jan 13, 2015 at 5:06 PM, John Fox j...@mcmaster.ca wrote: Dear Gang, The problem was in the model.matrix.lme() method provided by the car package, and is now fixed in the development version of the car package on R-Forge. You should be able to install it from there via install.packages(car, repos=http://R-Forge.R-project.org;) after the package is next built on R-Forge, usually in a day or so. Best, John -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Gang Chen Sent: January-13-15 1:48 PM To: r-help Subject: [R] Problem with Anova() in package car I'm having some trouble with Anova() in package car. When the model formula is explicitly expressed: library('nlme') library('car') fm - lme(distance ~ age + Sex, data = Orthodont, random = ~ 1) Anova() works fine: Anova(fm) However, if the model formula is scanned from an external source: myModel - as.formula(distance ~ age + Sex) fm2 - lme(myModel, data = Orthodont, random = ~ 1) I get the following error: Anova(fm2) Error: object of type 'symbol' is not subsettable How to resolve the situation when the model formula is defined externally? Thanks, Gang __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] String manipulation
I want to do the following: if a string does not contain a colon (:), no change is needed; if it contains one or more colons, break the string into multiple strings using the colon as a separator. For example, happy: becomes happy : :sad turns to : sad and happy:sad changes to happy : sad How to do this? Thanks, Gang __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Failure with .Rprofile on Mac OS X
Thanks for the help, Amos! The only reason that *should* happen is if there's an .Rprofile in the directory you're in when you start R. There is only one .Rprofil, which is in my home directory ~/ Where *exactly* is the .Rprofile file you want loaded The only one is in my home directory. what directory are you starting from It does not matter where I start R because on my Mac the CLI R is linked to /Library/Frameworks/R.framework/Resources/bin/R while the GUI R is linked to /Applications/R.app/Contents/MacOS/R what does R say is the user's home directory? Did you make *any* changes to Rprofile.site, or Renviron? R.home() [1] /Library/Frameworks/R.framework/Resources What is the output from Sys.getenv() in gui and cli, and do they differ? They differ slightly. I have trouble pinpointing the exact difference because the format is a little different and vim does not help much in tracking the differences. I just noticed that the CLI version has a few extra terms such as COLUMNS 130 ... DYLD_FALLBACK_LIBRARY_PATH /Library/Frameworks/R.framework/Resources/lib Thanks, Gang On Thu, Sep 18, 2014 at 7:04 PM, Amos B. Elberg amos.elb...@gmail.com wrote: The only reason that *should* happen is if there's an .Rprofile in the directory you're in when you start R. Where *exactly* is the .Rprofile file you want loaded, what directory are you starting from, and what does R say is the user's home directory? Did you make *any* changes to Rprofile.site, or Renviron? What is the output from Sys.getenv() in gui and cli, and do they differ? On Sep 18, 2014, at 11:18 AM, Gang Chen gangch...@gmail.com wrote: When R starts in GUI (e.g., /Applications/R.app/Contents/MacOS/R) on my Mac OS X 10.7.5, the startup configuration in .Rprofile works fine. However, when R starts on the terminal (e.g., /Library/Frameworks/R.framework/Resources/bin/R), it does not work at all. What could be the reason for the failure? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Failure with .Rprofile on Mac OS X
When R starts in GUI (e.g., /Applications/R.app/Contents/MacOS/R) on my Mac OS X 10.7.5, the startup configuration in .Rprofile works fine. However, when R starts on the terminal (e.g., /Library/Frameworks/R.framework/Resources/bin/R), it does not work at all. What could be the reason for the failure? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue with formula conversion
A random effect formulation for R package nlme is read in as a string of characters from an input file: ranEff - pdCompSymm(~1+Age) I need to convert 'ranEff' to a formula class. However, as shown below: as.formula(ranEff) ~1 + Age the pdCompSymm is lost in the conversion. Any solutions? Thanks! Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with formula conversion
Thanks for the help! However, I just need to get pdCompSymm(~1 + Age) without a tilde (~) at the beginning. On Wed, Aug 27, 2014 at 3:34 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 27, 2014, at 11:19 AM, Gang Chen wrote: A random effect formulation for R package nlme is read in as a string of characters from an input file: ranEff - pdCompSymm(~1+Age) I need to convert 'ranEff' to a formula class. However, as shown below: as.formula(ranEff) ~1 + Age the pdCompSymm is lost in the conversion. Any solutions? as.formula(paste(~,ranEff)) ~pdCompSymm(~1 + Age) -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with formula conversion
Good point! Here is an example: library(nlme) fm - lme(yield ~ nitro, data=Oats, random=list(Block=pdComSymm(~Variety-1))) Now the problem I'm facing is that the following part pdComSymm(~Variety-1) is read in as a string of characters from an external source: ranEff - 'pdComSymm(~Variety-1)' The following (ranEff1 - as.formula(ranEff)) ~Variety - 1 is not what I want. Even though fm - lme(yield ~ nitro, data=Oats, random=list(Block=pdCompSymm(ranEff1))) works, I don't know the 'pdCompSymm' part in advance and would like to make the process automatic. On Wed, Aug 27, 2014 at 3:49 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 27, 2014, at 12:44 PM, Gang Chen wrote: Thanks for the help! However, I just need to get pdCompSymm(~1 + Age) That's not a formula in the R sense of the word. You should do a better job of posting a use case. Perhaps you want an expression? -- David. without a tilde (~) at the beginning. On Wed, Aug 27, 2014 at 3:34 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 27, 2014, at 11:19 AM, Gang Chen wrote: A random effect formulation for R package nlme is read in as a string of characters from an input file: ranEff - pdCompSymm(~1+Age) I need to convert 'ranEff' to a formula class. However, as shown below: as.formula(ranEff) ~1 + Age the pdCompSymm is lost in the conversion. Any solutions? as.formula(paste(~,ranEff)) ~pdCompSymm(~1 + Age) -- David Winsemius Alameda, CA, USA David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with formula conversion
Sorry for the misspelling! And more importantly, thanks a lot for the nice solution and for the quick help! On Wed, Aug 27, 2014 at 4:22 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 27, 2014, at 1:11 PM, Gang Chen wrote: Good point! Here is an example: library(nlme) fm - lme(yield ~ nitro, data=Oats, random=list(Block=pdComSymm(~Variety-1))) One problem is that youa re misspelling the function name. Now the problem I'm facing is that the following part pdComSymm(~Variety-1) is read in as a string of characters from an external source: ranEff - 'pdComSymm(~Variety-1)' The following (ranEff1 - as.formula(ranEff)) ~Variety - 1 is not what I want. Even though fm - lme(yield ~ nitro, data=Oats, random=list(Block=pdCompSymm(ranEff1))) ranEff - 'pdCompSymm(~Variety-1)' tt - parse(text=ranEff) # Returns an unevaluated expression fm - lme(yield ~ nitro, data=Oats, random=list(Block=eval(tt) )) fm Linear mixed-effects model fit by REML Data: Oats Log-restricted-likelihood: -296.5209 Fixed: yield ~ nitro (Intercept) nitro 81.8722273.7 snipped rest of output. -- David works, I don't know the 'pdCompSymm' part in advance and would like to make the process automatic. On Wed, Aug 27, 2014 at 3:49 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 27, 2014, at 12:44 PM, Gang Chen wrote: Thanks for the help! However, I just need to get pdCompSymm(~1 + Age) That's not a formula in the R sense of the word. You should do a better job of posting a use case. Perhaps you want an expression? -- David. without a tilde (~) at the beginning. On Wed, Aug 27, 2014 at 3:34 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 27, 2014, at 11:19 AM, Gang Chen wrote: A random effect formulation for R package nlme is read in as a string of characters from an input file: ranEff - pdCompSymm(~1+Age) I need to convert 'ranEff' to a formula class. However, as shown below: as.formula(ranEff) ~1 + Age the pdCompSymm is lost in the conversion. Any solutions? as.formula(paste(~,ranEff)) ~pdCompSymm(~1 + Age) -- David Winsemius Alameda, CA, USA David Winsemius Alameda, CA, USA David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mapping from one vector to another
Suppose I have the following dataframe: L4 - LETTERS[1:4] fac - sample(L4, 10, replace = TRUE) (d - data.frame(x = 1, y = 1:10, fac = fac)) x y fac 1 1 1 B 2 1 2 B 3 1 3 D 4 1 4 A 5 1 5 C 6 1 6 D 7 1 7 C 8 1 8 B 9 1 9 B 10 1 10 B I'd like to add another column 'var' that is defined based on the following mapping of column 'fac': A - 8 B - 11 C - 3 D - 2 How can I achieve this in an elegant way (with a generic approach for any length)? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mapping from one vector to another
Thanks a lot for the quick and elegant solutions, Sarah, Bill and Petr! I really appreciate it, including the suggestion of setting a random seed. Have a nice day! Gang On Thu, Jul 17, 2014 at 11:15 AM, Sarah Goslee sarah.gos...@gmail.com wrote: What about: d$var - c(8, 11, 3, 2)[d$fac] Side note: it's much appreciated that you included data and a clear problem statement. If you use set.seed(123) before your call to sample(), everyone who tries it will get the same fac that you do. Otherwise we all get something different. Or just generate your own example data and use dput() to include it in your email. Sarah On Thu, Jul 17, 2014 at 11:00 AM, Gang Chen gangch...@gmail.com wrote: Suppose I have the following dataframe: L4 - LETTERS[1:4] fac - sample(L4, 10, replace = TRUE) (d - data.frame(x = 1, y = 1:10, fac = fac)) x y fac 1 1 1 B 2 1 2 B 3 1 3 D 4 1 4 A 5 1 5 C 6 1 6 D 7 1 7 C 8 1 8 B 9 1 9 B 10 1 10 B I'd like to add another column 'var' that is defined based on the following mapping of column 'fac': A - 8 B - 11 C - 3 D - 2 How can I achieve this in an elegant way (with a generic approach for any length)? Thanks, Gang -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mapping from one vector to another
Jeff, Even though the solutions from the previous responders are good enough for my current situation, the principle you just raised will be definitely beneficial to your future work. Thanks a lot for sharing the insights! Gang On Thu, Jul 17, 2014 at 12:06 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: You ask about generic methods for introducing alternate values for factors, and some of the other responses address this quite efficiently. However, a factor has meaning only within one vector at a time, since another vector may have additional values or missing values relative to the first vector. For example, you used the sample function which is not guaranteed to select at least one of each of the four letters in L4. Or, what if the data has values the mapping doesn't address? For any work in which I am dealing with categorical data in multiple places (e.g. your d data frame and whatever data structure you use to define your mapping) I prefer NOT to work with factors until all of my categories of data are moved into one vector (typically a column in a data frame). Rather, I work with character vectors during the data manipulation phase and only convert to factor when I start analyzing or displaying the data. With this in mind, I use a general flow something like: d - data.frame( x = 1, y = 1:10, fac = fac, stringsAsFactors=FALSE ) mp - data.frame( fac=LETTERS[1:4], value=c(8,11,3,2) ) d2 - merge( d, mp, all.x=TRUE ) d2$fac - factor( d2$fac ) # optional If you actually are in the analysis phase and are not pulling data from multiple external sources, then you may have already confirmed the completeness and range of values you have to work with then one of the other more efficient methods may still be a better choice for this specific task. Hadley Wickham's tidy data [1] principles address this concern more thoroughly than I have. [1] Google this phrase... paper seems to be a work in progress. On Thu, 17 Jul 2014, Gang Chen wrote: Suppose I have the following dataframe: L4 - LETTERS[1:4] fac - sample(L4, 10, replace = TRUE) (d - data.frame(x = 1, y = 1:10, fac = fac)) x y fac 1 1 1 B 2 1 2 B 3 1 3 D 4 1 4 A 5 1 5 C 6 1 6 D 7 1 7 C 8 1 8 B 9 1 9 B 10 1 10 B I'd like to add another column 'var' that is defined based on the following mapping of column 'fac': A - 8 B - 11 C - 3 D - 2 How can I achieve this in an elegant way (with a generic approach for any length)? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display a dataframe
Perfect! Thanks a lot! On July 3, 2014 5:10:02 PM EDT, David L Carlson dcarl...@tamu.edu wrote: Not elegant, but it works: term - dimnames(dd)[[1]] dd1 - dd dimnames(dd1)[[1]] - rep(, 6) dd2 - capture.output(dd1) cat(paste(dd2,, c(Term, term)), fill=48) # Chisq DF Pr(Chisq) Term 153.0216306 1 7.578366e-35 # Sex 13.3696538 1 5.114571e-04 # Volume 0.8476713 1 7.144239e-01 # Weight 1.2196050 1 5.388764e-01 # Intensity 2.6349405 1 2.090719e-01 # ISO 6.0507714 1 2.780045e-02 # SEC David Carlson -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Gang Chen Sent: Thursday, July 3, 2014 2:56 PM To: r-help Subject: [R] Display a dataframe I have a matrix 'dd' defined as below: dd - t(matrix(c(153.0216306, 1, 7.578366e-35, 13.3696538, 1, 5.114571e-04, 0.8476713, 1, 7.144239e-01, 1.2196050, 1, 5.388764e-01, 2.6349405, 1, 2.090719e-01, 6.0507714, 1, 2.780045e-02), nrow=3, ncol=6)) dimnames(dd)[[2]] - c('# Chisq', 'DF', 'Pr(Chisq)') dimnames(dd)[[1]] - c('# Sex', '# Volume', '# Weight', '# Intensity', '# ISO', '# SEC') 'dd' displays as the following: # Chisq DF Pr(Chisq) # Sex 153.0216306 1 7.578366e-35 # Volume 13.3696538 1 5.114571e-04 # Weight 0.8476713 1 7.144239e-01 # Intensity 1.2196050 1 5.388764e-01 # ISO 2.6349405 1 2.090719e-01 # SEC 6.0507714 1 2.780045e-02 I would like to display it as: # Chisq DF Pr(Chisq)term 153.0216306 1 7.578366e-35# Sex 13.3696538 1 5.114571e-04 # Volume 0.8476713 1 7.144239e-01# Weight 1.2196050 1 5.388764e-01# Intensity 2.6349405 1 2.090719e-01# ISO 6.0507714 1 2.780045e-02# SEC This is what I came up with (cc - data.frame(data.frame(dd), term=dimnames(dd)[[1]])) X..Chisq DF Pr..Chisq.term # Sex 153.0216306 1 7.578366e-35 # Sex # Volume 13.3696538 1 5.114571e-04# Volume # Weight 0.8476713 1 7.144239e-01# Weight # Intensity 1.2196050 1 5.388764e-01 # Intensity # ISO 2.6349405 1 2.090719e-01 # ISO # SEC 6.0507714 1 2.780045e-02 # SEC But I'm not happy with the following two issues: 1) How to get rid of the row names? 2) The special characters of #, (, ,) in the column names are not displayed correctly. Any suggestions? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display a dataframe
I really your kind help! This is exactly what I was looking for except that I need to get rid of the numbered row names. On July 3, 2014 9:57:00 PM EDT, arun smartpink...@yahoo.com wrote: Hi, May be this helps: nC - max(nchar(row.names(dd)))  term - formatC(row.names(dd), width=-nC) #or  term - sprintf(%-11s, row.names(dd))  dd1 - setNames(data.frame(unname(dd), term,stringsAsFactors=F), c(colnames(dd), formatC(term,width=-nC))) dd1 #     # Chisq DF  Pr(Chisq) term      #1 153.0216306 1 7.578366e-35 # Sex     #2 13.3696538 1 5.114571e-04 # Volume  #3  0.8476713 1 7.144239e-01 # Weight  #4  1.2196050 1 5.388764e-01 # Intensity #5  2.6349405 1 2.090719e-01 # ISO     #6  6.0507714 1 2.780045e-02 # SEC     A.K. On Thursday, July 3, 2014 3:57 PM, Gang Chen gangch...@gmail.com wrote: I have a matrix 'dd' defined as below: dd - t(matrix(c(153.0216306, 1, 7.578366e-35, 13.3696538, 1, 5.114571e-04, 0.8476713, 1, 7.144239e-01, 1.2196050, 1, 5.388764e-01, 2.6349405, 1, 2.090719e-01, 6.0507714, 1, 2.780045e-02), nrow=3, ncol=6)) dimnames(dd)[[2]] - c('# Chisq', 'DF', 'Pr(Chisq)') dimnames(dd)[[1]] - c('# Sex', '# Volume', '# Weight', '# Intensity', '# ISO', '# SEC') 'dd' displays as the following:         # Chisq DF Pr(Chisq) # Sex   153.0216306 1 7.578366e-35 # Volume  13.3696538 1 5.114571e-04 # Weight   0.8476713 1 7.144239e-01 # Intensity 1.2196050 1 5.388764e-01 # ISO    2.6349405 1 2.090719e-01 # SEC    6.0507714 1 2.780045e-02 I would like to display it as: # Chisq       DF Pr(Chisq)            term 153.0216306 1 7.578366e-35              # Sex 13.3696538 1 5.114571e-04               # Volume 0.8476713 1 7.144239e-01                # Weight 1.2196050 1 5.388764e-01                # Intensity 2.6349405 1 2.090719e-01                # ISO 6.0507714 1 2.780045e-02                # SEC This is what I came up with (cc - data.frame(data.frame(dd), term=dimnames(dd)[[1]]))        X..Chisq DF Pr..Chisq.    term # Sex   153.0216306 1 7.578366e-35   # Sex # Volume  13.3696538 1 5.114571e-04  # Volume # Weight   0.8476713 1 7.144239e-01  # Weight # Intensity 1.2196050 1 5.388764e-01 # Intensity # ISO    2.6349405 1 2.090719e-01   # ISO # SEC    6.0507714 1 2.780045e-02   # SEC But I'm not happy with the following two issues: 1) How to get rid of the row names? 2) The special characters of #, (, ,) in the column names are not displayed correctly. Any suggestions? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Display a dataframe
I have a matrix 'dd' defined as below: dd - t(matrix(c(153.0216306, 1, 7.578366e-35, 13.3696538, 1, 5.114571e-04, 0.8476713, 1, 7.144239e-01, 1.2196050, 1, 5.388764e-01, 2.6349405, 1, 2.090719e-01, 6.0507714, 1, 2.780045e-02), nrow=3, ncol=6)) dimnames(dd)[[2]] - c('# Chisq', 'DF', 'Pr(Chisq)') dimnames(dd)[[1]] - c('# Sex', '# Volume', '# Weight', '# Intensity', '# ISO', '# SEC') 'dd' displays as the following: # Chisq DF Pr(Chisq) # Sex 153.0216306 1 7.578366e-35 # Volume 13.3696538 1 5.114571e-04 # Weight 0.8476713 1 7.144239e-01 # Intensity 1.2196050 1 5.388764e-01 # ISO 2.6349405 1 2.090719e-01 # SEC 6.0507714 1 2.780045e-02 I would like to display it as: # Chisq DF Pr(Chisq)term 153.0216306 1 7.578366e-35# Sex 13.3696538 1 5.114571e-04 # Volume 0.8476713 1 7.144239e-01# Weight 1.2196050 1 5.388764e-01# Intensity 2.6349405 1 2.090719e-01# ISO 6.0507714 1 2.780045e-02# SEC This is what I came up with (cc - data.frame(data.frame(dd), term=dimnames(dd)[[1]])) X..Chisq DF Pr..Chisq.term # Sex 153.0216306 1 7.578366e-35 # Sex # Volume 13.3696538 1 5.114571e-04# Volume # Weight 0.8476713 1 7.144239e-01# Weight # Intensity 1.2196050 1 5.388764e-01 # Intensity # ISO 2.6349405 1 2.090719e-01 # ISO # SEC 6.0507714 1 2.780045e-02 # SEC But I'm not happy with the following two issues: 1) How to get rid of the row names? 2) The special characters of #, (, ,) in the column names are not displayed correctly. Any suggestions? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Any refit function available for 'car' package?
Suppose that I need to run a multivariate linear model Y = X B + E many times with the same model matrix X but each time with different response matrix Y. Is there a function available in 'car' package similar to refit() in lme4 package so that the model matrix X would not be reassembled each time? Also, runtime can be saved without repeatedly performing the same matrix computations such as (X'X)^(-1)X'. Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Change factor levels
Suppose I have a dataframe 'd' defined as L3 - LETTERS[1:3] d0 - data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace = TRUE)) (d - d0[d0$fac %in% c('A', 'B'),]) x y fac 2 1 2 B 3 1 3 A 4 1 4 A 5 1 5 A 6 1 6 B 8 1 8 A Even though factor 'fac' in 'd' only has 2 levels, but it seems to bear the birthmark of 3 levels from its parent 'd0': str(d) 'data.frame': 6 obs. of 3 variables: $ x : num 1 1 1 1 1 1 $ y : num 2 3 4 5 6 8 $ fac: Factor w/ 3 levels A,B,C: 2 1 1 1 2 1 How can I cut the umbilical cord so that factor 'fac' in 'd' would have an accurate birth certificate with the correct number of levels? Apparently the following does not work: levels(d$fac) - c('A', 'B') Also any reason for this heritage? Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change factor levels
Thanks both Uwe and Daniel for the great help! On Sat, Dec 14, 2013 at 3:30 PM, Dániel Kehl ke...@ktk.pte.hu wrote: Dear Gang, this seem to solve your problem. http://stackoverflow.com/questions/1195826/dropping-factor-levels-in-a-subsetted-data-frame-in-r best daniel Feladó: r-help-boun...@r-project.org [r-help-boun...@r-project.org] ; meghatalmaz#243;: Gang Chen [gangch...@gmail.com] Küldve: 2013. december 14. 21:09 To: r-help Tárgy: [R] Change factor levels Suppose I have a dataframe 'd' defined as L3 - LETTERS[1:3] d0 - data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace = TRUE)) (d - d0[d0$fac %in% c('A', 'B'),]) x y fac 2 1 2 B 3 1 3 A 4 1 4 A 5 1 5 A 6 1 6 B 8 1 8 A Even though factor 'fac' in 'd' only has 2 levels, but it seems to bear the birthmark of 3 levels from its parent 'd0': str(d) 'data.frame': 6 obs. of 3 variables: $ x : num 1 1 1 1 1 1 $ y : num 2 3 4 5 6 8 $ fac: Factor w/ 3 levels A,B,C: 2 1 1 1 2 1 How can I cut the umbilical cord so that factor 'fac' in 'd' would have an accurate birth certificate with the correct number of levels? Apparently the following does not work: levels(d$fac) - c('A', 'B') Also any reason for this heritage? Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dataframe manipulation
Suppose I have a dataframe defined as L3 - LETTERS[1:3] (d - data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace = TRUE))) x y fac 1 1 1 C 2 1 2 A 3 1 3 B 4 1 4 C 5 1 5 B 6 1 6 B 7 1 7 A 8 1 8 A 9 1 9 B 10 1 10 A I want to extract those rows that are the first occurrences for each level of factor 'fac', which are basically the first three rows above. How can I achieve that? The real dataframe is more complicated than the example above, and I can't simply list all the levels of factor 'fac' by exhaustibly listing all the levels like the following d[d$fac=='A' | d$fac=='B' | d$fac=='C', ] Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe manipulation
Perfect! Thanks a lot, A.K! On Fri, Dec 13, 2013 at 4:21 PM, arun smartpink...@yahoo.com wrote: Hi, Try: d[match(unique(d$fac),d$fac),] A.K. On Friday, December 13, 2013 4:17 PM, Gang Chen gangch...@gmail.com wrote: Suppose I have a dataframe defined as L3 - LETTERS[1:3] (d - data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace = TRUE))) x y fac 1 1 1 C 2 1 2 A 3 1 3 B 4 1 4 C 5 1 5 B 6 1 6 B 7 1 7 A 8 1 8 A 9 1 9 B 10 1 10 A I want to extract those rows that are the first occurrences for each level of factor 'fac', which are basically the first three rows above. How can I achieve that? The real dataframe is more complicated than the example above, and I can't simply list all the levels of factor 'fac' by exhaustibly listing all the levels like the following d[d$fac=='A' | d$fac=='B' | d$fac=='C', ] Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe manipulation
Another neat solution! Thanks a lot, Sarah! On Fri, Dec 13, 2013 at 4:35 PM, Sarah Goslee sarah.gos...@gmail.comwrote: What about: lapply(levels(d$fac), function(x)head(d[d$fac == x,], 1)) Thanks for the reproducible example. If you put set.seed(123) before the call to sample, then everyone who tries it will get the same data frame d. Sarah On Fri, Dec 13, 2013 at 4:15 PM, Gang Chen gangch...@gmail.com wrote: Suppose I have a dataframe defined as L3 - LETTERS[1:3] (d - data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace = TRUE))) x y fac 1 1 1 C 2 1 2 A 3 1 3 B 4 1 4 C 5 1 5 B 6 1 6 B 7 1 7 A 8 1 8 A 9 1 9 B 10 1 10 A I want to extract those rows that are the first occurrences for each level of factor 'fac', which are basically the first three rows above. How can I achieve that? The real dataframe is more complicated than the example above, and I can't simply list all the levels of factor 'fac' by exhaustibly listing all the levels like the following d[d$fac=='A' | d$fac=='B' | d$fac=='C', ] Thanks, Gang -- Sarah Goslee http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to capture the printout on the screen?
This is most likely a silly question. First I run the following: require(car) mod.ok - lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5, fup.1, fup.2, fup.3, fup.4, fup.5) ~ treatment*gender, data=OBrienKaiser) phase - factor(rep(c(pretest, posttest, followup), c(5, 5, 5)), levels=c(pretest, posttest, followup)) hour - ordered(rep(1:5, 3)) idata - data.frame(phase, hour) (fm - linearHypothesis(mod.ok, c(treatment1, treatment2), idata=idata, idesign=~phase*hour, iterms=phase:hour)) I get the output like the following (only the last part at the end is shown below): ... Multivariate Tests: Df test stat approx F num Df den Df Pr(F) Pillai2 0.6623840 0.2475987 16 8 0.99155 Wilks 2 0.4460357 0.1864957 16 6 0.99668 Hotelling-Lawley 2 0.9988990 0.1248624 16 4 0.99904 Roy 2 0.5792977 0.2896488 8 4 0.93605 I want to capture the data frame shown above, but it's not part of 'fm': str(fm) I guess the print method for an object of class Anova.mlm (print.Anova.mlm) generates the above data frame. My question is: How to capture it? I mean, how can I store the output on the screen and then extract the data frame? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to capture the printout on the screen?
Thanks a lot for the pointer! After I downloaded the source code and saw the innards of the print function, I know what to do now. Thanks again, Gang On Wed, Jul 17, 2013 at 6:00 PM, Greg Snow 538...@gmail.com wrote: If you are happy with the character strings printed out then you can use the capture.output function. If you want the numbers (without needing to convert) then look at the print method for the class of object that you are working with and see what the code is. It is probably calling another function to produce the output and you could just call that function directly. If not, you can copy that section of the code into your own function to call and have it return the object rather than printing. On Wed, Jul 17, 2013 at 3:38 PM, Gang Chen gangch...@gmail.com wrote: This is most likely a silly question. First I run the following: require(car) mod.ok - lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5, fup.1, fup.2, fup.3, fup.4, fup.5) ~ treatment*gender, data=OBrienKaiser) phase - factor(rep(c(pretest, posttest, followup), c(5, 5, 5)), levels=c(pretest, posttest, followup)) hour - ordered(rep(1:5, 3)) idata - data.frame(phase, hour) (fm - linearHypothesis(mod.ok, c(treatment1, treatment2), idata=idata, idesign=~phase*hour, iterms=phase:hour)) I get the output like the following (only the last part at the end is shown below): ... Multivariate Tests: Df test stat approx F num Df den Df Pr(F) Pillai2 0.6623840 0.2475987 16 8 0.99155 Wilks 2 0.4460357 0.1864957 16 6 0.99668 Hotelling-Lawley 2 0.9988990 0.1248624 16 4 0.99904 Roy 2 0.5792977 0.2896488 8 4 0.93605 I want to capture the data frame shown above, but it's not part of 'fm': str(fm) I guess the print method for an object of class Anova.mlm (print.Anova.mlm) generates the above data frame. My question is: How to capture it? I mean, how can I store the output on the screen and then extract the data frame? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Puzzling Execution halted
I wrote an R program that does heavy computations with hundreds of lines of code. It's running fine both interactively and in batch mode on a Mac OS X computer. The program also has no problem running on a Linux system (Fedora 14) interactively. However, when I try it on the terminal in batch mode on the Linux system, it chokes in the middle of the computation with the Execution halted error message. I already put try or tryCatch in those places where computation may throw an error. And the warnings are set in default (options(warn=1)). I wish I could provide the code for help, but that seems impractical. How to debug this? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Puzzling Execution halted
Thanks for the pointer! Unfortunately it doesn't look like a memory issue: /var/log/messages contains nothing about memory problems. And the Linux system has enough RAM for this computation. Gang On Tue, Nov 20, 2012 at 1:03 PM, Milan Bouchet-Valat nalimi...@club.fr wrote: Le mardi 20 novembre 2012 à 12:54 -0500, Gang Chen a écrit : I wrote an R program that does heavy computations with hundreds of lines of code. It's running fine both interactively and in batch mode on a Mac OS X computer. The program also has no problem running on a Linux system (Fedora 14) interactively. However, when I try it on the terminal in batch mode on the Linux system, it chokes in the middle of the computation with the Execution halted error message. I already put try or tryCatch in those places where computation may throw an error. And the warnings are set in default (options(warn=1)). I wish I could provide the code for help, but that seems impractical. How to debug this? Have a look at your /var/log/messages: it might well be the out-of-memory killer killing your R process because it eats too much RAM... My two cents __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package for comparing sensitivity, specificity, PPV, NPV, and accuracy?
Hi, I have two sets of sensitivity, specificity, positive predictive value, and negative predictive value, and accuracy from two tests on the same subjects. Is there an R package that does such paired comparisons? Thanks, Gang Chen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Setting up infile for R CMD BATCH
Hi Elai, yes, the approach works out pretty well. Thanks a lot for spending time on this and for the great help! Gang On Wed, Feb 8, 2012 at 5:43 PM, ilai ke...@math.montana.edu wrote: I'm going to declare this SOLVED. Yes, if you don't want a separate script for batch, you will need to modify the original script so it either readline or skips it. Here is an example: # Save in file myTest.R # Add this local function to the beginning of your original program and replace all the readline prompts with myreadline. # The new function takes on two arguments: # what: the original object (in your example it was type-...) # prompt: The same string from readline # All it is doing is searching for Answers.R, sourcing if available or prompting if not. myreadline - function(what,prompt){ ans - length(grep('Answers.R',list.files(),fixed=T))0 # add a warning for multiple files if(ans) source('Answers.R') else{ ret - as.integer(readline(prompt)) assign(what,ret,1) } } # here is an interactive program to square only negative values print(x - rnorm(1)) myreadline('type','x0 ? 1:T,2:F \n') print(x^type) ### END myTest.R Running myTest interactivly: source('myTest.R') [1] -0.3712215 x0 ? 1:T,2:F 2 [1] 0.1378054 # -.37^2 source('myTest.R') [1] 0.3160747 x0 ? 1:T,2:F 1 [1] 0.3160747 # .316^1 # Create a list of answers dump('type',file='Answers.R') # run again this time with answers available source('myTest.R') [1] -1.088665 # skips prompt [1] -1.088665 # -1.088^1 (type in Answer.R ==1) # Now you can also run as batch $ R CMD BATCH myTest.R out.R $ cat out.R # ... print(x - rnorm(1)) [1] -1.248487 myreadline('type','x0 ? 1:T,2:F \n') print(x^type) [1] -1.248487 That's it. Only thing is to keep the file names in the working directory straight. Enjoy On Wed, Feb 8, 2012 at 12:39 PM, Gang Chen gangch...@gmail.com wrote: Sorry Elai for the confusions. Let me try to reframe my predicament. The main program myTest.R has been written in interactive mode with many readline() lines embedded. Suppose a user has already run the program once before in interactive mode with all the answers saved in a text file called answer.R. Now s/he does not want to go through the interactive again because it's a tedious process, and would like to run the program but in batch mode with answer.R. And that's why I tried the following which didn't pan out: R CMD BATCH answer.R output.Rout Of couse I could rewrite a separate program specifically for batch mode as you suggested previously with eval(), for example. However, is there an approach to keeping the original program so that the user could run both interactive and batch mode? That probably requires modifying the readline() part, but how? Thanks, Gang On Wed, Feb 8, 2012 at 2:08 PM, ilai ke...@math.montana.edu wrote: Gang, Maybe someone here has a different take on things. I'm afraid I have no more insights on this unless you explain exactly what you are trying to achieve, or more importantly why? That may help understand what the problem really is. Do you want to save an interactive session for future runs? then ?save.image and all your answers are in the environment. In this case consider putting an if(!exists('type') | length(type)1 | is.na(type)) before type- readline(...) in your script so type wouldn't be overwritten in subsequent runs. If your goal is to batch evaluate multiple answer files from users (why else would you ask questions with readline?), then you should have enough to go on with my answer and the examples in ?eval. Elai On Wed, Feb 8, 2012 at 9:04 AM, Gang Chen gangch...@gmail.com wrote: Hi Elai, Thanks a lot for the suggestions! I really appreciate it... Your suggestion of using eval() and creating those answers in a list would work, but there is no alternative to readline() with which I could read the input in batch mode? I'm asking this because I'd like to have the program work in both interactive and batch mode. Thanks again, Gang On Wed, Feb 8, 2012 at 12:50 AM, ilai ke...@math.montana.edu wrote: Ahh, I think I'm getting it now. Well, readlines() is not going to work for you. The help file ?readline clearly states In non-interactive use the result is as if the response was RETURN and the value is ‘’. The implication is you cannot use it to insert different answers as if you were really there. How about using eval() instead? You will need to make the answers a named list (or just assigned objects). test - expression({ if(a2) print('+') else print('I got more') b - b+3 # reassign b in the environment print(b) print(c) d^2 }) dump('test',file='myTest.R') ; rm(test) # make the answers.R file: a=5 ; b=2 ; c=2 ; d=3 source(myTest.R) eval(test) # Now, from the terminal R CMD BATCH answers.R out.R # And here is my $ cat out.R ... flushed a=5 ; b=2 ; c=2 ; d=3
Re: [R] Setting up infile for R CMD BATCH
Sorry Elai for the confusions. Let me try to reframe my predicament. The main program myTest.R has been written in interactive mode with many readline() lines embedded. Suppose a user has already run the program once before in interactive mode with all the answers saved in a text file called answer.R. Now s/he does not want to go through the interactive again because it's a tedious process, and would like to run the program but in batch mode with answer.R. And that's why I tried the following which didn't pan out: R CMD BATCH answer.R output.Rout Of couse I could rewrite a separate program specifically for batch mode as you suggested previously with eval(), for example. However, is there an approach to keeping the original program so that the user could run both interactive and batch mode? That probably requires modifying the readline() part, but how? Thanks, Gang On Wed, Feb 8, 2012 at 2:08 PM, ilai ke...@math.montana.edu wrote: Gang, Maybe someone here has a different take on things. I'm afraid I have no more insights on this unless you explain exactly what you are trying to achieve, or more importantly why? That may help understand what the problem really is. Do you want to save an interactive session for future runs? then ?save.image and all your answers are in the environment. In this case consider putting an if(!exists('type') | length(type)1 | is.na(type)) before type- readline(...) in your script so type wouldn't be overwritten in subsequent runs. If your goal is to batch evaluate multiple answer files from users (why else would you ask questions with readline?), then you should have enough to go on with my answer and the examples in ?eval. Elai On Wed, Feb 8, 2012 at 9:04 AM, Gang Chen gangch...@gmail.com wrote: Hi Elai, Thanks a lot for the suggestions! I really appreciate it... Your suggestion of using eval() and creating those answers in a list would work, but there is no alternative to readline() with which I could read the input in batch mode? I'm asking this because I'd like to have the program work in both interactive and batch mode. Thanks again, Gang On Wed, Feb 8, 2012 at 12:50 AM, ilai ke...@math.montana.edu wrote: Ahh, I think I'm getting it now. Well, readlines() is not going to work for you. The help file ?readline clearly states In non-interactive use the result is as if the response was RETURN and the value is ‘’. The implication is you cannot use it to insert different answers as if you were really there. How about using eval() instead? You will need to make the answers a named list (or just assigned objects). test - expression({ if(a2) print('+') else print('I got more') b - b+3 # reassign b in the environment print(b) print(c) d^2 }) dump('test',file='myTest.R') ; rm(test) # make the answers.R file: a=5 ; b=2 ; c=2 ; d=3 source(myTest.R) eval(test) # Now, from the terminal R CMD BATCH answers.R out.R # And here is my $ cat out.R ... flushed a=5 ; b=2 ; c=2 ; d=3 source(myTest.R) eval(test) [1] + [1] 5 [1] 2 [1] 9 proc.time() user system elapsed 0.640 0.048 0.720 Would this work? Elai On Tue, Feb 7, 2012 at 4:05 PM, Gang Chen gangch...@gmail.com wrote: Suppose I create an R program called myTest.R with only one line like the following: type - as.integer(readline(input type (1: type1; 2: type2)? )) Then I'd like to run myTest.R in batch mode by constructing an input file called answers.R with the following: source(myTest.R) 1 When I ran the following at the terminal: R CMD BATCH answer.R output.Rout it failed to pick up the answer '1' from the 2nd line in answers.R as shown inside output.Rout: source(myTest.R) input type (0: quit; 1: type1; 2: type2)? 1 [1] 1 What am I missing here? Thanks in advance, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Setting up infile for R CMD BATCH
Suppose I create an R program called myTest.R with only one line like the following: type - as.integer(readline(input type (1: type1; 2: type2)? )) Then I'd like to run myTest.R in batch mode by constructing an input file called answers.R with the following: source(myTest.R) 1 When I ran the following at the terminal: R CMD BATCH answer.R output.Rout it failed to pick up the answer '1' from the 2nd line in answers.R as shown inside output.Rout: source(myTest.R) input type (0: quit; 1: type1; 2: type2)? 1 [1] 1 What am I missing here? Thanks in advance, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Setting up infile for R CMD BATCH
Thanks for the help. You're not missing anything. In your output.Rout: the 1 right after the source('test') is the 1 inputed from answers.R. the [1] 1 is the result of test. Remove the second line from answers.R and see what happens (hint: script ends after the readline prompt). That number '1' at the 2nd line was meant to be the answer for the readline() part, but apparently it does not worked as I intended. If the script ends right after the readline prompt as you suggested, my question is, how can I feed in this answer '1' in the infile 'answers.R'? Just out of curiosity, why will you use a script that requires user input (readlines) in batch mode ? I know this sounds contradictory, but my intention is that, in case the user has all the answers available from a previous run in interactive mode, s/he may try out the batch mode the next time. On Tue, Feb 7, 2012 at 4:05 PM, Gang Chen gangch...@gmail.com wrote: Suppose I create an R program called myTest.R with only one line like the following: type - as.integer(readline(input type (1: type1; 2: type2)? )) Then I'd like to run myTest.R in batch mode by constructing an input file called answers.R with the following: source(myTest.R) 1 When I ran the following at the terminal: R CMD BATCH answer.R output.Rout it failed to pick up the answer '1' from the 2nd line in answers.R as shown inside output.Rout: source(myTest.R) input type (0: quit; 1: type1; 2: type2)? 1 [1] 1 What am I missing here? Thanks in advance, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
Dennis, Yes, I was wrong! The variables should be: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F/J: within-subject factor (2 levels: F / J) 5) Time: within-subject factor (2 levels: 1 / 2) 6) Value So your mData3 is what I was looking for. Thanks a lot! Gang On Thu, Oct 6, 2011 at 9:09 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) I don't see how you can get all of 3-5 given the way your data is structured. The problem is that each column contains two of the three variables you want, but not all three. I can see a way to get Subj Group Ref Time F J S1 s Me 1 4 5 S1 s Me 2 3 6 S1 s She 1 6 10 S1 s She 2 6 9 or an 8 line version with Ref (4 Me, 4 She), Factor (F1, J1, F2, J2) repeated twice and the appropriate response vector, but not a way where you have three columns for Ref, F and J. For example, what is the 'J' for MeF1 or the F for SheJ2? With that, here are a few stabs using the reshape2 package. The first step is to do a little renaming of your data frame so that one can use the colsplit() function to generate a new set of variables. names(myData) - c(Group, Me_F_1, Me_J_1, She_F_1, She_J_1, Me_F_2, Me_J_2, She_F_2, She_J_2, Subj) library('plyr') library('reshape2') # collapses the eight columns to be reshaped into a factor named # variable with a corresponding variable named value mData - melt(myData, id = c('Subj', 'Group')) head(mData) # Split the original variables into three new columns, named # Ref, Var and Time, respectively: newvars - colsplit(mData$variable, '_', c('Ref', 'Var', 'Time')) # Append these to the melted data frame and remove 'variable' mData2 - cbind(mData, newvars)[, -3] # This comes closest to your original intent: mData3 - arrange(mData2, Subj, Ref, Var, Time) head(mData3, 8) Subj Group value Ref Var Time 1 S1 s 4 Me F 1 2 S1 s 3 Me F 2 3 S1 s 5 Me J 1 4 S1 s 6 Me J 2 5 S1 s 6 She F 1 6 S1 s 6 She F 2 7 S1 s 10 She J 1 8 S1 s 9 She J 2 # Some rearrangements to consider: mData4 - cast(mData3, Subj + Group + Ref + Time ~ Var, value_var = 'value') head(mData4, 4) Subj Group Ref Time F J 1 S1 s Me 1 4 5 2 S1 s Me 2 3 6 3 S1 s She 1 6 10 4 S1 s She 2 6 9 mData5 - cast(mData3, Subj + Group + Ref + Var ~ Time, value_var = 'value') head(mData5, 4) Subj Group Ref Var 1 2 1 S1 s Me F 4 3 2 S1 s Me J 5 6 3 S1 s She F 6 6 4 S1 s She J 10 9 If you like this one, it's probably a good idea to rename the last two columns 'Time1' and 'Time2' or something similar. HTH, Dennis On Thu, Oct 6, 2011 at 1:28 PM, Gang Chen gangch...@gmail.com wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) As this is the first time I'm learning such a conversion, could someone help me out? Many thanks, Gang myData Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj 1 s 4 5 6 10 3 6 6 9 S1 2 s 6 5 5 6 4 3 5 6 S2 3 s 7 4 6 5 7 4 5 3 S3 4 s 8 5 8 7 7 1 8 6 S4 5 s 10 6 4 7 9 6 4 6 S5 6 s 5 2 4 7 4 1 4 2 S6 7 s 13 2 10 4 11 2 4 3 S7 8 s 8 1 3 11 6 0 3 10 S8 9 s 6 9 5 8 6 8 5 6 S9 10 s 14 5 6 10 13 5 5 10 S10 11 s 15 2 18 2 14 1 18 2 S11 12 s 6 9 4 9 5 11 3 8 S12 13 s 5 5 0 12 4 3 0 8 S13 14 s 5 6 4 9 4 6 2 6 S14 15 s 14 5 12 3 12 3 11 3 S15 16 s 7 2 11 3 5 2 10 2 S16 17 s 1 7 4 5 1 6 3 5 S17 18 s 6 2 7 4 6 2 7
Re: [R] Wide to long form conversion
Jim, I really appreciate your help! I like the power of rep_n_stack, but how can I use rep_n_stack to get the following result? Subj Group value Ref Var Time 1S1 s 4 Me F1 2S1 s 3 Me F2 3S1 s 5 Me J1 4S1 s 6 Me J2 5S1 s 6 She F1 6S1 s 6 She F2 7S1 s10 She J1 8S1 s 9 She J2 On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon j...@bitwrit.com.au wrote: On 10/07/2011 07:28 AM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) Hi Gang, I don't know whether this is the format you want, but: library(prettyR) rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
David, thanks a lot for the code! I've learned quite a bit from all the generous help... Gang On Fri, Oct 7, 2011 at 1:37 PM, David Winsemius dwinsem...@comcast.net wrote: On Oct 7, 2011, at 1:30 PM, David Winsemius wrote: On Oct 7, 2011, at 7:40 AM, Gang Chen wrote: Jim, I really appreciate your help! I like the power of rep_n_stack, but how can I use rep_n_stack to get the following result? Subj Group value Ref Var Time 1 S1 s 4 Me F 1 2 S1 s 3 Me F 2 3 S1 s 5 Me J 1 4 S1 s 6 Me J 2 5 S1 s 6 She F 1 6 S1 s 6 She F 2 7 S1 s 10 She J 1 8 S1 s 9 She J 2 I was not able to construct a one step solution with `reshape` that will contains all the columns. You can do it in about 4 steps by first making the data long and then adding annotation columns. Using just rows 1 and 26 you might get: reshape(myData[c(1,26), ], idvar=c(Group,Subj), direction=long, varying=2:9, v.names=c(value) ) Group Subj time value s.S1.1 s S1 1 4 w.S26.1 w S26 1 5 s.S1.2 s S1 2 5 w.S26.2 w S26 2 9 s.S1.3 s S1 3 6 w.S26.3 w S26 3 4 s.S1.4 s S1 4 10 w.S26.4 w S26 4 7 s.S1.5 s S1 5 3 w.S26.5 w S26 5 3 s.S1.6 s S1 6 6 w.S26.6 w S26 6 7 s.S1.7 s S1 7 6 w.S26.7 w S26 7 3 s.S1.8 s S1 8 9 w.S26.8 w S26 8 5 The 'time' variable is not really what you wanted but refers to the sequence along the original wide column names You can add the desired Ref, Var and Time columms with these constructions: str(times-rep(c(1,2), length=nrow(myData)*8 ) ) num [1:408] 1 2 1 2 1 2 1 2 1 2 ... str(times-rep(c(F,J), each=2, length=nrow(myData)*8 ) ) chr [1:408] F F J J F F J J F F ... str(times-rep(c(Me,She), each=4, length=nrow(myData)*8 ) ) chr [1:408] Me Me Me Me She She She She ... It occured to me that the ordering operation probably should have preceded teh ancillary column creation so this method is tested: longData - reshape(myData, idvar=c(Group,Subj), direction=long, #fixed the direction argument varying=2:9, v.names=c(value) ) longData - longData[order(longData$Subj), ] longData$Time - rep(c(1,2), length=nrow(myData)*8 ) longData$Var - rep(c(F,J), each=2, length=nrow(myData)*8 ) longData$Ref - rep(c(Me,She), each=4, length=nrow(myData)*8 ) Group Subj time value Time Var Ref s.S1.1 s S1 1 4 1 F Me s.S1.2 s S1 2 5 2 F Me s.S1.3 s S1 3 6 1 J Me s.S1.4 s S1 4 10 2 J Me s.S1.5 s S1 5 3 1 F She s.S1.6 s S1 6 6 2 F She s.S1.7 s S1 7 6 1 J She s.S1.8 s S1 8 9 2 J She Looking at Jim Lemon's response, I think he just misinterpreted the structure of your data but gave you a perfectly usable response. You could have done much the same thing with a minor modification: str(rep_n_stack(myData,matrix(c(2,3,6,7,4,5,8,9),nrow=1,byrow=TRUE))) 'data.frame': 408 obs. of 4 variables: $ Group : Factor w/ 2 levels s,w: 1 1 1 1 1 1 1 1 1 1 ... $ Subj : Factor w/ 51 levels S1,S10,S11,..: 1 12 23 34 45 48 49 50 51 2 ... $ group1: Factor w/ 8 levels Me.F.1,Me.F.2,..: 1 1 1 1 1 1 1 1 1 1 ... $ value1: int 4 6 7 8 10 5 13 8 6 14 ... Now you can just split apart the 'group1' column with sub() to make the three specified columns. Lemon's method has the advantage that it properly carries along the column information -- David. On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon j...@bitwrit.com.au wrote: On 10/07/2011 07:28 AM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) Hi Gang, I don't know whether this is the format you want, but: library(prettyR) rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE)) Jim David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Wide to long form conversion
I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) As this is the first time I'm learning such a conversion, could someone help me out? Many thanks, Gang myData Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj 1 s45 61036 6 9 S1 2 s65 5 643 5 6 S2 3 s74 6 574 5 3 S3 4 s85 8 771 8 6 S4 5 s 106 4 796 4 6 S5 6 s52 4 741 4 2 S6 7 s 13210 4 112 4 3 S7 8 s81 31160 310 S8 9 s69 5 868 5 6 S9 10 s 145 610 135 510 S10 11 s 15218 2 14118 2 S11 12 s69 4 95 11 3 8 S12 13 s55 01243 0 8 S13 14 s56 4 946 2 6 S14 15 s 14512 3 12311 3 S15 16 s7211 35210 2 S16 17 s17 4 516 3 5 S17 18 s62 7 462 7 4 S18 19 s94 8 5 104 6 3 S19 20 s82 6 592 6 4 S20 21 s65 5 766 5 5 S21 22 s88 3 767 5 3 S22 23 s 114 6 711 6 4 S23 24 s63 2 464 2 2 S24 25 s44 6 623 4 6 S25 26 w59 4 737 3 5 S26 27 w76 3 541 0 4 S27 28 w 10414 28410 2 S28 29 w97 5 684 5 3 S29 30 w92 7 562 6 5 S30 31 w67 6 765 5 8 S31 32 w7612 76310 7 S32 33 w 123 8 9 113 4 7 S33 34 w 12210 592 6 3 S34 35 w6310 453 5 3 S35 36 w93 9 963 7 8 S36 37 w5 11 7 74 11 3 4 S37 38 w74 4 673 1 5 S38 39 w65 1 833 0 8 S39 40 w 10310 273 7 2 S40 41 w1 11 7 518 4 3 S41 42 w 105 610 104 3 9 S42 43 w63 9 242 6 0 S43 44 w9511 454 7 3 S44 45 w85 6 384 2 3 S45 46 w84 8 741 2 6 S46 47 w 122 6 2 101 5 2 S47 48 w 106 9 875 7 8 S48 49 w 13615 1 12414 0 S49 50 w78 11247 111 S50 51 w 123 9 491 7 4 S51 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wide to long form conversion
Thanks for the pointer! I still couldn't figure out how to convert my data because the example at stackoverflow seems to have only one variable (Year) while there are three within-subject variables plus one between-subjects variable. Any further help? Gang On Thu, Oct 6, 2011 at 4:34 PM, Andrew Miles rstuff.mi...@gmail.com wrote: Take a look here. http://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format Andrew Miles Department of Sociology Duke University On Oct 6, 2011, at 4:28 PM, Gang Chen wrote: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F: within-subject factor (2 levels: F1 / F2) 5) J: within-subject factor (2 levels: J1 / J2) As this is the first time I'm learning such a conversion, could someone help me out? Many thanks, Gang myData Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj 1 s 4 5 6 10 3 6 6 9 S1 2 s 6 5 5 6 4 3 5 6 S2 3 s 7 4 6 5 7 4 5 3 S3 4 s 8 5 8 7 7 1 8 6 S4 5 s 10 6 4 7 9 6 4 6 S5 6 s 5 2 4 7 4 1 4 2 S6 7 s 13 2 10 4 11 2 4 3 S7 8 s 8 1 3 11 6 0 3 10 S8 9 s 6 9 5 8 6 8 5 6 S9 10 s 14 5 6 10 13 5 5 10 S10 11 s 15 2 18 2 14 1 18 2 S11 12 s 6 9 4 9 5 11 3 8 S12 13 s 5 5 0 12 4 3 0 8 S13 14 s 5 6 4 9 4 6 2 6 S14 15 s 14 5 12 3 12 3 11 3 S15 16 s 7 2 11 3 5 2 10 2 S16 17 s 1 7 4 5 1 6 3 5 S17 18 s 6 2 7 4 6 2 7 4 S18 19 s 9 4 8 5 10 4 6 3 S19 20 s 8 2 6 5 9 2 6 4 S20 21 s 6 5 5 7 6 6 5 5 S21 22 s 8 8 3 7 6 7 5 3 S22 23 s 11 4 6 7 1 1 6 4 S23 24 s 6 3 2 4 6 4 2 2 S24 25 s 4 4 6 6 2 3 4 6 S25 26 w 5 9 4 7 3 7 3 5 S26 27 w 7 6 3 5 4 1 0 4 S27 28 w 10 4 14 2 8 4 10 2 S28 29 w 9 7 5 6 8 4 5 3 S29 30 w 9 2 7 5 6 2 6 5 S30 31 w 6 7 6 7 6 5 5 8 S31 32 w 7 6 12 7 6 3 10 7 S32 33 w 12 3 8 9 11 3 4 7 S33 34 w 12 2 10 5 9 2 6 3 S34 35 w 6 3 10 4 5 3 5 3 S35 36 w 9 3 9 9 6 3 7 8 S36 37 w 5 11 7 7 4 11 3 4 S37 38 w 7 4 4 6 7 3 1 5 S38 39 w 6 5 1 8 3 3 0 8 S39 40 w 10 3 10 2 7 3 7 2 S40 41 w 1 11 7 5 1 8 4 3 S41 42 w 10 5 6 10 10 4 3 9 S42 43 w 6 3 9 2 4 2 6 0 S43 44 w 9 5 11 4 5 4 7 3 S44 45 w 8 5 6 3 8 4 2 3 S45 46 w 8 4 8 7 4 1 2 6 S46 47 w 12 2 6 2 10 1 5 2 S47 48 w 10 6 9 8 7 5 7 8 S48 49 w 13 6 15 1 12 4 14 0 S49 50 w 7 8 1 12 4 7 1 11 S50 51 w 12 3 9 4 9 1 7 4 S51 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Saving a graph
I read somewhere that vector graphics such as eps or dpf are more favorable than alternatives (jpeg, bmp or png) for publication because vector graphics scale properly when enlarged. However, my problem is that the file generated from a graph of fixed size is too large (in the order of 10MB) because of many data points in multiple scatterplots. Any suggestions? Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confused about a warning message
I define the following function to convert a t-value with degrees of freedom DF to another t-value with different degrees of freedom fullDF: tConvert - function(tval, DF, fullDF) ifelse(DF=1, qt(pt(tval, DF), fullDF), 0) It works as expected with the following case: tConvert(c(2,3), c(10,12), 12) [1] 1.961905 3.00 However, it gives me warning for the example below although the output is still as intended: tConvert(c(2,3), c(0,12), 12) [1] 0 3 Warning message: In pt(q, df, lower.tail, log.p) : NaNs produced I'm confused about the warning especially considering the fact that the following works correctly without such warning: tConvert(2, 0, 12) [1] 0 What am I missing? Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confused about a warning message
Thanks for the help! Are you sure R version plays a role in this case? My R version is 2.13.0 Your suggestion prompted me to look into the help content of ifelse, and a similar example exists there: x - c(6:-4) sqrt(x) #- gives warning sqrt(ifelse(x = 0, x, NA)) # no warning ## Note: the following also gives the warning ! ifelse(x = 0, sqrt(x), NA) Based on the above example, now I have a solution for my situation: tConvert2 - function(tval, DF, fullDF) qt(pt(ifelse(DF=1, tval, 0), ifelse(DF=1, DF, 1)), fullDF) tConvert2(c(2,3), c(0,12), 12) [1] 0 3 However, I feel my solution is a little kludged. Any better idea? Thanks, Gang On Thu, Jul 7, 2011 at 9:04 PM, David Winsemius dwinsem...@comcast.netwrote: On Jul 7, 2011, at 8:52 PM, David Winsemius wrote: On Jul 7, 2011, at 8:47 PM, Gang Chen wrote: I define the following function to convert a t-value with degrees of freedom DF to another t-value with different degrees of freedom fullDF: tConvert - function(tval, DF, fullDF) ifelse(DF=1, qt(pt(tval, DF), fullDF), 0) It works as expected with the following case: tConvert(c(2,3), c(10,12), 12) [1] 1.961905 3.00 However, it gives me warning for the example below although the output is still as intended: tConvert(c(2,3), c(0,12), 12) [1] 0 3 Warning message: In pt(q, df, lower.tail, log.p) : NaNs produced I'm confused about the warning especially considering the fact that the following works correctly without such warning: tConvert(2, 0, 12) [1] 0 What am I missing? The fact that ifelse evaluates both sides of the consequent and alternative. I also think you should update yur R to the most recent version since a current version does not issue that warning. -- David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] non-recursive SVAR model with vars
Hi, I have a question about SVAR modeling with the package vars. How does it handle the situation where the A (structural) matrix has a non-recursive structure in the SVAR model? In other words, what kind of algorithm does vars adopt to deal with the unidentifiable issue in a non-recursive model? Thank you, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert dataframe with two factors from wide to long format
I know how to convert a simple dataframe from wide to long format with one varying factor. However, for a dataset with two factors like the following, Subj T1_Cond1 T1_Cond2 T2_Cond1 T2_Cond2 1 0.125869 4.108232 1.099392 5.556614 2 1.427940 2.170026 0.120748 1.176353 How to elegantly convert to a long form as Subj Time Cond Value 1 11 0.125869 1 12 4.108232 1 21 1.099392 ... Thanks in advance! Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert dataframe with two factors from wide to long format
Just learned another trick today. Thanks a lot to both of you for the kind help! Gang On Sun, May 22, 2011 at 6:04 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: library(reshape2) d1 - melt(d, id = 'Subj') d1 - cbind(d1, colsplit(d1$variable, '_', c('Time', 'Cond'))) d1 - transform(d1, Time = substr(Time, 2, 2), Cond = substr(Cond, 5, 5))[c(1, 4, 5, 3)] str(d1) d1 You can decide whether to leave Time and Cond as character or to convert them to numeric or factor. HTH, Dennis Assume this is a dataframe named 'tst' require(reshape2) ltest - melt(tst, id.vars =1) dcast(ltest, ubj+substr(variable, 1,2) + substr(variable, 4,8) ~. ) ubj substr(variable, 1, 2) substr(variable, 4, 8) NA 1 1 T1 Cond1 0.125869 2 1 T1 Cond2 4.108232 3 1 T2 Cond1 1.099392 4 1 T2 Cond2 5.556614 5 2 T1 Cond1 1.427940 6 2 T1 Cond2 2.170026 7 2 T2 Cond1 0.120748 8 2 T2 Cond2 1.176353 dcast(ltest, ubj+substr(variable, 1,2) ~ substr(variable, 4,8) ) ubj substr(variable, 1, 2)Cond1Cond2 1 1 T1 0.125869 4.108232 2 1 T2 1.099392 5.556614 3 2 T1 1.427940 2.170026 4 2 T2 0.120748 1.176353 The modifications to get it exactly as requested are left to the reader On Sun, May 22, 2011 at 2:25 PM, Gang Chen gangch...@gmail.com wrote: I know how to convert a simple dataframe from wide to long format with one varying factor. However, for a dataset with two factors like the following, Subj T1_Cond1 T1_Cond2 T2_Cond1 T2_Cond2 1 0.125869 4.108232 1.099392 5.556614 2 1.427940 2.170026 0.120748 1.176353 How to elegantly convert to a long form as Subj Time Cond Value 1 11 0.125869 1 12 4.108232 1 21 1.099392 ... Thanks in advance! Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A question about data frame
A very simple question. With a data frame like this: n = c(2, 3, 5) s = c(aa, bb, cc) df = data.frame(n, s) I want df$s[1] or df[1,2], but how can I get rid of the extra line in the output about the factor levels: df$s[1] [1] aa Levels: aa bb cc Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A question about data frame
Thanks! You mean something like: print(df$s[1], max.levels=0) It seems I could also do as.character(df$s[1]) Any other/better solutions? On Thu, Mar 10, 2011 at 4:40 PM, Rob Tirrell r...@stanford.edu wrote: See the max.levels argument in ?print. I think this is what you're looking for. -- Robert Tirrell | r...@stanford.edu | (607) 437-6532 Program in Biomedical Informatics | Butte Lab | Stanford University On Thu, Mar 10, 2011 at 13:35, Gang Chen gangch...@gmail.com wrote: n = c(2, 3, 5) s = c(aa, bb, cc) df = data.frame(n, s) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A question about data frame
Yes, indeed I wanted them stored as characters instead of factor levels. Thanks a lot for the help, Jim and Phil! Gang On Thu, Mar 10, 2011 at 5:19 PM, Phil Spector spec...@stat.berkeley.edu wrote: Gang - It sounds like you want your character variables to be stored as character values, not factor values. If that's the case, use df = data.frame(n, s,stringsAsFactors=FALSE) If you want them to be factors, but not to display as factors, others have provided usable solutions. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Thu, 10 Mar 2011, Gang Chen wrote: A very simple question. With a data frame like this: n = c(2, 3, 5) s = c(aa, bb, cc) df = data.frame(n, s) I want df$s[1] or df[1,2], but how can I get rid of the extra line in the output about the factor levels: df$s[1] [1] aa Levels: aa bb cc Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3D scatter plot with projections
I want to create some 3D scatter plot with a diagonal line. In addition, I'd like to have those points plus the diagonal line projected to those three planes (xy, yz and xz). Which package can I use to achieve this, scatterplot3d or something else? Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D scatter plot with projections
Thanks a lot for the quick help! How to project the scatter plot with the diagonal line to the three planes with scatterplot3d? I could not find such an example demonstrating that in the vignette. Thanks, Gang 2011/1/8 Uwe Ligges lig...@statistik.tu-dortmund.de On 08.01.2011 16:38, Gang Chen wrote: I want to create some 3D scatter plot with a diagonal line. In addition, I'd like to have those points plus the diagonal line projected to those three planes (xy, yz and xz). Which package can I use to achieve this, scatterplot3d or something else? Yes, scatterplot3d, rgl, and maybe also others. For looking at it interactively I always prefer rgl, for statical representations (e.g. printing) scatterplot3d can be used with all known R devices. Best, Uwe Ligges Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D scatter plot with projections
Yes, too bad I didn't realize that it's so simple like that! Thanks... On Sat, Jan 8, 2011 at 12:45 PM, David Winsemius dwinsem...@comcast.netwrote: On Jan 8, 2011, at 11:21 AM, Gang Chen wrote: Thanks a lot for the quick help! How to project the scatter plot with the diagonal line to the three planes with scatterplot3d? I could not find such an example demonstrating that in the vignette. I'm puzzled. If you have (x1, y1, z1) and (x2, y2, z2) as starting and ending points, Deducing the three projected segments , i.e. the starting and ending points of the projections on the xy, xz and yz planes (z = 0, y=0, and x= 0 respectively) would seem to be trivial. So maybe I just don't understand. What part is offering difficulty? Please show your code so far. -- David. Thanks, Gang 2011/1/8 Uwe Ligges lig...@statistik.tu-dortmund.de On 08.01.2011 16:38, Gang Chen wrote: I want to create some 3D scatter plot with a diagonal line. In addition, I'd like to have those points plus the diagonal line projected to those three planes (xy, yz and xz). Which package can I use to achieve this, scatterplot3d or something else? Yes, scatterplot3d, rgl, and maybe also others. For looking at it interactively I always prefer rgl, for statical representations (e.g. printing) scatterplot3d can be used with all known R devices. Best, Uwe Ligges [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble installing gsl wrapper
You nailed it, Prof. Ripley! Thanks a lot... Gang On Sat, Oct 30, 2010 at 2:58 PM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: On Sat, 30 Oct 2010, Gang Chen wrote: Hi, I'm trying to install the gsl wrapper source code (http://cran.r-project.org/src/contrib/gsl_1.9-8.tar.gz) on a Linux system (OpenSuse 11.1), but encountering the following problem. I've already installed 'gsl' version 1.14 (ftp://ftp.gnu.org/gnu/gsl/gsl-1.14.tar.gz) on the system. What's missing? Thanks a lot... Installing the gsl library correctly? I need to guess quite a bit here (and a clean install attempt would have given a few more clues). I suspect you installed into /usr/local/lib whereas your OS probably uses /usr/local/lib64 (most x86_64 Linuxen do, and you seem to be using lib64 for R). In that case ld.so most likely will not find the dynamic library in /usr/local/lib. You can avoid such problems by installing auxiliary software such as gsl from RPMs -- I would be very surprised if OpenSuse did not have gsl and gsl-devel RPMs. Otherwise you need to install from the sources by something like make install libdir=/usr/local/lib64 R CMD INSTALL gsl * installing to library ‘/usr/lib64/R/library’ * installing *source* package ‘gsl’ ... checking for gsl-config... /usr/local/bin/gsl-config checking if GSL version = 1.12... checking for gcc... gcc checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed yes configure: creating ./config.status config.status: creating src/Makevars ** libs make: Nothing to be done for `all'. This was not a clean install installing to /usr/lib64/R/library/gsl/libs ** R ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/usr/lib64/R/library/gsl/libs/gsl.so': libgsl.so.0: cannot open shared object file: No such file or directory ERROR: loading failed * removing ‘/usr/lib64/R/library/gsl’ sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-unknown-linux-gnu (64-bit) -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble installing gsl wrapper
Hi, I'm trying to install the gsl wrapper source code (http://cran.r-project.org/src/contrib/gsl_1.9-8.tar.gz) on a Linux system (OpenSuse 11.1), but encountering the following problem. I've already installed 'gsl' version 1.14 (ftp://ftp.gnu.org/gnu/gsl/gsl-1.14.tar.gz) on the system. What's missing? Thanks a lot... R CMD INSTALL gsl * installing to library ‘/usr/lib64/R/library’ * installing *source* package ‘gsl’ ... checking for gsl-config... /usr/local/bin/gsl-config checking if GSL version = 1.12... checking for gcc... gcc checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed yes configure: creating ./config.status config.status: creating src/Makevars ** libs make: Nothing to be done for `all'. installing to /usr/lib64/R/library/gsl/libs ** R ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/usr/lib64/R/library/gsl/libs/gsl.so': libgsl.so.0: cannot open shared object file: No such file or directory ERROR: loading failed * removing ‘/usr/lib64/R/library/gsl’ sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-unknown-linux-gnu (64-bit) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate variance/covariance with complex numbers
Anybody knows what functions can be used to calculate variance/covariance with complex numbers? var and cov don't seem to work: a 1 V1 0.00810014+0.00169366i V2 0.00813054+0.00158251i V3 0.00805489+0.00163295i V4 0.00809141+0.00159533i V5 0.00813976+0.00161850i var(a) 1 1 1.141556e-09 Warning message: In var(a) : imaginary parts discarded in coercion cov(a) 1 1 1.141556e-09 Warning message: In cov(a) : imaginary parts discarded in coercion Thanks in advance, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate variance/covariance with complex numbers
Thanks a lot for the help, Ben Bolker and Chuck! Chuck, it seems your version needs a little modification. Instead of this: xri - matrix(rnorm(1)+1i*rnorm(1),nc=2) crossprod(xri-colMeans(xri))/(nrow(xri)-1) it seems to me it should be this (maybe there is a more elegant modification than mine?): crossprod(t(apply(xri, 1, '-', colMeans(xri/(nrow(xri)-1) Do you agree? Gang On Sat, Mar 27, 2010 at 7:07 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Sat, 27 Mar 2010, Gang Chen wrote: Anybody knows what functions can be used to calculate variance/covariance with complex numbers? var and cov don't seem to work: How about? xri - matrix(rnorm(1)+1i*rnorm(1),nc=2) crossprod(xri-colMeans(xri))/(nrow(xri)-1) HTH, Chuck a 1 V1 0.00810014+0.00169366i V2 0.00813054+0.00158251i V3 0.00805489+0.00163295i V4 0.00809141+0.00159533i V5 0.00813976+0.00161850i var(a) 1 1 1.141556e-09 Warning message: In var(a) : imaginary parts discarded in coercion cov(a) 1 1 1.141556e-09 Warning message: In cov(a) : imaginary parts discarded in coercion Thanks in advance, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] metafor package: effect sizes are not fully independent
Thanks for the confirmation and pointer, Mike! Dr. Viechtbauer, I'm looking forward to the new functionality of specifying covariance structures in metafor(). Thanks both again for the great help, Gang On Sun, Feb 7, 2010 at 8:59 PM, Mike Cheung mikewlche...@gmail.com wrote: Dear Gang, It seems that it is possible to use a univariate meta-analysis to handle your multivariate effect sizes. If you want to calculate a weighted average first, Hedges and Olkin (1985) has discussed this approach. Hedges, L. V., Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press. Regards, Mike -- - Mike W.L. Cheung Phone: (65) 6516-3702 Department of Psychology Fax: (65) 6773-1843 National University of Singapore http://courses.nus.edu.sg/course/psycwlm/internet/ - On Mon, Feb 8, 2010 at 6:48 AM, Gang Chen gangch...@gmail.com wrote: Dear Mike, Thanks a lot for the kind help! Actually a few months ago I happened to read a couple of your posts on the R-help archive when I was exploring the possibility of using lme() in R for meta analysis. First of all, I didn't specify the meta analysis model for my cases correctly in my previous message. Currently I'm only interested in random- or mixed-effects meta analysis. So what you've suggested is directly relevant to what I've been looking for, especially for case (2). I'll try to gather those references you listed, and figure out the details. Also I think I didn't state my case (1) clearly in my previous post. In that case, all the effect sizes are the same and in the same condition too (e.g., happy), but each source has multiple samples of the measurement (and also measurement error, or standard error). Could this still be handled as a multivariate meta analysis since the samples for the the same source are correlated? Or somehow the multiple measures from the same source can be somehow summarized (weighted average?) before the meta analysis? Your suggestions are highly appreciated. Best wishes, Gang On Sun, Feb 7, 2010 at 10:39 AM, Mike Cheung mikewlche...@gmail.com wrote: Dear Gang, Here are just some general thoughts. Wolfgang Viechtbauer will be a better position to answer questions related to metafor. For multivariate effect sizes, we first have to estimate the asymptotic sampling covariance matrix among the effect sizes. Formulas for some common effect sizes are provided by Gleser and Olkin (2009). If a fixed-effects model is required, it is quite easy to write your own GLS function to conduct the multivariate meta-analysis (see e.g., Becker, 1992). If a random-effects model is required, it is more challenging in R. SAS Proc MIXED can do the work (e.g., van Houwelingen, Arends, Stijnen, 2002). Sometimes, it is possible to transform the multivariate effect sizes into independent effect sizes (Kalaian Raudenbush, 1996; Raudenbush, Becker, Kalaian, 1988). Then univariate meta-analysis, e.g., metafor(), can be performed on the transformed effect sizes. This approach works if it makes sense to pool the multivariate effect sizes as in your case (2)- the effect sizes are the same but in different conditions (happy, sad, and neutral). However, this approach does not work if the multivariate effect sizes are measuring different concepts, e.g., verbal achievement and mathematical achievement. Hope this helps. Becker, B. J. (1992). Using results from replicated studies to estimate linear models. Journal of Educational Statistics, 17, 341-362. Gleser, L. J., Olkin, I. (2009). Stochastically dependent effect sizes. In H. Cooper, L. V. Hedges, and J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis, 2nd edition (pp. 357-376). New York: Russell Sage Foundation. Kalaian, H. A., Raudenbush, S. W. (1996). A multivariate mixed linear model for meta-analysis. Psychological Methods, 1, 227-235. Raudenbush, S. W., Becker, B. J., Kalaian, H. (1988). Modeling multivariate effect sizes. Psychological Bulletin, 103, 111-120. van Houwelingen, H.C., Arends, L.R., Stijnen, T. (2002). Advanced methods in meta-analysis: multivariate approach and meta-regression. Statistics in Medicine, 21, 589-624. Regards, Mike -- - Mike W.L. Cheung Phone: (65) 6516-3702 Department of Psychology Fax: (65) 6773-1843 National University of Singapore http://courses.nus.edu.sg/course/psycwlm/internet/ - On Sat, Feb 6, 2010 at 6:07 AM, Gang Chen gangch...@gmail.com wrote: In a classical meta analysis model y_i = X_i * beta_i + e_i, data {y_i} are assumed to be independent effect sizes. However, I'm encountering the following two scenarios
[R] metafor package: effect sizes are not fully independent
In a classical meta analysis model y_i = X_i * beta_i + e_i, data {y_i} are assumed to be independent effect sizes. However, I'm encountering the following two scenarios: (1) Each source has multiple effect sizes, thus {y_i} are not fully independent with each other. (2) Each source has multiple effect sizes, each of the effect size from a source can be categorized as one of a factor levels (e.g., happy, sad, and neutral). Maybe better denote the data as y_ij, effect size at the j-th level from the i-th source. I can code the levels with dummy variables into the X_i matrix, but apparently the data from the same source are correlated with each other. In this case, I would like to run a few tests one of which is, for example, whether there is any difference across all the levels of the factor. Can metafor handle these two cases? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert bits to numbers in base 10
I have some bits stored like the following variable nn (nn - c(1, 0, 0, 1, 0, 1,0)) [1] 1 0 0 1 0 1 0 not in the format of 1001010 and I need to convert them to numbers in base 10. What's an easy way to do it? TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert bits to numbers in base 10
Yes, such a concise and elegant solution! Thanks a lot! Gang On Thu, Apr 9, 2009 at 5:51 PM, Marc Schwartz marc_schwa...@me.com wrote: I suspect that Gang was looking for something along the lines of: sum(2 ^ (which(as.logical(rev(nn))) - 1)) [1] 74 You might also want to look at the digitsBase() function in Martin's sfsmisc package on CRAN. HTH, Marc Schwartz On Apr 9, 2009, at 4:34 PM, Jorge Ivan Velez wrote: Dear Gang, Try this: nn - c(1, 0, 0, 1, 0, 1,0) paste(nn,sep=,collapse=) See ?paste for more information. HTH, Jorge On Thu, Apr 9, 2009 at 5:23 PM, Gang Chen gangch...@gmail.com wrote: I have some bits stored like the following variable nn (nn - c(1, 0, 0, 1, 0, 1,0)) [1] 1 0 0 1 0 1 0 not in the format of 1001010 and I need to convert them to numbers in base 10. What's an easy way to do it? TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Puzzled by an error with apply()
I've written a function, myFunc, that works fine with myFunc(data, ...), but when I use apply() to run it with an array of data apply(myArray, 1, myFunc, ...) I get a strange error: Error in match.fun(FUN) : '1' is not a function, character or symbol which really puzzles me because '1' is meant to be the margin of the array I want to apply over, but how come does apply() treat it as a function? I have been successfully using apply() for a while, so I must have made a stupid mistake this time. Hopefully somebody can point out something obviously wrong without me providing any details of the function. TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpreting model matrix columns when using contr.sum
Many thanks to both Drs. Bates and Fox for the help! I also figured out yesterday what Dr. Fox just said regarding the interpretations of those coefficients for a balanced design. Thanks Dr. Bates for the suggestion of using solve(cbind(1, contr.sum(4))) to sort out the factor level effects. Model validation is very important, but interpreting those coefficients, at least in the case of balanced designs, also provides some insights about various effects for the people working in the field. Gang On Sun, Jan 25, 2009 at 11:25 AM, John Fox j...@mcmaster.ca wrote: Dear Doug and Gang Chen, With balanced data and sum-to-zero contrasts, the intercept is indeed the general mean of the response; the coefficient of a1 is the mean of the response in category a1 minus the general mean; the coefficient of a1:b1 is the mean of the response in cell a1, b1 minus the general mean and the coefficients of a1 and b1; etc. For unbalanced data (and balanced data) the intercept is the mean of the cell means; the coefficient of a1 is the mean of cell means at level a1 minus the intercept; etc. Whether all this is of interest is another question, since a simple graph of cell means tells a more digestible story about the data. Regards, John -- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Douglas Bates Sent: January-25-09 10:49 AM To: Gang Chen Cc: R-help Subject: Re: [R] Interpreting model matrix columns when using contr.sum On Fri, Jan 23, 2009 at 4:58 PM, Gang Chen gangch...@gmail.com wrote: With the following example using contr.sum for both factors, dd - data.frame(a = gl(3,4), b = gl(4,1,12)) # balanced 2-way model.matrix(~ a * b, dd, contrasts = list(a=contr.sum, b=contr.sum)) (Intercept) a1 a2 b1 b2 b3 a1:b1 a2:b1 a1:b2 a2:b2 a1:b3 a2:b3 11 1 0 1 0 0 1 0 0 0 0 0 21 1 0 0 1 0 0 0 1 0 0 0 31 1 0 0 0 1 0 0 0 0 1 0 41 1 0 -1 -1 -1-1 0-1 0-1 0 51 0 1 1 0 0 0 1 0 0 0 0 61 0 1 0 1 0 0 0 0 1 0 0 71 0 1 0 0 1 0 0 0 0 0 1 81 0 1 -1 -1 -1 0-1 0-1 0-1 91 -1 -1 1 0 0-1-1 0 0 0 0 10 1 -1 -1 0 1 0 0 0-1-1 0 0 11 1 -1 -1 0 0 1 0 0 0 0-1-1 12 1 -1 -1 -1 -1 -1 1 1 1 1 1 1 ... I have two questions: (1) I assume the 1st column (under intercept) is the overall mean, the 2rd column (under a1) is the difference between the 1st level of factor a and the overall mean, the 4th column (under b1) is the difference between the 1st level of factor b and the overall mean. Is this interpretation correct? I don't think so and furthermore I don't see why the contrasts should have an interpretation. The contrasts are simply a parameterization of the space spanned by the indicator columns of the levels of the factors. Interpretations as overall means, etc. are mostly a holdover from antiquated concepts of how analysis of variance tables should be evalated. If you want to determine the interpretation of particular coefficients for the special case of a balanced design (which doesn't always mean a resulting balanced data set - I remind my students that expecting a balanced design to produce balanced data is contrary to Murphy's Law) the easiest way of doing so is (I think this is right but I can somehow manage to confuse myself on this with great ease) to calculate contr.sum(3) [,1] [,2] 110 201 3 -1 -1 solve(cbind(1, contr.sum(3))) 1 2 3 [1,] 0.333 0.333 0.333 [2,] 0.667 -0.333 -0.333 [3,] -0.333 0.667 -0.333 solve(cbind(1, contr.sum(4))) 1 2 3 4 [1,] 0.25 0.25 0.25 0.25 [2,] 0.75 -0.25 -0.25 -0.25 [3,] -0.25 0.75 -0.25 -0.25 [4,] -0.25 -0.25 0.75 -0.25 That is, the first coefficient is the overall mean (but only for a balanced data set), the second is a contrast of the first level with the others, the third is a contrast of the second level with the others and so on. (2) I'm not so sure about those interaction columns. For example, what is a1:b1? Is it the 1st level of factor a at the 1st level of factor b versus the overall mean, or something more complicated? Well, at the risk of sounding trivial, a1:b1 is the product of the a1 and b1 columns. You need a basis for a certain subspace and this provides one. I don't
[R] Interpreting model matrix columns when using contr.sum
With the following example using contr.sum for both factors, dd - data.frame(a = gl(3,4), b = gl(4,1,12)) # balanced 2-way model.matrix(~ a * b, dd, contrasts = list(a=contr.sum, b=contr.sum)) (Intercept) a1 a2 b1 b2 b3 a1:b1 a2:b1 a1:b2 a2:b2 a1:b3 a2:b3 11 1 0 1 0 0 1 0 0 0 0 0 21 1 0 0 1 0 0 0 1 0 0 0 31 1 0 0 0 1 0 0 0 0 1 0 41 1 0 -1 -1 -1-1 0-1 0-1 0 51 0 1 1 0 0 0 1 0 0 0 0 61 0 1 0 1 0 0 0 0 1 0 0 71 0 1 0 0 1 0 0 0 0 0 1 81 0 1 -1 -1 -1 0-1 0-1 0-1 91 -1 -1 1 0 0-1-1 0 0 0 0 10 1 -1 -1 0 1 0 0 0-1-1 0 0 11 1 -1 -1 0 0 1 0 0 0 0-1-1 12 1 -1 -1 -1 -1 -1 1 1 1 1 1 1 ... I have two questions: (1) I assume the 1st column (under intercept) is the overall mean, the 2rd column (under a1) is the difference between the 1st level of factor a and the overall mean, the 4th column (under b1) is the difference between the 1st level of factor b and the overall mean. Is this interpretation correct? (2) I'm not so sure about those interaction columns. For example, what is a1:b1? Is it the 1st level of factor a at the 1st level of factor b versus the overall mean, or something more complicated? Thanks in advance for your help, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using apply for two datasets
I can run one-sample t-test on an array, for example a matrix myData1, with the following apply(myData1, 2, t.test) Is there a similar fashion using apply() or something else to run 2-sample t-test with datasets from two groups, myData1 and myData2, without looping? TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for two datasets
Thanks a lot for the suggestions, Jorge and Satoshi Takahama! Both approaches work well... Gang On Tue, Jan 6, 2009 at 2:12 PM, Satoshi Takahama s.takah...@yahoo.com wrote: Perhaps you can convert your matrices to data frames as in: mapply(t.test,as.data.frame(myData1),as.data.frame(myData2)) to test by column and mapply(t.test,as.data.frame(t(myData1)),as.data.frame(t(myData2))) to test by row? - Original Message From: Gang Chen gangch...@gmail.com To: Henrique Dallazuanna www...@gmail.com Cc: r-h...@stat.math.ethz.ch Sent: Tuesday, January 6, 2009 10:10:44 AM Subject: Re: [R] Using apply for two datasets Thanks a lot for the quick help! mapply() seems promising. However, mapply(t.test, myData1, myData2) would not work, so how can I specify the margin in mapply() which function t.test() will be applied over? For example, I specify the 2nd dimension (column) in apply(myData1, 2, t.test) to run one-sample t-test. Is there a way I can achieve the same with mapply()? Thanks again, Gang On Tue, Jan 6, 2009 at 12:34 PM, Henrique Dallazuanna www...@gmail.com wrote: I think that you can use mapply for this. On Tue, Jan 6, 2009 at 3:24 PM, Gang Chen gangch...@gmail.com wrote: I can run one-sample t-test on an array, for example a matrix myData1, with the following apply(myData1, 2, t.test) Is there a similar fashion using apply() or something else to run 2-sample t-test with datasets from two groups, myData1 and myData2, without looping? TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Parallel computing with snow
I've been using parApply() in snow package for parallel computing with the following lines in R 2.8.1: library(snow) nNodes - 4 cl - makeCluster(nNodes, type = SOCK) fm - parApply(cl, myData, c(1,2), func1, ...) Since I have a Mac OS X (version 10.4.11) with two dual-core processors, I thought that I could run 4 simultaneous clusters. However with the 1st job it seems only two clusters (362 and 364 below) were running with roughly the same CPU time (4th column) while the other two clusters were pretty much idling (I assume the 1st row with PID 357 was the main process with which I started R): PID COMMAND %CPU TIME 357 R 0.0% 0:15.81 362 R99.8% 11:41.07 364 R 100.3% 12:26.43 366 R 0.0% 0:01.67 368 R 0.0% 0:01.68 Why weren't 4 clusters split roughly equally in CPU time with two barely used? I also tried a different job with fm - parApply(cl, myData, c(1,2), func2, ...), and the result is slightly different with all 4 clusters more or less involved although they were still not distributed evenly neither: PID COMMAND %CPU TIME 413 R0.0% 0:18.46 419 R 93.3% 2:53.62 421 R 93.6% 6:07.85 423 R 92.8% 5:12.13 425 R 93.3% 1:39.73 What gives? Why different usage of clusters between the two jobs? All help is highly appreciated, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confused with default device setup
When invoking dev.new() on my Mac OS X 10.4.11, I get an X11 window instead of quartz which I feel more desirable. So I'd like to set the default device to quartz. However I'm confused because of the following: Sys.getenv(R_DEFAULT_DEVICE) R_DEFAULT_DEVICE quartz getOption(device) [1] X11 What's going on? Also is file Renviron under /Library/Frameworks/R.framework/Resources/ etc/ppc/ the one I should modify if I want to change some environment variables? But I don't see R_DEFAULT_DEVICE there. TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Load a program at the front end
Thanks a lot for the suggestion! Unfortunately R --no-save prog.R does not work well with my situation because prog.R contain lines such as readline() and () that require user response in the middle of the execution. I also tried other options such as R -f prog.R and R --interactive prog.R, and they all failed. Any other suggestions? Thanks, Gang On Mon, Oct 6, 2008 at 10:12 PM, Bernardo Rangel Tura [EMAIL PROTECTED] wrote: Em Qui, 2008-10-02 às 14:36 -0400, Gang Chen escreveu: I want to run a R program, prog.R, interactively. My question is, is there a way I can start prog.R on the shell terminal when invoking R, instead of using source() inside R? TIA, Gang Hi Gang I my system just only type: R --no-save prog.R platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status Patched major 2 minor 7.2 year 2008 month 09 day11 svn rev46532 language R version.string R version 2.7.2 Patched (2008-09-11 r46532) -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Load a program at the front end
Thanks for the suggestion, Luke! Your approach would load prog.R whenever invoking R, but that is not exactly what I want. Basically I can run prog.R through source(~/someDir/prog.R) inside R. Also since prog.R involves many lines of readline() and tclvalue() in tcltk package, the program requires the user responses in the middle of the execution. If possible, I would like to create an executable shell script RunProg with one line or two inside like R ... ~/someDir/prog.R If this is feasible, whenever I want to run prog.R, I can simply type the following at the terminal prompt: RunProg and I would still be able to interact with the user. Is this something doable? Thanks, Gang On Tue, Oct 7, 2008 at 12:43 PM, Luke Tierney [EMAIL PROTECTED] wrote: Something like env R_PROFILE=prog.R R may work for you. You may need to call .First.sys at the beginning of prog.R to get default packages loaded. luke On Tue, 7 Oct 2008, Gang Chen wrote: Thanks a lot for the suggestion! Unfortunately R --no-save prog.R does not work well with my situation because prog.R contain lines such as readline() and () that require user response in the middle of the execution. I also tried other options such as R -f prog.R and R --interactive prog.R, and they all failed. Any other suggestions? Thanks, Gang On Mon, Oct 6, 2008 at 10:12 PM, Bernardo Rangel Tura [EMAIL PROTECTED] wrote: Em Qui, 2008-10-02 às 14:36 -0400, Gang Chen escreveu: I want to run a R program, prog.R, interactively. My question is, is there a way I can start prog.R on the shell terminal when invoking R, instead of using source() inside R? TIA, Gang Hi Gang I my system just only type: R --no-save prog.R platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status Patched major 2 minor 7.2 year 2008 month 09 day11 svn rev46532 language R version.string R version 2.7.2 Patched (2008-09-11 r46532) -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running source() on a file in another directory
Thanks a lot! It seems to work fine now with what you suggested. I also tried another approach by running the following on the terminal R -f ~dir1/prog.R and it read in some lines in prog.R, but then the execution halted at one point for some reason I could not figure out. Any way I could get R -f ... working? Thanks, Gang On Wed, Oct 1, 2008 at 11:49 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Suppose p is a vector of paths, e.g., p - c(~/dir1, ~/dir2/dir3) Then the following will return the full pathname of the first found location: Find(file.exists, file.path(p, prog.R)) so you can source that. On Wed, Oct 1, 2008 at 6:09 PM, Gang Chen [EMAIL PROTECTED] wrote: Suppose I have a file prog.R stored in a directory under ~/dirname, and ~/dirname is set in a shell script file (e.g. .cshrc) as one of the accessible paths on terminal. On a different directory I could run prog.R interactively by executing source(~/dirname/prog.R) It seems that source() does not search for all the paths set in the shell script. So, is there a better and more elegant way to do this, for example without explicitly typing the directory? TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] t.test() on a list
I have a list, myList, with each of its 9 components being a 15X15 matrix. I want to run a t-test across the list for each component in the matrix. For example, the first t-test is on myList[[1]][1, 1], myList[[2]][1, 1], ..., myList[[9]][1, 1]; and there are totally 15X15 t-tests. How can I run these t-tests in a simple way? TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Load a program at the front end
I want to run a R program, prog.R, interactively. My question is, is there a way I can start prog.R on the shell terminal when invoking R, instead of using source() inside R? TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t.test() on a list
I appreciate your suggestion. The example I provided was fabricated because I was only focusing on the programming perspective on how to deal with such a data structure, not real statistical issues. Do you mind elaborating a little more why that would not be appreciate? I know I can do it with a couple of loops, but I still appreciate suggestions on how to write a line or two to run t-tests on such a data structure. Thanks, Gang On Thu, Oct 2, 2008 at 5:11 PM, Bert Gunter [EMAIL PROTECTED] wrote: I am sure you will get helpful answers. I am almost as sure that you shouldn't be doing this. I suggest you consult with your local statistician. -- Bert Gunter -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gang Chen Sent: Thursday, October 02, 2008 11:24 AM To: [EMAIL PROTECTED] Subject: [R] t.test() on a list I have a list, myList, with each of its 9 components being a 15X15 matrix. I want to run a t-test across the list for each component in the matrix. For example, the first t-test is on myList[[1]][1, 1], myList[[2]][1, 1], ..., myList[[9]][1, 1]; and there are totally 15X15 t-tests. How can I run these t-tests in a simple way? TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Running source() on a file in another directory
Suppose I have a file prog.R stored in a directory under ~/dirname, and ~/dirname is set in a shell script file (e.g. .cshrc) as one of the accessible paths on terminal. On a different directory I could run prog.R interactively by executing source(~/dirname/prog.R) It seems that source() does not search for all the paths set in the shell script. So, is there a better and more elegant way to do this, for example without explicitly typing the directory? TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trouble with tkgetOpenFile
I'm trying to use the following loop to open a window multiple times to select files, but only the last window shows up. What am I missing? library(tcltk) nWin - 6 fn - vector('list', nWin) for (ii in nWin) { fn[[ii]] - tclvalue( tkgetOpenFile( filetypes = {{Files} {.1D}} {{All files} *}, title = paste('Choose number', ii, 'time series file'))) } TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble with tkgetOpenFile
Such a silly mistake! Thanks a lot, John! Gang On Wed, Sep 17, 2008 at 7:21 PM, John Fox [EMAIL PROTECTED] wrote: Dear Gang, Your for loop is in error; try for (ii in seq(length=nWin)) I hope this helps, John -- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gang Chen Sent: September-17-08 4:31 PM To: [EMAIL PROTECTED] Subject: [R] trouble with tkgetOpenFile I'm trying to use the following loop to open a window multiple times to select files, but only the last window shows up. What am I missing? library(tcltk) nWin - 6 fn - vector('list', nWin) for (ii in nWin) { fn[[ii]] - tclvalue( tkgetOpenFile( filetypes = {{Files} {.1D}} {{All files} *}, title = paste('Choose number', ii, 'time series file'))) } TIA, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating an array of lists
Yes, both array of lists and list of lists work well as desired. Thanks a lot! Gang On Thu, Aug 7, 2008 at 4:05 PM, Patrick Burns [EMAIL PROTECTED] wrote: myListArray - array(list(NULL), c(3,2)) myListArray[[1,2]] - list(letters, 1:4) Patrick Burns [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Gang Chen wrote: Hi, I want to store some number of outputs from running a bunch of analyses such as lm() into an array. I know how to do this with a one-dimensional array (vector) by creating myArray - vector(mode='list', length=10) and storing each lm() result into a component of myArray. My question is, how can I do this for a multiple dimensional array? It seems array() does not have such a 'mode' option as in vector(). Any alternatives? Thanks in advance, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixed model with multiple response variables?
Hi, I have a data set collected from 10 measurements (response variables) on two groups (healthy and patient) of subjects performing 4 different tasks. In other words there are two fixed factors (group and task), and 10 response variables. I could analyze the data with aov() or lme() in package nlme for each response variable separately, but since most likely there are correlations among the 10 response variables, would it be more meaningful to run a MANOVA? However manova() in R seems not to allow an error term in the formula. What else can I try for this kind of multivariate mixed model? Also, if I want to find out which response variables (among the 10 measurements) are statistically significant in terms of acting as indicators for group difference, what kind of statistical analysis would help me sort them out? Thanks in advance, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-ME] Post hoc tests with lme
Thanks for the suggestion, Somon! I did try glht from multcomp package, but the problem is that for the hypothesis H0: TypeT1 =0 and TypeT2 = 0 it gives results for two separate hypotheses H01: TypeT1 =0 and H02: TypeT2 = 0, not exactly one statistic for the original hypothesis H0. So my question is, how can I get only one statistic for H0? Any more suggestions? Thanks, Gang library(nlme) fm - lme(effort~Type-1, data=ergoStool, random=~1|Subject) library(multcomp) summary(glht(fm, linfct=c(TypeT1=0, TypeT2=0))) Simultaneous Tests for General Linear Hypotheses Fit: lme.formula(fixed = effort ~ Type - 1, data = ergoStool, random = ~1 | Subject) Linear Hypotheses: Estimate Std. Error z value p value TypeT1 == 08.556 0.576 14.85 1e-10 *** TypeT2 == 0 12.444 0.576 21.60 1e-10 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Adjusted p values reported -- single-step method) On 4/16/08, Simon Blomberg [EMAIL PROTECTED] wrote: Try glht in package multcomp. Simon. On Wed, 2008-04-16 at 12:00 -0400, Gang Chen wrote: Using the ergoStool data cited in Mixed-Effects Models in S and S-PLUS by Pinheiro and Bates as an example, we have library(nlme) fm - lme(effort~Type-1, data=ergoStool, random=~1|Subject) summary(fm) Now suppose I want to test the following hypothesis H0: TypeT1 =0 and TypeT2 = 0 I've tried estimable() and glh.test() in package gmodels, esticon() in package boBy, and linear.hypothesis() in package car, but it seems none of them would work with objects from lme. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Formula with no intercept
Thanks both Harold Doran and Prof. Ripley for the suggestion. Time*Group - 1 or Time*(Group-1) does seem better. However as Prof. Ripley pointed out, it is a little complicated with the interactions. For example, == set.seed(1) group - as.factor (sample (c(M,F), 12, T)) y - rnorm(12) time - as.factor (rep (1:4, 3)) summary(fit - lm ( y ~ time * group - 1)) Call: lm(formula = y ~ time * group - 1) Residuals: 1 2 3 4 5 6 7 -5.122e-01 3.916e-01 5.985e-01 9.547e-01 5.122e-01 1.665e-16 -5.985e-01 8 9 10 11 12 -9.547e-01 0.000e+00 -3.916e-01 -5.551e-17 2.220e-16 Coefficients: Estimate Std. Error t value Pr(|t|) time1 1.124930.91795 1.2250.288 time2 0.389840.91795 0.4250.693 time3-0.022730.64909 -0.0350.974 time4-1.260040.64909 -1.9410.124 groupM -0.125331.12426 -0.1110.917 time2:groupM 0.082181.58994 0.0520.961 time3:groupM 0.131871.58994 0.0830.938 time4:groupM 2.329211.58994 1.4650.217 Residual standard error: 0.918 on 4 degrees of freedom Multiple R-squared: 0.6962, Adjusted R-squared: 0.08858 F-statistic: 1.146 on 8 and 4 DF, p-value: 0.4796 = There are totally 8 fixed effects listed above. I believe I can interpret time1, time2, time3 and time4 as the fixed effects of those 4 levels of factor Time in groupF. But I'm not so sure about the other 4 fixed effects: are time2:groupM, time3:groupM, and time4:groupM the fixed effect differences of those 3 levels of factor Time between groupM and groupF? If so, what is groupM (the 5th)? Or are time2:groupM, time3:groupM, and time4:groupM the difference (between groupM and groupF) of the fixed effects of those 3 levels of time factor versus time1 while groupM (the 5th) the fixed effect of time1 or groupM versus GroupF? packages such as multcomp can post-hoc test any (coherent) set of hypotheses you choose, irrespective of the model parametrization. This does not seem true unless I'm missing something. See the following example: === set.seed(1) group - as.factor (sample (c(M,F), 12, T)) y - rnorm(12) time - as.factor (rep (1:4, 3)) fit - lm(y ~ time * group) library(multcomp) summary(glht(fit, linfct=c(time1=0, time2=0))) Error in chrlinfct2matrix(linfct, names(beta)) : variable(s) 'time1' not found summary(glht(fit, linfct=c(time2=0, time3=0))) Simultaneous Tests for General Linear Hypotheses Fit: lm(formula = y ~ time * group) Linear Hypotheses: Estimate Std. Error t value p value time2 == 0 -0.7351 1.2982 -0.566 0.797 time3 == 0 -1.1477 1.1243 -1.021 0.533 (Adjusted p values reported -- single-step method) == The problem is that glht doesn't allow any hypothesis involving time1 if intercept is included in the model specification. Any more thoughts? Thanks, Gang On 4/16/08, Prof Brian Ripley [EMAIL PROTECTED] wrote: On Wed, 16 Apr 2008, Doran, Harold wrote: R may not be giving you what you want, but it is doing the right thing. You can change what the base category is through contrasts but you can't get the marginal effects for every level of all factors because this creates a linear dependence in the model matrix. I suspect that Time*Group - 1 or Time*(Group-1) come closer to the aim. It is the first factor in the model which is coded without contrasts in a no-intercept model. Once you include interactions I think the 'convenience' is largely lost, and packages such as multcomp can post-hoc test any (coherent) set of hypotheses you choose, irrespective of the model parametrization. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gang Chen Sent: Monday, April 14, 2008 5:38 PM To: [EMAIL PROTECTED] Subject: [R] Formula with no intercept I'm trying to analyze a model with two variables, one is Group with two levels (male and female), and other is Time with four levels (T1, T2, T3 and T4). And for the convenience of post-hoc testing I wanted to consider a model with no intercept for factor Time, so I tried formula Group*(Time-1) However this seems to give me the following terms in the model GroupMale, GroupFemale, TimeT2, TimeT3, TimeT4, GroupMale:TimeT2, GroupMale:TimeT3, GroupMale:TimeT4, GroupFemale:TimeT2, GroupFemale:TimeT3, GroupFemale:TimeT4 which is not exactly what I wanted. Also it seems (Group-1)*Time and (Group-1)*(Time-1) also give me exactly the same set of terms as Group*(Time-1). So I have some conceptual trouble understanding this. And how could I create a model with terms including all the levels of factor Time? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help
[R] Post hoc tests with lme
Using the ergoStool data cited in Mixed-Effects Models in S and S-PLUS by Pinheiro and Bates as an example, we have library(nlme) fm - lme(effort~Type-1, data=ergoStool, random=~1|Subject) summary(fm) Linear mixed-effects model fit by REML Data: ergoStool AIC BIC logLik 133.1308 141.9252 -60.5654 Random effects: Formula: ~1 | Subject (Intercept) Residual StdDev:1.332465 1.100295 Fixed effects: effort ~ Type - 1 Value Std.Error DF t-value p-value TypeT1 8.56 0.5760123 24 14.85308 0 TypeT2 12.44 0.5760123 24 21.60448 0 TypeT3 10.78 0.5760123 24 18.71102 0 TypeT4 9.22 0.5760123 24 16.01046 0 Correlation: TypeT1 TypeT2 TypeT3 TypeT2 0.595 TypeT3 0.595 0.595 TypeT4 0.595 0.595 0.595 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.80200345 -0.64316591 0.05783115 0.70099706 1.63142054 Number of Observations: 36 Number of Groups: 9 Now suppose I want to test the following hypothesis H0: TypeT1 =0 and TypeT2 = 0 I've tried estimable() and glh.test() in package gmodels, esticon() in package boBy, and linear.hypothesis() in package car, but it seems none of them would work with objects from lme: library(gmodels) estimable(fm, rbind(c(TypeT1=1), c(TypeT2=1))) Error in FUN(newX[, i], ...) : `param' has no names and does not match number of coefficients of model. Unable to construct coefficient vector glh.test(fm, rbind(c(TypeT1=1), c(TypeT2=1))) Error in glh.test(fm, rbind(c(TypeT1 = 1), c(TypeT2 = 1))) : Only defined for lm,glm objects library(doBy) esticon(fm, rbind(c(TypeT1=1), c(TypeT2=1))) Error in t(abs(t(tmp) * obj$fixDF$X)) : dims [product 2] do not match the length of object [4] In addition: Warning message: In esticon.lme(fm, rbind(c(TypeT1 = 1), c(TypeT2 = 1))) : The esticon function has not been thoroughly teste on 'lme' objects library(car) linear.hypothesis(fm, rbind(c(TypeT1=1), c(TypeT2=1))) Error in L %*% b : requires numeric matrix/vector arguments So is there any other package with which I can run this kind of tests? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Formula with no intercept
I'm trying to analyze a model with two variables, one is Group with two levels (male and female), and other is Time with four levels (T1, T2, T3 and T4). And for the convenience of post-hoc testing I wanted to consider a model with no intercept for factor Time, so I tried formula Group*(Time-1) However this seems to give me the following terms in the model GroupMale, GroupFemale, TimeT2, TimeT3, TimeT4, GroupMale:TimeT2, GroupMale:TimeT3, GroupMale:TimeT4, GroupFemale:TimeT2, GroupFemale:TimeT3, GroupFemale:TimeT4 which is not exactly what I wanted. Also it seems (Group-1)*Time and (Group-1)*(Time-1) also give me exactly the same set of terms as Group*(Time-1). So I have some conceptual trouble understanding this. And how could I create a model with terms including all the levels of factor Time? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extract values from a named array
Sorry for this dumb question. Suppose I have a named array ww defined as ww - 1:5 names(ww) - c(a, b, c, d, e) How can I extract the whole array of numbers without the names? ww[1:5] does not work while ww[[1]] can only extract one number at a time. Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extract values from a named array
Thanks a lot for the suggestions! Gang On 4/7/08, Uwe Ligges [EMAIL PROTECTED] wrote: Gang Chen wrote: Sorry for this dumb question. Suppose I have a named array ww defined as ww - 1:5 names(ww) - c(a, b, c, d, e) How can I extract the whole array of numbers without the names? ww[1:5] does not work while ww[[1]] can only extract one number at a time. Use as.vector to strip these attributes: as.vector(ww) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is there any package can be used to solve individual haplotyping problem?
Hi all, I am working for reconstructing haplotype from individual's SNPs. But I have met a problem that most of paper and software I found are focus on reconstructing haplotype from population's SNPs, rather than individual's SNPs. I wonder is there any R package can be used to solve such problem? Thanks in advance for any information, Gang Chen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Array arithmetic
I have two arrays A and B with dimensions of (L, M, N, P) and (L, M, N), and I want to do for (i in 1:L) { for (j in 1:M) { for (k in 1:N) { if (abs(B[i, j, k]) 10e-5) C[i, j, k,] - A[i, j, k,]/B[i, j, k] else C[i, j, k,] - 0 } } } How can I get C more efficiently than looping? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Array arithmetic
Thank both of you for the suggestions! Gang On Mar 6, 2008, at 2:12 PM, Henrique Dallazuanna wrote: Hum you are rigth, I forgot of 'else'. On 06/03/2008, Benilton Carvalho [EMAIL PROTECTED] wrote: no, it won't. you're doing the right math on the valid subset... but you're not returning the zeros where needed therefore, the whole thing will get recycled to match the dimensions. b On Mar 6, 2008, at 2:03 PM, Henrique Dallazuanna wrote: I think this should work: array(A[abs(B) 10e-5]/B[abs(B) 10e-5], dim=c(L, M, N, P)) On 06/03/2008, Gang Chen [EMAIL PROTECTED] wrote: I have two arrays A and B with dimensions of (L, M, N, P) and (L, M, N), and I want to do for (i in 1:L) { for (j in 1:M) { for (k in 1:N) { if (abs(B[i, j, k]) 10e-5) C[i, j, k,] - A[i, j, k,]/B[i, j, k] else C[i, j, k,] - 0 } } } How can I get C more efficiently than looping? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Collapse an array
Suppose I have a 4-D array X with dimensions (dx, dy, dz, dp). I want to collapse the first 3 dimensions of X to make a 2-D array Y with dimensions (dx*dy*dz, dp). Instead of awkward looping, what is a good way to do this? Is there a similar function like reshape in Matlab? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Collapse an array
Thanks a lot for all the suggestions! Gang On Feb 28, 2008, at 3:20 PM, Henrique Dallazuanna wrote: Try this also: apply(x, 4, rbind) On 28/02/2008, Gang Chen [EMAIL PROTECTED] wrote: Suppose I have a 4-D array X with dimensions (dx, dy, dz, dp). I want to collapse the first 3 dimensions of X to make a 2-D array Y with dimensions (dx*dy*dz, dp). Instead of awkward looping, what is a good way to do this? Is there a similar function like reshape in Matlab? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Running R non-interactively
Normally I can run an R script in batch mode with a command like this R CMD BATCH MyScript.R MyOutput However I prefer to write another script containing something like R --no-restore --save --no-readline $1 $2 so that I could run the original script simply on the prompt as MyScript.R MyOutput The second method is almost the same as the R CMD BATCH method except MyOutput does not contain the output from the OS calls such as system (). So what option(s) should add in the second approach to make it identical to R CMD BATCH? Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble setting up correlation structure in lme
Hi, I'm trying to set up AR(1) as a correlation structure in modeling some data (attached file data.txt in text format) with lme, but have trouble getting it to work. Incent, Correctness, and Oppor are 3 categorical variables, Beta is a response variable, and Time is an equally-spaced variable with 6 time points (treated as a categorical variable as well). I'm not so sure about the form in corAR1 here. Since the correlation is supposed to be along the Time points, is form=~1 | Subject/Time a correct setup? library(nlme) Data - read.table(data.txt, header=TRUE) fm - lme(Beta~Incent*Correctness+Oppor+(Time-1), random = ~1|Subj, correlation=corAR1(0.5, form=~1|Subj/Time), Data) The above lines do work, but my intuition seems pointing to a form like form = ~time | Subj/Incent/Correctness/Oppor. However fm - lme(Beta~Incent*Correctness+Oppor+(Time-1), random = ~1|Subj, correlation=corAR1(0.5, form=~Time|/Incent/Correctness/Oppor), Data) gives me the following error: Error in attributes(.Data) - c(attributes(.Data), attrib) : 'names' attribute [270] must be the same length as the vector [0] It is probably due to the fact some combinations in the data are missing, but I could not figure out how to group them so that the values on the time variable would be unique within each subject. So how can I do this? Any help is highly appreciated, Gang SubjIncent Oppor Correctness TimeBeta esp005 highnoopp correct Time1 1.156180 esp005 highnoopp correct Time2 -0.1524906 esp005 highnoopp correct Time3 -0.1374741 esp005 highnoopp correct Time4 0.1932847 esp005 highnoopp correct Time5 -0.02334221 esp005 highnoopp correct Time6 -0.4367026 esp005 highnoopp incorr Time1 -1.046768 esp005 highnoopp incorr Time2 0.7754745 esp005 highnoopp incorr Time3 -0.4571449 esp005 highnoopp incorr Time4 0.2027827 esp005 highnoopp incorr Time5 0.1213075 esp005 highnoopp incorr Time6 -0.9145339 esp005 highopp correct Time1 -0.09131422 esp005 highopp correct Time2 1.341441 esp005 highopp correct Time3 0.7011359 esp005 highopp correct Time4 -0.1799809 esp005 highopp correct Time5 0.465437 esp005 highopp correct Time6 -0.3150032 esp005 highopp incorr Time1 -0.8010447 esp005 highopp incorr Time2 0.7445193 esp005 highopp incorr Time3 -0.8163298 esp005 highopp incorr Time4 -0.7792278 esp005 highopp incorr Time5 -0.8525334 esp005 highopp incorr Time6 -0.9935786 esp005 low opp correct Time1 -1.135091 esp005 low opp correct Time2 2.080874 esp005 low opp correct Time3 0.9655223 esp005 low opp correct Time4 -0.4225955 esp005 low opp correct Time5 0.3518235 esp005 low opp correct Time6 0.7110413 esp005 low opp incorr Time1 -0.7773728 esp005 low opp incorr Time2 0.2896505 esp005 low opp incorr Time3 0.2957908 esp005 low opp incorr Time4 0.8173949 esp005 low opp incorr Time5 -0.8015972 esp005 low opp incorr Time6 -0.7611155 esp006 highnoopp correct Time1 -0.2736552 esp006 highnoopp correct Time2 0.0553967 esp006 highnoopp correct Time3 0.06530985 esp006 highnoopp correct Time4 -0.03726416 esp006 highnoopp correct Time5 -0.4627595 esp006 highnoopp correct Time6 -0.1016363 esp006 highnoopp incorr Time1 -0.1492384 esp006 highnoopp incorr Time2 0.02676032 esp006 highnoopp incorr Time3 0.01891842 esp006 highnoopp incorr Time4 0.2098106 esp006 highnoopp incorr Time5 -0.07722018 esp006 highnoopp incorr Time6 -0.0820671 esp006 highopp correct Time1 0.2699376 esp006 highopp correct Time2 0.4860634 esp006 highopp correct Time3 0.09422317 esp006 highopp correct Time4 0.1057343 esp006 highopp correct Time5 -0.09392627 esp006 highopp correct Time6 0.1700330 esp006 highopp incorr Time1 -0.2369925 esp006 highopp incorr Time2 -0.1076743 esp006 highopp incorr Time3 -0.1159856 esp006 highopp incorr Time4 0.07569676 esp006 highopp incorr Time5 -0.3941537 esp006 highopp incorr Time6 0.1117587 esp006 low opp correct Time1
[R] Trouble setting up correlation structure in lme
Hi, I'm trying to set up AR(1) as a correlation structure in modeling some data (attached file data.txt in text format) with lme, but have trouble getting it to work. Incent, Correctness, and Oppor are 3 categorical variables, Beta is a response variable, and Time is an equally-spaced variable with 6 time points (treated as a categorical variable as well). Basically I want to model the correlation with AR(1) or even ARMA along those 6 time points. First, how is the order determined among the 6 levels of Time? Alphabetically? If so, I would have to be very careful about their labels? Or can I specify their order somehow? Secondly, I'm not so sure about the form in corAR1 here. Since the correlation is supposed to be along the Time points, is form=~1 | Subject/Time a correct setup? library(nlme) Data - read.table(data.txt, header=TRUE) fm - lme(Beta~Incent*Correctness+Oppor+(Time-1), random = ~1|Subj, correlation=corAR1(0.5, form=~1|Subj/Time), Data) The above lines do work, but my intuition seems pointing to a form like form = ~time | Subj. However fm - lme(Beta~Incent*Correctness+Oppor+(Time-1), random = ~1|Subj, correlation=corAR1(0.5, form=~Time|Subj), Data) gives me the following error: Error in Initialize.corAR1(X[[2L]], ...) : Covariate must have unique values within groups for corAR1 objects Does it mean that the values on the time variable should be unique within each subject? How can I define a group variable using more than just subject since I have multiple combinations of subject and some other variables? Any help is highly appreciated, Gang SubjIncent Oppor Correctness TimeBeta esp005 highnoopp correct Time1 1.156180 esp005 highnoopp correct Time2 -0.1524906 esp005 highnoopp correct Time3 -0.1374741 esp005 highnoopp correct Time4 0.1932847 esp005 highnoopp correct Time5 -0.02334221 esp005 highnoopp correct Time6 -0.4367026 esp005 highnoopp incorr Time1 -1.046768 esp005 highnoopp incorr Time2 0.7754745 esp005 highnoopp incorr Time3 -0.4571449 esp005 highnoopp incorr Time4 0.2027827 esp005 highnoopp incorr Time5 0.1213075 esp005 highnoopp incorr Time6 -0.9145339 esp005 highopp correct Time1 -0.09131422 esp005 highopp correct Time2 1.341441 esp005 highopp correct Time3 0.7011359 esp005 highopp correct Time4 -0.1799809 esp005 highopp correct Time5 0.465437 esp005 highopp correct Time6 -0.3150032 esp005 highopp incorr Time1 -0.8010447 esp005 highopp incorr Time2 0.7445193 esp005 highopp incorr Time3 -0.8163298 esp005 highopp incorr Time4 -0.7792278 esp005 highopp incorr Time5 -0.8525334 esp005 highopp incorr Time6 -0.9935786 esp005 low opp correct Time1 -1.135091 esp005 low opp correct Time2 2.080874 esp005 low opp correct Time3 0.9655223 esp005 low opp correct Time4 -0.4225955 esp005 low opp correct Time5 0.3518235 esp005 low opp correct Time6 0.7110413 esp005 low opp incorr Time1 -0.7773728 esp005 low opp incorr Time2 0.2896505 esp005 low opp incorr Time3 0.2957908 esp005 low opp incorr Time4 0.8173949 esp005 low opp incorr Time5 -0.8015972 esp005 low opp incorr Time6 -0.7611155 esp006 highnoopp correct Time1 -0.2736552 esp006 highnoopp correct Time2 0.0553967 esp006 highnoopp correct Time3 0.06530985 esp006 highnoopp correct Time4 -0.03726416 esp006 highnoopp correct Time5 -0.4627595 esp006 highnoopp correct Time6 -0.1016363 esp006 highnoopp incorr Time1 -0.1492384 esp006 highnoopp incorr Time2 0.02676032 esp006 highnoopp incorr Time3 0.01891842 esp006 highnoopp incorr Time4 0.2098106 esp006 highnoopp incorr Time5 -0.07722018 esp006 highnoopp incorr Time6 -0.0820671 esp006 highopp correct Time1 0.2699376 esp006 highopp correct Time2 0.4860634 esp006 highopp correct Time3 0.09422317 esp006 highopp correct Time4 0.1057343 esp006 highopp correct Time5 -0.09392627 esp006 highopp correct Time6 0.1700330 esp006 highopp incorr Time1 -0.2369925 esp006 highopp incorr Time2 -0.1076743 esp006 highopp incorr Time3 -0.1159856 esp006
Re: [R] glht() and contrast() comparison
With the example you provided, it seems both glht() and contrast() work fine. Based on my limited experience with contrast(), if you encounter such an error message you just mentioned, check dat.lme$apVar You might see something like this [1] Non-positive definite approximate variance-covariance which may indicate an inappropriate model for the data? Gang On Jan 11, 2008, at 6:04 PM, array chip wrote: Hi, I have been trying glht() from multcomp package and contrast() from contrast package to test a contrast that I am interested in. With the following simulated dataset (fixed effect type with 3 levels (b, m, t), and random effect batch of 4 levels, a randomized block design with interaction), sometimes both glht() and contrast() worked and gave nearly the same p values; other times, glht()worked, but contrast() didn't and gave the error message: Error in dimnames(X) - list(dn[[1L]], unlist(collabs, use.names = FALSE)) : length of 'dimnames' [2] not equal to array extent. so apparently, the error message from contrast() is data dependent. glht() always worked. Here is the code, I just copy and paste in R repeatedly: batch-as.factor(rep(1:4,3,each=20)) type-as.factor(rep(c('b','m','t'),each=80)) y-2*(type=='m')+4*(type=='t')+rnorm(240,100,20) dat-cbind(as.data.frame(y),type=type,batch=batch) rm(batch,type,y) library(nlme) dat.lme-lme(y~type, random=~1|batch/type, data=dat) library(multcomp) summary(glht(dat.lme, linfct = mcp(type =c(b+t-2*m=0))),test=Chisqtest()) library(contrast) contrast(dat.lme, a=list(type=c('b','t')), b=list(type='m'),type='average') Does anybody have a clue? Thanks, John __ __ Never miss a thing. Make Yahoo your home page. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Residual deviance in glm
I'm running a categorical data analysis with a two-way design of nominal by ordinal structure like the Political Ideology Example (Table 9.5) in Agresti's book Categorical Data Analysis. The nominal variable is Method while the ordinal variable is Quality (Bad, Moderate, Good, Excellent). I rank/quantify Quality with another variable QualityR (1, 2, 3, 4), and run the following: fm - glm(Count ~ Quality + Method * QualityR, family=poisson, MyData) I'm pretty happy with the significance testing of the main effects and contrasts. However after examining the following deviances, I'm concerned about the Poisson fitting of the data: = Null deviance: 426.36 on 35 degrees of freedom Residual deviance: 171.71 on 16 degrees of freedom AIC: 369.78 Number of Fisher Scoring iterations: 6 = If I interpret the deviances correctly, it seems the Poisson fitting only explains (426.36-171.71)/426.36 ~ 60% of the total variability in the data. Also with a residual deviance of 171.71 on 16 degrees of freedom, the p value is 3.83738e-28. So does it indicate Poisson is not a good model for the data? If not, how can I improve the fitting? Change the ranking numbers or switch to a different model? Sorry this seems more like a statistical question than R-related. Thanks, Gang [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.