[R] from data.frame to Venn diagram
Hello All, I have a data.frame with this structure: m <- matrix(sample(c(rep('yes', 10, replace = TRUE), rep('no', 10, replace = TRUE), NA), 500, replace = TRUE), nrow = 100, ncol = 5) colnames(m) <- colnames(m, do.NULL = FALSE, prefix = "col") m <- as.data.frame(m) I need to generate a Venn diagram from this data.frame, displaying the various intersections of 'yes' for the different columns. Ideally, the circle for each column should be proportional to the number of non-NA entries. The package "VennDiagram" (described here: http://www.biomedcentral.com/1471-2105/12/35) can do all this. However, I have not been able to figure out how to transform the data.frame into the required list format. Any suggestions on how to do this? Many thanks, Lara __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] function recode within sapply
Dear List, I am using function recode, from package car, within sapply, as follows: L3 <- LETTERS[1:3] (d <- data.frame(cbind(x = 1, y = 1:10), fac1 = sample(L3, 10, replace=TRUE), fac2 = sample(L3, 10, replace=TRUE), fac3 = sample(L3, 10, replace=TRUE))) str(d) d[, c("fac1", "fac2")] <- sapply(d[, c("fac1", "fac2")], recode, "c('A', 'B') = 'XX'", as.factor.result = TRUE) d[, "fac3"] <- recode(d[, "fac3"], "c('A', 'B') = 'XX'") str(d) However, the class of columns fac1 and fac2 is "character" as opposed to "factor", even though I specify the option "as.factor.result = TRUE"; this option works fine with a single column. Any thoughts? Many thanks, Lara __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subsetting a list of matrices
Hi all, I have an object that looks (roughly) like the following: l <- list(a = matrix(rnorm(9), 3), b = matrix(rnorm(9), 3), c = matrix(rnorm(9), 3)) l$a[3,] <- sample(c("Message 1", "Message 2", "Message 3")) l$b[3,] <- sample(c("Message 1", "Message 2", "Message 3")) l$c[3,] <- sample(c("Message 1", "Message 2", "Message 3")) rownames(l$a) <- rownames(c(1:3), do.NULL = FALSE, prefix = "row") rownames(l$b) <- rownames(c(1:3), do.NULL = FALSE, prefix = "row") rownames(l$c) <- rownames(c(1:3), do.NULL = FALSE, prefix = "row") colnames(l$a) <- c("V1", "V2", "V3") colnames(l$b) <- c("V1", "V2", "V3") colnames(l$c) <- c("V1", "V2", "V3") I want to extract values (row1, V1) for the three sublists a, b, c, but only for those cases in which row3 == "Message 1". Could someone suggest how to proceed? Many thanks in advance, Lara __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kronecker sum
Dear All, Could someone please suggest how to find the Kronecker sum of two 2x2 matrices, i.e. given two matrices: -A A a -a and -B B b -b I need: -A-BA B 0 a -a-B 0 B b0 -A-b A 0b a-a-b Many thanks, Lara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a list of dataframes
Thank you all, this is exactly what I had in mind, except that I still have to get my head around apply et al. Back to the books for me then! Lara On Tue, May 17, 2011 at 2:41 PM, Jannis wrote: > Have a look at lapply(). Something like: > > entries.with.nrows=lapply(data,function(x)dim(x)[1]>1) > > should give you a vector with the elements of the list that you seek marked > with TRUE. > > This vector can then be used to extract a subset from your list by: > > data.reduced=data[entries.with.nrows] > > Or similar > > > HTH > Jannis > > --- Lara Poplarski schrieb am Di, 17.5.2011: > > > Von: Lara Poplarski > > Betreff: [R] subsetting a list of dataframes > > An: r-help@r-project.org > > Datum: Dienstag, 17. Mai, 2011 20:24 Uhr > > Hello All, > > > > I have a list of dataframes, and I need to subset it by > > keeping only those > > dataframes in the list that meet a certain criterion. > > Specifically, I need > > to generate a second list which only includes those > > dataframes whose number > > of rows is > 1. > > > > Could someone suggest how to do this? I have come close to > > what I need with > > loops and such, but there must be a less clumsy way... > > > > Many thanks, > > Lara > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org > > mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, > > reproducible code. > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subsetting a list of dataframes
Hello All, I have a list of dataframes, and I need to subset it by keeping only those dataframes in the list that meet a certain criterion. Specifically, I need to generate a second list which only includes those dataframes whose number of rows is > 1. Could someone suggest how to do this? I have come close to what I need with loops and such, but there must be a less clumsy way... Many thanks, Lara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] applying a function to output a matrix
Dear List, I am using function distCosine from package geosphere to a list of lat/lon coordinates, and I want to calculate the great circle distance between a pair of coordinates in the list and all other pairs --- essentially, the output should be a matrix. I have been able to achieve this with two nested loops, as in the example below, but this is rather slow. Can someone please suggest how to do this with "apply" or similar? Many thanks, Lara install.packages("geosphere") library(geosphere) ##generate sets of random points n <- 100 lon <- runif(n, -180, 180) lat <- runif(n, -90, 90) #package geosphere ##spherical law of cosines method dCos <- matrix( , nrow = length(lon), ncol = length(lat)) for (i in 1:length(lon)) { for (j in 1:length(lat)) { dCos[[i,j]] <- distCosine(matrix(c(lon[i], lat[i]), ncol=2), matrix(c( lon[j], lat[j]), ncol=2)) }} [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] diverting output from nested loops
Dear List, I have a series of nested loops with the structure shown below, and I am struggling to figure out how to divert output to folders created with dir.create() within the loops. What I need is for the output to end up as topNameK/subNameL/objNameM.pdf; what I get instead is a series of directories topNameK/, directories subNameL/, and files objNameM.pdf, all in the working directory. Any hints on how to do this will be much appreciated! Many thanks in advance, Lara for (K in ...){ ... create object ... topDirName <- as.character(paste("topName", K, sep="")) topDirMake <- dir.create(topDirName) for (L in ...) { subDirName <- as.character(paste("subName", L, sep="")) subDirMake <- dir.create(subDirName) ... manipulate object ... for (M in ...) { objectName <- as.character(paste("objName", M, ".pdf", sep="")) pdf(objectName) plot(object) dev.off() } } } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] negative axis values in image() and scalebar in cor.plot() {psych}
Dear List, I sent a related message yesterday, but did not receive it through the mailing list; on the R-help archives it reads as "An embedded and charset-unspecified text was scrubbed...". So here it is again, with the (little) progress I have made since then. Any help would be greatly appreciated! I am working with a relatively large correlation matrix (~1600*1600), which I am looking to plot with function cor.plot in package psych. cor.plot adds the scalebar as an extra column to the matrix plot, and then adds an axis with as many tick marks/labels as there are rows in the matrix. This makes the values unreadable even for relatively small matrices (e.g. 100*100). In order to make the output more readable, one would want the scalebar to have a reasonable number of tick marks/labels, say seq(from = zlim[1], to = zlim[2], by = 0.1) for a given zlim, e.g. zlim=c(-1,1). So for matrix r, I have modified the relevant bit in cor.plot to ... nf <- dim(r)[2] nvar <- dim(r)[1] ... if (show.legend) { #at1 <- (0:(nf))/(nf) at2 <- seq(from = zlim[1], to = zlim[2], by = 0.1) abline(v = (nf - 0.5)/nf) axis(4, at = at2, labels = at2, las = 2, ...) } However, only values between 0 and 1 are used on the axis label. I believe this has to do with the analogous behavior in image(), for example: r <- cor(matrix(rnorm(600), 60, 100)) image(r, zlim=c(-1,1), axes=FALSE) at2 <- seq(from = -1, to = 1, by = 0.1) axis(4, at = at2, labels = at2, las = 2, cex.axis=.5) It is not clear to me from the image() help file why this would be. Could someone provide any hints on this, or on how to modify cor.plot() successfully? Many thanks in advance, Lara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modify the scalebar in cor.plot {psych}?
Dear List, I am working with a relatively large correlation matrix (~1600*1600), which I am looking to plot with function cor.plot in package psych. cor.plot draws a scalebar with as many tick marks/subdivisions as there are rows in the matrix, which makes the values unreadable even for relatively small matrices (e.g. 100*100); this is because the scalebar is added as an extra column to the matrix plot. Could someone suggest how to modify cor.plot to make the output more readable? Ideally, for large matrices one would want the scalebar drawn separately from the matrix, and with a reasonable number of subdivisions, say seq(-1,1,0.1) for zlim=c(-1,1). Many thanks in advance, Lara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rotate column names in large matrix
Thank you both, these are very helpful hints. Chris, could you please suggest how to modify what you sent to also show the same labels as (horizontal) row names? I have not yet mastered the details of R graphics... Many thanks in advance, Lara On Mon, Nov 15, 2010 at 3:25 PM, Chris Stubben wrote: > > You could display the matrix as an image then add column names (rotated > used > srt) if you really want them. > > x <- cor(matrix(rnorm(600), 60, 100)) > > # set margins with extra space at top and xpd=TRUE to write outside plot > region > op<-par(mar=c(1,1,5,1), xpd=TRUE) > > # display image without the 90 degree counter clockwise rotation > image(t(x[nrow(x):1,]), axes=FALSE) > > ## add 100 column names > y<-paste("column", 1:100) > text( seq(0,1,length=100) , 1.01, y, pos = 2, srt = 270, offset=0, cex=.7) > > > Chris Stubben > > > > > > Lara Poplarski wrote: > > > > I have a large (1600*1600) matrix generated with symnum, that I am using > > to > > eyeball the structure of a dataset. > > > > I have abbreviated the column names with the abbr.colnames option. One > way > > to get an even more compact view of the matrix would be to display the > > column names rotated by 90 degrees. > > > > Any pointers on how to do this would be most useful. Any other tips for > > displaying the matrix in compact form are of course also welcome. > > > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/rotate-column-names-in-large-matrix-tp3043493p3043927.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rotate column names in large matrix
Dear List, I have a large (1600*1600) matrix generated with symnum, that I am using to eyeball the structure of a dataset. I have abbreviated the column names with the abbr.colnames option. One way to get an even more compact view of the matrix would be to display the column names rotated by 90 degrees. Any pointers on how to do this would be most useful. Any other tips for displaying the matrix in compact form are of course also welcome. Many thanks, Lara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] exploratory analysis of large categorical datasets
Dear List, I am looking to perform exploratory analyses of two (relatively) large datasets of categorical data. The first one is a binary 80x100 matrix, in the form: matrix(sample(c(0,1),25,replace=TRUE), nrow = 5, ncol=5, dimnames = list(c( "group1", "group2","group3", "group4","group5"), c("V.1", "V.2", "V.3", "V.4", "V.5"))) and the second one is a multistate 750x1500 matrix, with up to 15 *unordered* states per variable, in the form: matrix(sample(c(1:15),25,replace=TRUE), nrow = 5, ncol=5, dimnames = list(c( "group1", "group2","group3", "group4","group5"), c("V.1", "V.2", "V.3", "V.4", "V.5"))) Specifically, I am looking to see which pairs of variables are correlated. For continuos data, I would use cor() and cov() to generate the correlation matrix and the variance-covariance matrix, which I would then visualize with symnum() or image(). However, it is not clear to me whether this approach is suitable for categorical data of this kind. Since I am new to R, I would greatly appreciate any input on how to approach this task and on efficient visualization of the results. Many thanks in advance, Lara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.