Re: [R] Rewriting the Biplot Function
Boris, Thanks very much, this looks just like what I need! All the best, Scott From: Boris Steipe [boris.ste...@utoronto.ca] Sent: 20 January 2015 20:21 To: Scott Robinson Cc: r-help@r-project.org Subject: Re: [R] Rewriting the Biplot Function Scott - A few months ago I posted on this list a modified version of biplot that takes a colour argument (and preserves the axes information, so you can use points() ). I don't have time right now to experiment and look at your code, but perhaps this does "out of the box" what you need. Cheers, Boris Previous post == Since I've wanted this capability for some time, I modified the original biplot() to accept a type parameter type={"t" (Default) | "p" | "n"}. For "t", the function behaves almost exactly as before. For "p" it plots points, and should accept all the usual arguments for that. For "n" it plots nothing except the axes. You can then add the points as desired. I also added two parameters col.arrows = "red", and col.text = "black" to have extra control. Here is an example. (Note, you have to load the function, below, first. library(MASS) data(crabs) PRC <- prcomp(crabs[, 4:8]) myBiplot(PRC) myBiplot(PRC, choices=2:3, cex = 0.7, col.text="#445599") # much as before # use filled points, color by the value found in column 4 of the data r <- range(crabs[,4]) n <- 15 cv <- cm.colors(n) c <- cv[cut(crabs[,4],n)] myBiplot(PRC, choices=2:3, type = "p", pch=20, col=c, col.arrows = "#FF6600") # finally: plot nothing then use points() for detailed control myBiplot(PRC, choices=2:3, type = "n") # no points # blue/orange crab males/females - as given by rows in the data points(PRC$x[ 1: 50, 2:3], pch=21, bg="#0066FF") points(PRC$x[ 51:100, 2:3], pch=24, bg="#0066FF") points(PRC$x[101:150, 2:3], pch=21, bg="#FF6600") points(PRC$x[151:200, 2:3], pch=24, bg="#FF6600") == myBiplot <- function (x, choices = 1L:2L, scale = 1, pc.biplot = FALSE, var.axes = TRUE, type = "t", col, col.arrows = "#FF", col.text = "#00", cex = rep(par("cex"), 2), expand = 1, xlabs = NULL, ylabs = NULL, xlim = NULL, ylim = NULL, main = NULL, sub = NULL, xlab = NULL, ylab = NULL, arrow.len = 0.1, ... ) { if (length(choices) != 2L) stop("length of choices must be 2") if (!length(scores <- x$x)) stop(gettextf("object '%s' has no scores", deparse(substitute(x))), domain = NA) if (is.complex(scores)) stop("biplots are not defined for complex PCA") lam <- x$sdev[choices] n <- NROW(scores) lam <- lam * sqrt(n) if (scale < 0 || scale > 1) warning("'scale' is outside [0, 1]") if (scale != 0) lam <- lam^scale else lam <- 1 if (pc.biplot) lam <- lam/sqrt(n) y <- t(t(x$rotation[, choices]) * lam) x <- t(t(scores[, choices])/lam) # note that from here on # x is no longer the PC object # originally pased into the function n <- nrow(x) p <- nrow(y) if (missing(xlabs)) { xlabs <- dimnames(x)[[1L]] if (is.null(xlabs)) xlabs <- 1L:n } xlabs <- as.character(xlabs) dimnames(x) <- list(xlabs, dimnames(x)[[2L]]) if (missing(ylabs)) { ylabs <- dimnames(y)[[1L]] if (is.null(ylabs)) ylabs <- paste("Var", 1L:p) } ylabs <- as.character(ylabs) dimnames(y) <- list(ylabs, dimnames(y)[[2L]]) if (length(cex) == 1L) cex <- c(cex, cex) unsigned.range <- function(x) c(-abs(min(x, na.rm = TRUE)), abs(max(x, na.rm = TRUE))) rangx1 <- unsigned.range(x[, 1L]) rangx2 <- unsigned.range(x[, 2L]) rangy1 <- unsigned.range(y[, 1L]) rangy2 <- unsigned.range(y[, 2L]) if (missing(xlim) && missing(ylim)) xlim <- ylim <- rangx1 <- rangx2 <- range(rangx1, rangx2) else if (missing(xlim)) xlim <- rangx1 else if (missing(ylim)) ylim <- rangx2 ratio <- max(rangy1/rangx1, rangy2/rangx2)/expand on.exit(par(op)) op <- par(pty = "s") if (!is.null(main)) op <- c(op, par(mar = par("mar") + c(0, 0, 1, 0)
[R] Rewriting the Biplot Function
Dear R-Help, I have been trying to rewrite the base biplot.prcomp function but am coming across some errors I don't understand. It seems like R is still 'expecting' the same values despite me rewriting and renaming the methods. My aim is simply to have an additional biplot function which could use a vector of colours, where each position of the vector gives the colour for a variable name and its arrow. Another issue I have with the default function is that when I save a very large image (to get good seperation and readability when looking at hundreds of variables) the names are very displaced from the arrows, but I haven't even started looking into that yet... I ran the code on my actual data and got the error: > colouredBiplot(prc, yCol=rep("#00", 962)) Error in if (yCol == NULL) { : argument is of length zero > traceback() 2: colouredBiplot.internal(t(t(scores[, choices])/lam), t(t(x$rotation[, choices]) * lam), yCol, ...) at colouredBiplot.R#103 1: colouredBiplot(prc, yCol = rep("#00", 962)) However when I tried creating a small example I got a different error: > options(stringsAsFactors=F) > source("C:/Work/InGenious/InGen/colouredBiplot.R") > > random <- matrix(rexp(200, rate=.1), ncol=20) > prc <- prcomp(random, center=T, scale=T) > colouredBiplot(random, rep("#00", 20)) Error in colouredBiplot(random, rep("#00", 20)) : length of choices must be 2 > sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 [2] LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base and then immediately after the above I thought to try a traceback() and got another different error again, which I don't even understand how is possible: > colouredBiplot(random, yCol=rep("#00", 20)) Error in x$x : $ operator is invalid for atomic vectors > traceback() 1: colouredBiplot(random, yCol = rep("#00", 20)) The only things I have changed are to pass in a vector "yCol" and use it inside the else blocks of some conditionals testing 'if(yCol==NULL)': colouredBiplot.internal <- function (x, y, var.axes = TRUE, col, cex = rep(par("cex"), 2), xlabs = NULL, ylabs = NULL, expand = 1, xlim = NULL, ylim = NULL, arrow.len = 0.1, main = NULL, sub = NULL, xlab = NULL, ylab = NULL, yCol=NULL, ...) { n <- nrow(x) p <- nrow(y) if (missing(xlabs)) { xlabs <- dimnames(x)[[1L]] if (is.null(xlabs)) xlabs <- 1L:n } xlabs <- as.character(xlabs) dimnames(x) <- list(xlabs, dimnames(x)[[2L]]) if (missing(ylabs)) { ylabs <- dimnames(y)[[1L]] if (is.null(ylabs)) ylabs <- paste("Var", 1L:p) } ylabs <- as.character(ylabs) dimnames(y) <- list(ylabs, dimnames(y)[[2L]]) if (length(cex) == 1L) cex <- c(cex, cex) if (missing(col)) { col <- par("col") if (!is.numeric(col)) col <- match(col, palette(), nomatch = 1L) col <- c(col, col + 1L) } else if (length(col) == 1L) col <- c(col, col) unsigned.range <- function(x) c(-abs(min(x, na.rm = TRUE)), abs(max(x, na.rm = TRUE))) rangx1 <- unsigned.range(x[, 1L]) rangx2 <- unsigned.range(x[, 2L]) rangy1 <- unsigned.range(y[, 1L]) rangy2 <- unsigned.range(y[, 2L]) if (missing(xlim) && missing(ylim)) xlim <- ylim <- rangx1 <- rangx2 <- range(rangx1, rangx2) else if (missing(xlim)) xlim <- rangx1 else if (missing(ylim)) ylim <- rangx2 ratio <- max(rangy1/rangx1, rangy2/rangx2)/expand on.exit(par(op)) op <- par(pty = "s") if (!is.null(main)) op <- c(op, par(mar = par("mar") + c(0, 0, 1, 0))) plot(x, type = "n", xlim = xlim, ylim = ylim, col = col[1L], xlab = xlab, ylab = ylab, sub = sub, main = main, ...) text(x, xlabs, cex = cex[1L], col = col[1L], ...) par(new = TRUE) dev.hold() on.exit(dev.flush(), add = TRUE) if(yCol==NULL){ plot(y, axes = FALSE, type = "n", xlim = xlim * ratio, ylim = ylim * ratio, xlab = "", ylab = "", col = col[1L], ...) } else{ plot(y, axes = FALSE, type = "n", xlim = xlim * ratio, ylim = ylim * ratio, xlab = "", ylab = "", col = yCol, ...) } axis(3, col = col[2L], ...) axis(4, col = col[2L], ...) box(col = col[1L]) if(yCol==NULL){ text(y, labels = ylabs, cex = cex[2L], col = col[2L], ...) } else{ text(y, labels = ylabs, cex = cex[2L], col = yCol, ...) } if (var.axes){ if(yCol==NULL){ arrows(0, 0, y[, 1L] * 0.8, y[, 2L] * 0.8, col = col[2L], length = arrow.len) }else{ arrows(0, 0, y[, 1L] * 0.8, y[, 2L] * 0.8, col = yCol, length = arrow.len)}
Re: [R] BH correction with p.adjust
My understanding was that the vector was ranked, the adjusted p vector was calculated and then the vector is returned to the original order - I came across a stack overflow answer saying this: http://stackoverflow.com/questions/10323817/r-unexpected-results-from-p-adjust-fdr Although the code there does not appear to be the same as when I type "p.adjust" into the command line. The order shouldn't matter anyway since my data was ordered by p. Yesterday I tried a short example of 5 numbers and it seemed to work out though today I tried to do another short example to demonstrate that the order in the p vector you input doesn't matter but didn't quite get a working example this time. Maybe due to a rounding to first significant figure or something? > smallP <- c(0.01, 0.5, 0.0001) > names(smallP) <- c("first", "second", "last") > > p.adjust(smallP) first second last 2e-02 5e-01 3e-04 > > 0.01*3/2 [1] 0.015 > 0.5*3/3 [1] 0.5 > 0.0001*3/1 [1] 3e-04 In any case I reconstructed a large example which can be run without real data where the figure is way off and definitely not the result of a rounding error: > exampleP <- seq(from=0.001, to=0.1, by=0.0001) > length(exampleP) [1] 991 > > examplePBH <- p.adjust(exampleP, method="BH") > > exampleP[1] [1] 1e-07 > > examplePBH[1] [1] 0.1 > > exampleP[1]*length(exampleP)/1 [1] 0.991 Any help with this would be very much appreciated. It seems like it ought to be such a simple and commonly used method and yet I am struggling and not sure what to do about it. Thanks, Scott From: David Winsemius [dwinsem...@comcast.net] Sent: 21 July 2013 03:33 To: Scott Robinson Cc: r-help@r-project.org Subject: Re: [R] BH correction with p.adjust On Jul 20, 2013, at 10:37 AM, Scott Robinson wrote: > Dear List, > > I have been trying to use p.adjust() to do BH multiple test correction and > have gotten some unexpected results. I thought that the equation for this was: > > pBH = p*n/i Looking at the code for `p.adjust`, you see that the method is picked from a switch function lp <- length(p) BH = { i <- lp:1L o <- order(p, decreasing = TRUE) ro <- order(o) pmin(1, cummin(n/i * p[o]))[ro] } You may not have sorted the p-values in pList. > > where p is the original p value, n is the number of tests and i is the rank > of the p value. However when I try and recreate the corrected p from my most > significant value it does not match up to the one computed by the method > p.adjust: > >> setwd("C:/work/Methylation/IMA/GM/siteLists") >> >> hypTable <- read.delim("hypernormal vs others.txt") >> pList <- hypTable$p >> names(pList) <- hypTable$site >> >> adjusted <- p.adjust(pList, method="BH") >> adjusted[1] > cg27433479 > 0.05030589 >> >> pList[1]*nrow(hypTable)/1 > cg27433479 > 0.09269194 > No data provided, so unable to pursue this further. > I tried to recreate this is a small example of a vector of 5 p values but > everything worked as expected there. I was wondering if there is some subtle > difference about how p.adjust operates? Is there something more complicated > about how to calculate 'n' or 'i' - perhaps due to identical p values being > assigned the same rank or something? Does anyone have an idea what might be > going on here? -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BH correction with p.adjust
Dear List, I have been trying to use p.adjust() to do BH multiple test correction and have gotten some unexpected results. I thought that the equation for this was: pBH = p*n/i where p is the original p value, n is the number of tests and i is the rank of the p value. However when I try and recreate the corrected p from my most significant value it does not match up to the one computed by the method p.adjust: > setwd("C:/work/Methylation/IMA/GM/siteLists") > > hypTable <- read.delim("hypernormal vs others.txt") > pList <- hypTable$p > names(pList) <- hypTable$site > > adjusted <- p.adjust(pList, method="BH") > adjusted[1] cg27433479 0.05030589 > > pList[1]*nrow(hypTable)/1 cg27433479 0.09269194 I tried to recreate this is a small example of a vector of 5 p values but everything worked as expected there. I was wondering if there is some subtle difference about how p.adjust operates? Is there something more complicated about how to calculate 'n' or 'i' - perhaps due to identical p values being assigned the same rank or something? Does anyone have an idea what might be going on here? Many thanks, Scott __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Iterating through slots of an S4 object
I want to loop through slots of an S4 object and am unsure how to do so since the only way I can find to access them is individually in the form 'object@slotName'. I have guessed at a few possibilities which did not work and I have read some S4 object tutorials and things but still unsuccessful. I presume it is possible though? Any help would be much appreciated, Scott [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] WriteXLS: 'object not found' error within function
Dear All, I am using WriteXLS to write tables with multiple sheets with the command: WriteXLS("tables", ExcelFileName = fileName, SheetNames = tableList, perl = "perl", verbose = FALSE, Encoding = c("UTF-8", "latin1"), row.names = TRUE, col.names = TRUE, AdjWidth = TRUE, AutoFilter = FALSE, BoldHeaderRow = FALSE, FreezeRow = 0, FreezeCol = 0, envir = parent.frame()) ...where "tables" is the name of a list of data.frames which are the tables to go into my sheets. I am having no problem running my code unless it is within a function in which case I get the error: "Error in get(as.character(x), envir = envir) : object 'tables' not found" I have checked that there is not an issue with scope using these two lines within my function immediately before calling the WriteXLS function: print(class(tables)) flush.console() [1] "list" At least I would have thought this would have ruled out any scope-related issues. Has anyone else had this problem or have any ideas why it might be happening? Thanks, Scott PS here is the function in full: makeTables <- function(design, designInfo, oldMatrix, matrix, annot, tableList) { rownames(design) <- designInfo[,1] fit <- lmFit(matrix, design) fit <- eBayes(fit) tables<-list() for (i in 1:length(tableList)) { table <- makeTable(oldMatrix, matrix, annot, fit, tableList[i]) print(paste(tableList[i], "=", sep="")) print(table) tables[[i]] <- table } #Saving bit (set to FALSE if not using) saveIt <- TRUE if(saveIt) { fileName <- paste0("~",paste(colnames(design)[2:length(colnames(design))], collapse = "+"),".xls") print(paste("sheetnames=",length(tableList), print(class(tables)) flush.console() WriteXLS("tables", ExcelFileName = fileName, SheetNames = tableList, perl = "perl", verbose = FALSE, Encoding = c("UTF-8", "latin1"), row.names = TRUE, col.names = TRUE, AdjWidth = TRUE, AutoFilter = FALSE, BoldHeaderRow = FALSE, FreezeRow = 0, FreezeCol = 0, envir = parent.frame()) } } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.