Re: [R] Rewriting the Biplot Function

2015-01-20 Thread Scott Robinson
Boris,

Thanks very much, this looks just like what I need!

All the best,

Scott


From: Boris Steipe [boris.ste...@utoronto.ca]
Sent: 20 January 2015 20:21
To: Scott Robinson
Cc: r-help@r-project.org
Subject: Re: [R] Rewriting the Biplot Function

Scott -

A few months ago I posted on this list a modified version of biplot that takes 
a colour argument (and preserves the axes information, so you can use points() 
). I don't have time right now to experiment and look at your code, but perhaps 
this does "out of the box" what you need.

Cheers,
Boris

  Previous post ==
Since I've wanted this capability for some time, I modified the original 
biplot() to accept a type parameter type={"t" (Default) | "p" | "n"}. For "t", 
the function behaves almost exactly as before. For "p" it plots points, and 
should accept all the usual arguments for that. For "n" it plots nothing except 
the axes. You can then add the points as desired.

I also added two parameters col.arrows = "red", and col.text = "black" to have 
extra control.

Here is an example. (Note, you have to load the function, below, first.





library(MASS)
data(crabs)
PRC <- prcomp(crabs[, 4:8])

myBiplot(PRC)
myBiplot(PRC, choices=2:3, cex = 0.7, col.text="#445599") # much as before


# use filled points, color by the value found in column 4 of the data
r <- range(crabs[,4])
n <- 15
cv <- cm.colors(n)
c <- cv[cut(crabs[,4],n)]
myBiplot(PRC, choices=2:3, type = "p", pch=20, col=c, col.arrows = "#FF6600")


# finally: plot nothing then use points() for detailed control
myBiplot(PRC, choices=2:3, type = "n")  # no points

# blue/orange crab males/females - as given by rows in the data
points(PRC$x[  1: 50, 2:3], pch=21, bg="#0066FF")
points(PRC$x[ 51:100, 2:3], pch=24, bg="#0066FF")
points(PRC$x[101:150, 2:3], pch=21, bg="#FF6600")
points(PRC$x[151:200, 2:3], pch=24, bg="#FF6600")


==
myBiplot <- function (x, choices = 1L:2L, scale = 1,
 pc.biplot = FALSE, var.axes = TRUE,
 type = "t",
 col,
 col.arrows = "#FF",
 col.text = "#00",
 cex = rep(par("cex"), 2),
 expand = 1,
 xlabs = NULL, ylabs = NULL,
 xlim = NULL, ylim = NULL,
 main = NULL, sub = NULL,
 xlab = NULL, ylab = NULL,
 arrow.len = 0.1,
 ...
 )

{
if (length(choices) != 2L)
   stop("length of choices must be 2")
   if (!length(scores <- x$x))
   stop(gettextf("object '%s' has no scores", deparse(substitute(x))),
   domain = NA)
   if (is.complex(scores))
   stop("biplots are not defined for complex PCA")

   lam <- x$sdev[choices]
   n <- NROW(scores)
   lam <- lam * sqrt(n)

   if (scale < 0 || scale > 1)
   warning("'scale' is outside [0, 1]")
   if (scale != 0)
   lam <- lam^scale
   else lam <- 1
   if (pc.biplot)
   lam <- lam/sqrt(n)

   y <- t(t(x$rotation[, choices]) * lam)
   x <- t(t(scores[, choices])/lam)  # note that from here on
 # x is no longer the PC object
 # originally pased into the function
   n <- nrow(x)
   p <- nrow(y)

   if (missing(xlabs)) {
   xlabs <- dimnames(x)[[1L]]
   if (is.null(xlabs))
   xlabs <- 1L:n
   }
   xlabs <- as.character(xlabs)
   dimnames(x) <- list(xlabs, dimnames(x)[[2L]])

   if (missing(ylabs)) {
   ylabs <- dimnames(y)[[1L]]
   if (is.null(ylabs))
   ylabs <- paste("Var", 1L:p)
   }
   ylabs <- as.character(ylabs)
   dimnames(y) <- list(ylabs, dimnames(y)[[2L]])

   if (length(cex) == 1L)
   cex <- c(cex, cex)

   unsigned.range <- function(x) c(-abs(min(x, na.rm = TRUE)),
   abs(max(x, na.rm = TRUE)))
   rangx1 <- unsigned.range(x[, 1L])
   rangx2 <- unsigned.range(x[, 2L])
   rangy1 <- unsigned.range(y[, 1L])
   rangy2 <- unsigned.range(y[, 2L])

   if (missing(xlim) && missing(ylim))
   xlim <- ylim <- rangx1 <- rangx2 <- range(rangx1, rangx2)
   else if (missing(xlim))
   xlim <- rangx1
   else if (missing(ylim))
   ylim <- rangx2

   ratio <- max(rangy1/rangx1, rangy2/rangx2)/expand
   on.exit(par(op))
   op <- par(pty = "s")
   if (!is.null(main))
   op <- c(op, par(mar = par("mar") + c(0, 0, 1, 0)

[R] Rewriting the Biplot Function

2015-01-20 Thread Scott Robinson
Dear R-Help,

I have been trying to rewrite the base biplot.prcomp function but am coming 
across some errors I don't understand. It seems like R is still 'expecting' the 
same values despite me rewriting and renaming the methods. My aim is simply to 
have an additional biplot function which could use a vector of colours, where 
each position of the vector gives the colour for a variable name and its arrow.

Another issue I have with the default function is that when I save a very large 
image (to get good seperation and readability when looking at hundreds of 
variables) the names are very displaced from the arrows, but I haven't even 
started looking into that yet...

I ran the code on my actual data and got the error:
> colouredBiplot(prc, yCol=rep("#00", 962))
Error in if (yCol == NULL) { : argument is of length zero
> traceback()
2: colouredBiplot.internal(t(t(scores[, choices])/lam), t(t(x$rotation[,
   choices]) * lam), yCol, ...) at colouredBiplot.R#103
1: colouredBiplot(prc, yCol = rep("#00", 962))

However when I tried creating a small example I got a different error:

> options(stringsAsFactors=F)
> source("C:/Work/InGenious/InGen/colouredBiplot.R")
>
> random <- matrix(rexp(200, rate=.1), ncol=20)
> prc <- prcomp(random, center=T, scale=T)
> colouredBiplot(random, rep("#00", 20))
Error in colouredBiplot(random, rep("#00", 20)) :
  length of choices must be 2
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base




and then immediately after the above I thought to try a traceback() and got 
another different error again, which I don't even understand how is possible:

> colouredBiplot(random, yCol=rep("#00", 20))
Error in x$x : $ operator is invalid for atomic vectors
> traceback()
1: colouredBiplot(random, yCol = rep("#00", 20))




The only things I have changed are to pass in a vector "yCol" and use it inside 
the else blocks of some conditionals testing 'if(yCol==NULL)':

colouredBiplot.internal <- function (x, y, var.axes = TRUE, col, cex = 
rep(par("cex"), 2),
xlabs = NULL, ylabs = NULL, expand = 1, xlim = NULL, ylim = NULL,
arrow.len = 0.1, main = NULL, sub = NULL, xlab = NULL, ylab = NULL, 
yCol=NULL,
...)
{
n <- nrow(x)
p <- nrow(y)
if (missing(xlabs)) {
xlabs <- dimnames(x)[[1L]]
if (is.null(xlabs))
xlabs <- 1L:n
}
xlabs <- as.character(xlabs)
dimnames(x) <- list(xlabs, dimnames(x)[[2L]])
if (missing(ylabs)) {
ylabs <- dimnames(y)[[1L]]
if (is.null(ylabs))
ylabs <- paste("Var", 1L:p)
}
ylabs <- as.character(ylabs)
dimnames(y) <- list(ylabs, dimnames(y)[[2L]])
if (length(cex) == 1L)
cex <- c(cex, cex)
if (missing(col)) {
col <- par("col")
if (!is.numeric(col))
col <- match(col, palette(), nomatch = 1L)
col <- c(col, col + 1L)
}
else if (length(col) == 1L)
col <- c(col, col)
unsigned.range <- function(x) c(-abs(min(x, na.rm = TRUE)),
abs(max(x, na.rm = TRUE)))
rangx1 <- unsigned.range(x[, 1L])
rangx2 <- unsigned.range(x[, 2L])
rangy1 <- unsigned.range(y[, 1L])
rangy2 <- unsigned.range(y[, 2L])
if (missing(xlim) && missing(ylim))
xlim <- ylim <- rangx1 <- rangx2 <- range(rangx1, rangx2)
else if (missing(xlim))
xlim <- rangx1
else if (missing(ylim))
ylim <- rangx2
ratio <- max(rangy1/rangx1, rangy2/rangx2)/expand
on.exit(par(op))
op <- par(pty = "s")
if (!is.null(main))
op <- c(op, par(mar = par("mar") + c(0, 0, 1, 0)))
plot(x, type = "n", xlim = xlim, ylim = ylim, col = col[1L],
xlab = xlab, ylab = ylab, sub = sub, main = main, ...)
text(x, xlabs, cex = cex[1L], col = col[1L], ...)
par(new = TRUE)
dev.hold()
on.exit(dev.flush(), add = TRUE)
if(yCol==NULL){
plot(y, axes = FALSE, type = "n", xlim = xlim * ratio, ylim = ylim *
ratio, xlab = "", ylab = "", col = col[1L], ...)
}
else{
plot(y, axes = FALSE, type = "n", xlim = xlim * ratio, ylim = ylim *
ratio, xlab = "", ylab = "", col = yCol, ...)
}
axis(3, col = col[2L], ...)
axis(4, col = col[2L], ...)
box(col = col[1L])
if(yCol==NULL){
text(y, labels = ylabs, cex = cex[2L], col = col[2L], ...)
} else{
text(y, labels = ylabs, cex = cex[2L], col = yCol, ...)
}
if (var.axes){
if(yCol==NULL){
arrows(0, 0, y[, 1L] * 0.8, y[, 2L] * 0.8, col = col[2L],
length = arrow.len)
}else{
arrows(0, 0, y[, 1L] * 0.8, y[, 2L] * 0.8, col = yCol,
length = arrow.len)}
   

Re: [R] BH correction with p.adjust

2013-07-21 Thread Scott Robinson
My understanding was that the vector was ranked, the adjusted p vector was 
calculated and then the vector is returned to the original order - I came 
across a stack overflow answer saying this:

http://stackoverflow.com/questions/10323817/r-unexpected-results-from-p-adjust-fdr

Although the code there does not appear to be the same as when I type 
"p.adjust" into the command line. The order shouldn't matter anyway since my 
data was ordered by p.

Yesterday I tried a short example of 5 numbers and it seemed to work out though 
today I tried to do another short example to demonstrate that the order in the 
p vector you input doesn't matter but didn't quite get a working example this 
time. Maybe due to a rounding to first significant figure or something?

> smallP <- c(0.01, 0.5, 0.0001)
> names(smallP) <- c("first", "second", "last")
> 
> p.adjust(smallP)
 first second   last 
 2e-02  5e-01  3e-04 
> 
> 0.01*3/2
[1] 0.015
> 0.5*3/3
[1] 0.5
> 0.0001*3/1
[1] 3e-04

In any case I reconstructed a large example which can be run without real data 
where the figure is way off and definitely not the result of a rounding error:

> exampleP <- seq(from=0.001, to=0.1, by=0.0001)
> length(exampleP)
[1] 991
> 
> examplePBH <- p.adjust(exampleP, method="BH")
> 
> exampleP[1]
[1] 1e-07
> 
> examplePBH[1]
[1] 0.1
> 
> exampleP[1]*length(exampleP)/1
[1] 0.991

Any help with this would be very much appreciated. It seems like it ought to be 
such a simple and commonly used method and yet I am struggling and not sure 
what to do about it.

Thanks,

Scott


From: David Winsemius [dwinsem...@comcast.net]
Sent: 21 July 2013 03:33
To: Scott Robinson
Cc: r-help@r-project.org
Subject: Re: [R] BH correction with p.adjust

On Jul 20, 2013, at 10:37 AM, Scott Robinson wrote:

> Dear List,
>
> I have been trying to use p.adjust() to do BH multiple test correction and 
> have gotten some unexpected results. I thought that the equation for this was:
>
> pBH = p*n/i

Looking at the code for `p.adjust`, you see that the method is picked from a 
switch function

lp <- length(p)
BH = {
i <- lp:1L
o <- order(p, decreasing = TRUE)
ro <- order(o)
pmin(1, cummin(n/i * p[o]))[ro]
}

You may not have sorted the p-values in pList.


>
> where p is the original p value, n is the number of tests and i is the rank 
> of the p value. However when I try and recreate the corrected p from my most 
> significant value it does not match up to the one computed by the method 
> p.adjust:
>
>> setwd("C:/work/Methylation/IMA/GM/siteLists")
>>
>> hypTable <- read.delim("hypernormal vs others.txt")
>> pList <- hypTable$p
>> names(pList) <- hypTable$site
>>
>> adjusted <- p.adjust(pList, method="BH")
>> adjusted[1]
> cg27433479
> 0.05030589
>>
>> pList[1]*nrow(hypTable)/1
> cg27433479
> 0.09269194
>

No data provided, so unable to pursue this further.

> I tried to recreate this is a small example of a vector of 5 p values but 
> everything worked as expected there. I was wondering if there is some subtle 
> difference about how p.adjust operates? Is there something more complicated 
> about how to calculate 'n' or 'i' - perhaps due to identical p values being 
> assigned the same rank or something? Does anyone have an idea what might be 
> going on here?


--

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] BH correction with p.adjust

2013-07-20 Thread Scott Robinson
Dear List,

I have been trying to use p.adjust() to do BH multiple test correction and have 
gotten some unexpected results. I thought that the equation for this was:

pBH = p*n/i

where p is the original p value, n is the number of tests and i is the rank of 
the p value. However when I try and recreate the corrected p from my most 
significant value it does not match up to the one computed by the method 
p.adjust:

> setwd("C:/work/Methylation/IMA/GM/siteLists")
> 
> hypTable <- read.delim("hypernormal vs others.txt")
> pList <- hypTable$p
> names(pList) <- hypTable$site
> 
> adjusted <- p.adjust(pList, method="BH")
> adjusted[1]
cg27433479 
0.05030589 
> 
> pList[1]*nrow(hypTable)/1
cg27433479 
0.09269194 

I tried to recreate this is a small example of a vector of 5 p values but 
everything worked as expected there. I was wondering if there is some subtle 
difference about how p.adjust operates? Is there something more complicated 
about how to calculate 'n' or 'i' - perhaps due to identical p values being 
assigned the same rank or something? Does anyone have an idea what might be 
going on here?

Many thanks,

Scott
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Iterating through slots of an S4 object

2013-02-15 Thread Scott Robinson
I want to loop through slots of an S4 object and am unsure how to do so
since the only way I can find to access them is individually in the form
'object@slotName'. I have guessed at a few possibilities which did not work
and I have read some S4 object tutorials and things but still unsuccessful.

I presume it is possible though?

Any help would be much appreciated,

Scott

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] WriteXLS: 'object not found' error within function

2013-02-13 Thread Scott Robinson
Dear All,

I am using WriteXLS to write tables with multiple sheets with the command:

WriteXLS("tables", ExcelFileName = fileName, SheetNames = tableList, perl = 
"perl",
   verbose = FALSE, Encoding = c("UTF-8", "latin1"),
   row.names = TRUE, col.names = TRUE,
   AdjWidth = TRUE, AutoFilter = FALSE, BoldHeaderRow = FALSE,
   FreezeRow = 0, FreezeCol = 0,
   envir = parent.frame())


...where "tables" is the name of a list of data.frames which are the tables to 
go into my sheets. I am having no problem running my code unless it is within a 
function in which case I get the error:

"Error in get(as.character(x), envir = envir) : object 'tables' not found"

I have checked that there is not an issue with scope using these two lines 
within my function immediately before calling the WriteXLS function:

print(class(tables))
flush.console()
[1] "list"


At least I would have thought this would have ruled out any scope-related 
issues. Has anyone else had this problem or have any ideas why it might be 
happening?

Thanks,

Scott


PS here is the function in full:

makeTables <- function(design, designInfo, oldMatrix, matrix, annot, tableList)
{
rownames(design) <- designInfo[,1]

fit <- lmFit(matrix, design)
fit <- eBayes(fit)

tables<-list()

for (i in 1:length(tableList))
{
table <- makeTable(oldMatrix, matrix, annot, fit, tableList[i])
print(paste(tableList[i], "=", sep=""))
print(table)
tables[[i]] <- table
}

#Saving bit (set to FALSE if not using)
saveIt <- TRUE

if(saveIt)
{
fileName <- 
paste0("~",paste(colnames(design)[2:length(colnames(design))], collapse = 
"+"),".xls")

print(paste("sheetnames=",length(tableList),

print(class(tables))
flush.console()

WriteXLS("tables", ExcelFileName = fileName, SheetNames = tableList, 
perl = "perl",
   verbose = FALSE, Encoding = c("UTF-8", "latin1"),
   row.names = TRUE, col.names = TRUE,
   AdjWidth = TRUE, AutoFilter = FALSE, BoldHeaderRow = FALSE,
   FreezeRow = 0, FreezeCol = 0,
   envir = parent.frame())
}
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.