[R] Controlling text and strip arrangement in xyplot
I've searched the archives and read the xyplot help but can't figure out the 2 lattice questions below? Consider: library(lattice) DF <- data.frame(x=rnorm(20), y=rnorm(20), g1=rep(letters[1:2], 10), g2=rep(LETTERS[1:2], each=10), g3=rep(rep(letters[3:4],each=5),2)) xyplot(y ~ x | g1 + g2, groups=g3, data=DF) 1) Is there a way to get one strip per row and column of panels as below instead of the default? _|__a__|__b__| | B | -- | A | 2) How do I control the text of the strips so that for instance instead of "a" and "b" it reads"g1=alpha", "g1=beta" where "alpha" and "beta" stand for the corresponding greek symbols? (my difficulty here is not with the plotmath symbols but with controlling the text of the strips directly from the call to xyplot and not by renaming the levels of g1) I'd appreciate any help! Juan Pablo Lewinger Department of Preventive Medicine Keck School of Medicine University of Southern California __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speeding up resampling of rows from a large matrix
That's beautiful. For the full 120 x 65,000 matrix your approach took 85 seconds. A truly remarkable improvement over my 80 minutes! Thank you! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Speeding up resampling of rows from a large matrix
I'm trying to: Resample with replacement pairs of distinct rows from a 120 x 65,000 matrix H of 0's and 1's. For each resampled pair sum the resulting 2 x 65,000 matrix by column: 0 1 0 1 ... + 0 0 1 1 ... ___ = 0 1 1 2 ... For each column accumulate the number of 0's, 1's and 2's over the resamples to obtain a 3 x 65,000 matrix G. For those interested in the background, H is a matrix of haplotypes, each pair of haplotypes forms a genotype, and each column corresponds to a SNP. I'm using resampling to compute the null distribution of the maximum over correlated SNPs of a simple statistic. The code: #--- nSNPs <- 1000 H <- matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120) G <- matrix(0, nrow=3, ncol=nSNPs) # Keep in mind that the real H is 120 x 65000 nResamples <- 3000 pair <- replicate(nResamples, sample(1:120, 2)) gen <- function(x){g <- sum(x); c(g==0, g==1, g==2)} for (i in 1:nResamples){ G <- G + apply(H[pair[,i],], 2, gen) } #--- The problem is that the loop takes about 80 mins to complete and I need to repeat the whole thing 10,000 times, which would then take over a year and a half! Is there a way to speed this up so that the full 10,000 iterations take a reasonable amount of time (say a week)? My machine has an Intel Xeon 3.40GHz CPU with 1GB of RAM > sessionInfo() R version 2.5.0 (2007-04-23) i386-pc-mingw32 I would greatly appreciate any help. Juan Pablo Lewinger Department of Preventive Medicine Keck School of Medicine University of Southern California __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Efficiently reading random lines form a large file
I need to read two different random lines at a time from a large ASCII file (120 x 296976) containing space delimited 0-1 entries. The following code does the job and it's reasonable fast for my needs: lineNumber = sample(120, 2) line1 = scan(filename, what = "integer", skip=lineNumber[1]-1, nlines=1) line2 = scan(filename, what = "integer", skip=lineNumber[2]-1, nlines=1) > system.time(for (i in 50){ + lineNumber = sample(120, 2) + line1 = scan(filename, what = "integer", skip=lineNumber[1]-1, nlines=1) + line2 = scan(filename, what = "integer", skip=lineNumber[2]-1, nlines=1) + }) Read 296976 items Read 296976 items [1] 14.24 0.12 14.51NANA However, I'm wondering if there's an even faster way to do this. Is there? > sessionInfo() R version 2.4.1 (2006-12-18) i386-pc-mingw32 Juan Pablo Lewinger Department of Preventive Medicine Keck School of Medicine University of Southern California 1540 Alcazar Street, CHP-220 Los Angeles, CA 90089-9011, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CDF of a Multivariate Normal
> In my simulations, I have to use the values of the cumulative distribution function of a multivariate > normal with known mean vector and dispersion matrix. Please, can you tell me if there is a package in R to do that? There are two that I know of: mvtnorm mnormt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random sample from log-normal distribution
> Dear all R users, > > Please forgive me if my question is too trivial. > Suppose I have two variables, (x,y) which is > log-normally distributed with expected value (mu1, > mu2) and some variance-covariance matrix. Now I want > to draw a random sample of size 1000 from this > distribution. Is there any function available to do > this? > > Thanks and regards, > Megh If what you really want is a bivariate lognormal, you can generate first a bivariate normal sample (X,Y) with the function rmvnorm in package mvtnorm. Then exp(X,Y) will be multivariate lognormal. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Retrieving value computed in inner function call
Dear R users, Consider the following example function: f = function(a,b) { g = function(x) a*x + b h = function(x) g(x)^2 + x^2 opt = optimize(h,lower = -1, upper = 1) x.min = opt$minimum h.xmin = opt$objective g.xmin = g(x.min) return(c(x.min, h.xmin, g.xmin)) } In my real problem the function that plays the role of "g" is costly to compute. Now, to minimize "h", "optimize" calls "h" with different values of x. In particular, at the end of the optimization, "h" would be called with argument x.min, the minimizer of h(x). Therefore, buried somewhere, there has to be a call to "g" with argument x=x.min which I would like to retrieve in order to avoid the extra call to "g" in the line before the return. Can this be done without too much pain? I'd very much appreciate any help. Juan Pablo Lewinger Department of Preventive Medicine University of Southern California __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numeric variables converted to character when recoding missingvalues
Thanks Bert, that works of course and is much more straightforward than what I was trying. However, I'm still puzzled as to why x[x==99]<-NA works (i.e. it replaces the 999s with NAs and keeps the numeric variables numeric) but is.na(x[x==999])<-TRUE doesn't (it replaces the 999s with NAs but changes all variables where a replacement was made to character) PS: As far as I can tell section 2.5 of "An Introduction to R" -which I had read- doesn't answer my original question. Juan Pablo Lewinger Department of Preventive Medicine Keck School of Medicine University of Southern California -Original Message- From: Berton Gunter [mailto:[EMAIL PROTECTED] Sent: Friday, June 23, 2006 3:15 PM To: 'Juan Pablo Lewinger'; r-help@stat.math.ethz.ch Subject: RE: [R] numeric variables converted to character when recoding missingvalues Please read section 2.5 of "An Introduction to R". Numerical missing values are assigned as NA: x[x==999]<-NA -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Juan > Pablo Lewinger > Sent: Friday, June 23, 2006 3:00 PM > To: r-help@stat.math.ethz.ch > Subject: [R] numeric variables converted to character when > recoding missingvalues > > Dear R helpers, > > I have a data frame where missing values for numeric > variables are coded as > 999. I want to recode those as NAs. The following only > partially succeeds > because numeric variables are converted to character in the process: > > df <- data.frame(a=c(999,1,999,2), b=LETTERS[1:4]) > is.na(df[2,1]) <- TRUE > df > > a b > 1 999 A > 2 NA B > 3 999 C > 4 2 D > > is.numeric(df$a) > [1] TRUE > > > is.na(df[!is.na(df) & df==999]) <- TRUE > df > a b > 1 A > 21 B > 3 C > 42 D > > is.character(df$a) > [1] TRUE > > My question is how to do the recoding while avoiding this > undesirable side > effect. I'm using R 2.2.1 (yes, I know 2.3.1 is available but > don't want to > switch mid project). I'd appreciate any help. > > Further details: > > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major2 > minor2.1 > year 2005 > month12 > day 20 > svn rev 36812 > language R > > > > Juan Pablo Lewinger > Department of Preventive Medicine > Keck School of Medicine > University of Southern California > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] numeric variables converted to character when recoding missing values
Dear R helpers, I have a data frame where missing values for numeric variables are coded as 999. I want to recode those as NAs. The following only partially succeeds because numeric variables are converted to character in the process: df <- data.frame(a=c(999,1,999,2), b=LETTERS[1:4]) is.na(df[2,1]) <- TRUE df a b 1 999 A 2 NA B 3 999 C 4 2 D is.numeric(df$a) [1] TRUE is.na(df[!is.na(df) & df==999]) <- TRUE df a b 1 A 21 B 3 C 42 D is.character(df$a) [1] TRUE My question is how to do the recoding while avoiding this undesirable side effect. I'm using R 2.2.1 (yes, I know 2.3.1 is available but don't want to switch mid project). I'd appreciate any help. Further details: platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor2.1 year 2005 month12 day 20 svn rev 36812 language R Juan Pablo Lewinger Department of Preventive Medicine Keck School of Medicine University of Southern California __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] cdf of multivariate normal
I was wondering if anybody has written R code to compute the cdf of a multivariate (or at least a bivariate) normal distribution with given covariance structure. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html