Re: [R] Question about levels/as.numeric
On Sun, Apr 10, 2011 at 05:47:59PM +0200, Thibault Vatter wrote: Hi, I am still new to R and this is my first post on this mailing-list. I have two .csv (each one being a column of real numbers) coming from the same database (the first one is just longer than the second) and I read them in R the following way: returns - read.csv(test.csv, header = FALSE) returns2 - read.csv(test2.csv, header = FALSE) However, the two objects clearly don't seem to be equivalent: returns[2528:2537,1] [1] -0.002206 0.115696 -0.015192 0.008719 -0.004654 -0.010688 0.009453 0.002676 0.001334 -0.011326 7470 Levels: -0.78 -0.85 -0.86 -0.0001 -0.000112 -0.000115 -0.000152 -0.000154 -0.000157 -0.00016 -0.000171 -0.000185 -0.000212 -0.000238 -0.000256 -0.000259 -0.000263 -0.000273 ... C There is probably a non-numeric row in the data. In order to locate this row, try the following which(is.na(as.numeric(as.character(returns[, 1] This will show the indices of the rows, which cannot be converted to numeric type. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random sampling with levels and with replacement
Andreas, Thanks alot. I combined below and other suggestions given on r-help and it worked. --- On Fri, 4/8/11, Andreas Borg andreas.b...@unimedizin-mainz.de wrote: From: Andreas Borg andreas.b...@unimedizin-mainz.de Subject: Re: [R] random sampling with levels and with replacement To: tab...@yahoo.com Cc: R help r-help@r-project.org Date: Friday, April 8, 2011, 11:13 AM Hi, I am not perfectly sure what you want to do, but here is what I would do to maintain good/bad ratio in the sample (as Daniel posted, split the data and sample from the groups): df - data.frame(V1 = 1:400, V2 = c(rep(good,360), rep(bad,40))) isGood - which(df$V2==good) isBad - which(df$V2==bad) sampleGood - df[sample(isGood, replace=TRUE),] sampleBad - df[sample(isBad, replace=TRUE),] summary(rbind(sampleGood, sampleBad)) Please include a more specific example with test data (for final in this case) next time. Best regards, Andreas taby gathoni schrieb: Dear all, i have a dataset of about 400 records , with a variable that has two levels 40 bad and 360 good among other variables,how do i come up with10 random samples that have the composition of as the main sample but maintaining the 40 bad 360 good with replacement, i recently discovered that my random samples generated dont maintain the ratio. My code is as : mysample - final[sample(1:nrow(final), 400,replace=TRUE),] does not give me the ratio of 40 bad and 360 good can anyone give me some pointers please? Thanks, Taby [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andreas Borg Medizinische Informatik UNIVERSITÄTSMEDIZIN der Johannes Gutenberg-Universität Institut für Medizinische Biometrie, Epidemiologie und Informatik Obere Zahlbacher Straße 69, 55131 Mainz www.imbei.uni-mainz.de Telefon +49 (0) 6131 175062 E-Mail: b...@imbei.uni-mainz.de Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und löschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der darin enthaltenen Informationen ist nicht gestattet. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help question
Hi r-help-boun...@r-project.org napsal dne 08.04.2011 18:24:37: On 08/04/2011 9:20 AM, DEBERGH Patrick wrote: hello I am at the very beginneing of using the R program I just don't understand how one can save a programfile For exemple, if I type in R 23+456 and want to save this file under a ceratin name to reload it later, i just don't get the way to do it R is not Microsoft Word. When you type 23+456, you're asking R to do something for you, you're not writing a document. So you can save a record of what you asked for, but you can't reload it later in R to continue on. If you want such functionality you shall look to some editor which is capable to cooperate with R like TINN-R or maybe ESS. Regards Petr Duncan Murdoch I can save it with the save function;I acheive to see that I have a file with the name, but no way to understand how to reload or re-suse it in the official r manuel this is very badly described for first users.. thanks helping me saving R programs Best regards Patrick Debergh VP Product Development Colibrys (Suisse) Ltd - Maladière 83 - CH-2000 Neuchâtel, Switzerland Phone: +41 32 720 5696 / Fax: +41 32 720 57 84 mailto:patrick.debe...@colibrys.com http://www.colibrys.comhttp://www.colibrys.com/ _ This message may contain confidential and proprietary material for the sole use of the intended recipient. Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about levels/as.numeric
Thanks Kent and Petr, the problem was indeed the C/missing value that I had to convert! Thanks Peter too, the factor explanation will also be quite usefull for further work. Best regards, Thibault On 11 April 2011 03:48, Rolf Turner rolf.tur...@xtra.co.nz wrote: On 11/04/11 10:08, Peter Ehlers wrote: SNIP Checking anything with Excel is never much use. SNIP Fortune? cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Password-protect R script files
On Sun, Apr 10, 2011 at 8:18 PM, Vijayan Padmanabhan padmanabhan.vija...@gmail.com wrote: There was a question in R forum very long time back.. on how to protect R Script files from inadvertent editing by users. The good way to do it is to include the following comment at the beginning: # This is a holy Script, please edit it not Regards, Kenn Konstabel There is a way to do this from within R, atleast in Windows XP I have tried this and it certainly works , The method is very different from the OS based folder protection route, however making available such a method in the open forum would only kill the very spirit of R. But if someone is able to convince me the genuineness of his reasons to achieve such a purpose, I might decide to provide selective service to achieve the same. Regards Vijayan Padmanabhan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Password-protect R script files
On Mon, Apr 11, 2011 at 12:21 AM, Kenn Konstabel lebats...@gmail.com wrote: On Sun, Apr 10, 2011 at 8:18 PM, Vijayan Padmanabhan padmanabhan.vija...@gmail.com wrote: There was a question in R forum very long time back.. on how to protect R Script files from inadvertent editing by users. The good way to do it is to include the following comment at the beginning: # This is a holy Script, please edit it not If ever fate predestined a reply to be a fortunes candidate, surely this is the one. Josh Regards, Kenn Konstabel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Password-protect R script files
On Sun, Apr 10, 2011 at 10:48:19PM +0530, Vijayan Padmanabhan wrote: There was a question in R forum very long time back.. on how to protect R Script files from inadvertent editing by users. There is a way to do this from within R, atleast in Windows XP I have tried this and it certainly works , A possible approach is to put the script into an extension package. This does not prevent the user to modify the script, but makes it almost impossible to change it inadvertently. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: In need of help with correlations
Hi r-help-boun...@r-project.org napsal dne 09.04.2011 19:24:38: I am in need of someone's help in correlating gene expression. I'm somewhat new to R, and can't seem to find anyone local to help me with what I think is a simple problem. I need to obtain pearson and spearman correlation coefficients, and corresponding p-values for all of the genes in my dataset that correlate to one specific gene of interest. I'm working with mouse Affymetrix Mouse 430 2.0 arrays, so I've got about 45,000 probesets (rows; with 1st column containing identifiers) and 30 biological replicates (columns; with the top row containing the header information). I've looked through several Intro manuals and the R help files. I know that cor(x,y, use =everything, method = c(pearson)) can help obtain the coefficients. I also know that cor.test() is supposed to test the significance of a single correlation coefficients. I've also found the bioconductor package genefilter / genefinder that looks for correlations to a given gene (although I can't get it to work). So far I've been able to: #Read in the csv file data-read.csv(my data.csv) #Check the dimensions, names, class, fix(data) to ensure the file was loaded properly dim(data) names(data) class(data) fix(data) #So far I've been able to successfully correlate the entire 'column' matrix through: x - data[,2:30] y - data[,2:30] corr.data-cor(x,y, use = everything, method = c(pearson)) write.csv(corr.data, file = correlation of my data by columns.csv) --- Now if I try and run the 'cor.test()' function on the same matrix, I get and error message with 'x' must be a numeric vector. This I don't understand. In cor.test help page it is said x, y: numeric vectors of data values. ‘x’ and ‘y’ must have the same length. however your data[,2:30] is most probably data frame, see str(data[,2:20]) To be able to do cor.test you need to do cor.test like cor.test(data[,2], data[,3]) or to do it in some cycle (untested) result - matrix(NA, 20,20) for( i in 2:20) { for(j in i+1:20) { result[i,j] - cor.test(data[,i], data[,j]) }} But most probably there are other ways. Regards Petr And this is not my goal, but rather me trying to learn how to go about doing correlation analysis in R. I've also tried transposing the data.frame using as.data.frame(t(data)) and doing so gives the same error message as above. Can anyone help me with figuring out how to conduct a correlation analysis for specific gene/probeset, and help me understand why I get the above error message? I know it probably is a simple analysis, that is probably just over my head right now since I'm still new to R. But I can't figure it out and have been trying with a bunch of different variations for the past week. Thank you in advance for your help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Coding matrix equation
Hi all, I have two matrices: G-matrix(c(2.0, 0.5, 0.5, 0.5, 2.0, 0.5, 0.5, 0.5,2.0),3,3) P-matrix(c(1.0, 0.5, 0.5, 0.5, 1.0, 0.5, 0.5, 0.5,1.0),3,3) and I want to run this equation to get a new matrix F: F = [P+2G]^-1/2 P [P+2G]^-1/2 Could someone please tell me how to code this in R? Many thanks in advance for your time. Best wishes, Matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot, groups and colors
On Fri, Apr 08, 2011 at 08:14:21AM -0700, Dennis Murphy wrote: Thanks to everyone who replied! Especialy this and the ggplot advice did what I wanted. xyplot(circumference~age, dat, groups=Tree, type='l', col.line = c('red', 'blue', 'blue', 'red', 'red')) This is essentially what I had been doing after somehow creating the correct color vector. After a little more fiddling around, this also works, and seems a bit less kludgy: dat$group2 - factor(dat$group, labels = c('red', 'blue')) xyplot(circumference~age, dat, groups=Tree, type='l', col.line = levels(dat$group2)) Perfect! Using the levels directly had not occured to me. Thanks! cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Coding matrix equation
Hi r-help-boun...@r-project.org napsal dne 11.04.2011 09:43:03: Hi all, I have two matrices: G-matrix(c(2.0, 0.5, 0.5, 0.5, 2.0, 0.5, 0.5, 0.5,2.0),3,3) P-matrix(c(1.0, 0.5, 0.5, 0.5, 1.0, 0.5, 0.5, 0.5,1.0),3,3) and I want to run this equation to get a new matrix F: F = [P+2G]^-1/2 P [P+2G]^-1/2 Is this what you want? (P+2*G)^-1/2 * P * (P+2*G)^-1/2 Regards Petr Could someone please tell me how to code this in R? Many thanks in advance for your time. Best wishes, Matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coding matrix equation
Hi Matt, Petr gave you one possibility. If you are looking for more matrix operations see: ?%*% # the inner product of the matrices ?%o% # the outer product of the matrices ?( # for parentheses to help order things require(MASS) # load the package MASS ?ginv # for the generalized inverse of a matrix For things like constants which you just want treated normally, use the regular multiplication operator, *, not the matrix one. HTH, Josh On Mon, Apr 11, 2011 at 12:43 AM, matthew.r.robin...@sheffield.ac.uk matthew.r.robin...@sheffield.ac.uk wrote: Hi all, I have two matrices: G-matrix(c(2.0, 0.5, 0.5, 0.5, 2.0, 0.5, 0.5, 0.5,2.0),3,3) P-matrix(c(1.0, 0.5, 0.5, 0.5, 1.0, 0.5, 0.5, 0.5,1.0),3,3) and I want to run this equation to get a new matrix F: F = [P+2G]^-1/2 P [P+2G]^-1/2 Could someone please tell me how to code this in R? Many thanks in advance for your time. Best wishes, Matt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] In need of help with correlations
On Sat, Apr 9, 2011 at 10:24 AM, Sean Farris farris...@vcu.edu wrote: I am in need of someone's help in correlating gene expression. I'm somewhat new to R, and can't seem to find anyone local to help me with what I think is a simple problem. I need to obtain pearson and spearman correlation coefficients, and corresponding p-values for all of the genes in my dataset that correlate to one specific gene of interest. I'm working with mouse Affymetrix Mouse 430 2.0 arrays, so I've got about 45,000 probesets (rows; with 1st column containing identifiers) and 30 biological replicates (columns; with the top row containing the header information). Sean, I'm the maintainer of the package WGCNA that does correlation network analysis of gene expression data. I recommend you check out the package and the tutorials at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/index.html The package contains a couple useful functions for correlation p-values. Unlike cor.test which only takes two vectors (not matrices), you can use the function corAndPvalue to calculate Pearson correlations and the corresponding p-values for matrices. If you already have the correlation matrix pre-calculated AND you have no missing data (i.e., constant number of observations), you can also use corPvalueStudent to calculate the p-values. We don't use Spearman correlations much (we prefer the biweight midcorrelation, functions bicor and bicorAndPvalue, as a robust alternative to Pearson correlation), but you can approximate the Spearman p-values by the Student p-values (that are used for Pearson correlations). Statisticians who read this, please don't execute me for this suggestion :) To use the function cor(), you need to transpose the data so that genes are in columns and samples in rows. Just be aware that to correlate all probe sets at a time you need a 40k+ times 40k+ matrix to hold the result. Only a large computer (at least 32GB of memory, possibly needing 64GB) will be able to handle such a matrix and the necessary manipulations. The WGCNA package contains methods to construct co-expression networks on such big sets if necessary. HTH, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I make this faster?
Hi Hasan, I'd be happy to help you, but I am not able to run your code. You use commandArgs to retrieve arguments of the R program, but which ones do you actually provide? Best regards, Andreas Hasan Diwan schrieb: I was on vacation the last week and wrote some code to run a 500-day correlation between the Nasdaq tracking stock (QQQ) and 191 currency pairs for 500 days. The initial run took 9 hours(!) and I'd like to make it faster. So, I'm including my code below, in hopes that somebody will be able to figure out how to make it faster, either through parallelisation, or by making changes. I've marked the places where Rprof showed me it was slowing down: currencyCorrelation - function(lagtime = 1) { require(quantmod) dataTrack - getSymbols(commandArgs(trailingOnly=T)[1], from='2009-11-21', to='2011-04-03') stockData - get(dataTrack) currencies - row.names(oanda.currencies[grep(pattern='oz.', fixed=T, x =as.vector(oanda.currencies$oanda.df.1.length.oanda.df...21.)) == F]) correlations - vector() values - list() # optimise these loops using the apply family for (i in currencies) { for (j in currencies) { if (i == j) next() fx - getFX(paste(i, j, sep='/'), from='2009-11-20', to='2011-04-02') # Prepare data by getting rates for market days only fx - get(fx) fx - fx[which(index(fx) %in% index(QQQ$QQQ.Close))] correlation - cor(fx, QQQ$QQQ.Close) correlations - c(correlations, correlation) string - paste(paste(i,j,sep='/'), correlation, sep=',') values - c(values,paste(string,'\n', sep='')) } } # TODO eliminate NA's values - values[which(correlations[is.na(correlations) == F])] correlations - correlations[is.na(correlations) == F] values - values[order(correlations, decreasing=T)] write.table(values, file=commandArgs(trailingOnly=T)[2], sep='', qmethod=NULL, quote = F, row.names=F, col.names=F) rm('currencies', 'correlations', 'values', 'fx', 'string') return() } lagtime - as.integer(commandArgs(trailingOnly=T)[3]) if (is.na(lagtime)) lagtime - 1 print(paste(Sys.time(), '--- starting', lagtime, 'day lag currencies correlation with', commandArgs(trailingOnly=T)[1], 'from 2009-11-20 to 2011-04-03')) currencyCorrelation(lagtime) print(paste(Sys.time(), '--- ended, results in', commandArgs(trailingOnly=T)[2])) -- Andreas Borg Medizinische Informatik UNIVERSITÄTSMEDIZIN der Johannes Gutenberg-Universität Institut für Medizinische Biometrie, Epidemiologie und Informatik Obere Zahlbacher Straße 69, 55131 Mainz www.imbei.uni-mainz.de Telefon +49 (0) 6131 175062 E-Mail: b...@imbei.uni-mainz.de Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und löschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der darin enthaltenen Informationen ist nicht gestattet. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with basic loop
Hi, I think you can do this without a loop (well, replicate() is based on sapply()): prob-numeric(1000) task1 - replicate(1000,runif(1, min=0.8, max= 0.9)) task2 - replicate(1000,runif(1, min=0.75, max= 0.85)) task3 - replicate(1000,runif(1, min=0.81, max= 0.89)) prob - task1*task2*task3 It might not be faster, but I don't think it can be slower. And I find the code easier and clearer. Please correct me if this is not equivalent. HTH, Ivan Le 4/11/2011 01:06, Daniel Malter a écrit : The loop is correct, you just need to make sure that your result is computed and stored as the n-th element that is returned by the loop. Pick up any manual of R, and looping will be explained there. Also, I would recommend that you draw a random number for every iteration of the loop. Defining the random vectors outside the loop make sense to me only if they are the same length as n. prob-numeric(1000) for (n in 1:1000) { task1- runif(1, min=0.8, max= 0.9) task2- runif(1, min=0.75, max= 0.85) task3- runif(1, min=0.81, max= 0.89) prob[n]-task1*task2*task3 } If you wanted to store the individual probabilities (task1..3), you would proceed accordingly by defining them outside the loop and storing the value in the loop as the n-th element of that vector just like for prob. HTH, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Help-with-basic-loop-tp3440190p3440607.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Password-protect R script files
On Windows at least, you could set it as read only. The user can save an edited copy of it but cannot modify the original script. Le 4/11/2011 09:36, Petr Savicky a écrit : On Sun, Apr 10, 2011 at 10:48:19PM +0530, Vijayan Padmanabhan wrote: There was a question in R forum very long time back.. on how to protect R Script files from inadvertent editing by users. There is a way to do this from within R, atleast in Windows XP I have tried this and it certainly works , A possible approach is to put the script into an extension package. This does not prevent the user to modify the script, but makes it almost impossible to change it inadvertently. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with basic loop
Hi: Let's assume the lengths of each vector are the same so that they can be multiplied. Here's the timing on my machine: system.time(replicate(1000, { prob-numeric(1000) + + for (n in 1:1000) { + task1 - runif(1, min=0.8, max= 0.9) + task2 - runif(1, min=0.75, max= 0.85) + task3 - runif(1, min=0.81, max= 0.89) + prob[n]-task1*task2*task3 + } + })) user system elapsed 16.960.01 17.19 system.time(replicate(1000, { + task1 = runif(1000, min = 0.8, max = 0.9) + task2 - runif(1000, min = 0.75, max = 0.85) + task3 - runif(1000, min = 0.81, max = 0.89) + prob - task1 * task2 * task3 } )) user system elapsed 0.370.000.39 Dennis On Mon, Apr 11, 2011 at 1:42 AM, Ivan Calandra ivan.calan...@uni-hamburg.de wrote: Hi, I think you can do this without a loop (well, replicate() is based on sapply()): prob-numeric(1000) task1 - replicate(1000,runif(1, min=0.8, max= 0.9)) task2 - replicate(1000,runif(1, min=0.75, max= 0.85)) task3 - replicate(1000,runif(1, min=0.81, max= 0.89)) prob - task1*task2*task3 It might not be faster, but I don't think it can be slower. And I find the code easier and clearer. Please correct me if this is not equivalent. HTH, Ivan Le 4/11/2011 01:06, Daniel Malter a écrit : The loop is correct, you just need to make sure that your result is computed and stored as the n-th element that is returned by the loop. Pick up any manual of R, and looping will be explained there. Also, I would recommend that you draw a random number for every iteration of the loop. Defining the random vectors outside the loop make sense to me only if they are the same length as n. prob-numeric(1000) for (n in 1:1000) { task1- runif(1, min=0.8, max= 0.9) task2- runif(1, min=0.75, max= 0.85) task3- runif(1, min=0.81, max= 0.89) prob[n]-task1*task2*task3 } If you wanted to store the individual probabilities (task1..3), you would proceed accordingly by defining them outside the loop and storing the value in the loop as the n-th element of that vector just like for prob. HTH, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Help-with-basic-loop-tp3440190p3440607.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with basic loop
Well, I was quite blind not to change 1 to 1000 in runif() and use replicate()!! It gets even faster if you create prob first. Ivan Le 4/11/2011 10:53, Dennis Murphy a écrit : Hi: Let's assume the lengths of each vector are the same so that they can be multiplied. Here's the timing on my machine: system.time(replicate(1000, { prob-numeric(1000) + + for (n in 1:1000) { + task1 - runif(1, min=0.8, max= 0.9) + task2 - runif(1, min=0.75, max= 0.85) + task3 - runif(1, min=0.81, max= 0.89) + prob[n]-task1*task2*task3 + } + })) user system elapsed 16.960.01 17.19 system.time(replicate(1000, { + task1 = runif(1000, min = 0.8, max = 0.9) + task2 - runif(1000, min = 0.75, max = 0.85) + task3 - runif(1000, min = 0.81, max = 0.89) + prob - task1 * task2 * task3 } )) user system elapsed 0.370.000.39 Dennis On Mon, Apr 11, 2011 at 1:42 AM, Ivan Calandra ivan.calan...@uni-hamburg.de mailto:ivan.calan...@uni-hamburg.de wrote: Hi, I think you can do this without a loop (well, replicate() is based on sapply()): prob-numeric(1000) task1 - replicate(1000,runif(1, min=0.8, max= 0.9)) task2 - replicate(1000,runif(1, min=0.75, max= 0.85)) task3 - replicate(1000,runif(1, min=0.81, max= 0.89)) prob - task1*task2*task3 It might not be faster, but I don't think it can be slower. And I find the code easier and clearer. Please correct me if this is not equivalent. HTH, Ivan Le 4/11/2011 01:06, Daniel Malter a écrit : The loop is correct, you just need to make sure that your result is computed and stored as the n-th element that is returned by the loop. Pick up any manual of R, and looping will be explained there. Also, I would recommend that you draw a random number for every iteration of the loop. Defining the random vectors outside the loop make sense to me only if they are the same length as n. prob-numeric(1000) for (n in 1:1000) { task1- runif(1, min=0.8, max= 0.9) task2- runif(1, min=0.75, max= 0.85) task3- runif(1, min=0.81, max= 0.89) prob[n]-task1*task2*task3 } If you wanted to store the individual probabilities (task1..3), you would proceed accordingly by defining them outside the loop and storing the value in the loop as the n-th element of that vector just like for prob. HTH, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Help-with-basic-loop-tp3440190p3440607.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 tel:%2B49%280%2940%2042838%206231 ivan.calan...@uni-hamburg.de mailto:ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot layout with several plots ON plot area of previous plot
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/04/11 18:47, Greg Snow wrote: Some of the functions that were the first in the TeachingDemos package were originally written to help me visualize something, so it is not just teachers demoing, but people demoing to themselves. It has become a bit of a misc package with several utilities that are useful in themselves, but while I have considered splitting the package, I don't see an obvious splitting (and what would I call the new part?, naming things is not my strongest talent, just look at some of the functions in TeachingDemos, luckily for my kids my wife invoked veto power there). If someone wanted to include the function in one of the core packages then I would be happy to donate it, though generally that means one of the core members taking over maintenance and they may not want to do that (and I am happy to keep doing so). One of my small claims to fame is that there have been 3 instances of code in the TeachingDemos package that apparently had the right combination of potential usefulness and ugly code or implementation that inspired Brian Ripley to write new functions in the core packages to do the same thing (only better). The subplot function has not been one of those, so I am guessing that Prof. Ripley (or other core members) either has not become aware of it, does not think it useful enough, or does not consider it ugly enough to ne rewriting (I am hoping it's the last). Personally I think the TeachingDemos package is useful and everyone should use it (but I may be a bit biased). I sometimes fantasize about it becoming one of the official recommended packages (but the realistic part of me admits that this is only slightly more likely to happen than the fantasy about developing super powers or having the entire house stay clean for a whole day with 4 kids at home). Luckily Jim (and others) is good at pointing people to TeachingDemos when it is appropriate. I try to point people to Jim's package as well, but he is usually a bit faster about it. Hi Greg, I must say I thoroughly enjoyed reading your response and reasoning and I will definitely take a closer look into the TeahingDemos package. Cheers, Rainer - -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Stellenbosch University South Africa Tel : +33 - (0)9 53 10 27 44 Cell: +33 - (0)6 85 62 59 98 Fax : +33 - (0)9 58 10 27 44 Fax (D):+49 - (0)3 21 21 25 22 44 email: rai...@krugs.de Skype: RMkrug -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk2iw9sACgkQoYgNqgF2egqDYwCeP6j7rReGPlkgwYx4lcNyC4j5 qc0AoIle1bkl8Zwf0TIn83WLfB1aikTd =TlQ/ -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I make this faster?
On 04/11/2011 10:28 AM, Andreas Borg wrote: Hi Hasan, I'd be happy to help you, but I am not able to run your code. You use commandArgs to retrieve arguments of the R program, but which ones do you actually provide? Best regards, Andreas Hasan Diwan schrieb: I was on vacation the last week and wrote some code to run a 500-day correlation between the Nasdaq tracking stock (QQQ) and 191 currency pairs for 500 days. The initial run took 9 hours(!) and I'd like to make it faster. So, I'm including my code below, in hopes that somebody will be able to figure out how to make it faster, either through parallelisation, or by making changes. I've marked the places where Rprof showed me it was slowing down: currencyCorrelation - function(lagtime = 1) { require(quantmod) dataTrack - getSymbols(commandArgs(trailingOnly=T)[1], from='2009-11-21', to='2011-04-03') stockData - get(dataTrack) currencies - row.names(oanda.currencies[grep(pattern='oz.', fixed=T, x =as.vector(oanda.currencies$oanda.df.1.length.oanda.df...21.)) == F]) correlations - vector() values - list() # optimise these loops using the apply family for (i in currencies) { for (j in currencies) { if (i == j) next() fx - getFX(paste(i, j, sep='/'), from='2009-11-20', to='2011-04-02') # Prepare data by getting rates for market days only fx - get(fx) fx - fx[which(index(fx) %in% index(QQQ$QQQ.Close))] correlation - cor(fx, QQQ$QQQ.Close) correlations - c(correlations, correlation) In this piece of code you concatenate correlation and correlations. Because you dynamically change correllations the operating system is looking for a spot of memory for the object often. Preallocating the space you need, or a bit is also fine, will make this much faster. You can do this by not creating zero-length vectors for 'correlations' and 'vectors' before the start of the loop, but creating them already at the desired length and assign values in the loop, not concatenate. This could possibly speed up your codes by several orders of magnitude. cheers, Paul string - paste(paste(i,j,sep='/'), correlation, sep=',') values - c(values,paste(string,'\n', sep='')) } } # TODO eliminate NA's values - values[which(correlations[is.na(correlations) == F])] correlations - correlations[is.na(correlations) == F] values - values[order(correlations, decreasing=T)] write.table(values, file=commandArgs(trailingOnly=T)[2], sep='', qmethod=NULL, quote = F, row.names=F, col.names=F) rm('currencies', 'correlations', 'values', 'fx', 'string') return() } lagtime - as.integer(commandArgs(trailingOnly=T)[3]) if (is.na(lagtime)) lagtime - 1 print(paste(Sys.time(), '--- starting', lagtime, 'day lag currencies correlation with', commandArgs(trailingOnly=T)[1], 'from 2009-11-20 to 2011-04-03')) currencyCorrelation(lagtime) print(paste(Sys.time(), '--- ended, results in', commandArgs(trailingOnly=T)[2])) -- Paul Hiemstra, MSc Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] heatmap clustering dendrogram export
Hi, I am a beginner for R. I had use gplots to generate a heatmap as following: heatmap.2(matrix, col=topo.colors(75), dendrogram=column, Rowv=FALSE, trace=none, key=TRUE, keysize=0.8, density.info=none, cexRow=0.2, cexCol=0.6) It work well. It generate heatmap whith rcolumn clustering dendrogram and I can export a very nice graph. But I don not know how to export the column clustering dendrogram out. Because I want to get the dendrogram for next step analysis. If I can export it as the newick format (or nexus format), it will be much easy for me. Thank you very much! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rtmvt
Hi there, Since you failed to provide us with data and sessionInfo(), I can only guess that for some reason you call the rtmvt.rejection function instead of rtmvt.gibbs. Just look at the code of rtvmt by typing: rtmvt There you can see that it is a wrapper for rtmvt.rejection or rtmvt.gibbs. You can run them directly by typing: tmvtnorm:::rtmvt.rejection(...) tmvtnorm:::rtmvt.gibbs(...) ... shall be the arguments as pre-processed by rtmvt(). But first, you might upgrade your base R and the installed packages. HTH, Denes I have been using the rtmvt function in the {tmvtnorm} package i'm getting the warning: Acceptance rate is very low and rejection sampling becomes inefficient. Consider using Gibbs sampling. but i AM specifying the gibbs algorithm!!: rtmvt(M, mean=q[,,i,j], sigma=((u[i,j] + nu[i])/(p+nu[i]))*delta[,,i], df=ceiling(nu[i]+p), lower=c(0,0), algorithm=gibbs) Any ideas why I am getting this warning and how I can fix it? -- View this message in context: http://r.789695.n4.nabble.com/rtmvt-tp3440751p3440751.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mclapply and print statement
Dear all. I am using the mclapply function to split my code to the many cores my system has. It seems that is working fine. This is the parallel version of lcapply. The only problem that I seem to have is that the printf cannot print messages. The ideal to me is to have fro my function an output of the form Shadowlist-mclapply(1:dimz, function(i) { print(sprintf('Creating the %d map',i)); GaussRF(x=x, y=y, model=model, grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha)) } ) 'I am the processor %d and I work with the task %d',processorid,i So far I get not output from my print(sprintf(... function. What do you think I should try out? Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pseudo-R by hand
hello dear list! since we want to do a model analysis and some people would like to see pseudo-R^2 values for different types of glm of a logistic regression, i've decided to write a function that computes either nagelkerkes normed pseudo-R or cox snells pseudo-R. however, i am not clear as in the decisive step, i need to calculate the log of (maximum likelihood estimates of model divided by mle of null model). i am well aware of the functions stats::mle and stats::logLik as well as of Design::lrm. however, I'm not sure wheter mle helps me at all and I am uncertain about the logLik call I have implemented: #coxsnell lambda- -2*log((logLik(null.model)[1]/logLik(model)[1])) out-1-exp(-lambda/n) #nagelkerke lambda- -2*log( logLik(model)[1]/logLik(null.model)[1] ) lambda2- -2*log( logLik(model)[1] ) out-(1-exp(-lambda/n))/(1-exp(-lambda2/n)) can anyone help me out? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Comparing execution times
Dear all, In my 'simple' computer I was running some experiments to help me understand how faster a multicore lapply will be. I thought it might be interesting for some people to look at the results. Even though are not accurate, still might be a good indicator how much improvement there can be. A.Case. The classic: for 1:100 for (i in c(1:dimz)){ print(sprintf('Creating the %d map',i)); Shadowlist[,,i]- GaussRF(x=x, y=y, model=model, grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha)) } user system elapsed 1825.699 303.100 1063.352 -- B.Case. Same as above but with lapply instead of for Shadowlist-lapply(1:dimz, function(i) { print(sprintf('Creating the %d map',i)); GaussRF(x=x, y=y, model=model, grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha)) } ) ) user system elapsed 1816.784 296.745 1062.142 --- C.Case. Foreach is considered to be easier to be applied to manycores. foreach (i=1:dimz) %do% { print(sprintf('Creating the %d map',i)); Shadowlist[,,i]-f - GaussRF(x=x, y=y, model=model, grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha)) } user system elapsed 1027.058 13.243 1031.849 --- D. Case. The really multicore lapply. Great difference system.time(Shadowlist-mclapply(1:dimz, function(i) { + #print(sprintf('Creating the %d map',i)); + GaussRF(x=x, y=y, model=model, grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha)) + } + ) + ) user system elapsed 263.134 99.639 549.366 --- My computer is a normal four core pc. Great improvement with mlcapply. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about levels/as.numeric
--- On Sun, 4/10/11, Rolf Turner rolf.tur...@xtra.co.nz wrote: From: Rolf Turner rolf.tur...@xtra.co.nz Subject: Re: [R] Question about levels/as.numeric To: r-help@r-project.org Received: Sunday, April 10, 2011, 9:48 PM On 11/04/11 10:08, Peter Ehlers wrote: SNIP Checking anything with Excel is never much use. SNIP Fortune? cheers, Rolf Turner Definitely! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm with multiple vars
Sascha Thanks that works. Dirk -- View this message in context: http://r.789695.n4.nabble.com/glm-with-multiple-vars-tp3438095p3441476.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pseudo-R by hand
On Mon, 11 Apr 2011, Sacha Viquerat wrote: hello dear list! since we want to do a model analysis and some people would like to see pseudo-R^2 values for different types of glm of a logistic regression, i've decided to write a function that computes either nagelkerkes normed pseudo-R or cox snells pseudo-R. however, i am not clear as in the decisive step, i need to calculate the log of (maximum likelihood estimates of model divided by mle of null model). i am well aware of the functions stats::mle and stats::logLik as well as of Design::lrm. You can look at the pR2() function in the pscl package which provides various flavors of pseudo R-squared for glm, multinom, and polr objects. The idea is to extract the observed log-likelihood using logLik(), then update the model to obtain the null model only and call logLik() again. From the two log-likelihoods and the associated number of observations, the pseudo R-squared are computed using pR2Work(), see pscl:::pR2.glm and pscl:::pR2Work. hth, Z however, I'm not sure wheter mle helps me at all and I am uncertain about the logLik call I have implemented: #coxsnell lambda- -2*log((logLik(null.model)[1]/logLik(model)[1])) out-1-exp(-lambda/n) #nagelkerke lambda- -2*log( logLik(model)[1]/logLik(null.model)[1] ) lambda2- -2*log( logLik(model)[1] ) out-(1-exp(-lambda/n))/(1-exp(-lambda2/n)) can anyone help me out? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple comparisons with generalised least squares
Dear R users, I have used the following model: M1 - gls(Nblad ~ Concentration+Season + Concentration:Season, data=DDD, weights=varIdent(form=~ 1 | Season*Concentration)) to assess the effect of Concentration and Season on nitrogen uptake by leaves (Nblad). I accounted for the difference in variance across the factor levels by using the varIdent function. Then I wanted to perform multiple comparisons with the glht function of the multcomp package. glht(M1,linfct = mcp(Season = Tukey)) However, here I got an error message Error in terms.default(object) : no terms component. Error in factor_contrasts(model) : no ‘model.matrix’ method for ‘model’ found!. Does the glht function work with a gls model? And if not, is there an other way to perform multiple comparisons for a gls model? I've searched this forum for an answer to this question, but I could only found someone with the same question which remained unanswered. I hope someone can provide an answer now! Many thanks in advance! Sandy -- View this message in context: http://r.789695.n4.nabble.com/multiple-comparisons-with-generalised-least-squares-tp3441513p3441513.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package fgui returns error: Object of type closure is not subsettable
Hello All, I have written three functions. First: To input user specified SAS dataset and plot the boxplots of relevant variables. Second: Extract the number of hours, minutes etc. from a variable describing a time-point using regular expressions. E.g. 'Per1, Day 2, 24 Hour' would be separated into three columns, Per (value 1), Day (value 2) and Hour (value 24) Third: Finding the summary statistics of the relevant variables from the data input using the first function. All the functions are working fine in R console. However, when I tried to use the 'guiv' function from the 'fgui' package, the last two functions are returning the error: 'Object of type closure is not subsettable'. The guiv function just provides a GUI to enter the function arguments. guiv works well with the first function but returns an error only with the second and third functions. What could be the reason? Could it be because of the use of regular expressions? Thanks Nikhil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coding matrix equation
On Mon, Apr 11, 2011 at 08:43:03AM +0100, matthew.r.robin...@sheffield.ac.uk wrote: Hi all, I have two matrices: G-matrix(c(2.0, 0.5, 0.5, 0.5, 2.0, 0.5, 0.5, 0.5,2.0),3,3) P-matrix(c(1.0, 0.5, 0.5, 0.5, 1.0, 0.5, 0.5, 0.5,1.0),3,3) and I want to run this equation to get a new matrix F: F = [P+2G]^-1/2 P [P+2G]^-1/2 Could someone please tell me how to code this in R? Hi. Try the following. sqrtSymMat - function(A) { out - eigen(A) D - diag(out$values) U - out$vectors U %*% sqrt(D) %*% t(U) } A - sqrtSymMat(solve(P + 2*G)) F - A %*% P %*% A See also the function svd() and its help ?svd. The operator A^(1/2) works component-wise. There may be a function computing the square root of a positive semidefinite matrix in some of the CRAN extension packages http://cran.at.r-project.org/web/packages/index.html but i am not sure. The package mvtnorm http://cran.at.r-project.org/web/packages/mvtnorm/index.html computes the square root of a matrix internally. See the help of the function ?rmvnorm. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Polar Plots
Dear List, Following the link below ( http://rgm2.lab.nig.ac.jp/RGM2/func.php?rd_id=plotrix:clock24.plot) I got an interesting polar plots which displayed my data and the time of observation. Thank you very much for providing such details. However, I have two set of data which I wish to display in the same polar plot. I tried using points to add the second data but could not succeed. That is, after the running the first code: clock24.plot(a,b,main=Test Clock24 (lines),show.grid=FALSE, line.col=green,lwd=3) if(dev.interactive()) par(ask=TRUE) # now do a 'daylight' plot clock24.plot(a,b, main=Test Clock24 daytime (symbols), point.col=blue,rp.type=s,lwd=3) # reset the margins par(mar=c(5,4,4,2)) I tried to add the second using: points(aa,bb,col=blue) Error in xy.coords(x, y) : (list) object cannot be coerced to type 'double' points(add = TRUE,a,b,col=blue) Error in xy.coords(x, y) : (list) object cannot be coerced to type 'double' Any further help will be much appreciated. Best regards Ogbos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About Tinn-R
Dear Marcos, Sorry, It is very difficult for me to know what happened on your computer! The fact is that the structure of the ini are corrupted. In this place Tinn-R stores all user preferences and configurations. It will really necessary to rename (or remove: in this case all your prior configuration will be lost whether you do not have a backup made by Tinn-R) the folder Tinn-R located, in your computer, at C:/Documents and Settings/Marcos/Datos de programa/ To make this folder visible, give a looked in the links below: http://www.howtogeek.com/howto/windows/display-hidden-folders-in-xp/ http://www.online-tech-tips.com/computer-tips/how-to-hide-files-and-folders-in-windows-xp-the-easy-way/ After to rename/remove the Tinn-R folder start Tinn-R program: it will create a new ini folder. HTH, JCFaria -- View this message in context: http://r.789695.n4.nabble.com/About-Tinn-R-tp3440475p3441703.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Yearly aggregates and matrices
Solved the problem: I guess I was still using the main version of zoo. Thanks again! -- View this message in context: http://r.789695.n4.nabble.com/Yearly-aggregates-and-matrices-tp3438140p3441723.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting edgelist to symmetric matrix/ plotting sparse network with lots of nodes
Date: Sat, 9 Apr 2011 14:34:28 -0700 From: kmshafi...@yahoo.com To: r-help@r-project.org Subject: [R] Converting edgelist to symmetric matrix Hi, I have network data in the form of a couple of edgelists containing weights in the format x,y,weight whereby x represents row header and y represents column header. All edgelists are based on links among 634 nodes and I need to convert them into a 634*634 weighted matrix. I searched for online help using possible keywords I know, but I could not find a clue how to do this in R. Any help will be appreciated. I'd replied earlier suggesting the ncol format, I'd like to follow up on that as I have tried with some success but maybe someone else can comment on alternatives and suggest ideas for plotting. I have a set of nodes or states specified by two parameters ( these are isotopes specified by proton and mass connected by decay paths with probability of that path being its weight). This seems to almost work for your needs( note that I have taken out a lot of extraneous stuff and may have dropped somethimng important LOL, also setup for Rgraphviz is not simple on 'dohs as i had to manually edit env variable etc) , library(Rgraphviz) library(QuACN) nxg-read.graph(ncol.txt,format=ncol) nn-igraph.to.graphNEL(nxg) aasd-adjacencyMatrix(nn) str((aasd)) num [1:2561, 1:2561] 0 100 95.8 2.7 0 0 0 0 0 0 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:2561] 17_10 17_9 16_9 15_7 ... ..$ : chr [1:2561] 17_10 17_9 16_9 15_7 ... $ head ncol.txt 17_10 17_9 100 17_10 16_9 95.8 17_10 15_7 2.7 18_10 18_9 100 19_10 19_9 100 23_10 23_11 100 243_100 239_98 100 245_100 241_98 100 246_100 242_98 92 246_100 246_99 1 However, for my needs plotting has been a big problem. I apparently have 2561 isotopes ( none of this has been validated yet LOL) that are sparely connected by a few decay modes ( presumably acyclic directed graph but DAG in searches didn't help much ). Any thoughts on which R classes to try to visualize this or even what I should be thinking about artistically? This is largely just a way to learn R for some other things I want to do for analyzing data on wireless devices but I am curious about this result too. Some of the things I did try are below, library(Rgraphviz) library(QuACN) nxg-read.graph(ncol.txt,format=ncol) foo-adjacencyMatrix(nxg) ?graphNEL ?NELgraph df-data.frame(nxg) plot.igraph(nxg,layout=layout.svd) rglplot.igraph(nxg,layout=layout.svd) rglplot.igraph(nxg,layout=layout.svd) rglplot.igraph(nxg) tkplot.igraph(nxg) library(tcltk) tkplot.igraph(nxg) tkplot(nxg) dx-decompose.graph(nxg) nn-igraph.to.graphNEL(nxg) igraph.plotting(nxg) library(sna) gplot(nxg) dx-get.adjacency(nxg) gplot(dx) gplot3d(dx) plot(nxg) library(ElectroGraph) eg-electrograph(nxg) eg-electrograph(aasd) plot(eg) Thanks. Best regards, Shafique __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] plyr: version 1.5
# plyr plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of a data frame * quickly calculate summary statistics for each group * perform group-wise transformations like scaling or standardising It's already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with: * totally consistent names, arguments and outputs * convenient parallelisation through the foreach package * input from and output to data.frames, matrices and lists * progress bars to keep track of long running operations * built-in error recovery, and informative error messages * labels that are maintained across all transformations Considerable effort has been put into making plyr fast and memory efficient, and in many cases plyr is as fast as, or faster than, the built-in equivalents. A detailed introduction to plyr has been published in JSS: The Split-Apply-Combine Strategy for Data Analysis, http://www.jstatsoft.org/v40/i01/. You can find out more at http://had.co.nz/plyr/, or track development at http://github.com/hadley/plyr. You can ask questions about plyr (and data manipulation in general) on the plyr mailing list. Sign up at http://groups.google.com/group/manipulatr. Version 1.5 -- NEW FEATURES * new `strip_splits` function removes splitting variables from the data frames returned by `ddply`. * `rename` moved in from reshape, and rewritten. * new `match_df` function makes it easy to subset a data frame to only contain values matching another data frame. Inspired by http://stackoverflow.com/questions/4693849. BUG FIXES * `**ply` now works when passed a list of functions * `*dply` now correctly names output even when some output combinations are missing (NULL) (Thanks to bug report from Karl Ove Hufthammer) * `*dply` preserves the class of many more object types. * `a*ply` now correctly works with zero length margins, operating on the entire object (Thanks to bug report from Stavros Macrakis) * `join` now implements joins in a more SQL like way, returning all possible matches, not just the first one. It is still a (little) faster than merge. The previous behaviour is accessible with `match = first`. * `join` is now more symmetric so that `join(x, y, left)` is closer to `join(y, x, right)`, modulo column ordering * `named.quoted` failed when quoted expressions were longer than 50 characters. (Thanks to bug report from Eric Goldlust) * `rbind.fill` now correctly maintains POSIXct tzone attributes and preserves missing factor levels * `split_labels` correctly preserves empty factor levels, which means that `drop = FALSE` should work in more places. Use `base::droplevels` to remove levels that don't occur in the data, and `drop = T` to remove combinations of levels that don't occur. * `vaggregate` now passes `...` to the aggregation function when working out the output type (thanks to bug report by Pavan Racherla) -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quantile Regression and R
Pls disregard...I have it figured out. Thank you. Regards, Peter D. Sheldrick Hartford Financial Services Group _ From: Sheldrick, Peter (Specialty Casualty UW Support) Sent: Friday, April 08, 2011 9:53 AM To: 'r-help@R-project.org' Subject: Quantile Regression and R Sir or Madam: I am new to R and the use of quantile regeression. In addition, I am a finance person not a true statistcian. Basic regression form is Y = (Coefficient * Variable) + Error Term I have results from a quantile regression where I used the Barro and Roberts method with bootstrapping for standard errors. I am now taking another set of data and applying the quantile regression equation to determine accuracy. I am doing this in Excel so I can share with my business customer. I think I need to add the error term to my prediction but I cannot seem to find it in the summary output of the quantile regression nor does my Google search reveal how it is calculated if there is one. Any help would be appreciated. Thanks. Regards, Peter D. Sheldrick Hartford Financial Services Group This communication, including attachments, is for the exclusive use of addressee and may contain proprietary, confidential and/or privileged information. If you are not the intended recipient, any use, copying, disclosure, dissemination or distribution is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this communication and destroy all copies. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting edgelist to symmetric matrix
Hi Shafique, If your edgelist is in the form of a text file (elist.csv) that looks like this: from, to, weight vertex1, vertex2, 3 vertex2, vertex3, 2.3 vertex4, vertex1, 1.2 ... you can convert that to a matrix using library(igraph) edge.list - read.csv(elist.csv,header=TRUE) g - graph.data.frame(edge.list, directed=FALSE) get.adjacency(g, type=both, attr=weight) More options for exporting the adjacency matrix are here: http://igraph.sourceforge.net/doc/R/conversion.html If you give more details about your data format you might get more specific help. HTH, Gary On Apr 11, 2011, at 6:00 AM, r-help-requ...@r-project.org wrote: Date: Sat, 9 Apr 2011 14:34:28 -0700 From: kmshafi...@yahoo.com To: r-help@r-project.org Subject: [R] Converting edgelist to symmetric matrix Hi, I have network data in the form of a couple of edgelists containing weights in the format x,y,weight whereby x represents row header and y represents column header. All edgelists are based on links among 634 nodes and I need to convert them into a 634*634 weighted matrix. not find a clue how to do this in R. Any help will be appreciated. I'm trying to do something related and found ?read.graph will format=ncol do what you need? This apparently creates a graph object that likely has capacilities you need.? Again, I haven't actually used any of this just found while trying to solve a different problem. 'It is a simple text file with one edge per line. An edge is defined by two symbolic vertex names separated by whitespace. (The symbolic vertex names themselves cannot contain whitespace. They might followed by an optional number, this will be the weight of the edge; the number can be negative and can be in scientific notation. If there is no weight specified to an edge it is assumed to be zero. ' Best regards, Shafique -- Gary Weissman http://www.babelgraph.org/ g...@babelgraph.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Edate and EOmonth
Hi, I was wondering if anyone could point me to the excel look alike Edate and eomonth functions in R. I have found the timeLastDayInMonth and timeFirstDayInMonth in the timeDate package. However, I am looking for a bit more flexibility. I would like to be able to obtain dates and EOM dates n months prior/forward to the input date. Thanks, Jorge Nieves [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Edate and EOmonth
I think Dirk has recently done some things w/ boost date time as an Rcpp based project bdt. http://cran.r-project.org/web/packages/RcppBDT/ChangeLog -Whit On Mon, Apr 11, 2011 at 10:11 AM, Jorge Nieves jorge.nie...@moorecap.comwrote: Hi, I was wondering if anyone could point me to the excel look alike Edate and eomonth functions in R. I have found the timeLastDayInMonth and timeFirstDayInMonth in the timeDate package. However, I am looking for a bit more flexibility. I would like to be able to obtain dates and EOM dates n months prior/forward to the input date. Thanks, Jorge Nieves [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting controlled released data
On 10.04.2011 21:22, EmaDaCuz wrote: Hi, I am new to the forum/mailing list. I have been using R for a while and I find it incredible. I was just wondering whether someone has ever written a library to calculate the best fit of experimental data to some controlled release models, having only the release cumulative drug release at given time points. For example, there is an extension for SigmaPlot http://www.sigmaplot.co.uk/products/sigmaplot/productuses/prod-uses15.php which allows rapid fitting of 5 standard model. R is better: Just fit an arbirary number of different models and choose the one hat fits your criterion best. I doubt that is sensible, but it is not hard to do that with arbitrary (rather than 5 specific) kinds of models. Uwe Ligges I prefer to use free software and therefore I rather use R than Sigma Plot. Is there anyone who can help? Thank you very much Marco -- View this message in context: http://r.789695.n4.nabble.com/Fitting-controlled-released-data-tp3440303p3440303.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Edate and EOmonth
On 11 April 2011 at 10:55, Whit Armstrong wrote: | I think Dirk has recently done some things w/ boost date time as an Rcpp | based project bdt. | | http://cran.r-project.org/web/packages/RcppBDT/ChangeLog It's on CRAN too at http://cran.r-project.org/web/packages/RcppBDT/ It may get an update 'soon' as Romain is adding more magic to Rcpp modules. Dirk | On Mon, Apr 11, 2011 at 10:11 AM, Jorge Nieves jorge.nie...@moorecap.comwrote: | | Hi, | | I was wondering if anyone could point me to the excel look alike Edate | and eomonth functions in R. I have found the timeLastDayInMonth and | timeFirstDayInMonth in the timeDate package. However, I am looking | for a bit more flexibility. I would like to be able to obtain dates and | EOM dates n months prior/forward to the input date. | | Thanks, | | Jorge Nieves | | | | [[alternative HTML version deleted]] | | __ | R-help@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-help | PLEASE do read the posting guide | http://www.R-project.org/posting-guide.html | and provide commented, minimal, self-contained, reproducible code. | | | [[alternative HTML version deleted]] | | __ | R-help@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-help | PLEASE do read the posting guide http://www.R-project.org/posting-guide.html | and provide commented, minimal, self-contained, reproducible code. -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] heatmap clustering dendrogram export
Hi boyang zhe, The dendrogram is stored in the object returned from heatmap.2 #e.g. x - heatmap.2(matrix(1:9,3)) dend.row - x$rowDendrogram class(dend.row) [1] dendrogram plot(tmp$rowDendrogram) Amos Folarin -- Forwarded message -- From: boyang zhe zheboy...@gmail.com To: r-help@r-project.org Date: Mon, 11 Apr 2011 11:15:52 +0200 Subject: [R] heatmap clustering dendrogram export Hi, I am a beginner for R. I had use gplots to generate a heatmap as following: heatmap.2(matrix, col=topo.colors(75), dendrogram=column, Rowv=FALSE, trace=none, key=TRUE, keysize=0.8, density.info=none, cexRow=0.2, cexCol=0.6) It work well. It generate heatmap whith rcolumn clustering dendrogram and I can export a very nice graph. But I don not know how to export the column clustering dendrogram out. Because I want to get the dendrogram for next step analysis. If I can export it as the newick format (or nexus format), it will be much easy for me. Thank you very much! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Geographic distance between lat-long points in R?
Dear R, I have a bunch of geographic locations specified by lat-long coordinates. What's an easy way to calculate geographic distance between any two points? OR, perhaps there is a function for calculating a distance matrix for K sites? Sincerely, Scott Chamberlain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] integration
It does. See `lower' and `upper' arguments. Why are y and z not known? Say, you want the marginal of x, i.e. integrate over x. Now, y and z are fixed. You fix them at different values, but they are known. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edumailto:rvarad...@jhmi.edu From: cindy Guo [mailto:cindy.g...@gmail.com] Sent: Saturday, April 09, 2011 5:07 PM To: Ravi Varadhan Cc: r-help@r-project.org Subject: Re: [R] integration 'integrate' does not allow parameter limits. For example, the limits of x is (z/y, Inf) while z and y are unkonwn. On Fri, Apr 8, 2011 at 9:46 PM, Ravi Varadhan rvarad...@jhmi.edumailto:rvarad...@jhmi.edu wrote: ?integrate From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On Behalf Of cindy Guo [cindy.g...@gmail.commailto:cindy.g...@gmail.com] Sent: Friday, April 08, 2011 9:21 PM To: r-help@r-project.orgmailto:r-help@r-project.org Subject: [R] integration Hi, All, I have a density function with 3 variables which is defined on some irregular domain, and I want to get the marginal distribution of each variable. Is there any function doing this? A simple example is p(x,y,z)=x*y*z*I(xyz). So each marginal distribution is a function of the other two variables. My density form is very complicated, so I cannot do it by hand. I was just wondering if there is any function in R for this? Thanks, Cindy [[alternative HTML version deleted]] __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Geographic distance between lat-long points in R?
Hi Scott, have a look at the 'earth.dist'-function in the package 'fossil'. hth. Am 11.04.2011 17:37, schrieb Scott Chamberlain: Dear R, I have a bunch of geographic locations specified by lat-long coordinates. What's an easy way to calculate geographic distance between any two points? OR, perhaps there is a function for calculating a distance matrix for K sites? Sincerely, Scott Chamberlain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ordered logistic regression - cdplot and polr
Hi, I have a dataset that I am trying to analyze and plot as an ordered logistic regression (y = ordinal categories 1-3, x = continuous variable with values 3-9). First is a problem with cdplot: Produces a beautiful plot, with the right trend, but my independent factor values are transformed. The factor has values from 3-9, but the plot produces an x-axis with values from 20-140. When I force the xlim to be 3-9, it produces a plot without the trend, which can't be correct. Second is a problem with polr: The output of the summary command of the model built with polr includes t values for lots (if not all) of my independent factor values, but does not produce a summary of the fit of the model or of the overall fit of the factor. Also, intercepts are different from those produced with a logistic fit in JMP... Code below, any help much appreciated. Thanks Beth LogAntDensityFactor-as.factor(LogAntDensity) ###order ordinal variable HammerCatOrd-ordered(HammerCat) ###set ordered ordinal dependent variable as factor HammerCatOrdFactor-as.factor(HammerCatOrd) ###density plot with three levels cdplot(HammerCatOrdFactor~LogAntDensityFactor,xlab=Log(Ant Density),ylab=Latency of response to disturbance (1-3)) require(MASS) logordered-polr(HammerCatOrdFactor~LogAntDensityFactor,Hess=TRUE) summary(logordered,digits=3) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast version of Fisher's Exact Test
Depends on how many other programs, and how large they are, and how much RAM you have on your machine. If I repeatedly run the example I used below, my R session shows 170MB of memory usage, not a huge amount relative to total memory, and not a huge amount even for 32 bit R. But if your system has 2 GB of RAM and 1.9 GB is consumed by other processes, then this example will cause swapping and speed will be reduced. So figuring out a solution requires understanding what it is that is causing the slowdown - not enough RAM, other programs competing for CPU cycles... You can try switching to 64 bit R but unless your 32 bit R is loading some large data objects, leaving little RAM, you won't see much of a difference. If you start R, and do rm(list = ls()) to ensure no big data objects are using up RAM, does the example below still take a long time? You haven't mentioned what operating system you are using, how much RAM you have or what sessionInfo() reports on your machine. That information will help to figure this out. Steven McKinney From: Jim Silverton [jim.silver...@gmail.com] Sent: April 9, 2011 9:21 AM To: Steven McKinney Subject: Re: [R] Fast version of Fisher's Exact Test I R 32 bit installed but my machine is 64 bit. Do I need to upgrade the R to 64 bit for it to run faster? On Fri, Apr 8, 2011 at 6:44 PM, Steven McKinney smckin...@bccrc.camailto:smckin...@bccrc.ca wrote: Do you mean a test something such as this? fisher.test(matrix(c(502,498,490, 510), nrow = 2)) Fisher's Exact Test for Count Data data: matrix(c(502, 498, 490, 510), nrow = 2) p-value = 0.6228 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.8770113 1.2550998 sample estimates: odds ratio 1.049119 This runs quickly on my machine. system.time(fisher.test(matrix(c(502,498,490, 510), nrow = 2))) user system elapsed 0.008 0.001 0.010 sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.2 Can you provide an example that is running slowly for you? Steven McKinney From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On Behalf Of Jim Silverton [jim.silver...@gmail.commailto:jim.silver...@gmail.com] Sent: April 8, 2011 9:43 AM To: r-help@r-project.orgmailto:r-help@r-project.org Subject: Re: [R] Fast version of Fisher's Exact Test Is anyone aware of a fast way of doing fisher's exact test for a series of 2 x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000. -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thanks, Jim. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing execution times
Just a comment about your use of foreach: On Mon, Apr 11, 2011 at 6:29 AM, Alaios ala...@yahoo.com wrote: [snip] C.Case. Foreach is considered to be easier to be applied to manycores. foreach (i=1:dimz) %do% { print(sprintf('Creating the %d map',i)); Shadowlist[,,i]-f - GaussRF(x=x, y=y, model=model, grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha)) } You are still running this sequentially. To run in parallel, you need load the appropriate parallel backend, and use %dopar%: library(doMC) foreach(i=1:dimz) %dopar% { ... } -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mclapply and print statement
Hi, On Mon, Apr 11, 2011 at 5:26 AM, Alaios ala...@yahoo.com wrote: Dear all. I am using the mclapply function to split my code to the many cores my system has. It seems that is working fine. This is the parallel version of lcapply. The only problem that I seem to have is that the printf cannot print messages. The ideal to me is to have fro my function an output of the form Shadowlist-mclapply(1:dimz, function(i) { print(sprintf('Creating the %d map',i)); GaussRF(x=x, y=y, model=model, grid=TRUE,param=c(mean,variance,nugget,scale,Whit.alpha)) } ) 'I am the processor %d and I work with the task %d',processorid,i So far I get not output from my print(sprintf(... function. What do you think I should try out? Use `cat`: R x - mclapply(1:20, function(i) cat(i, \n)) 1 9 17 2 10 18 3 11 19 4 12 20 5 13 6 14 7 15 8 16 -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ordered logistic regression - cdplot and polr
On Mon, 11 Apr 2011, Elizabeth Pringle wrote: Hi, I have a dataset that I am trying to analyze and plot as an ordered logistic regression (y = ordinal categories 1-3, x = continuous variable with values 3-9). First is a problem with cdplot: Produces a beautiful plot, with the right trend, but my independent factor values are transformed. The factor has values from 3-9, but the plot produces an x-axis with values from 20-140. When I force the xlim to be 3-9, it produces a plot without the trend, which can't be correct. You transform the presumably numerical regressor LogAntDensity to a factor. Is that intended? If so, cdplot() is not suitable for visualization as it assumes a numerical x-variable. See ?cdplot. A more suitable visualization may be obtained using spineplot() which allows both numerical and categorical x-variables. See ?spineplot. Second is a problem with polr: The output of the summary command of the model built with polr includes t values for lots (if not all) of my independent factor values, but does not produce a summary of the fit of the model or of the overall fit of the factor. You could refit the model without the factor and then compare both models using anova(). hth, Z Also, intercepts are different from those produced with a logistic fit in JMP... Code below, any help much appreciated. Thanks Beth LogAntDensityFactor-as.factor(LogAntDensity) ###order ordinal variable HammerCatOrd-ordered(HammerCat) ###set ordered ordinal dependent variable as factor HammerCatOrdFactor-as.factor(HammerCatOrd) ###density plot with three levels cdplot(HammerCatOrdFactor~LogAntDensityFactor,xlab=Log(Ant Density),ylab=Latency of response to disturbance (1-3)) require(MASS) logordered-polr(HammerCatOrdFactor~LogAntDensityFactor,Hess=TRUE) summary(logordered,digits=3) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ordered logistic regression - cdplot and polr
Hi Elizabeth, On Mon, Apr 11, 2011 at 9:59 AM, Elizabeth Pringle eprin...@stanford.edu wrote: Hi, I have a dataset that I am trying to analyze and plot as an ordered logistic regression (y = ordinal categories 1-3, x = continuous variable with values 3-9). First is a problem with cdplot: Produces a beautiful plot, with the right trend, but my independent factor values are transformed. The factor has values from 3-9, but the plot produces an x-axis with values from 20-140. When I force the xlim to be 3-9, it produces a plot without the trend, which can't be correct. This is difficult to really help with without some data (we do not have LogAntDensity). Certainly, if the graph shows values form 20 - 140, it makes sense that if you then force the range to be from 3 - 9, you do not see anything. The problem is not range, it is data/setup. Second is a problem with polr: The output of the summary command of the model built with polr includes t values for lots (if not all) of my independent factor values, but does not produce a summary of the fit of the model or of the overall fit of the factor. Also, intercepts are different from those produced with a logistic fit in JMP... Does it not output the Residual Deviance and AIC? Those relate to model fit. Two models can be compared using anova(m1, m2), so to compare the overall effect of a factor or multiple factors, just fit and compare two separate models. Code below, any help much appreciated. Thanks Beth LogAntDensityFactor-as.factor(LogAntDensity) ###order ordinal variable HammerCatOrd-ordered(HammerCat) ###set ordered ordinal dependent variable as factor HammerCatOrdFactor-as.factor(HammerCatOrd) This is repetivie. ordered() makes a factor, and you could do the same with: factor(HammerCat, ordered = TRUE) Another note/commet, cdplot() and polr() have formula methods and can access data from a data frame elegantly. It would be better to keep all your data bundled together in a data frame, than have different variables in various stages of transformation but with similar names floating around. This may not be true, but wildly unexpected values almost sounds like a typo may have happened at some point either in using the name in cdplot OR in assigning data to the variable initially. ###density plot with three levels cdplot(HammerCatOrdFactor~LogAntDensityFactor,xlab=Log(Ant Density),ylab=Latency of response to disturbance (1-3)) What does str(HammerCatOrdFactor) or summary(HammerCatOrdFactor) (and ditto for LogAntDensityFactor) give? My guess is you will find they are not quite what you thought they were. require(MASS) logordered-polr(HammerCatOrdFactor~LogAntDensityFactor,Hess=TRUE) Side note, why is LogAntDensity a factor? or do you mean factor in a vernacular sense not in a technical is.factor(LogAntDensityFactor) sense? If LogAntDensityFactor is your only other term in the model, an example comparison could be: lognull - polr(HammerCatOrdFactor ~ 1, Hess=TRUE) logordered - polr(HammerCatOrdFactor ~ LogAntDensityFactor, Hess=TRUE) anova(lognull, logordered) Cheers, Josh summary(logordered,digits=3) [[alternative HTML version deleted]] Plain text emails are preferred. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] good examples of plot(table())
I am looking for good examples of visualising a tabulation using plot(table()) maybe with colour coding or indexing. Dirk -- View this message in context: http://r.789695.n4.nabble.com/good-examples-of-plot-table-tp3442131p3442131.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Geographic distance between lat-long points in R?
I found something here http://www.biostat.umn.edu/~sudiptob/Software/distonearth.R #The following program computes the distance on the surface of the earth between two points point1 and point2. Both the points are of the form (Longitude, Latitude) geodetic.distance - function(point1, point2) { R - 6371 p1rad - point1 * pi/180 p2rad - point2 * pi/180 d - sin(p1rad[2])*sin(p2rad[2])+cos(p1rad[2])*cos(p2rad[2])*cos(abs(p1rad[1]-p2rad[1])) d - acos(d) R*d } Dirk -- View this message in context: http://r.789695.n4.nabble.com/Geographic-distance-between-lat-long-points-in-R-tp3442338p3442355.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read in summarised data as table()
I have some summarised data from a 2D pivot table which I want to visualise in R. How can I read in the data as a R table so I can use mosaicplot()? Dirk -- View this message in context: http://r.789695.n4.nabble.com/read-in-summarised-data-as-table-tp3442283p3442283.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast version of Fisher's Exact Test
Hi, On Fri, Apr 8, 2011 at 1:52 PM, Bert Gunter gunter.ber...@gene.com wrote: 1. I am not an expert on this. Definitely me neither, but: 2. However, my strong prior would be no, since because it is exact it has to calculate all the possible configurations and there are a lot to calculate with the values of n1 and n2 you gave. But there are situations where one could get away with an approximation given large enough samples (ie. numbers in the contingency table), no? For instance, my wikipedia-certified statistics course suggests that with large N, a chisq.test should give decent approximation to the pvalue. You can play with that as you like. Also, the function sage.test in the sagenhaft package uses a binomial approximation to the Fisher Exact test. A slight modification from its examples: R library(sagenhaft) R s - sage.test(c(0,5,10),c(0,30,50),n1=1,n2=15000) ## And the fisher.exact equivalents: R M - list(matrix(c(0,0,1-0,15000-0),2,2), matrix(c(5,30,1-5,15000-30),2,2), matrix(c(10,50,1-10,15000-50),2,2)) R m - sapply(M, function(m) fisher.test(m)$p.value) ## How close are they to each other? R s - m [1] 0.00e+00 1.110054e-05 2.916176e-06 You can find the package here: http://www.bioconductor.org/packages/release/bioc/html/sagenhaft.html I guess you (Jim) can judge if it's (i) faster and (ii) appropriate to use in your scenario. Enjoy, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Geographic distance between lat-long points in R?
I have a bunch of geographic locations specified by lat-long coordinates. What's an easy way to calculate geographic distance between any two points? OR, perhaps there is a function for calculating a distance matrix for K sites? A comparison of some geographic distance calculations is provided at http://pineda-krch.com/2010/11/23/great-circle-distance-calculations-in-r/ , along with code for calculating the Vincenty inverse formula, which relies on the WGS-84 ellipsoid approximations. The author compares the results to fields::rdist.earth, which seems to rely on a spherical model of the earth. It would be interesting to compare it to other distance functions as well. I found that the function provided at the above URL did not handle the case of coincident points. Adding the following line after the while loop fixed this. if (iterLimit==100) return(0) # formula began with nearly or exactly coincident points Enjoy the days, cur -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.c...@epa.gov 541/754-4638 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RExcel
Hi, I am installing Excel using package RExcelInstaller. When I tried to run installRExcel() I got this error message: You don not have the R package rcom installed. The (D)COM server installed which will aloow you to use the background server in RExcel. Since rcom is not installed, foreground mode will be unavailable. You may continue with the installation, but in most circumstances you probably should cancel current installation, install the package rcom properly (do not forget to run the commands library(rcom) comRegisterRegistry() immediately after installation) and after that run this installer once again But rcom package was installed without any problem, somehow the installer keeps saying that rcom is not installed. Any suggestions? Thanks John sessionInfo() R version 2.12.2 (2011-02-25) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RExcelInstaller_3.1-13 rcom_2.2-3.1 rscproxy_1.3-1 loaded via a namespace (and not attached): [1] tools_2.12.2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read in summarised data as table()
I assume that you would use 'read.csv' if you are getting output from Excel. Since we have no idea of what you data looks like, it is hard to tell. At least post an example of your data and then what you are expecting as output from the mosaicplot using the data. On Mon, Apr 11, 2011 at 11:20 AM, dirknbr dirk...@gmail.com wrote: I have some summarised data from a 2D pivot table which I want to visualise in R. How can I read in the data as a R table so I can use mosaicplot()? Dirk -- View this message in context: http://r.789695.n4.nabble.com/read-in-summarised-data-as-table-tp3442283p3442283.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about GAM (mgcv)
Dear list, i'm using the GAM function from mgcv package. I'm using this syntax: model=gam(y~offset(x)+s(log1p(x1))+s(log1p(x2))+s(x3)+s(x4)+s(5),family=quasipoisson,data=data) and I'm sequentially dropping the single term with the highest non-significant p-value from the model and re-fitting until all term are significant. Now I have: model=gam(y~offset(x)+s(log1p(x1))+s(log1p(x2))+s(x5),family=quasipoisson,data=data) summary(model) Approximate significance of smooth terms: edf Ref.df F p-value s(log1p(x1)) 1.000 1.00 36.984 8.09e-08 *** s(log1p(x2)) 13.174 13.84 5.767 5.66e-07 *** s(x5) 8.807 8.98 3.600 0.00118 ** My question is: may I increase the k parameter for the variable x1 to avoid the 1 edf and the linear relationship in the plot result. I tried: model=gam(y~offset(x)+s(log1p(x1)*,* k=15)+s(log1p(x2))+s(x5),family=quasipoisson,data=data) and all variables still significant and I have a edf higher than 1 and a non-linear relationship in the plot result. If I increase the k parameter for one variable, should I increase it for the other variables?? Does the increase (or decrease) of the k parameter changes the interpretation of the results? I'm not sure to understand when I should change or not the k parameter... and of course I read the help page choose.k {mgcv}. Thanks a lot in advance Sam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regression model with proportional dependent variable
Hello, dear experts. I don't have much experience in building regression models, so sorry if this is too simple and not very interesting question. Currently I'm working on the model that have to predict proportion of the debt returned by the debtor in some period of time. So the dependent variable can be any number between 0 and 1 with very high probability of 0 (if there are no payment) and if there are some payments it can very likely be 1 (all debt paid) although can be any number from 0 to 1. Not having much knowledge in this area I can't think about any appropriate model and wasn't able to find much on the Internet. Can anyone give me some ideas about possible models, any information on-line and some R functions and packages that can implement it. Thank you in advance for any help. Ihor. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nndist R vs. ArcGIS
Can anyone tell me why I would get different average nearest neighbor values for the same set of coordinates between ArcGIS 10 and R? Sometimes the difference in distance is over 1.3 km. Alexis -- View this message in context: http://r.789695.n4.nabble.com/nndist-R-vs-ArcGIS-tp3442375p3442375.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] forest + igraph ?
Hello, Is it possible to have two meta-plots in one graph (not par(mfrow=c(2,1))? But somthing like library(metafor) library(igraph) if (interactive()) { forest(dat.Treat$RR, ci.lb=dat.Treat$lower, ci.ub=dat.Treat$upper, xlab=Relative Risk,slab=dat.Treat$ID,refline=1) forest(dat.Control$RR, ci.lb=dat.Control$lower, ci.ub=dat.Control$upper, xlab=Relative Risk,slab=dat.Control$ID,refline=1) } i.e. both metaplots on the same graph! Regards, Samor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Override col.lines and col.symbol in panel.xyplot with type='b'
Dear useRs, I have a longitudinal experiment with several treatment groups, ~20 subjects per group, ~6 timepoints and a continuous dependent variable. I have been successfully been using lattice::xyplot with this data. However, I have been stumped with a particular application of it. I would like to use xyplot on my data, broken into treatment groups with the groups argument, using type='b' to show subjectwise longitudinal data. So far so good, I have done this many times. But now I wish to show the same data but having the color of the lines and symbols overridden in some arbitrary way, yet not without changing anything else about the plot, in particular the structure/topology of the plot from using the groups argument and type='b'. This requires using a panel function of some sort. I have come to think I will need to use a function with a 'subscripts' argument as the panel function, which then itself calls panel.xyplot() and uses its arguments col.line and col.symbol. The closest example I could find is on page 73 of Sarkar's UseR! Lattice book, where the subscripts argument indexed the data within each grouped subplot, and it was used as an index for a user-generated vector of colors. This seems like what I want to do. But I could not get this paradigm to work. Here is a simple example using trivial data: unlist(R.Version()) platform arch i386-pc-mingw32 i386 os system mingw32 i386, mingw32 status major Patched 2 minor year 12.2 2011 month day 03 18 svn rev language 55383 R version.string R version 2.12.2 Patched (2011-03-18 r55383) search() [1] .GlobalEnvpackage:stats package:graphics [4] package:grDevices package:utils package:datasets [7] package:methods Autoloads package:base require(lattice) Loading required package: lattice set.seed(388659262) dat - data.frame(Panel=rep(c('A','B'), each=4), + ID=factor(rep(letters[1:4], each=2)), + X=rep(c(0,1), times=4), + Y=runif(8) + ) # now for the arbitrary colors. Let's highlight one subject red, the rest black dat$Color - with(dat, ifelse(Panel=='A' ID == 'a', 2, 1)) dat Panel ID X Y Color 1 A a 0 0.1138821 2 2 A a 1 0.7361403 2 3 A b 0 0.3304683 1 4 A b 1 0.5866701 1 5 B c 0 0.8819857 1 6 B c 1 0.7329025 1 7 B d 0 0.5000357 1 8 B d 1 0.6365438 1 # The following standard plot is fine. # Each subject is colored differently, # I believe recycling through the colors from either # trellis.par.get(superpose.symbol)$col or # trellis.par.get(superpose.line)$col, # but be default they are the same anyway xyplot(Y ~ X | Panel, data=dat, groups=ID, type='b', +scales=list(x=list(at=c(0,1),labels=c(0,1 # But for example, this following attempt to grab the # corresponding values of dat$Color do not have # my intended effect. There are now three subjects # plotted per group, each groups' line colors are the same, # and the symbol colors are nearly the same as the line colorsbut not exactly! xyplot(Y ~ X | Panel, data=dat, groups=ID, type='b', +scales=list(x=list(at=c(0,1),labels=c(0,1))), +panel=function(..., groups, subscripts) panel.xyplot(..., col.symbol=dat$Color[subscripts], + col.line=dat$Color[subscripts]) +) At one point in my efforts I was actually able to get the symbol colors correct, but the line colors were (to me) incomprehensibly wrong. But alas I have not be able to reproduce that to show here. \begin{ignorant speculation alert} I suspect that having (in the example) 8 points but only 4 lines causes undesired recycling somewhere. \end{speculation} Any assistance as to how to properly use the panel functions (or any other approach short of abandoning lattice
Re: [R] Partial italic in graph titles when looping
Follow-up question: I want to make the gene name bold and italic, AND make the p number just bold. But here's the catch: now I want the p number to appear as a superscript! For instance: TFL1^687 (the carrot is to indicate that I actually want the p number as a superscript). Thanks very much in advance! Sincerely, Josh Banta From: David Winsemius dwinsem...@comcast.net Sent: Sat, February 19, 2011 10:24:03 PM Subject: Re: [R] Partial italic in graph titles when looping On Feb 19, 2011, at 8:52 PM, Josh B wrote: Follow-up question: how would I make the gene name italic AND bold, and how would I make the p and the number just bold? Could also work inside teh .() function for (i in 1:nrow(x)){ plot(z - sort(rnorm(47)), type = s, main = ) points(z, cex = .5, col = dark red) title(main = bquote(italic(.(x[i,1]))*bold( p)*bold(.(as.character(x[i,2]) } From: David Winsemius dwinsem...@comcast.net Cc: R Help r-help@r-project.org Sent: Sat, February 19, 2011 8:33:33 PM Subject: Re: [R]Partial italic in graph titles when looping On Feb 19, 2011, at 7:41 PM, Josh B wrote: Dear all, I have a rather complicated problem. I am trying to loop through making graphs, so that the graph-making process is fully automated. For each graph, I'd like to make sure the corresponding title is formatted properly. The titles will be a combination of a gene name and numerical position within the gene. The gene name should be italic-bold, whereas the gene position should be just bold. Consider the following: x - read.table(textConnection(gene position FLC 3312 TFL1 687 GA1 1127), header = TRUE, as.is = TRUE) closeAllConnections() Now this, below, is essentially how I am automating the graph-making (imagine these graphs contain some sort of real data): par(mfrow = c(3,1)) for (i in 1:nrow(x)){ plot(z - sort(rnorm(47)), type = s, main = ) points(z, cex = .5, col = dark red) title(main = paste(x[i,1], p, x[i,2], sep = )) } Or perhaps (with a shuffling of the parens): for (i in 1:nrow(x)){ plot(z - sort(rnorm(47)), type = s, main = ) points(z, cex = .5, col = dark red) title(main = bquote(italic(.(x[i,1]))* p*.(x[i,2]))) } The graphs produced by this method are almost perfect, except that the gene names are not italicized (they SHOULD be). So, once again, the big question is: how would I italicize the gene names but NOT the gene positions, when looping through to make these graphs and graph titles? If I WASN'T looping to make my graph titles, I could write: title(main = expression(paste(bolditalic(FLC), bold(p3312), sep = ))) ...but I can't do that, because I'm looping (or can I?) [[elided Yahoo spam]] --- Josh Banta, Ph.D Center for Genomics and Systems Biology New York University 100 Washington Square East New York, NY 10003 Tel: (212) 998-8465 http://plantevolutionaryecology.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Zoom on simple.violinplot
Hello, I am using the function simple.violinplot from the package UsingR. I have some outliers in my dataset so that the distribution has very long tails. As a result, the y-axis of the output of simple.violinplot extends to very large values. I would like to zoom on the y-axis with a command such as ylim=c(a,b), as in boxplot(x,ylim=c(a,b)). However, doing simple.violinplot(x,ylim=c(a,b)) does not work. Is there any way out? -- View this message in context: http://r.789695.n4.nabble.com/Zoom-on-simple-violinplot-tp3442515p3442515.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with facet_grid in ggplot2
Hi all, I am practising a bit with ggplot2 but I have a problem when I try to use facet_grid. The following code:- p - ggplot(diamonds, aes(carat, ..density..)) + + geom_histogram(binwidth = 1) p + facet_grid(cut ~ clarity, margins=TRUE) produce the following error:- Error in class(output[[var]]) - class(value) : cannot set class to array unless the dimension attribute has length 0 I have lifted this code directly from the ggplot2 documentation! By a process of elimination it seems that the problem arises from the use of margins. I do not know why the dimension attribute is conflicting with margins unless it is something to do with the properties of the data frame. I am running R 2.12.1 Has the source code changed? Very grateful for any help Simon Hayward __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time series of spatial data
Hi, You could try spacetime: http://cran.r-project.org/web/packages/spacetime/ Cheers. Oscar. - Oscar Perpiñán Lamigueiro Dpto. de Ingeniería Eléctrica EUITI-UPM http://procomun.wordpress.com --- En Thu, 7 Apr 2011 03:38:12 -0700 (PDT) idham idhamkha...@gmail.com escribió: Hi guys, I'm really new in R. Trying to analyze series of spatial datasets (365 satellite images) in order to find the best model that fit the data. Any suggestion which package that could help me? Thanks in advance. Cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding margin text to lattice graphics
Hi, You can try the combination of c.trellis and update from the latticeExtra package. For example: p - xyplot(1~1) update(c(p, p, p, p), xlab='SomeText', ylab='MoreText') update(c(p, p, p, p), xlab=c('SomeText', 'SomeText2'), ylab=c('MoreText', 'MoreText2')) There are lots of examples in help(c.trellis). Cheers. Oscar. - Oscar Perpiñán Lamigueiro Dpto. de Ingeniería Eléctrica EUITI-UPM http://procomun.wordpress.com --- En Sat, 9 Apr 2011 18:33:42 -0700 Dennis Fisher fis...@plessthan.com escribió: Colleagues I am learning lattice graphics (R 2.12.2; OS X). Several days ago, I inquired about adding margin text to lattice graphics. Jim Price offered a useful reply, suggesting that I add: page = function(page) grid.text('words', x = 0.5, y = 0.01) to my call to the function. The entire function that he suggested was; xyplot(1 ~ 1, par.settings = list(layout.heights = list(bottom.padding = 10)), page = function(page) grid.text('words', x = 0.5, y = 0.01)) That worked initially and I also had success with panel.text. However, I am now working with more complicated objects in which more than one image is displayed on a page. In this instance, the text added by the command above appears with each image. I would like it to appear only once, scaled across the entire page, not relative to a single panel. Is there a different command that accomplishes my goal? Or a different implementation of this same command? Any help would be greatly appreciated. Also, because of my naivete with lattice graphics, I may be asking the question in entirely the wrong way -- please feel free to redirect me. Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: CRAN problem with plyr-1.4.1
It looks like there might be some kind of problem with the Plyr-1.4.1 packages pushed to CRAN? The web pages show 1.4.1 as the current version, but trying to fetch the source through the provided link gives a 404: http://lib.stat.cmu.edu/R/CRAN/web/packages/plyr/index.html $ wget http://lib.stat.cmu.edu/R/CRAN/src/contrib/plyr_1.4.1.tar.gz --2011-04-11 13:19:09-- http://lib.stat.cmu.edu/R/CRAN/src/contrib/plyr_1.4.1.tar.gz Resolving lib.stat.cmu.edu... 128.2.241.212 Connecting to lib.stat.cmu.edu|128.2.241.212|:80... connected. HTTP request sent, awaiting response... 404 Not Found 2011-04-11 13:19:09 ERROR 404: Not Found. This prevented me from installing ggplot2 until I went back and found an old version (1.4) of Plyr to install manually. Since it looks like Plyr was *just* updated a few days ago, I'm guessing something went awry? I checked several CRAN mirrors and got the same problem with all of them. They think the current version is 1.4.1, but they don't have any files available for download. Hope this helps, Ian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RExcel
It is asking the obvious, but did you run the commands from the rcom package after installation (see inline ***s)? -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly r-help-boun...@r-project.org wrote on 04/11/2011 02:08:02 PM: [image removed] [R] RExcel array chip to: r-help 04/11/2011 02:12 PM Sent by: r-help-boun...@r-project.org Hi, I am installing Excel using package RExcelInstaller. When I tried to run installRExcel() I got this error message: You don not have the R package rcom installed. The (D)COM server installed which will aloow you to use the background server in RExcel. Since rcom is not installed, foreground mode will be unavailable. You may continue with the installation, but in most circumstances you probably should cancel current installation, install the package rcom properly (do not forget to run the commands library(rcom) comRegisterRegistry() immediately after installation) and after that run this installer once again But rcom package was installed without any problem, somehow the installer keeps saying that rcom is not installed. Any suggestions? Thanks John sessionInfo() R version 2.12.2 (2011-02-25) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RExcelInstaller_3.1-13 rcom_2.2-3.1 rscproxy_1.3-1 loaded via a namespace (and not attached): [1] tools_2.12.2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nndist R vs. ArcGIS
Alexis wrote: Can anyone tell me why I would get different average nearest neighbor values for the same set of coordinates between ArcGIS 10 and R? Sometimes the difference in distance is over 1.3 km. spatstat::nndist calculates Euclidean distances rather than distances along the earth's surface, which is probably what you're getting from AG. A very short example illustrating the problem would be helpful in determining if I'm at all right; on the other hand I'm glad you're testing your results. You might peek at the current thread Geographic distance between lat-long points in R for more information. cur -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.c...@epa.gov 541/754-4638 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: CRAN problem with plyr-1.4.1
The first thing to do is try another mirror. The official (or as official as we ever get about anything) U.S. mirror is http://cran.us.R-project.org They tend to be very good about updating. Presently the source package for plyr is at version 1.5 and the binary versions are both at 1.4.1 On Mon, Apr 11, 2011 at 1:27 PM, Ian Davis ian.w.da...@gmail.com wrote: It looks like there might be some kind of problem with the Plyr-1.4.1 packages pushed to CRAN? The web pages show 1.4.1 as the current version, but trying to fetch the source through the provided link gives a 404: http://lib.stat.cmu.edu/R/CRAN/web/packages/plyr/index.html $ wget http://lib.stat.cmu.edu/R/CRAN/src/contrib/plyr_1.4.1.tar.gz --2011-04-11 13:19:09-- http://lib.stat.cmu.edu/R/CRAN/src/contrib/plyr_1.4.1.tar.gz Resolving lib.stat.cmu.edu... 128.2.241.212 Connecting to lib.stat.cmu.edu|128.2.241.212|:80... connected. HTTP request sent, awaiting response... 404 Not Found 2011-04-11 13:19:09 ERROR 404: Not Found. This prevented me from installing ggplot2 until I went back and found an old version (1.4) of Plyr to install manually. Since it looks like Plyr was *just* updated a few days ago, I'm guessing something went awry? I checked several CRAN mirrors and got the same problem with all of them. They think the current version is 1.4.1, but they don't have any files available for download. Hope this helps, Ian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Polar Plots
On 2011-04-11 05:38, ogbos okike wrote: Dear List, Following the link below ( http://rgm2.lab.nig.ac.jp/RGM2/func.php?rd_id=plotrix:clock24.plot) I got an interesting polar plots which displayed my data and the time of observation. Thank you very much for providing such details. However, I have two set of data which I wish to display in the same polar plot. I tried using points to add the second data but could not succeed. That is, after the running the first code: clock24.plot(a,b,main=Test Clock24 (lines),show.grid=FALSE, line.col=green,lwd=3) if(dev.interactive()) par(ask=TRUE) # now do a 'daylight' plot clock24.plot(a,b, main=Test Clock24 daytime (symbols), point.col=blue,rp.type=s,lwd=3) # reset the margins par(mar=c(5,4,4,2)) I tried to add the second using: points(aa,bb,col=blue) Error in xy.coords(x, y) : (list) object cannot be coerced to type 'double' points(add = TRUE,a,b,col=blue) Error in xy.coords(x, y) : (list) object cannot be coerced to type 'double' Have you made sure that your points fit on the display? The following works for me (note: I'm using Jim Lemon's well-known penchant for eschewing the spacebar): testlen-rnorm(24)*2+5 testpos-0:23+rnorm(24)/4 clock24.plot(testlen,testpos,show.grid=FALSE,line.col=3) clock24.plot(testlen[7:19],testpos[7:19],point.col=4,rp.type=s,point.symbol=16,cex=3,add=TRUE) par('usr') #[1] -8.786689 8.786689 -8.786689 8.786689 points(3,4,pch=19,col=2,cex=3) Peter Ehlers Any further help will be much appreciated. Best regards Ogbos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I include a new book on the
It's there. Thank you Ben and also Kurt! Best, Marcio Em 4/7/2011 10:19 AM, Marcio Pupin Mello escreveu: Thanks Ben! I will! Em 4/7/2011 8:32 AM, Ben Bolker escreveu: Marcio Pupin Mellomelloat ieee.org writes: I've just published a new book for R beginners in Portuguese: Conhecendo o R: uma visão estatística (something like Knowing R: an statistical approach). I'd like to include it on the list Books at R-project.org. How can I do it? More informations about the book at http://www.editoraufv.com.br/produtos/conhecendo-o-r I think you should try contacting Kurt Hornik at r-project.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Marcio Pupin Mello Survey Engineer Ph.D student in Remote Sensing National Institute for Space Research (INPE) - Brazil Laboratory of Remote Sensing in Agriculture and Forestry (LAF) www.dsr.inpe.br/~mello __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nndist R vs. ArcGIS
On Mon, Apr 11, 2011 at 4:49 PM, smoluka smol...@geo.oregonstate.edu wrote: Can anyone tell me why I would get different average nearest neighbor values for the same set of coordinates between ArcGIS 10 and R? Sometimes the difference in distance is over 1.3 km. Edge correction? In a spatial point pattern, points near the boundary of your window are less likely to have a near neighbour because only some of the surrounding space can possibly have points. I think functions in spatstat will correct for this. Make a simple test example and tell us what functions you are using. And also try the r-sig-geo mailing list for this sort of thing. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting many substrings but only loading the original string one time.
Hi All, I'm looking for a way to get many substrings from a longer string and then stitch them together. But, since the longer string is really, really long (like 250 MB long), I don't want to do this in a loop and load and re-load the longer string many times. Does anybody have an idea? Maybe I could pass in two vectors (the first would have the starting coordinates, and the second would have the stopping coordinates), so it would be like a vectorized version of substr, where start and stop would be vector instead of single integers. Example (I'm reducing the size of the string for the example) of how this might work: longerString - 'HelloThisIsMyLongerString startVector - c(2,6,4) stopVector - c(4,10,5) substrings - vectorized_substr(longerString, startVector, stop Vector) longerString [1] ell ThisI lo Then I'd like to concatenate them (there will be many of them) result - paste(longerString,collapse='') result [1] ellThisIlo (perhaps the paste command as I've done it is the best way, but depending on how the substrings are reported there may be different ways). Thanks! Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting many substrings but only loading the original string one time.
On 11/04/2011 3:48 PM, Jonathan wrote: Hi All, I'm looking for a way to get many substrings from a longer string and then stitch them together. But, since the longer string is really, really long (like 250 MB long), I don't want to do this in a loop and load and re-load the longer string many times. Does anybody have an idea? Maybe I could pass in two vectors (the first would have the starting coordinates, and the second would have the stopping coordinates), so it would be like a vectorized version of substr, where start and stop would be vector instead of single integers. Example (I'm reducing the size of the string for the example) of how this might work: longerString- 'HelloThisIsMyLongerString startVector- c(2,6,4) stopVector- c(4,10,5) substrings- vectorized_substr(longerString, startVector, stop Vector) longerString [1] ell ThisI lo Use substring(), not substr(). It is vectorized: substring(longerString, startVector, stopVector) [1] ell ThisI lo It does this by replicating the longerString, but that doesn't mean actual copies are made: just multiple pointers to the same big one. Duncan Murdoch Then I'd like to concatenate them (there will be many of them) result- paste(longerString,collapse='') result [1] ellThisIlo (perhaps the paste command as I've done it is the best way, but depending on how the substrings are reported there may be different ways). Thanks! Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting a quadratic line on top of an xy scatterplot
Dear Listserv, Here is my latest in a series of simple-seeming questions that dog me. Consider the following data: x - read.table(textConnection(temperature probability 0.11 9.4 0 2.3 0.38 8.7 0.43 9.2 0.6 15.6 0.47 8.7 0.09 12.8 0.11 9.4 0.01 7.7 0.83 8 0.65 9.3 0.05 7.4 0.34 10.1 0.02 4.8 0.07 9.1 0.6 15.6 0.01 8.4 0.9 9.6 0.83 8 0.12 8.4 0.01 8 0 5 0.11 9.7 0.41 7.4 0.05 9.4 0.09 8.3 0 6.1 0.12 8.4 0.73 7.8 0 4.2), header = TRUE, as.is = TRUE) closeAllConnections() I modeled the relationship: Probability = f(Temperature), i.e., probability as a function of temperature. I found that there is a significant quadratic term in the model: summary(lm(x[,2] ~ x[,1] + I(x[,1]^2))) Now the question is: how do I plot it? I can do this: plot(x[,2] ~ x[,1]) ...but I would also like to add a line corresponding to the quadratic function. In other words, I want to visually show the relationship among the variables that is being modeled. How do I do it? I think the curve() command will be used, but I don't know how to employ it. Thanks very much in advance. Sincerely, --- Josh Banta, Ph.D Center for Genomics and Systems Biology New York University 100 Washington Square East New York, NY 10003 Tel: (212) 998-8465 http://plantevolutionaryecology.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help on calculating a variable using random numbers
I'm new to R, but I'm trying to write a program for a dissertation that generates a dataset as follows... subject=1:1000 treat=rbinom(1*1000,1,.13) gender=rbinom(1*1000,1,.5) eth=runif(1*1000, min=1, max=4) cogat=rnorm(1*1000, 100, 16) map=rnorm(1*1000, 200, 9) simtest=data.frame (subject=subject, treat=treat, gender=gender, eth=round(eth,digits=0), cogat=round(cogat,digits=0),map=round(map,digits=0)) simtest I need to add a variable named growth. If the treat variable for an observation is 0 then growth needs to be a randomly generated a number from a normal distribution with a mean of .1 and a sd of .03. If the treat variable is 1 then growth needs to be a randomly generated a number from a normal distribution with a mean of .5 and a sd of .03. Please help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Meta-analysis of a correlation matrix
Sorry for the cross-posting, but I would like to know if anyone is aware of a package in R for this. -- Forwarded message -- From: John Antonakis Sent: Sunday, April 10, 2011 3:26 PM To: RMNET Subject: Meta-analysis of a correlation matrix (correct thread title) Hi: Does anyone know of good program that can do a meta-analytic multiple regression (with multiple correlated independent variables and one dependent varable) where the data input is in the form of a meta-analyzed correlation matrix (and where the point estimates and SEs produced are consistent)? Regards, John. __ Prof. John Antonakis Faculty of Business and Economics Department of Organizational Behavior University of Lausanne Internef #618 CH-1015 Lausanne-Dorigny Switzerland Tel ++41 (0)21 692-3438 Fax ++41 (0)21 692-3305 http://www.hec.unil.ch/people/jantonakis Associate Editor The Leadership Quarterly __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting many substrings but only loading the original string one time.
Duncan, That would appear to be exactly what I was looking for! I will follow up if I have trouble after implementing the script this'll be used in. I suppose I'd be wondering whether R is a reasonably fast language to use for this type of task (given the very large long string size, and the large number of substrings to fetch), ie is it much slower than C++, or in the same ballpark? Thanks! Jonathan On Mon, Apr 11, 2011 at 4:14 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 11/04/2011 3:48 PM, Jonathan wrote: Hi All, I'm looking for a way to get many substrings from a longer string and then stitch them together. But, since the longer string is really, really long (like 250 MB long), I don't want to do this in a loop and load and re-load the longer string many times. Does anybody have an idea? Maybe I could pass in two vectors (the first would have the starting coordinates, and the second would have the stopping coordinates), so it would be like a vectorized version of substr, where start and stop would be vector instead of single integers. Example (I'm reducing the size of the string for the example) of how this might work: longerString- 'HelloThisIsMyLongerString startVector- c(2,6,4) stopVector- c(4,10,5) substrings- vectorized_substr(longerString, startVector, stop Vector) longerString [1] ell ThisI lo Use substring(), not substr(). It is vectorized: substring(longerString, startVector, stopVector) [1] ell ThisI lo It does this by replicating the longerString, but that doesn't mean actual copies are made: just multiple pointers to the same big one. Duncan Murdoch Then I'd like to concatenate them (there will be many of them) result- paste(longerString,collapse='') result [1] ellThisIlo (perhaps the paste command as I've done it is the best way, but depending on how the substrings are reported there may be different ways). Thanks! Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with facet_grid in ggplot2
On 4/11/2011 9:33 AM, Simon Hayward wrote: Hi all, I am practising a bit with ggplot2 but I have a problem when I try to use facet_grid. The following code:- p- ggplot(diamonds, aes(carat, ..density..)) + + geom_histogram(binwidth = 1) p + facet_grid(cut ~ clarity, margins=TRUE) produce the following error:- Error in class(output[[var]])- class(value) : cannot set class to array unless the dimension attribute has length 0 I have lifted this code directly from the ggplot2 documentation! By a process of elimination it seems that the problem arises from the use of margins. I do not know why the dimension attribute is conflicting with margins unless it is something to do with the properties of the data frame. I am running R 2.12.1 Has the source code changed? Very grateful for any help Simon Hayward Simon, You have correctly figured out the problem; the margins argument does not work in facet_grid. It is a known bug. See, for example, http://groups.google.com/group/ggplot2/browse_thread/thread/8a49a200ac3172a7 http://groups.google.com/group/ggplot2/browse_thread/thread/97ba6e2f469792cc It is still there in ggplot2 0.8.9 (the latest release). -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Meta-analysis of a correlation matrix
I don't know if it can, but have you looked into the metafor package? On Monday, April 11, 2011 at 1:46 PM, Iuri Gavronski wrote: Sorry for the cross-posting, but I would like to know if anyone is aware of a package in R for this. -- Forwarded message -- From: John Antonakis Sent: Sunday, April 10, 2011 3:26 PM To: RMNET Subject: Meta-analysis of a correlation matrix (correct thread title) Hi: Does anyone know of good program that can do a meta-analytic multiple regression (with multiple correlated independent variables and one dependent varable) where the data input is in the form of a meta-analyzed correlation matrix (and where the point estimates and SEs produced are consistent)? Regards, John. __ Prof. John Antonakis Faculty of Business and Economics Department of Organizational Behavior University of Lausanne Internef #618 CH-1015 Lausanne-Dorigny Switzerland Tel ++41 (0)21 692-3438 Fax ++41 (0)21 692-3305 http://www.hec.unil.ch/people/jantonakis Associate Editor The Leadership Quarterly __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Geographic distance between lat-long points in R?
Thanks very much for the help! Scott On Monday, April 11, 2011 at 12:54 PM, seeliger.c...@epamail.epa.gov wrote: I have a bunch of geographic locations specified by lat-long coordinates. What's an easy way to calculate geographic distance between any two points? OR, perhaps there is a function for calculating a distance matrix for K sites? A comparison of some geographic distance calculations is provided at http://pineda-krch.com/2010/11/23/great-circle-distance-calculations-in-r/ , along with code for calculating the Vincenty inverse formula, which relies on the WGS-84 ellipsoid approximations. The author compares the results to fields::rdist.earth, which seems to rely on a spherical model of the earth. It would be interesting to compare it to other distance functions as well. I found that the function provided at the above URL did not handle the case of coincident points. Adding the following line after the while loop fixed this. if (iterLimit==100) return(0) # formula began with nearly or exactly coincident points Enjoy the days, cur -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.c...@epa.gov 541/754-4638 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Speeding up Multinomial Logit/Proportional odds model in R vs stata?
Hi all, An R blogger just published a comparison between R and stata for performing: - Multinomial Logit - Proportional odds model - Generalized Logit At: http://ekonometrics.blogspot.com/2011/04/speeding-tickets-for-r-and-stata.html The benchmark used (as mentioned in the comment to the post) isn't the best one. Still, since the differences are in the magnitude of x6-x20 (in favor of stata), I thought it might be of interest to someone on the list to check if there is some substantial speed improvement that might be possible. With respect, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple maths question
Hi Mrs Ms R, A simple maths question that I am trying to resolve with R: I need to calculate the SE from a pvalue and it's beta... How to do this...? Thank you very much and best regards! Georg Ehret, Geneva, Switzerland. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on calculating a variable using random numbers
Hi: Try simtest - transform(simtest, growth = rnorm(1000, m = ifelse(treat == 0, 0.1, 0.5), s = 0.03)) HTH, Dennis On Mon, Apr 11, 2011 at 1:16 PM, Shane Phillips sphill...@lexington1.netwrote: I'm new to R, but I'm trying to write a program for a dissertation that generates a dataset as follows... subject=1:1000 treat=rbinom(1*1000,1,.13) gender=rbinom(1*1000,1,.5) eth=runif(1*1000, min=1, max=4) cogat=rnorm(1*1000, 100, 16) map=rnorm(1*1000, 200, 9) simtest=data.frame (subject=subject, treat=treat, gender=gender, eth=round(eth,digits=0), cogat=round(cogat,digits=0),map=round(map,digits=0)) simtest I need to add a variable named growth. If the treat variable for an observation is 0 then growth needs to be a randomly generated a number from a normal distribution with a mean of .1 and a sd of .03. If the treat variable is 1 then growth needs to be a randomly generated a number from a normal distribution with a mean of .5 and a sd of .03. Please help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Geographic distance between lat-long points in R?
A comparison of some geographic distance calculations is provided at http://pineda-krch.com/2010/11/23/great-circle-distance-calculations-in-r/ , along with code for calculating the Vincenty inverse formula, which relies on the WGS-84 ellipsoid approximations. You know, Scott, I should have included some test results of that method. Comparing the distances with Arc 9 indicates that the accuracy varies with location and whether there is a longitudinal difference in the two points. Comparing calculation results for points shifted 0 secs to 10 degrees North, West and Northwest from a 'base' point, the relative errors (defined as (Arc9.distance - Vincenty.distance)/Arc9.distance) range up to 0.08 in AK, AZ, CA, MT, NE, NM, UT, WA and WY, and range only up to 0.009 otherwise. In the special case of zero longitudinal offset (North-South distances only), the relative error ranges to 0.006 in those states and to 2E-7 otherwise. Let us know if you can do better, cur -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.c...@epa.gov 541/754-4638 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Partial italic in graph titles when looping
On Apr 11, 2011, at 6:28 PM, Josh B wrote: Follow-up question: I want to make the gene name bold and italic, AND make the p number just bold. But here's the catch: now I want the p number to appear as a superscript! I am no longer clear (if I ever was) what the p number might be, but here is my guess: main = bquote(italic(.(x[i,1]))*bolditalic( p)^.(as.character(x[i, 2]))) For more than a guess, post a worked example, please. -- David. For instance: TFL1^687 (the carrot is to indicate that I actually want the p number as a superscript). Thanks very much in advance! Sincerely, Josh Banta From: David Winsemius dwinsem...@comcast.net To: Josh B josh...@yahoo.com Sent: Sat, February 19, 2011 10:24:03 PM Subject: Re: [R] Partial italic in graph titles when looping On Feb 19, 2011, at 8:52 PM, Josh B wrote: Follow-up question: how would I make the gene name italic AND bold, and how would I make the p and the number just bold? Could also work inside teh .() function for (i in 1:nrow(x)){ plot(z - sort(rnorm(47)), type = s, main = ) points(z, cex = .5, col = dark red) title(main = bquote(italic(.(x[i,1]))*bold( p)*bold(. (as.character(x[i,2]) } From: David Winsemius dwinsem...@comcast.net To: Josh B josh...@yahoo.com Cc: R Help r-help@r-project.org Sent: Sat, February 19, 2011 8:33:33 PM Subject: Re: [R]Partial italic in graph titles when looping On Feb 19, 2011, at 7:41 PM, Josh B wrote: Dear all, I have a rather complicated problem. I am trying to loop through making graphs, so that the graph-making process is fully automated. For each graph, I'd like to make sure the corresponding title is formatted properly. The titles will be a combination of a gene name and numerical position within the gene. The gene name should be italic-bold, whereas the gene position should be just bold. Consider the following: x - read.table(textConnection(gene position FLC 3312 TFL1 687 GA1 1127), header = TRUE, as.is = TRUE) closeAllConnections() Now this, below, is essentially how I am automating the graph- making (imagine these graphs contain some sort of real data): par(mfrow = c(3,1)) for (i in 1:nrow(x)){ plot(z - sort(rnorm(47)), type = s, main = ) points(z, cex = .5, col = dark red) title(main = paste(x[i,1], p, x[i,2], sep = )) } Or perhaps (with a shuffling of the parens): for (i in 1:nrow(x)){ plot(z - sort(rnorm(47)), type = s, main = ) points(z, cex = .5, col = dark red) title(main = bquote(italic(.(x[i,1]))* p*.(x[i,2]))) } The graphs produced by this method are almost perfect, except that the gene names are not italicized (they SHOULD be). So, once again, the big question is: how would I italicize the gene names but NOT the gene positions, when looping through to make these graphs and graph titles? If I WASN'T looping to make my graph titles, I could write: title(main = expression(paste(bolditalic(FLC), bold(p3312), sep = ))) ...but I can't do that, because I'm looping (or can I?) Thanks in advance for your help! --- Josh Banta, Ph.D Center for Genomics and Systems Biology New York University 100 Washington Square East New York, NY 10003 Tel: (212) 998-8465 http://plantevolutionaryecology.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RExcel
Yes, I did, and no error message. And comRegisterRegistry() returns NULL, not sure if that matters John From: Jonathan P Daily jda...@usgs.gov Cc: r-help r-help@r-project.org; r-help-boun...@r-project.org Sent: Mon, April 11, 2011 11:39:12 AM Subject: Re: [R] RExcel It is asking the obvious, but did you run the commands from the rcom package after installation (see inline ***s)? -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly r-help-boun...@r-project.org wrote on 04/11/2011 02:08:02 PM: [image removed] [R] RExcel array chip to: r-help 04/11/2011 02:12 PM Sent by: r-help-boun...@r-project.org Hi, I am installing Excel using package RExcelInstaller. When I tried to run installRExcel() I got this error message: You don not have the R package rcom installed. The (D)COM server installed which will aloow you to use the background server in RExcel. Since rcom is not installed, foreground mode will be unavailable. You may continue with the installation, but in most circumstances you probably should cancel current installation, install the package rcom properly (do not forget to run the commands library(rcom) comRegisterRegistry() immediately after installation) and after that run this installer once again But rcom package was installed without any problem, somehow the installer keeps saying that rcom is not installed. Any suggestions? Thanks John sessionInfo() R version 2.12.2 (2011-02-25) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RExcelInstaller_3.1-13 rcom_2.2-3.1 rscproxy_1.3-1 loaded via a namespace (and not attached): [1] tools_2.12.2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Revolutions Blog: March Roundup
I write about R every weekday at the Revolutions blog: http://blog.revolutionanalytics.com and every month I post a summary of articles from the previous month of particular interest to readers of r-help. In case you missed them, here are some articles related to R from the month of March: The doSMP package, which enables parallel processing for R on multiprocessor machine, is now available on CRAN: http://bit.ly/gTS7BJ The Offensive Politics blog provided R code used to make a map of precinct returns in the Chicago mayoral election: http://bit.ly/fon0BJ A connector to integrate R output into JasperReports with RevoDeployR is now available: http://bit.ly/ftkIFy The Iowa State Department of Statistics used R to analyze distribution of stimulus funds, and has an interesting look at some of the errors in the source data: http://bit.ly/hc4q4E The Rexer Analytics Data Miner Survey reports that R is the most commonly-used tool amongst surveyed data miners: http://bit.ly/gD9nmD We cross-posted an essay by Revolution Analytics CEO Norman Nie, Keep an Eye on the Open-Source Analytics Stack: http://bit.ly/eeCUBK Baseball batting averages provide an instructive lesson on checking your assumptions for T-tests: http://bit.ly/fGSK4y We're looking for nominations for R community members to be profiled in the R-Files series on the Revolutions blog: http://bit.ly/h3YCXg R 2.13.0 is scheduled for release on April 13: http://bit.ly/fq1OBt Sherry LaMonica of the Revolution Analytics engineering team reviews the functions in the RevoScaleR package for Big Data: http://bit.ly/gaXChr Amanda Cox presented at the New Your R User Group on how the New York Times uses R for visualization, and you can watch it on video: http://bit.ly/gJM5tH Revolution Analytics announces a partnership with Netezza, to bring R to the TwinFin data warehouse appliance: http://bit.ly/dTuIqD Register your opinions about open-source software in the 2011 Future of Open Source Survey: http://bit.ly/dZG5Oy Robert Muenchen has updated his analysis of popularity of data analysis software, featuring R: http://bit.ly/ekM5bv Tech news site The Register publishes a profile of Revolution Analytics: http://bit.ly/fBeeWP Joseph Rickert shares an example of building a model in R and exporting it to PMML for use with ADAPA: http://bit.ly/e8LGAN Violins of volatility provide a novel way of visualizing financial volatility: http://bit.ly/hkFzpe Revolution Analytics chief scientist Lee Edlefsen is interviewed at the Structure Big Data Conference in this five-minute video: http://bit.ly/ePYpt0 Other non-R-related stories in the past month included: Heritage Health and Kaggle have launched a 2-year competition with $3.2M in prizemoney for predicting hospitalization from health data (http://bit.ly/eH29nJ) and flying by Saturn without CGI (http://bit.ly/hXzKvQ). On a lighter note, there also was: successively upgrading every version of Windows (http://bit.ly/fZqyik), and an equation for celebrity dating habits (http://bit.ly/i5EhJS). There are new R user groups (http://bit.ly/eC5YQe) in Orange County, CA (http://bit.ly/gEFJOr), Tallahassee, FL and Hobart, TAS (http://bit.ly/heHv3g). Meeting times for these groups can be found on the updated R Community Calendar at: http://bit.ly/bb3naW If you're looking for more articles about R, you can find summaries from previous months at http://blog.revolutionanalytics.com/roundups/. Join the Revolution mailing list at http://revolutionanalytics.com/newsletter to be alerted to new articles on a monthly basis. As always, thanks for the comments and please keep sending suggestions to me at da...@revolutionanalytics.com . Don't forget you can also follow the blog using an RSS reader like Google Reader, or by following me on Twitter (I'm @revodavid). Cheers, # David -- David M Smith da...@revolutionanalytics.com VP of Marketing, Revolution Analytics http://blog.revolutionanalytics.com Tel: +1 (650) 646-9523 (Palo Alto, CA, USA) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nndist R vs. ArcGIS
On 12/04/11 07:32, Barry Rowlingson wrote: On Mon, Apr 11, 2011 at 4:49 PM, smolukasmol...@geo.oregonstate.edu wrote: Can anyone tell me why I would get different average nearest neighbor values for the same set of coordinates between ArcGIS 10 and R? Sometimes the difference in distance is over 1.3 km. Edge correction? In a spatial point pattern, points near the boundary of your window are less likely to have a near neighbour because only some of the surrounding space can possibly have points. I think functions in spatstat will correct for this. No. Not as far as I am aware or can discern. The function nndist() does ***not*** invoke any edge correction. It simply calculates the distances as they are, for the points that appear in the window, and takes the appropriate minima. cheers, Rolf Turner Make a simple test example and tell us what functions you are using. And also try the r-sig-geo mailing list for this sort of thing. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple maths question
Georg Ehret georgehret at gmail.com writes: Hi Mrs Ms R, A simple maths question that I am trying to resolve with R: I need to calculate the SE from a pvalue and it's beta... How to do this...? Thank you very much and best regards! Georg Ehret, Geneva, Switzerland. Without more information, I don't think you can. **If** you are assuming a Z test (i.e. the thing you are testing against a null hypotheses H_0=0 is supposed to be normally distributed) then you know that p-value = 2*pnorm(abs(beta/SE),lower.tail=FALSE) [based on a two-tailed test] and you can use qnorm() to invert this, but you can't separate beta and SE. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting a quadratic line on top of an xy scatterplot
Hi Josh, This is by no means the fanciest solution ever, but as there are predict methods for many types of models in R, I thought I would show it this way. ## fit the model model - lm(probability ~ poly(temperature, 2), data = x) ## create line values dat - data.frame(temperature = seq(min(x$temperature, na.rm = TRUE), max(x$temperature, na.rm = TRUE), by = .01)) ## add predicted y values dat$yhat - predict(model, dat) ## plot data plot(probability ~ temperature, data = x) ## add predicted line lines(x = dat$temperature, y = dat$yhat, type = l) Hope this helps, Josh On Mon, Apr 11, 2011 at 12:29 PM, Josh B josh...@yahoo.com wrote: Dear Listserv, Here is my latest in a series of simple-seeming questions that dog me. Consider the following data: x - read.table(textConnection(temperature probability 0.11 9.4 0 2.3 0.38 8.7 0.43 9.2 0.6 15.6 0.47 8.7 0.09 12.8 0.11 9.4 0.01 7.7 0.83 8 0.65 9.3 0.05 7.4 0.34 10.1 0.02 4.8 0.07 9.1 0.6 15.6 0.01 8.4 0.9 9.6 0.83 8 0.12 8.4 0.01 8 0 5 0.11 9.7 0.41 7.4 0.05 9.4 0.09 8.3 0 6.1 0.12 8.4 0.73 7.8 0 4.2), header = TRUE, as.is = TRUE) closeAllConnections() I modeled the relationship: Probability = f(Temperature), i.e., probability as a function of temperature. I found that there is a significant quadratic term in the model: summary(lm(x[,2] ~ x[,1] + I(x[,1]^2))) Now the question is: how do I plot it? I can do this: plot(x[,2] ~ x[,1]) ...but I would also like to add a line corresponding to the quadratic function. In other words, I want to visually show the relationship among the variables that is being modeled. How do I do it? I think the curve() command will be used, but I don't know how to employ it. Thanks very much in advance. Sincerely, --- Josh Banta, Ph.D Center for Genomics and Systems Biology New York University 100 Washington Square East New York, NY 10003 Tel: (212) 998-8465 http://plantevolutionaryecology.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Meta-analysis of a correlation matrix
Dear luri, The metaSEM package (http://courses.nus.edu.sg/course/psycwlm/Internet/metaSEM/) may be used to fit structural equation models on the pooled correlation/covariance matrices with weighted least squares as the estimation method. You may refer to the examples in tssem1() and tssem2(). Regards, Mike -- - Mike W.L. Cheung Phone: (65) 6516-3702 Department of Psychology Fax: (65) 6773-1843 National University of Singapore http://courses.nus.edu.sg/course/psycwlm/internet/ - On Tue, Apr 12, 2011 at 4:53 AM, Scott Chamberlain scttchamberla...@gmail.com wrote: I don't know if it can, but have you looked into the metafor package? On Monday, April 11, 2011 at 1:46 PM, Iuri Gavronski wrote: Sorry for the cross-posting, but I would like to know if anyone is aware of a package in R for this. -- Forwarded message -- From: John Antonakis Sent: Sunday, April 10, 2011 3:26 PM To: RMNET Subject: Meta-analysis of a correlation matrix (correct thread title) Hi: Does anyone know of good program that can do a meta-analytic multiple regression (with multiple correlated independent variables and one dependent varable) where the data input is in the form of a meta-analyzed correlation matrix (and where the point estimates and SEs produced are consistent)? Regards, John. __ Prof. John Antonakis Faculty of Business and Economics Department of Organizational Behavior University of Lausanne Internef #618 CH-1015 Lausanne-Dorigny Switzerland Tel ++41 (0)21 692-3438 Fax ++41 (0)21 692-3305 http://www.hec.unil.ch/people/jantonakis Associate Editor The Leadership Quarterly __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bind mean to a df
Hello, I would like to take the mean of a column from a data frame and then bind the mean back to the data frame. I can do this using the following lines of code, but I am looking for a more elegant solution. Thank you very much. Geoff name - c('Frank','Frank','Frank','Tony','Tony','Tony','Ed','Ed','Ed'); year - c(2004,2005,2006,2004,2005,2006,2004,2005,2006); sale - c(56,45,55,65,68,70,45,67,23); data - data.frame(name=name, year=year, sale=sale); data; #is there a more elegant way to add a column of means for sale by name than what I did below?; mean - data.frame(aggregate(data$sale, list(data$name), mean)); colnames(mean) - c('name','mean'); mean; data - merge(data, mean); data; [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bind mean to a df
Hi Geoffrey, Here is one option (data named dfrm instead of data because data() is a function too): ## Data dfrm - data.frame( name = c('Frank','Frank','Frank','Tony','Tony','Tony','Ed','Ed','Ed'), year = c(2004,2005,2006,2004,2005,2006,2004,2005,2006), sale = c(56,45,55,65,68,70,45,67,23)) ## Using with() to avoid typing names and ave() to do the work dfrm$mean - with(dfrm, ave(x = sale, name, FUN = mean)) ## look at the results dfrm Cheers, Josh On Mon, Apr 11, 2011 at 8:46 PM, Geoffrey Smith g...@asu.edu wrote: Hello, I would like to take the mean of a column from a data frame and then bind the mean back to the data frame. I can do this using the following lines of code, but I am looking for a more elegant solution. Thank you very much. Geoff name - c('Frank','Frank','Frank','Tony','Tony','Tony','Ed','Ed','Ed'); year - c(2004,2005,2006,2004,2005,2006,2004,2005,2006); sale - c(56,45,55,65,68,70,45,67,23); data - data.frame(name=name, year=year, sale=sale); data; #is there a more elegant way to add a column of means for sale by name than what I did below?; mean - data.frame(aggregate(data$sale, list(data$name), mean)); colnames(mean) - c('name','mean'); mean; data - merge(data, mean); data; [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bind mean to a df
Hi, You could try, library(plyr) ddply(data, .(name), transform, mean=mean(sale)) ddply(data, .(name), summarize, mean=mean(sale)) HTH, baptiste On 12 April 2011 15:46, Geoffrey Smith g...@asu.edu wrote: Hello, I would like to take the mean of a column from a data frame and then bind the mean back to the data frame. I can do this using the following lines of code, but I am looking for a more elegant solution. Thank you very much. Geoff name - c('Frank','Frank','Frank','Tony','Tony','Tony','Ed','Ed','Ed'); year - c(2004,2005,2006,2004,2005,2006,2004,2005,2006); sale - c(56,45,55,65,68,70,45,67,23); data - data.frame(name=name, year=year, sale=sale); data; #is there a more elegant way to add a column of means for sale by name than what I did below?; mean - data.frame(aggregate(data$sale, list(data$name), mean)); colnames(mean) - c('name','mean'); mean; data - merge(data, mean); data; [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] model specification: help needed
Hi R experts: I am new to mixed model commodity. I am tryping to specify a model using lmer in lme4 package. I am not sure if I am doing right, so I need your helpplease.. Treatment / factor structure Year: level 1:3, the whole the experiment was repeated in three years, random factor village: level 1:2 # the level is much higher just three are shown as example, random factor Farm : level 1:9 # the level is much higher just three are shown as example, random factor Variety: 10 variety were grown (may or not be different at different years, farm, villages, some of them were repeated) (fixed effect) Thus layout of treatment structure would like the follows for each year - Year[1] Villlage[1] Farm[1] Variety: 1, 2, 8, 9, 6, 5 Farm[2] Variety: 6, 8, 9, 10, 4 Farm[3] Variety: 1, 2, 5, 6, 3, 7 Village[2] Farm[3] Variety: 6, 8, 3, 4, 2 Farm[4] Variety: 3, 8,1, 10, 2 Farm[5] Variety: 1, 2, 3, 4, 5, 6 I am interested in interactions as well as following is the model in my mind: Pijklm = M+Yi +Vj +YVij +F(YV)k(ij) +Gl +GYli +GVlj + GYVlij + eijklm (Y is for year, V = village, G = Variety, F = Farm) I tried the following model and command, am I right? lmer( gryld ~ 1 + (1|year) + (1|village) + (1|year:village) + (Farm|year:village) + variety + (1|variety:year) + (1|variety:village) + (1|year:variety:village) , data= mbtrail) My doubt is on specially on year component? how can put that effectively? Thank you for your time. I tried to post this to mixed model forum but I did not get any response. Sorry to post all of you, but my hope is my question is simple enough and bigger R community can help me ! Ram H -- Ram H [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.