Re: [R] Replace split with regex for speed ?
That's a good solution, but if you're really, really sure that the timestamps are in the format you gave, it's quite a bit faster to use substr and paste, because you don't have to do any searching in the string. HTH Rex x = rep(09:30:00.000.633,100) system.time(y-paste(substr(x,1,12),substr(x,14,16),sep=)) user system elapsed 0.870.000.88 system.time(y-sub(\\.(\\d+)$, \\1, x)) user system elapsed 1.650.001.65 system.time(y-sub(\\.(\\d+)$, \\1, x)) user system elapsed 1.650.001.66 system.time(y-paste(substr(x,1,12),substr(x,14,16),sep=)) user system elapsed 0.880.000.89 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique Dallazuanna Sent: Friday, March 18, 2011 8:32 AM To: rivercode Cc: r-help@r-project.org Subject: Re: [R] Replace split with regex for speed ? Try this: sub(\\.(\\d+)$, \\1, ts) On Thu, Mar 17, 2011 at 11:01 PM, rivercode aqua...@gmail.com wrote: Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last . so it is in format HH:MM:SS.MMMUUU. What is the fastest way to do this, since it has to be repeated on millions of rows. Should I use regex ? Currently doing it with a string split, which is slow: head(ts) [1] 09:30:00.000.245 09:30:00.000.256 09:30:00.000.633 09:30:00.001.309 09:30:00.003.635 09:30:00.026.370 ts = strsplit(ts, ., fixed = TRUE) ts=lapply(ts, function(x) { paste(x[1], ., x[2], x[3], sep=) } ) # Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU ts = unlist(ts) Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Singularity problem
Feng, Your matrix is *not* (practically) singular; its inverse is. The message said that the *system* was singular, not the matrix. Remember Cramer's Rule: xi = |Ai| / |A| The really, really large determinant of your matrix is going to appear in the denominator of your solutions, so, essentially, you get underflow. Try working out the entire solution with Cramer's Rule if you still don't see the problem. solve doesn't really use Cramer's Rule, but it will give you a feel for the issue. HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Berend Hasselman Sent: Wednesday, March 16, 2011 1:33 PM To: r-help@r-project.org Subject: Re: [R] Singularity problem Peter Langfelder wrote: On Wed, Mar 16, 2011 at 8:28 AM, Feng Li lt;m...@feng.ligt; wrote: Dear R, If I have remembered correctly, a square matrix is singular if and only if its determinant is zero. I am a bit confused by the following code error. Can someone give me a hint? a - matrix(c(1e20,1e2,1e3,1e3),2) det(a) [1] 1e+23 solve(a) Error in solve.default(a) : system is computationally singular: reciprocal condition number = 1e-17 You are right, a matrix is mathematically singular iff its determinant is zero. However, this condition is useless in practice since in practice one cares about the matrix being computationally singular, i.e. so close to singular that it cannot be inverted using the standard precision of real numbers. And that's what your matrix is (and the error message you got says so). You can write your matrix as a = 1e20 * matrix (c(1, 1e-18, 1e-17, 1e-17), 2, 2) Compared to the first element, all of the other elements are nearly zero, so the matrix is numerically nearly singular even though the determinant is 1e23. A better measure of how numerically unstable the inversion of a matrix is is the condition number which IIRC is something like the largest eigenvalue divided by the smallest eigenvalue. svd(a) indicates the problem. largest singular value / smallest singular value=1e17 (condition number) -- reciprocal condition number=1e-17 and the standard solve can't handle that. (pivoted) QR decomposition does help. And so does SVD. Berend -- View this message in context: http://r.789695.n4.nabble.com/Singularity-problem-tp3382093p3382465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
Hi Jon, I read your question differently. Is the answer? - Rex ch=scan(stdin(),what=character(0),n=1) 1: f Read 1 item ch [1] f -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Tuesday, March 15, 2011 10:09 AM To: Jonathan P Daily Cc: r-help@r-project.org Subject: Re: [R] (no subject) ?strsplit x - ThisIsaString y- strsplit(x,) This gives a list, which you can convert to a vector by unlist(y) Incidentally, you could have found out about strsplit via R's help.search(character string) (or similar) or even googling R string function . Please use R's native Help capabilities before posting to the list. (I will grant that the unlist() trick may not be that easy to find). Also, it's often worthwhile searching the Help archives first. Peter Dalgaard answered this same question here a day or two ago. Cheers, Bert On Tue, Mar 15, 2011 at 6:35 AM, Jonathan P Daily jda...@usgs.gov wrote: I was wondering if there is a way to get read in a single keystroke at a time in R as a string, akin to ncurses-style interfaces. I looked into readLines, readChar, etc. using stdin, but these all require the use of an end of line. Has anyone ever had need to do this or have any ideas on how to do this? Thanks, Jon PS I apologize if this double-sends, but I am having mail client issues. -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read only specified columns from a data file
I think you need to read an introduction to R. For starters, read.table returns its results as a value, which you are not saving. The probable answer to your question: Read the whole file with read.table, and select columns you need, e.g.: tab - read.table(myfile, skip=2)[,1:5] -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Luis Ridao Sent: Tuesday, March 15, 2011 11:53 AM To: r-help@r-project.org Subject: [R] How to read only specified columns from a data file R-help, I'm trying to read a data file with plenty of columns. I just need the first 5 but it doe not work by doing something like: mycols - rep(NULL, 430) ; mycols[c(1:4)] - NA read.table(myfile, skip=2, colClasses=mycols) Any suggestions? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does R have a const object?
Cheer up! R is a step closer to that concept than the old FORTRAN compilers that couldn't even guarantee that 37 was a constant if used repeatedly in a subroutine call. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Uwe Ligges Sent: Tuesday, March 15, 2011 2:23 PM To: xiagao1982 Cc: r-help Subject: Re: [R] Does R have a const object? On 15.03.2011 15:53, xiagao1982 wrote: Hi, all, Does R have a const object concept like which is in C++ language? I want to set some data frames as constant to avoid being modified unintentionally. Thanks! Although there is almost never a No in R, the best short answer is: No. Best, Uwe Ligges xiagao1982 2011-03-15 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] *Building* a covariance matrix efficiently
Tjerk, This is just a pseudo code outline of what you need to do: M = matrix(0, number of variables, number of variables) V = rep(0, number of variables) N = 0 While (more observations to read) { X - next observation V - V + X M - M + outer(X,X) N - N+1 } Compute covariance matrix from elements of V,M, and N You just need to refer to the formula defining covariance. Outlook seems to think all my variables should be upper case. HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Tsjerk Wassenaar Sent: Monday, March 14, 2011 10:14 AM To: R-help Subject: [R] *Building* a covariance matrix efficiently deaRs, I want to build a covariance matrix out of the data from a binary file, that I can read in chunk by chunk, with each chunk containing a single observation vector X. I wonder how to do that most efficiently, avoiding the calculation of the full symmetric matrices XX'. The trivial non-optimal approach boils down to something like: Q - matrix(rnorm(10),ncol=200) M - matrix(0,ncol=200,nrow=200) for (i in 1:nrow(Q)) M - M + tcrossprod(Q[i,]) I would appreciate pointers to help me fill this lacuna in my R skills :) Cheers, Tsjerk -- Tsjerk A. Wassenaar, Ph.D. post-doctoral researcher Molecular Dynamics Group * Groningen Institute for Biomolecular Research and Biotechnology * Zernike Institute for Advanced Materials University of Groningen The Netherlands __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] *Building* a covariance matrix efficiently
Tsjerk, It seems to me that memory and not time is your big efficiency problem, and I've showed you how to avoid storing your entire input. If you want to avoid doing each multiplication twice, you can replace the outer with a function that computes each product only once and accumulate sums of those products. Iii = matrix(c(rep(1:200,200),rep(1:200,each=200)), ncol=2) Iii = iii[ iii[,1]=iii[,2]] and at each step V2 = v2+X[iii[,1]] * X[iii[,2]] instead of M. I would imagine that the internal cov does this anyway, as you are not the first person to notice this symmetry, so I'm not sure of the point of this exercise. PS: I actually know how to spell and pronounce Tsjerk, but Tsj is not a very familiar pattern for my fingers. From: Tsjerk Wassenaar [mailto:tsje...@gmail.com] Sent: Monday, March 14, 2011 1:41 PM To: Dwyer Rex USRE Subject: Re: RE: [R] *Building* a covariance matrix efficiently Hi Rex, Thanks for the reply. But it doesn't solve the issues of redundant calculations due to symmetry, both in the outer product and in the summation. Cheers, Tsjerk (correct spelling, really) On Mar 14, 2011 5:44 PM, rex.dw...@syngenta.commailto:rex.dw...@syngenta.com wrote: Tjerk, This is just a pseudo code outline of what you need to do: M = matrix(0, number of variables, number of variables) V = rep(0, number of variables) N = 0 While (more observations to read) { X - next observation V - V + X M - M + outer(X,X) N - N+1 } Compute covariance matrix from elements of V,M, and N You just need to refer to the formula defining covariance. Outlook seems to think all my variables should be upper case. HTH Rex -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org.mailto:r-help-boun...@r-project.org... __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum distance between line segments
I like Thomas's idea as a quick practical solution. Here is one more little variation just in case you really do have millions of these distances. Pick point P1 on line segment L1 (e.g., an endpoint). Pick 101 evenly spaced points on line segment L2. Find the nearest to P1 and call it P2. Now go back to L1 and pick a new P1. Alternate until the distance stops dropping. I think it is probably a theorem that three iterations suffice. So, you could get by with 303 distance calculations instead of 10,201. This might also be interesting to some because it defines a function that returns a function. line = function(p1,p2) function(a) (1-a)*p1 + a*p2 # returns a function mapping any a in [0,1] to a point between p1 and p2. line1 = line(c(pi,12),c(7,-3)) # one line line2 = line(c(0.1,5),c(3,7*sqrt(2))) #another line print(line1(1)) # one endpoint print(line1(0)) # the other print(line1(0.5)) # midpoint d2 = function(p1,p2) sum((p1-p2)^2) # parameter a for the point on some.line nearest to some.point. nearest = function(some.point,some.line) { seq = (0:100)/100; return(seq[which.min(sapply(seq, function(a) d2(some.point,some.line(a]) } plot.seg = function (some.line,...) lines(rbind(some.line(0),some.line(1)),...) a = 0 b = nearest(line1(a),line2) a = nearest(line2(b),line1) b = nearest(line1(a),line2) plot(c(-5,15),c(-5,15),asp=1,type=n,main=sqrt(d2(line1(a),line2(b plot.seg(line1,col=black) plot.seg(line2,col=blue) plot.seg(line(line1(a),line2(b)),col=red) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Thomas Lumley Sent: Thursday, March 10, 2011 2:54 PM To: Mike Marchywka Cc: r-help@r-project.org; darcy.web...@gmail.com Subject: Re: [R] minimum distance between line segments On Fri, Mar 11, 2011 at 2:46 AM, Mike Marchywka marchy...@hotmail.com wrote: Date: Wed, 9 Mar 2011 10:55:46 +1300 From: darcy.web...@gmail.com To: r-help@r-project.org Subject: [R] minimum distance between line segments Dear R helpers, I think that this may be a bit of a math question as the more I consider it, the harder it seems. I am trying to come up with a way to work out the minimum distance between line segments. For instance, consider 20 random line segments: x1 - runif(20) y1 - runif(20) x2 - runif(20) y2 - runif(20) plot(x1, y1, type = n) segments(x1, y1, x2, y2) Inititally I thought the solution to this problem was to work out the distance between midpoints (it quickly became apparent that this is totally wrong when looking at the plot). So, I thought that perhaps finding the minimum distance between each of the lines endpoints AND their midpoints would be a good proxy for this, so I set up a loop that uses pythagoras to work out these 9 distances and find the minimum. But, this solution is obviously flawed as well (sometimes lines actually intersect, sometimes the minimum distances are less etc). Any help/dection on this one would be much appreciated. There are two possibilities: If the segments cross, the minimum distance is where they cross, obviously. If they don't cross, the minimum distance is from one of the four endpoints to the closest point on the other segment. The closest point on the other segment is either the nearest endpoint of the other segment or the closest point on the infinite line that extends the other segment. That gives a small set of possibilities to work with. If you're not doing this for millions of segments and you don't need very high accuracy, however, taking lots of points from each segment and computing pairwise distances by brute force is likely to be easier peri-function(xstart,ystart,xend,yend){ line1x-seq(xstart[1],xend[1],length=98) line1y-seq(ystart[1],yend[1],length=98) line2x-seq(xstart[2],xend[2],length=100) line2y-seq(ystart[2],yend[2],length=100) distsq-outer(1:98,1:100, function(i,j) (line1x[i]-line2x[j])^2+(line1y[i]-line2y[j])^2) closest-which(distsq==min(distsq),arr.ind=TRUE) rbind(c(line1x[closest[1]],line1y[closest[1]]),c(line2x[closest[2]],line2y[closest[2]])) } -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
Re: [R] minimum distance between line segments
I think I need to retract the part about 3 iterations... not true if, e.g., the segments intersect and the angle is small. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of rex.dw...@syngenta.com Sent: Friday, March 11, 2011 2:37 PM To: tlum...@uw.edu; marchy...@hotmail.com Cc: r-help@r-project.org; darcy.web...@gmail.com Subject: Re: [R] minimum distance between line segments I like Thomas's idea as a quick practical solution. Here is one more little variation just in case you really do have millions of these distances. Pick point P1 on line segment L1 (e.g., an endpoint). Pick 101 evenly spaced points on line segment L2. Find the nearest to P1 and call it P2. Now go back to L1 and pick a new P1. Alternate until the distance stops dropping. I think it is probably a theorem that three iterations suffice. So, you could get by with 303 distance calculations instead of 10,201. This might also be interesting to some because it defines a function that returns a function. line = function(p1,p2) function(a) (1-a)*p1 + a*p2 # returns a function mapping any a in [0,1] to a point between p1 and p2. line1 = line(c(pi,12),c(7,-3)) # one line line2 = line(c(0.1,5),c(3,7*sqrt(2))) #another line print(line1(1)) # one endpoint print(line1(0)) # the other print(line1(0.5)) # midpoint d2 = function(p1,p2) sum((p1-p2)^2) # parameter a for the point on some.line nearest to some.point. nearest = function(some.point,some.line) { seq = (0:100)/100; return(seq[which.min(sapply(seq, function(a) d2(some.point,some.line(a]) } plot.seg = function (some.line,...) lines(rbind(some.line(0),some.line(1)),...) a = 0 b = nearest(line1(a),line2) a = nearest(line2(b),line1) b = nearest(line1(a),line2) plot(c(-5,15),c(-5,15),asp=1,type=n,main=sqrt(d2(line1(a),line2(b plot.seg(line1,col=black) plot.seg(line2,col=blue) plot.seg(line(line1(a),line2(b)),col=red) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Thomas Lumley Sent: Thursday, March 10, 2011 2:54 PM To: Mike Marchywka Cc: r-help@r-project.org; darcy.web...@gmail.com Subject: Re: [R] minimum distance between line segments On Fri, Mar 11, 2011 at 2:46 AM, Mike Marchywka marchy...@hotmail.com wrote: Date: Wed, 9 Mar 2011 10:55:46 +1300 From: darcy.web...@gmail.com To: r-help@r-project.org Subject: [R] minimum distance between line segments Dear R helpers, I think that this may be a bit of a math question as the more I consider it, the harder it seems. I am trying to come up with a way to work out the minimum distance between line segments. For instance, consider 20 random line segments: x1 - runif(20) y1 - runif(20) x2 - runif(20) y2 - runif(20) plot(x1, y1, type = n) segments(x1, y1, x2, y2) Inititally I thought the solution to this problem was to work out the distance between midpoints (it quickly became apparent that this is totally wrong when looking at the plot). So, I thought that perhaps finding the minimum distance between each of the lines endpoints AND their midpoints would be a good proxy for this, so I set up a loop that uses pythagoras to work out these 9 distances and find the minimum. But, this solution is obviously flawed as well (sometimes lines actually intersect, sometimes the minimum distances are less etc). Any help/dection on this one would be much appreciated. There are two possibilities: If the segments cross, the minimum distance is where they cross, obviously. If they don't cross, the minimum distance is from one of the four endpoints to the closest point on the other segment. The closest point on the other segment is either the nearest endpoint of the other segment or the closest point on the infinite line that extends the other segment. That gives a small set of possibilities to work with. If you're not doing this for millions of segments and you don't need very high accuracy, however, taking lots of points from each segment and computing pairwise distances by brute force is likely to be easier peri-function(xstart,ystart,xend,yend){ line1x-seq(xstart[1],xend[1],length=98) line1y-seq(ystart[1],yend[1],length=98) line2x-seq(xstart[2],xend[2],length=100) line2y-seq(ystart[2],yend[2],length=100) distsq-outer(1:98,1:100, function(i,j) (line1x[i]-line2x[j])^2+(line1y[i]-line2y[j])^2) closest-which(distsq==min(distsq),arr.ind=TRUE) rbind(c(line1x[closest[1]],line1y[closest[1]]),c(line2x[closest[2]],line2y[closest[2]])) } -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
Re: [R] using lapply
But no one answered Kushan's question about performance implications of for-loop vs lapply. With apologies to George Orwell: for-loops BAAD, no loops GOOD. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Uwe Ligges Sent: Thursday, March 10, 2011 4:38 AM To: Arun Kumar Saha Cc: r-help@r-project.org Subject: Re: [R] using lapply On 10.03.2011 08:30, Arun Kumar Saha wrote: On reply to the post http://r.789695.n4.nabble.com/using-lapply-td3345268.html Hmmm, can you please reply to the original post and quote it? You mail was not recognized to be in the same thread as the message of the original poster (and hence I wasted time to answer it again). Thanks, Uwe Ligges Dear Kushan, this may be a good start: ## assuming 'instr.list' is your list object and you are applying my.strat() function on each element of that list, you can use lapply function as lapply(instr.list, function(x) return(my.strat(x))) Here resulting element will again be another list with length is same as the length of your original list 'instr.list.' Instead if the returned object for my.strat() function is a single number then you might want to create a vector instead list, in that case just use 'sapply' HTH Arun, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting only odd columns from a matrix
Or, if X1 Y1 X2 Y2... are really your column names m[, grep(X,colnames(m)) ] or m[, grepl(X,colnames(m)) ] -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dimitris Rizopoulos Sent: Wednesday, March 09, 2011 9:36 AM To: Nixon, Matthew Cc: r-help@R-project.org Subject: Re: [R] Extracting only odd columns from a matrix one way is using the seq(), e.g., say 'm' is your matrix, then try: m[, seq(1, ncol(m), by = 2)] I hope it helps. Best, Dimitris On 3/9/2011 3:20 PM, Nixon, Matthew wrote: Hi, This might seem like a simple question but at the moment I am stuck for ideas. The columns of my matrix in which some data is stored are of this form: X1 Y1 X2 Y2 X3 Y3 ... Xn Yn with n~100. I would like to look at just the X values (i.e. odd column numbers). Is there an easy way to loop round extracting only these columns? Any help would be appreciated. Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Complex sampling?
It sounds like you want a bunch of random permutations of 1:7. Try order(runif(7)) If you need, say, 10 of them: as.vector(sapply(1:10,function(i) order(runif(7 Is it more complicated than that? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hosack, Michael Sent: Wednesday, March 09, 2011 1:02 PM To: r-help@R-project.org Subject: [R] Complex sampling? -Original Message- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Hosack, Michael Sent: Wednesday, March 09, 2011 7:34 AM To: r-help at R-project.org Subject: [R] Complex sampling? R users, I am trying to generate a randomized weekday survey schedule that ensures even coverage of weekdays in the sample, where the distribution of variable DOW is random with respect to WEEK. To accomplish this I need to randomly sample without replacement two weekdays per week for each of 27 weeks (only 5 are shown). This seems simple enough, sampling without replacement. However, I need to sample from a sequence (3:7) that needs to be completely depleted and replenished until the final selection is made. Here is an example of what I want to do, beginning at WEEK 1. I would prefer to do this without using a loop, if possible. sample frame: [3,4,5,6,7] -- [4,5,6] -- [4],[1,2,3,(4),5,6] -- [1,2,4,5,6] -- for each WEEK in dataframe OK, now you have me completely lost. Sorry, but I have no clue as to what you just did here. I looks like you are trying to describe some transformation/algorithm but I don't follow it. I could not reply to this email because it not been delivered to my inbox, so I had to copy it from the forum. I apologize for the confusion, this would take less than a minute to explain in conversation but an hour to explain well in print. Two DOW_NUMs will be selected randomly without replacement from the vector 3:7 for each WEEK. When this vector is reduced to a single integer that integer will be selected and the vector will be restored and a single integer will then be selected that differs from the prior selected integer (i.e. cannot sample the same day twice in the same week). This process will be repeated until two DOW_NUM have been assigned for each WEEK. That process is what I attempted to illustrate in my original message. This is beyond my current coding capabilities. Randomly sample 2 DOW_NUM without replacement from each WEEK ( () = no two identical DOW_NUM can be sampled in the same WEEK) sample = {3,7}, {5,6}, {4,3}, {1,5}, -- for each WEEK in dataframe So, are you sampling from [3,4,5,6,7], or [1,2,4,5,6], or ...? Can you show an 'example' of what you would like to end up given your data below? Thanks you, Mike DATE DOW DOW_NUM WEEK 2 2011-05-02 Mon 31 3 2011-05-03 Tue 41 4 2011-05-04 Wed 51 5 2011-05-05 Thu 61 6 2011-05-06 Fri 71 9 2011-05-09 Mon 32 10 2011-05-10 Tue 42 11 2011-05-11 Wed 52 12 2011-05-12 Thu 62 13 2011-05-13 Fri 72 16 2011-05-16 Mon 33 17 2011-05-17 Tue 43 18 2011-05-18 Wed 53 19 2011-05-19 Thu 63 20 2011-05-20 Fri 73 23 2011-05-23 Mon 34 24 2011-05-24 Tue 44 25 2011-05-25 Wed 54 26 2011-05-26 Thu 64 27 2011-05-27 Fri 74 30 2011-05-30 Mon 35 31 2011-05-31 Tue 45 32 2011-06-01 Wed 55 33 2011-06-02 Thu 65 34 2011-06-03 Fri 75 DF - structure(list(DATE = structure(c(15096, 15097, 15098, 15099, 15100, 15103, 15104, 15105, 15106, 15107, 15110, 15111, 15112, 15113, 15114, 15117, 15118, 15119, 15120, 15121, 15124, 15125, 15126, 15127, 15128), class = Date), DOW = c(Mon, Tue, Wed, Thu, Fri, Mon, Tue, Wed, Thu, Fri, Mon, Tue, Wed, Thu, Fri, Mon, Tue, Wed, Thu, Fri, Mon, Tue, Wed, Thu, Fri), DOW_NUM = c(3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7), WEEK = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5)), .Names = c(DATE, DOW, DOW_NUM, WEEK), row.names = c(2L, 3L, 4L, 5L, 6L, 9L, 10L, 11L, 12L, 13L, 16L, 17L, 18L, 19L, 20L, 23L, 24L, 25L, 26L, 27L, 30L, 31L, 32L, 33L, 34L), class = data.frame) Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list
Re: [R] rowSums - am I getting something wrong?
Hi Thomas, Several of us explained this in different ways just last week, so you might search the archive. Floating point numbers are an approximate representation of real numbers. Things that can be expressed exactly in powers of 10 can't be expressed exactly in powers of 2. So the sum 0.6+0.3+0.1 is NOT clearly 1.0. You can use signif and round to overcome this a = seq(0,1,0.1) a [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 a[7]-0.6 [1] 1.110223e-16 1-(a[4]+a[7]+a[2]) [1] -2.220446e-16 b = rev(seq(1,0,-0.1)) b [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 a-b [1] 0.00e+00 2.775558e-17 5.551115e-17 1.110223e-16 1.110223e-16 [6] 0.00e+00 1.110223e-16 1.110223e-16 0.00e+00 0.00e+00 [11] 0.00e+00 round(a-b,10) [1] 0 0 0 0 0 0 0 0 0 0 0 round(a,10)-round(b,10) [1] 0 0 0 0 0 0 0 0 0 0 0 The first commandment of floating point programming is THOU SHALT NOT TEST WHETHER TWO FP NUMBERS ARE EQUAL HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of thomas.salve...@syngenta.com Sent: Monday, March 07, 2011 2:09 AM To: r-help@r-project.org Subject: [R] rowSums - am I getting something wrong? I am trying to construct a data set with some sequences for example: a = seq(0,1,0.1) m = matrix(nrow = 1331, ncol = 3) m[,1] = rep(a,121) m[,2] = rep(a,11,each = 11) m[,3] = rep(a,1,each = 121) I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: [161,] 0.6 0.3 0.1 which(rowSum(m)1) [53] 119 120 121 132 142 143 152 153 154 161 162 As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) If I try the following: q=rowSums(m) which(q1) [53] 119 120 121 132 142 143 152 153 154 161 162 But if I add and subtract 1 from this: q=q+1 q=q-1 which(q1) [53] 119 120 121 132 142 143 152 153 154 162 What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. Any help would be great Tom message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generate 3 distinct random samples without replacement
Cesar, I think your basic misconception is that you believe 'sample' returns a list of indices into the original vector. It does not; it returns actual elements of the vector: sample(runif(100),3) [1] 0.4492988 0.0336069 0.6948440 I'm not sure why you keep resetting the seed, but if it's important, replace d2-d1[-i] with d2- setdiff(d1,i) Otherwise Duncan's suggestion is must nicer: s = sample(d1,300,replace=FALSE) s1 = sort(s[1:100]) s2 = sort(s[101:200]) s3 = sort(s[201:300]) If what you actually need are indices into the original vector, replace d1 with length(d1). (When you say 'distinct', I'm assuming you mean 'disjoint'.) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Monday, March 07, 2011 3:52 PM To: Cesar Hincapié Cc: r-help@r-project.org Subject: Re: [R] generate 3 distinct random samples without replacement On 07/03/2011 2:17 PM, Cesar Hincapié wrote: Hello: I wonder if I could get a little help with random sampling in R. I have a vector of length 7375. I would like to draw 3 distinct random samples, each of length 100 without replacement. I have tried the following: d1- 1:7375 set.seed(7) i- sample(d1, 100, replace=F) s1- sort(d1[i]) s1 d2- d1[-i] set.seed(77) j- sample(d2, 100, replace=F) s2- sort(d2[j]) s2 d3- d2[-j] set.seed(777) k- sample(d3, 100, replace=F) s3- sort(d3[k]) s3 D- data.frame(a=s1,b=s2,c=s3) However, s2 is only 97 elements long, and s3, only 96 long. I would appreciate any suggestions on a better approach. I'm also curious to know why my second and third samples are less than 100 elements in length. If you want 3 non-overlapping, non-repeating samples of 100, why not draw one sample of 300, and take 3 subsets of it? The reason you were finding shorter samples is because you were using j and k as indices into vectors d2 and d3 that didn't have enough elements, and then you sorted the result, losing the NAs. For example, d2 - 1:10 d2[10:12] sort(d2[10:12]) See ?sort for an explanation of how to keep NA values when you sort. Duncan Murdoch Thanks for your time and consideration, Cesar A. Hincapié, DC, MHSc Research Fellow, Division of Health Care and Outcomes Research, Toronto Western Research Institute PhD Candidate in Epidemiology, Dalla Lana School of Public Health, University of Toronto e. cesar.hinca...@utoronto.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] questions about using loop, while and next
Carrie, If your while-loop condition depends only on dt, and you don't change dt in your loop, your loop won't terminate. The only thing inside your loop is next. Perhaps you mean to write: temp=rep(NA, 10) for(i in 1:10) { dt=sum(rbinom(10, 5, 0.5)) while (dt25) { dt=sum(rbinom(10, 5, 0.5)) } temp[i]=dt } It doesn't look like you understand next. Try reading the help with ?next -- the quotes are necessary in this case. If you still don't understand next, you should be able to program without it with appropriate if's. HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Carrie Li Sent: Friday, March 04, 2011 12:10 AM To: r-help@r-project.org Subject: [R] questions about using loop, while and next Hello R helpers, I have a quick question about loop and next In my loop, I have some random generation of data, but if the data doesn't meet some condition, then I want it to go next, and generate data again for next round. # just an example.. # i want to generate the data again, if the sum is smaller than 25 temp=rep(NA, 10) for(i in 1:10) { dt=sum(rbinom(10, 5, 0.5)) while (dt25) next temp[i]=dt } I also tried while(dt25) {i=i+1} But it doesn't seem right to me, since it running nonstop. Any solutions ? Thanks for helps! Carrie-- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R usage survey
You still don't say what organization you are associated with. Your domain name and e-mail address give no hint. How do we know that Harsh Singhal is even a real person? An e-mail address at a university (for example) would go a long way to establish that. Gmail doesn't cut it for me. The preponderance of evidence is that you're just a naïve person who would give your own information to anyone who asked. On the other hand, it's possible that you are conducting industrial espionage by recording IP addresses and associating use cases with companies. In my opinion, the onus is on you to show your bona fides, and you haven't done it. That's all I have to say... From: Harsh [mailto:singhal...@gmail.com] Sent: Friday, March 04, 2011 4:19 AM To: bill.venab...@csiro.au Cc: Dwyer Rex USRE; r-help@r-project.org Subject: Re: [R] R usage survey The R usage survey goo.gl/jw1ighttp://goo.gl/jw1ig has been updated with the following changes: Addition of - Disclaimer : This data will not be used for any commercial purposes Do not include any personally identifiable information Contact: Harsh Singhal (singhalblr AT gmail DOT com) for any queries Removal of - Name field My primary purpose in conducting this survey is - - Find multiple use cases for various R packages - Understand the nature of work when R is being used in Academia / Commercial settings - The kind of technologies that are being used in conjunction with R (popularity of usage of Python with R, and what purpose does using Python solve) The outcome of this analysis will be published on my blog (in the process of being created). There is absolutely no commercial purpose behind collecting this information and as earlier stated, this information will not be shared with personally identifiable information. Thank you once again Mr. Dwyer and Mr. Venables for raising very import questions. I thank the R users who have already filled in the survey goo.gl/jw1ighttp://goo.gl/jw1ig and request more to do so. Regards, Harsh Singhal On Fri, Mar 4, 2011 at 7:41 AM, bill.venab...@csiro.au wrote: No. That's not answering the question. ALL surveys are for collecting information. The substantive issue is what purpose do you have in seeking this information in the first place and what are you going to do with it when you get it? Do you have some commercial purpose in mind? If so, what is it? -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On Behalf Of Harsh Sent: Friday, 4 March 2011 1:13 AM To: rex.dw...@syngenta.commailto:rex.dw...@syngenta.com Cc: r-help@r-project.orgmailto:r-help@r-project.org Subject: Re: [R] R usage survey Hi Rex and useRs, The purpose of the survey has been mentioned on the survey link goo.gl/jw1ighttp://goo.gl/jw1ig but I will also reproduce it here. - Geographical distribution of R users - Application areas where R is being used - Supporting technology being used along with R - Academic background distribution of R users The potential personally identifiable information such as name and employer name are optional fields. Actually all the fields in the survey are optional. Some of the analysis output(s) could be along the lines of :- - Usage statistics of various R packages - Distribution of R users across countries/cities - Mapping various applications to packages - Text Mining of the responses to create informative word clouds Personally, I am excited about the kind of data I will receive through this survey and the various insights that could be derived. As already mentioned, the results will be shared with the community. Thank you Rex for raising an important point. It is indeed necessary for me to personally assure the user community that the results will be shared in a manner that will not contain any personally identifiable information. Those who wish to gain access to the raw data will be provided with all the fields but not the name and employer name fields. Just out of curiosity : It is possible to get name, employer name, location, usage information and academic background details when searching for R users on LinkedIn and the many R related groups there. Does this also provide potential opportunities for misuse and outrageous analyses, since almost anyone can get onto LinkedIn and access user profiles ? Thank you for your interest and support. Regards, Harsh On Thu, Mar 3, 2011 at 8:02 PM, rex.dw...@syngenta.commailto:rex.dw...@syngenta.com wrote: Harsh, Suitably analyzed for whose purposes? One man's suitable is another's outrageous. That's why people want to see the gowns at the Oscars. Under what auspices are you conducting this survey? What do you intend to do with it? You don't give any assurance that the results you post won't have personally identifiable information. I don't get the impression that you know much about survey design.
Re: [R] R usage survey
Harsh, not to worry, but you were wrong to assert that I engaged in any name calling, let alone constant name calling. I also didn't and don't claim to be an authority on survey design. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Harsh Sent: Friday, March 04, 2011 4:13 PM To: Ista Zahn Cc: r-help@r-project.org Subject: Re: [R] R usage survey Rex, Please accept my apologies for my inappropriate and utterly juvenile remarks. I got carried away by what I thought was criticism and was quick to respond in a scathing manner. I do accept and apologize for my inability in understanding what was essentially being asked of me. Thanks to Ista and other members for clarifying what I failed to understand. I'm now aware that I must submit to appropriately answering questions from potential respondents of the survey. I must reiterate that This survey is not sponsored or approved by any organization or company. The purpose of the survey is to satisfy my personal curiosity regarding R usage patterns. Results will be posted to a publicly available weblog; the data will not be used for any other purpose. (Thanks Ista for wording this out. I couldn't have done it better) Regards, Harsh Singhal http://in.linkedin.com/in/harshsinghal On Sat, Mar 5, 2011 at 2:11 AM, Ista Zahn iz...@psych.rochester.edu wrote: On Fri, Mar 4, 2011 at 3:20 PM, Harsh singhal...@gmail.com wrote: Hi Ista, Spencer and Greg, snip The information being collected is purely out of personal interest and I have mentioned this earlier. No, I don't think you did actually. This is the key thing we wanted to know up-front, and it's a shame that it took the better part of the day before we finally understand why you are conducting the survey. There is no commercial interest involved. Is it possible that I am interested in this sort of information to better understand R's usage patterns ? In doing so, the survey I am conducting would seem an appropriate way for my requirements. And how does belittling someone on a mailing list help ? If anyone wants the kind of information I am collecting, are there suggestions of better ways of finding it besides the method that I have adopted ? Sure I could scrape the data of LinkedIn pages, or find other ways of doing it, but I found this suitable. On Sat, Mar 5, 2011 at 1:27 AM, Spencer Graves spencer.gra...@structuremonitoring.com wrote: Most surveys done in the US today are done during election season, to determine how to package candidates to attract votes. Officials elected under such circumstances spend half their time in office servicing the bribes that they accepted to pay for the surveys and the resulting advertising (and the other half soliciting more bribes er contributions for their next campaign). The best reference on this I know is Thomas Ferguson (1995) Golden Rule (U. Chicago Pr.). It's by now somewhat old but is still cited by leading researchers. People have a right to be cautious of surveys, because too rarely today are surveys used for legitimate scientific purposes. Most often, they are used to defraud the public into doing things that are contrary to their best interests. Spencer Graves On 3/4/2011 11:37 AM, Ista Zahn wrote: Now hold on a second Harsh! I was fairly neutral up to this point, but this response is totally uncalled for. The problem is that despite repeated requests you never clarified the purpose of your research! That is all you were asked to do, but rather than responding to this inquirly in a straightforward and honest manner you kept dodging the question. The most charitable explanation is that you just don't understand what information you were being asked to provide, which is frustrating but understandable; your last response on the other hand is completly out of line. Research participants have a right to know the purpose for which their data is being collected, and as a researcher you have a responsibility to tell them. Rex, thank you for generating this discussion. When I first say Harsh's original email I was just getting ready to fill out the survey. When I saw your response I delayed. Boy am I glad I did! Best, Ista On Fri, Mar 4, 2011 at 2:20 PM, Harshsinghal...@gmail.com wrote: Rex, You're just paranoid and I'm in no way answerable to you. Your constant name calling presupposes your own naivete. The survey has a disclaimer and those who wish to respond can do so at their own discretion. Judging by the nature (and number) of respondents, there seem to be a lot of highly qualified people who have no qualms about sharing information regarding their R usage patterns. You can believe what you want and can continue to spin your imaginative tales of industrial espionage while assuming a
Re: [R] Developing a web crawler
Perl seems like a 10x better choice for the task, but try looking at the examples in ?strsplit to get started. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of antujsrv Sent: Thursday, March 03, 2011 4:23 AM To: r-help@r-project.org Subject: [R] Developing a web crawler Hi, I wish to develop a web crawler in R. I have been using the functionalities available under the RCurl package. I am able to extract the html content of the site but i don't know how to go about analyzing the html formatted document. I wish to know the frequency of a word in the document. I am only acquainted with analyzing data sets. So how should i go about analyzing data that is not available in table format. Few chunks of code that i wrote: w - getURL(http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B003DZ1Y8Q/ref=dp_reviewsanchor#FullQuotes;) write.table(w,test.txt) t - readLines(w) readLines also didnt prove out to be of any help. Any help would be highly appreciated. Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Developing-a-web-crawler-tp3332993p3332993.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Greek character and R
mytitle = parse(text=paste(expression(paste(delta^13,'C Station ',,i, title(mytitle) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Filoche Sent: Thursday, March 03, 2011 8:16 AM To: r-help@r-project.org Subject: [R] Greek character and R Dear R users. In a loop, I set the title of my graph with : mytitle = expression(paste(delta^13,'C Station ', i) title(mytitle) However, instead of using value of i, it will literally use i character. Any one know the way to concatenate the value of i to the mathematical expression? With regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Greek-character-and-R-tp304p304.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R usage survey
Harsh, Suitably analyzed for whose purposes? One man's suitable is another's outrageous. That's why people want to see the gowns at the Oscars. Under what auspices are you conducting this survey? What do you intend to do with it? You don't give any assurance that the results you post won't have personally identifiable information. I don't get the impression that you know much about survey design. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Harsh Sent: Thursday, March 03, 2011 5:53 AM To: r-help@r-project.org Subject: [R] R usage survey Hi R users, I request members of the R community to consider filling a short survey regarding the use of R. The survey can be found at http://goo.gl/jw1ig Please accept my apologies for posting here for a non-technical reason. The data collected will be suitably analyzed and I'll post a link to the results in the coming weeks. Thank you all for your interest and for sharing your R usage information. Regards, Harsh Singhal [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Greek character and R
Eval it. This works at my house: plot(0) title(eval(parse(text=paste(expression(paste(delta^13,'C Station ',,i,)) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Filoche Sent: Thursday, March 03, 2011 9:39 AM To: r-help@r-project.org Subject: Re: [R] Greek character and R Hi and ty for the answer. However, it's not working. It will print expression(d13C Station 1). Thank for any help, Phil -- View this message in context: http://r.789695.n4.nabble.com/Greek-character-and-R-tp304p467.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R usage survey
Just out of curiosity : It is possible to get name, employer name, location, usage information and academic background details when searching for R users on LinkedIn and the many R related groups there. Does this also provide potential opportunities for misuse and outrageous analyses, since almost anyone can get onto LinkedIn and access user profiles ? [Dwyer Rex USRE] That's a no-brainer: YES! On Thu, Mar 3, 2011 at 8:02 PM, rex.dw...@syngenta.commailto:rex.dw...@syngenta.com wrote: Harsh, Suitably analyzed for whose purposes? One man's suitable is another's outrageous. That's why people want to see the gowns at the Oscars. Under what auspices are you conducting this survey? What do you intend to do with it? You don't give any assurance that the results you post won't have personally identifiable information. I don't get the impression that you know much about survey design. -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On Behalf Of Harsh Sent: Thursday, March 03, 2011 5:53 AM To: r-help@r-project.orgmailto:r-help@r-project.org Subject: [R] R usage survey Hi R users, I request members of the R community to consider filling a short survey regarding the use of R. The survey can be found at http://goo.gl/jw1ig Please accept my apologies for posting here for a non-technical reason. The data collected will be suitably analyzed and I'll post a link to the results in the coming weeks. Thank you all for your interest and for sharing your R usage information. Regards, Harsh Singhal [[alternative HTML version deleted]] __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Floating points and floor() ?
Hi Michael, In floating point calculation, 1.0-.9 is not exactly 0.1. This is easily seen by subtracting. (1.0-.9)-0.1 [1] -2.775558e-17 (1.0-.9)==0.1 [1] FALSE David is right, you can't correct this. You can only compensate by taking care that you never, ever test whether 2 FP numbers are equal, because they almost never are. You must always ask whether the difference is small. round(1.0-.9-.1,15)==0 [1] TRUE Unfortunately, most of us forget this rule once in a while and write a loop like while (x!=0)... that won't terminate. HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Folkes, Michael Sent: Thursday, March 03, 2011 9:24 PM To: r-help@r-project.org Subject: [R] Floating points and floor() ? Perhaps somebody could clarify for me if the following is a floating point matter or otherwise, and how am I to correct for it? floor(100*.1) [1] 10 100*(1.0-.9) [1] 10 floor(100*(1-0.9)) [1] 9 Thanks! Michael ___ Michael Folkes Salmon Stock Assessment Canadian Dept. of Fisheries Oceans Pacific Biological Station 3190 Hammond Bay Rd. Nanaimo, B.C., Canada V9T-6N7 Ph (250) 756-7264 Fax (250) 756-7053 michael.fol...@dfo-mpo.gc.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] inefficient ifelse() ?
Hi Ivo, It might be useful for you to study the examples below. The key from a programming language point of view is that functions like ifelse are functions of whole vectors, not elements of vectors. You either evaluate an argument or you don't; you don't evaluate only part of argument. (Somebody correct me if I'm wrong.) As you can see from the examples, if there are no TRUEs or no FALSEs in the condition, the corresponding arms are not evaluated, but if there are some of each, both must be evaluated. This a property of the entire condition vector. You can see all this if you type ifelse (not ?ifelse, just ifelse) and look at the definition. If you want to operate on elements of vectors, you need to use subsetting, e.g.: s = rep(NA,length(t)); b=t%%2==0; s[b]=g(t[b]); s[!b]=f(t[!b]) I agree that it might be counterintuitive for a beginner, but so is 0!=0^0=1, and both follow from first principles. (e.g. n! = n(n-1)!) Counterintuitive is not the same as incorrect, and correct is not the same as efficient. :) HTH Rex t = 1:30 ifelse(t%%2==0,g(t),f(t)) g for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 f for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [1] 2 6 6 12 10 18 14 24 18 30 22 36 26 42 30 48 34 54 38 60 42 66 46 72 50 [26] 78 54 84 58 90 t = 2*(1:30) ifelse(t%%2==0,g(t),f(t)) g for 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 [1] 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96 102 108 114 [20] 120 126 132 138 144 150 156 162 168 174 180 t = 2*(1:30)+1 ifelse(t%%2==0,g(t),f(t)) f for 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 [1] 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78 [20] 82 86 90 94 98 102 106 110 114 118 122 t = rep(c(1,2,NA),3) ifelse(t%%2==0,g(t),f(t)) g for 1 2 NA 1 2 NA 1 2 NA f for 1 2 NA 1 2 NA 1 2 NA [1] 2 6 NA 2 6 NA 2 6 NA t = rep(NA,10) ifelse(t%%2==0,g(t),f(t)) [1] NA NA NA NA NA NA NA NA NA NA t=1:30 ifelse(c(TRUE,FALSE,FALSE,TRUE),g(t),f(t)) g for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 f for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [1] 3 4 6 12 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of ivo welch Sent: Tuesday, March 01, 2011 5:20 PM To: William Dunlap Cc: r-help Subject: Re: [R] inefficient ifelse() ? yikes. you are asking me too much. thanks everybody for the information. I learned something new. my suggestion would be for the much smarter language designers (than I) to offer us more or less blissfully ignorant users another vector-related construct in R. It could perhaps be named %if% %else%, analogous to if else (with naming inspired by %in%, and with evaluation only of relevant parts [just as if else for scalars]), with different outcomes in some cases, but with the advantage of typically evaluating only half as many conditions as the ifelse() vector construct. %if% %else% may work only in a subset of cases, but when it does work, it would be nice to have. it would probably be my first goto function, with ifelse() use only as a fallback. of course, I now know how to fix my specific issue. I was just surprised that my first choice, ifelse(), was not as optimized as I had thought. best, /iaw On Tue, Mar 1, 2011 at 5:13 PM, William Dunlap wdun...@tibco.com wrote: An ifelse-like function that only evaluated what was needed would be fine, but it would have to be different from ifelse itself. The trick is to come up with a good parameterization. E.g., how would it deal with things like ifelse(is.na(x), mean(x, na.rm=TRUE), x) or ifelse(x1, log(x), runif(length(x),-1,0)) or ifelse(x1, log(x), -seq_along(x)) Would it reject such things? Deciding that the x in mean(x,na.rm=TRUE) should be replaced by x[is.na(x)] would be wrong. Deciding that runif(length(x)) should be replaced by runif(sum(x1)) seems a bit much to expect. Replacing seq_along(x) with seq_len(sum(x1)) is wrong. It would be better to parameterize the new function so it wouldn't have to think about those cases. Would you want it to depend only on a logical vector or perhaps also on a factor (a vectorized switch/case function)? Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of ivo welch Sent: Tuesday, March 01, 2011 12:36 PM To: Henrique Dallazuanna Cc: r-help Subject: Re: [R] inefficient ifelse() ? thanks, Henrique. did you mean as.vector(t(mapply(function(x, f)f(x), split(t, ((t %% 2)==0)), list(f, g ? otherwise, you get a matrix. its a good solution, but unfortunately I don't think this can be used to redefine
Re: [R] clustering problem
Don't you expect it to be a lot faster if you cluster 20 items instead of 25000? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Maxim Sent: Wednesday, March 02, 2011 4:08 PM To: r-help@r-project.org Subject: [R] clustering problem Hi, I have a gene expression experiment with 20 samples and 25000 genes each. I'd like to perform clustering on these. It turned out to become much faster when I transform the underlying matrix with t(matrix). Unfortunately then I'm not anymore able to use cutree to access individual clusters. In general I do something like this: hc - hclust(dist(USArrests), ave) library(RColorBrewer) library(gplots) clrno=3 cols-rainbow(clrno, alpha = 1) clstrs - cutree(hc, k=clrno) ccols - cols[as.vector(clstrs)] heatcol-colorRampPalette(c(3,1,2), bias = 1.0)(32) heatmap.2(as.matrix(USArrests), Rowv=as.dendrogram(hc),col=heatcol, trace=none,RowSideColors=ccols) Nice, I can access 3 main clusters with cutree. But what about a situation when I perform hclust like hc - hclust(dist(t(USArrests)), ave) which I have to do in order to speed up the clustering process. This I can plot with: heatmap.2(as.matrix(USArrests), Colv=as.dendrogram(hc),col=heatcol, trace=none) But where do I find information about the clustering that was applied to the rows? cutree(hc, k=clrno) delivers the clustering on the columns, so what can I do to access the levels for the rows? I guess the solution is easy, but after ours of playing around I thought it might be a good time to contact the mailing list! Maxim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge( , by='row.names') slowness
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of dms Sent: Wednesday, March 02, 2011 3:16 PM To: r-help@r-project.org Subject: [R] merge( , by='row.names') slowness I noticed that joining two data.frames in R using the merge function that using by='row.names' slows things down substantially when compared to just joining on a common index column. Using a dataframe size of ~10,000 rows: it's as slow as 10 minutes in the by='row.names' case versus merely 1 second using an index column. Beyond the 10^6 range, it's unusably slow. n - 5 a - data.frame(id=as.character(1:10^n), x=rnorm(10^n)); rownames(a) - a$id b - data.frame(id=as.character(1:10^n + 10^(n-1)), y=rnorm(10^n)); rownames(b) - b$id date() fast - merge(a, b, all=T) date() slow - merge(a, b, all=T, by='row.names') date() Has anybody else noticed this? _ HI DMS, Well, first off, they don't give the same answer... in fact, not even the same dimension. Even so, from looking at merge.data.frame, it's not immediately obvious what would make a difference of this magnitude. The answer might be buried in the internal merge. Here for n=3: system.time(print(dim(merge(a,b,all=T [1] 11003 user system elapsed 0.010.000.01 system.time(print(dim(merge(a,b,all=T,by=1 [1] 11003 user system elapsed 0.010.000.02 system.time(print(dim(merge(a,b,all=T,by=0 [1] 11005 user system elapsed 3.260.003.17 system.time(print(dim(merge(a,b,all=T,by=row.names [1] 11005 user system elapsed 3.170.003.17 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Explained variance for ICA
You determine the variance explained by *any* unit vector by taking its inner product with the data points, then finding the variance of the results. In the case of FastICA, the variance explained by the ICs collectively is exactly the same as the variance explained by the principal components (collectively) from which they are derived. HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Pavel Goldstein Sent: Tuesday, March 01, 2011 1:24 AM To: r-help@r-project.org Subject: [R] Explained variance for ICA Hello, I think to use FastICA package for microarray data clusterization, but one question stops me: can I know how much variance explain each component (or all components together) ? I will be very thankful for the help. Thanks, Pavel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there any Command showing correlation of all variables in a dataset?
?cor answers that question. If Housing is a dataframe, cor(Housing) should do it. Surprisingly, ??correlation doesn't point you to ?cor. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of JoonGi Sent: Tuesday, March 01, 2011 5:41 AM To: r-help@r-project.org Subject: [R] Is there any Command showing correlation of all variables in a dataset? Thanks in advance. I want to derive correlations of variables in a dataset Specifically library(Ecdat) data(Housing) attach(Housing) cor(lotsize, bathrooms) this code results only the correlationship between two variables. But I want to examine all the combinations of variables in this dataset. And I will finally make a table in Latex. How can I test correlations for all combinations of variables? with one simple command? -- View this message in context: http://r.789695.n4.nabble.com/Is-there-any-Command-showing-correlation-of-all-variables-in-a-dataset-tp3329599p3329599.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding pairs with least magnitude difference from mean
No, that's not what I meant, but maybe I didn't understand the question. What I suggested would involve sorting y, not x: sort the *distances*. If you want to minimize the sd of a subset of numbers, you sort the numbers and find a subset that is clumped together. If the numbers are a function of pairs, you compute the function for all pairs of numbers, and find a subset that's clumped together. Anyway, it's an idea, not a theorem, so proof is left as an exercise for the esteemed reader. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hans W Borchers Sent: Monday, February 28, 2011 2:17 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] Finding pairs with least magnitude difference from mean rex.dwyer at syngenta.com writes: James, It seems the 2*mean(x) term is irrelevant if you are seeking to minimize sd. Then you want to sort the distances from smallest to largest. Then it seems clear that your five values will be adjacent in the list, since if you have a set of five adjacent values, exchanging any of them for one further away in the list will increase the sd. The only problem I see with this is that you can't use a number more than once. In any case, you need to compute the best five pairs beginning at position i in the sorted list, for 1=i=choose(n,2), then take the max over all i. There no R in my answer such as you'd notice, but I hope it helps just the same. Rex You probably mean something like the following: x - rnorm(10) y - outer(x, x, +) - (2 * mean(x)) o - order(x) sd(c(y[o[1],o[10]], y[o[2],o[9]], y[o[3],o[8]], y[o[4],o[7]], y[o[5],o[6]])) This seems reasonable, though you would have to supply a more stringent argument. I did two tests and it works alright. --Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help
Generally, you can save your excel spreadsheet as comma-separated values, and then read with read.csv function: ?read.csv Or, tab-separated values and use read.delim. Then look at ?barplot Possibly you would like to read the Intro to R on the CRAN website. Go to www.r-project.org , find Documentation in the menu on the left, and click on Manuals. HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Laura Clasemann Sent: Monday, February 28, 2011 10:03 AM To: r-help@r-project.org Subject: [R] help Hi, I was wondering if anyone could provide me with help in entering the dataset below into R? I've been having a hard time in trying to figure out how to assemble it into both a frequency table and a bar graph within R. I've been trying to present the way I had the data arranged, as below, in Excel Spreadsheet into R. I am uncertain what the correct commands and exact techniques are into getting it correctly organized in R. Any help would be greatly appreciated! Thank you! Diet Binger-Yes Binger-No Total None 24 134 158 Healthy 9 52 61 Unhealthy 23 72 95 Dangerous 12 15 27 Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error
I have to agree that it's pretty hard to take something that works and figure out why it doesn't work :) The only other suggestion is that sometimes I find that this sort of error goes away if I add drop=FALSE to the subsetting, and, if so, that usually lets me figure out why. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Sunday, February 27, 2011 5:52 AM To: mathijsdevaan Cc: r-help@r-project.org Subject: Re: [R] Error On 11-02-26 8:26 AM, mathijsdevaan wrote: Mean doesn't work either... I understand that the message replacement has 0 items, need 37597770 implies that the function is not returning any values, but I don't understand why then this is not the case in the example. DF = data.frame(read.table(textConnection(A B C D E 1 1 a 1999 1 0 2 1 b 1999 0 1 3 1 c 1999 0 1 4 1 d 1999 1 0 5 2 c 2001 1 0 6 2 d 2001 0 1 7 3 a 2004 0 1 8 3 b 2004 0 1 9 3 d 2004 0 1 10 4 b 2001 1 0 11 4 c 2001 1 0 12 4 d 2001 0 1),head=TRUE,stringsAsFactors=FALSE)) DF = DF[order(DF$B,DF$C),] #first option - works fine in example and my target data frame DF$F = ave(DF$D,DF$B, FUN = function(x) cumsum(x)-x) DF$G = ave(DF$E,DF$B, FUN = function(x) cumsum(x)-x) #second option - works fine in example but not in my target data frame foo- function(x) { unlist(lapply(x, FUN = function(z) cumsum(z) - z)) } n-ave(DF[,c(4:5)],DF$B,FUN = foo) Why is this second option not working in my target data frame (which is much bigger than the example)? Presumably something is different about it. I don't see how you expect people to debug your problem when you don't show it. If you want help, you need to give us an example that shows the error. Start with your large dataset, and shrink it as much as possible, but not so much that the error goes away. (I suspect when you do this, you'll end up seeing your error yourself. But maybe not.) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding pairs with least magnitude difference from mean
James, It seems the 2*mean(x) term is irrelevant if you are seeking to minimize sd. Then you want to sort the distances from smallest to largest. Then it seems clear that your five values will be adjacent in the list, since if you have a set of five adjacent values, exchanging any of them for one further away in the list will increase the sd. The only problem I see with this is that you can't use a number more than once. In any case, you need to compute the best five pairs beginning at position i in the sorted list, for 1=i=choose(n,2), then take the max over all i. There no R in my answer such as you'd notice, but I hope it helps just the same. Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hans W Borchers Sent: Saturday, February 26, 2011 6:43 AM To: r-h...@stat.math.ethz.ch Subject: Re: [R] Finding pairs with least magnitude difference from mean I have what I think is some kind of linear programming question. Basically, what I want to figure out is if I have a vector of numbers, x - rnorm(10) x [1] -0.44305959 -0.26707077 0.07121266 0.44123714 -1.10323616 -0.19712807 0.20679494 -0.98629992 0.97191659 -0.77561593 mean(x) [1] -0.2081249 Using each number only once, I want to find the set of five pairs where the magnitude of the differences between the mean(x) and each pairs sum is least. y - outer(x, x, +) - (2 * mean(x)) With this matrix, if I put together a combination of pairs which uses each number only once, the sum of the corresponding numbers is 0. For example, compare the SD between this set of 5 pairs sd(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5])) [1] 1.007960 versus this hand-selected, possibly lowest SD combination of pairs sd(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7])) [1] 0.2367030 Your selection is not bad, as only about 0.4% of all possible distinct combinations have a smaller value -- the minimum is 0.1770076, for example [10 7 9 5 8 4 6 2 3 1]. (1) combinat() from the 'combinations' package seems slow, try instead the permutations() function from 'e1071'. (2) Yes, except your vector is getting much larger in which case brute force is no longer feasible. (3) This is not a linear programming, but a combinatorial optimization task. You could try optim() with the SANN method, or some mixed-integer linear program (e.g., lpSolve, Rglpk, Rsymphony) by intelligently using binary variables to define the sets. This does not mean that some specialized approach might not be more appropriate. --Hans Werner I believe that if I could test all the various five pair combinations, the combination with the lowest SD of values from the table would give me my answer. I believe I have 3 questions regarding my problem. 1) How can I find all the 5 pair combinations of my 10 numbers so that I can perform a brute force test of each set of combinations? I believe there are 45 different pairs (i.e. choose(10,2)). I found combinations from the {Combinations} package but I can't figure out how to get it to provide pairs. 2) Will my brute force strategy of testing the SD of each of these 5 pair combinations actually give me the answer I'm searching for? 3) Is there a better way of doing this? Probably something to do with real linear programming, rather than this method I've concocted. Thanks for any help you can provide regarding my question. Best regards, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate probabilty
Are you clear about the question you are asking? Do you want to know whether there are 6 balls or at least 6 balls? (It sounds like at least.) Do you want to know whether there are at least 6 balls in the first box, or at least 6 balls in exactly one box or at least 6 balls in at least one box? This is the probability that there are exactly 6 balls in the first box: dbinom(6,142,1/491) [1] 5.53366e-07 This is the probability that there are MORE THAN 6 balls in the first box: (NOT at least 6) 1-pbinom(6,142,1/491) [1] 2.272026e-08 sum(sapply(7:142, function(i) dbinom(i,142,1/491))) [1] 2.272026e-08 1-sum(sapply(0:6, function(i) dbinom(i,142,1/491))) [1] 2.272026e-08 This is probability that there are at least 6 balls in the first box: 1-pbinom(5,142,1/491) [1] 5.760862e-07 You can get all this from ?dbinom, but it pretty confusing that the argument n and the italic n in the details are totally different things, italic n = argument size. (Likewise, italic p = argument prob, not argument p.) Questions about more than one box are a little harder since the boxes are not independent. HTH, Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Fabrice Tourre Sent: Thursday, February 24, 2011 3:51 PM To: r-help@r-project.org Subject: [R] Calculate probabilty Hi List, I have a question to calculate probability using R. There are 491 boxes and 142 balles. If the ball randomly put into the box. How to calculate the probability of six or more there are in one box? I have try : dbinom(6,142,1/491) 1-pbinom(6,142,1/491) But I think I have some unclear about the dbinom and pbinom. Thank you very much in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error
Does it work for FUN=mean? If yes, you need to print out the results of f before you return them to find the anomalous value. BTW Error is not a very good subject line. I don't see many posts from people reporting how well things are going :) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of mathijsdevaan Sent: Friday, February 25, 2011 9:31 AM To: r-help@r-project.org Subject: [R] Error Hi, I am running the following script for a different (much larger data frame): DF = data.frame(read.table(textConnection(A B C D E 1 1 a 1999 1 0 2 1 b 1999 0 1 3 1 c 1999 0 1 4 1 d 1999 1 0 5 2 c 2001 1 0 6 2 d 2001 0 1 7 3 a 2004 0 1 8 3 b 2004 0 1 9 3 d 2004 0 1 10 4 b 2001 1 0 11 4 c 2001 1 0 12 4 d 2001 0 1),head=TRUE,stringsAsFactors=FALSE)) DF-DF[order(DF$B,DF$C),]#order by developer_id and year f- function(x) { unlist(lapply(x, FUN = function(z) cumsum(z) - z)) } DF-cbind(DF[,c(1:3)],ave(DF[, c(4:5)],DF$B, FUN = f)) I get the following error: Error in `[-.data.frame`(`*tmp*`, i, , value = integer(0)) : replacement has 0 items, need 37597770 In addition: Warning message: In max(i) : no non-missing arguments to max; returning -Inf The dimensions of the data frame are (5,108), so the last line of the script becomes: DF-cbind(DF[,c(1:3)],ave(DF[, c(4:108)],DF$B, FUN = f)) Any idea how to solve this problem? Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Error-tp3324531p3324531.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gaps in plotting temporal data.
If you're in a hurry, it's way easier than that: t - c(1,2,3,7,8,9,11,12,13) x - rnorm(length(t)) new.t - min(t):max(t) new.x - NULL new.x[t-min(t)+1] - x plot(new.t, new.x, type='l') This is wastes max(t)-min(t)-length(t)+1 vector entries, but presumably you won't be wasting a lot of real estate along the horizontal axis of a plot. If you do, you're going to need to fix that anyway. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Thursday, February 24, 2011 8:45 AM To: Christos Delivorias Cc: r-help@r-project.org Subject: Re: [R] Gaps in plotting temporal data. On 24/02/2011 7:38 AM, Christos Delivorias wrote: I'm trying to plot some temporal data that have some gaps in them. You can see the plot here: http://www.tiikoni.com/tis/view/?id=da222e2. The problem is that during the time gaps in the TS the line plot is interpolated over the gap and I don't want it to. I've tried interleaving the gaps with an NA flag, but there are around 1 data-points sorted from multiple files, that makes it difficult to add the NA flag manually. If it's not possible to define the behaviour of the plot(0function, is there another plot I can use, e.g. zoo, that will allow me to not have the lines drawn between the gaps? Any software is going to have the same problem you had: how do you define a gap? If the definition is something simple like time difference greater than X, then it will be fairly easy: use diff() to find all the time differences in the sorted times, and wherever those exceed X, insert a new data point with an NA value. For example, t - c(1,2,3,7,8,9,11,12,13) x - rnorm(length(T)) d - diff(t) gap - which(d 1.5) if (length(gap)) { newT - (t[gap] + t[gap+1])/2 t - c(t, newT) x - c(x, rep(NA, length(newT))) o - order(t) t - t[o] x - x[o] } plot(t, x, type='l') Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running code sequentially from separate scripts (but not functions)
You don't need to write functions to source files: source(code1.R) source(code2.R) source(code3.R) When you source a file with a bunch of function definitions, the definitions are just assignment statements: f - function (x)... g - function (x,y,z) ... Did you think you would break your computer if you just tried this to see if it worked? :) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dimitri Liakhovitski Sent: Thursday, February 24, 2011 10:22 AM To: r-help Subject: [R] Running code sequentially from separate scripts (but not functions) Hello! I am wondering if it's possible to run - in sequence - code that is stored in several R scripts. For example: Script in the file code1.r contains the code: a = 3; b = 5; c = a + b Script in the file code2.r contains the code: d = 10; e = d - c Script in the file code3.r contains the code: result=e/a I understand that I could write those 3 scripts as 3 functions and source them from another script. But maybe there is a way of having them run one by one as such? Thanks a lot! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] weighted Voronoi diagrams
One way to do Dirichlet triangulations is to map point (x,y) to point (x,y,x^2+y^2) (I think, it's been a while) and then find the convex hull of these points in 3 dimensions. You can do the Voronoi diagram of circles by mapping (x,y,r) to (x,y,x^2+y^2-r^2) I would try assigning an r to each point so that the area of the circle (r^2) is proportional to the size of the subject tree. You will need to scale the r^2 values so that none of the polygons disappear. See papers by Aurenhammer and/or Edelsbrunner from around 1990. I've been out of the field for a long time, so there may be more recent stuff. Hopefully, there is an R package for convex hull in three dimensions. Possibly, there is someone in the CS department at UMn who does computational geometry who could assist you. Maybe someone else knows of an R package that will do exactly what you want. HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Tuomas Aakala Sent: Thursday, February 24, 2011 10:35 AM To: r-help@r-project.org Subject: [R] weighted Voronoi diagrams Dear R-users, Does anyone know how to do weighted Voronoi diagrams (Dirichlet tesselation) in R? To be more specific, I have a set of coordinates for tree locations on a plot, and I'm looking for a way to do the tesselation so that the polygon size for each tree depends on the size of the subject tree, and the size of its neighbors. So, the location of the bisection between two trees would not necessarily be at the midpoint, but determined by the tree sizes. I have looked through the options in tripack and deldir-packages, but editing the functions in those packages is beyond my skills. Thanks, Tuomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find points of intersection
How is the curve defined? If the curve is y=f(x) and the line is y=mx+b, you look for the roots of f(x)-mx-b. ?polyroot ?uniroot -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of FMH Sent: Tuesday, February 22, 2011 6:28 AM To: r-help@r-project.org Subject: [R] How to find points of intersection Dear All, I'm looking an appropriate way in R to compute/estimate points of intersection between a line and a curve and will really appreciate for any suggestion or ideas? Thank you, Fir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discrepancies in run times
My surmise would be that you have not analyzed the situation correctly, and you are making a false assumption about your code. Since you can't show the code, it's pretty hard to figure out what that is. I think you're going to have to produce a simple example that you can share that has the same behavior. My guess is that you will answer your own question as you try to do that. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sébastien Bihorel Sent: Tuesday, February 22, 2011 12:16 PM To: r-help@r-project.org Subject: [R] Discrepancies in run times Dear R-users, I am in the process of creating new custom functions and am quite puzzled by some discrepancies in execution time when I run some R scripts that call those new functions. So here is the situation: - let's assume I have created two custom functions, called myg and myf; - myg is mostly a plotting function, which makes a heavy use of grid and lattice functions; - myf is a function that massages data, opens and closes graph devices, and pass the data to myg: * myf contains loops and sub-loops which subset the data in little pieces necessary for plotting purposes; * the most inner loop in myf contains two calls to myg, one in section A of the code, one in section B of the code; * Both sections could be turn on and off based upon an input of the myf function; * Both sections passes the same data to myg, except for some graph settings; * All graph devices open in section A are closed before section B starts; and all graph devices open in section B are closed before the next iteration of the inner loop. Running a script passing a particular set of data to myf and turning on both section A and B takes around 9 minutes (~3 combined minutes for Section A, ~6 combined minutes for Section B). The results of R CMD Rprof indicates that most of the execution time is used by print (see extract below). % total %self totalseconds selfsecondsname 99.4545.84 0.0 0.00 myf 95.2522.70 0.0 0.06 myg 90.7498.20 0.0 0.02 standardGeneric 90.6497.70 0.0 0.00 print 90.5497.32 5.3 29.06 printFunction 90.5497.32 0.0 0.00 print.trellis 62.7344.58 62.6343.96 lattice.setStatus ... %self % total selfsecondstotal secondsname 62.6343.96 62.7344.58 lattice.setStatus 5.6 30.86 7.6 41.60 .Call.graphics 5.3 29.06 90.5497.32 printFunction 3.5 19.12 3.7 20.22 .Call 2.3 12.52 2.3 12.52 $ 1.3 7.18 1.9 10.42 match 1.3 6.98 1.3 6.98 dev.off ... Running another script passing the same set of data to myf and turning section A on and section B off takes around 3 minutes. The results of R CMD Rprof also indicates that most of the execution time is used by print (see extract below). % total %self totalseconds selfsecondsname 98.1177.16 0.0 0.00 myf 93.3168.40 0.0 0.00 myg 85.0153.50 0.1 0.10 standardGeneric 84.7152.94 0.0 0.02 print 84.6152.74 0.0 0.00 print.trellis 84.6152.72 4.5 8.04 printFunction 51.3 92.66 51.3 92.58 lattice.setStatus ... %self % total selfsecondstotal secondsname 51.3 92.58 51.3 92.66 lattice.setStatus 8.5 15.34 10.7 19.36 .Call.graphics 6.6 11.96 6.9 12.50 .Call 4.5 8.04 84.6152.72 printFunction 3.4 6.14 3.4 6.14 $ 2.1 3.72 3.0 5.40 match 0.8 1.46 0.8 1.46 dev.off ... Running another script passing the same set of data to myf and turning section A off and section B on takes around 3 minutes. The results of R CMD Rprof also indicates that most of the execution time is used by print (see extract below). % total %self totalseconds selfsecondsname 98.1175.00 0.0 0.00 myf 90.7161.82 0.0 0.00 myg 86.8154.90 0.0 0.06 standardGeneric 86.5154.32 0.0 0.02 print 86.4154.16 4.0 7.18 printFunction 86.4154.16 0.0 0.00 print.trellis 52.6 93.76 52.5 93.62 lattice.setStatus ... %self % total selfsecondstotal secondsname 52.5 93.62 52.6 93.76 lattice.setStatus 8.6 15.28 10.9 19.40 .Call.graphics 4.2 7.58 4.5 7.98 .Call 4.0 7.18 86.4154.16 printFunction 3.1 5.56 3.1
Re: [R] How to find points of intersection between harmonic function and a line
How is the curve is represented? That's more important that its organ-of-origin. If you have values of y=f(x) at discrete time points, then y-(x+2) will change sign sometimes... the intersection point is at some time x' in between. Am I missing something subtle here? You could interpolate the time more precisely in many different ways, e.g., a spline -- read help(spline). -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of FMH Sent: Tuesday, February 22, 2011 1:21 PM To: r-help@r-project.org Cc: lig...@statistik.tu-dortmund.de Subject: [R] How to find points of intersection between harmonic function and a line Hi, Sorry for the very short explanation about the problem of intersection. I have a wave function monitored from the heart beat in a particular interval of times. Apart fom that, there is a line with positive slope (e.g: y = x+2) which lies across the wave and intersect on a number of points. My problem is i have no exact equation for such a complex harmonic wave produced by the heart pulse and so, cannot manage to find the intersection points. Therefore, i would be very grateful if someone could give some ideas or might suggest any packages in R that can assist me to do so. Thank you, Fir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Building an array from matrix blocks
Well, you can lose B by just adding to X in the first for-loop, can't you? For (...) X - X + A[...] But if you want elegance, you could try: X = Reduce(+,lapply(1:(p+1), function(i) A[i:(n-p-1+i),i:(n-p-1+i)])) I imagine someone can be even more eleganter than this. rad -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Eduardo de Oliveira Horta Sent: Saturday, February 19, 2011 9:49 AM To: r-help Subject: [R] Building an array from matrix blocks Hello, I've googled for a while and couldn't find anything on this topic: say I have a matrix A and want to build matrices B1, B2,... using blocks from A (or equivalently an array B with B[,,i] being a block from A), and that I must sum the B[,,i]'s. I've come up with this rather non-elegant code: n = 6 p = 3 A - matrix(1:(n^2), n, n, byrow=TRUE) B - array(0, c(n-p, n-p, p+1)) for (i in 1:(p+1)) B[,,i] - A[i:(n-p-1+i), i:(n-p-1+i)] X - matrix(0, n-p, n-p) for (i in 1:(p+1)) X - X + B[,,i] A [,1] [,2] [,3] [,4] [,5] [,6] [1,]123456 [2,]789 10 11 12 [3,] 13 14 15 16 17 18 [4,] 19 20 21 22 23 24 [5,] 25 26 27 28 29 30 [6,] 31 32 33 34 35 36 B , , 1 [,1] [,2] [,3] [1,]123 [2,]789 [3,] 13 14 15 , , 2 [,1] [,2] [,3] [1,]89 10 [2,] 14 15 16 [3,] 20 21 22 , , 3 [,1] [,2] [,3] [1,] 15 16 17 [2,] 21 22 23 [3,] 27 28 29 , , 4 [,1] [,2] [,3] [1,] 22 23 24 [2,] 28 29 30 [3,] 34 35 36 X [,1] [,2] [,3] [1,] 46 50 54 [2,] 70 74 78 [3,] 94 98 102 Note that the blocks B[,,i] are obtained by sweeping the diagonal of A. I wonder if there is a better and faster way to achieve this using block matrix operations for instance. Actually what matters most for me is getting to the matrix X, so if it is possible to do this without having to construct the array B it would be ok as well... Interesting observation: system.time(for (j in 1:1) {X - matrix(0, n-p, n-p); for (i in 1:(p+1)) X - X + B[,,i]}) user system elapsed 0.270.000.26 system.time(for (j in 1:1) {X - apply(B,c(1,2),sum)}) user system elapsed 1.820.021.86 Thanks in advance, and best regards, Eduardo Horta sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-mingw32 locale: [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Revobase_4.2.0 RevoScaleR_1.1-1 lattice_0.19-13 loaded via a namespace (and not attached): [1] grid_2.11.1 pkgXMLBuilder_1.0 revoIpe_1.0 tools_2.11.1 [5] XML_3.1-0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about generics
?InternalMethods ?S3groupGeneric ?S4groupGeneric -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erin Hodgess Sent: Friday, February 18, 2011 10:45 PM To: R help Subject: [R] question about generics Dear R People: Is there a way to determine which functions are generics, please? I looked for something like is.Generic, but no luck. Thanks in advance! Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] speed up the code
Yes, remove the call to intersect, and rely on the results of match to tell you whether there is an overlap. If there are any matches, all(is.na(index)) will be false. Read help for match. ?match -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hui Du Sent: Wednesday, February 16, 2011 6:29 PM To: r-help@r-project.org Subject: [R] speed up the code Hi All, The following is a snippet of my code. It works fine but it is very slow. Is it possible to speed it up by using different data structure or better solution? For 4 runs, it takes 8 minutes now. Thanks a lot fun_activation = function(s_k, s_hat, x_k, s_hat_len) { common = intersect(s_k, s_hat); if(length(common) != 0) { index = match(common, s_k); round(sum(x_k[index]) * length(common) / (s_hat_len * length(s_k)), 3); } else { 0; } } fun_x = function(a) { round(runif(length(a), 0, 1), 2); } symbol_len = 50; PHI_set = 1:symbol_len; S = matrix(replicate(M * M, sort(sample(PHI_set, sample(symbol_len, 1, M, M); X = matrix(mapply(fun_x, S), M, M); S_hat = c(28, 34, 35) S_hat_len = length(S_hat); S_hat_matrix = matrix(list(S_hat), M, M); system.time( for(I in 1:4) { A = matrix(mapply(fun_activation, S, S_hat_matrix, X, S_hat_len), M, M); } ) HXD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sort a 3 dimensional array across third dimension ?
Although I suggested to someone else that for-loops be avoided, they are not in the inner loop in this code, and it's probably easier to understand than some sort of apply: a = array(round(100*runif(60)),dim=c(3,4,5)) a for (i in 1:dim(a)[1]) for (j in 1:dim(a)[2]) a[i,j,] = sort(a[i,j,]) a Is that what you want? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Maas James Dr (MED) Sent: Friday, February 18, 2011 8:01 AM To: r-help@r-project.org Subject: [R] sort a 3 dimensional array across third dimension ? I'm attempting to sort a 3 dimensional array that looks like this x , , 1 [,1] [,2] [1,]99 [2,]79 , , 2 [,1] [,2] [1,]65 [2,]46 , , 3 [,1] [,2] [1,]21 [2,]32 Such that it ends up like this y , , 1 [,1] [,2] [1,]21 [2,]32 , , 2 [,1] [,2] [1,]65 [2,]46 , , 3 [,1] [,2] [1,]99 [2,]79 I think this is sorting across the third dimension but several attempts using either the sort or apply functions have not worked. Any and all suggestions most welcome. Thanks J === Dr. Jim Maas University of East Anglia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sort a 3 dimensional array across third dimension ?
I was going to say: The problem with for-loops (as best I understand it) is that the R code gets interpreted over and over; what you normally want to do is design the computation so that you jump into the internals of R and stay there. But the inner loop is in the R internals of the sort in this case. If the third dimension is even just moderately large, the cost of interpretation is small relative to the cost of the sort. But my quick experiments don't exactly bear that out: foo = runif(1) system.time(for (i in 1:1000) sort(foo)) user system elapsed 1.600.001.61 system.time(for (i in 1:1000) for (j in 1:1) k=k+1) user system elapsed 7.520.007.54 I imagine I could find a prettier way with various flavors of apply, if my employer didn't have other things for me to do. Maybe someone else can explain why the for loop is so slow that the overhead to increment the index is greater than sorting 1 doubles. I know it used to be even slower in splus than in R. -Original Message- From: Maas James Dr (MED) [mailto:j.m...@uea.ac.uk] Sent: Friday, February 18, 2011 10:06 AM To: Dwyer Rex USRE; r-help@r-project.org Subject: RE: sort a 3 dimensional array across third dimension ? Hi Rex, Thanks, this is exactly what I want but have to do it with many big arrays ... thus if there were a way to do it with a vectorized function would it not be a lot more efficient? Much appreciated! J Subject: RE: sort a 3 dimensional array across third dimension ? Although I suggested to someone else that for-loops be avoided, they are not in the inner loop in this code, and it's probably easier to understand than some sort of apply: a = array(round(100*runif(60)),dim=c(3,4,5)) a for (i in 1:dim(a)[1]) for (j in 1:dim(a)[2]) a[i,j,] = sort(a[i,j,]) a Is that what you want? Subject: [R] sort a 3 dimensional array across third dimension ? I'm attempting to sort a 3 dimensional array that looks like this x , , 1 [,1] [,2] [1,]99 [2,]79 , , 2 [,1] [,2] [1,]65 [2,]46 , , 3 [,1] [,2] [1,]21 [2,]32 Such that it ends up like this y , , 1 [,1] [,2] [1,]21 [2,]32 , , 2 [,1] [,2] [1,]65 [2,]46 , , 3 [,1] [,2] [1,]99 [2,]79 I think this is sorting across the third dimension but several attempts using either the sort or apply functions have not worked. Any and all suggestions most welcome. Thanks J === Dr. Jim Maas University of East Anglia message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calling pairs of variables into a function
Try putting d,e,f in a list: Xxx = list(d,e,f) For (I in 1:length(xxx)) For (j in 1:length(xxx)) If (i!=j) bigfunction(xxx[[i]],xxx[[j]]) (bad indentation, caps thanks to outlook) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of squamous Sent: Wednesday, February 16, 2011 7:11 PM To: r-help@r-project.org Subject: [R] calling pairs of variables into a function Hi, I have imported three text files into R using read.table. Their variables are called d, e and f. I want to run a function on all the possible combinations of these three files. The only way I know how to do that is like this: bigfunction(d,e) bigfunction(d,f) bigfunction(e,d) bigfunction(e,f) bigfunction(f,e) bigfunction(f,d) Is there an easier way? I will have five files later on, so it would be useful to know! I'd imagine I can use a loop somehow, and I have installed a package (gregmisc) so that typing permutations(3,2) gives all the possible pairs of three numbers, but I don't know how to combine these things to make it work. -- View this message in context: http://r.789695.n4.nabble.com/calling-pairs-of-variables-into-a-function-tp3309993p3309993.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] covar
I hate to sound like David Have You Read The Posting Guide? Winsemius, but there's no way for anyone to know what you are trying to accomplish here without a lot more information. You don't show us the output you expect and the output you got. I would expected relatedness to be on a scale from 0 to 1, but it's clear that you'll get values 1 in this program. To use R effectively, you need to rephrase your computation as a matrix computation. People generally use R at least partly to avoid debugging index computations in for-loops. For-loops are also much slower than the corresponding matrix operations in R. If you want to use for-loops, you can always put in some prints and trace what's going on, just like the old days! -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Val Sent: Wednesday, February 16, 2011 3:14 PM To: r-h...@stat.math.ethz.ch Subject: [R] covar Hi all, I want to construct relatedness among individuals and have a look at the following script. # rm(list=ls()) N=5 id = c(1:N) dad = c(0,0,0,3,3) mom = c(0,0,2,1,1) sex = c(2,2,1,2,2) # 1= M and 2=F A=diag(nrow = N) for(i in 1:N){ for(j in i:N) { ss = dad[j] dd = mom[j] sx = sex[j] if( ss 0 dd 0 ) { if(i == j) { A[i,i] = 1 + 0.5*A[ss,dd] } else { A[i,j] = A[i,ss] + 0.5*(A[i,dd]) A[j,i] = A[i,j] } } } #inner for loop } # outer for loop A If the sex is male ( sex=1) then I want to set A[i,i]=0.5*A[ss,dd] If it is female ( sex=2) then A[i,i] = 1 + 0.5*A[ss,dd] How do I do it ? I tried several cases but it did not work from me. Your assistance is highly appreciated in advance Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] String manipulation
A quick way to do this is to replace \d and \D with character classes [0-9.] and [^0-9.] . This assumes that there is no scientific notation and that there is nothing like 123.45.678 in the string. You did not account for a leading minus sign. The book Mastering Regular Expressions is probably worth the expense if you are going to be doing a lot of this, even though similar content can be gleaned from on line. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Megh Dal Sent: Sunday, February 13, 2011 4:42 PM To: Gabor Grothendieck Cc: r-help@r-project.org Subject: Re: [R] String manipulation Hi Gabor, thanks (and Jim as well) for your suggestion. However this is not working properly for following string: MyString - ABCFR34564IJVEOJC3434.36453 strapply(MyString, (\\D+)(\\d+)(\\D+)(\\d file://d+)(//d+)(//D+)(//d+), c)[[1]] [1] ABCFR 34564 IJVEOJC 3434 Therefore there is decimal number in the 4th group, which is numeric then that is not taken care off... Similarly same kind of unintended result here as well: MyString - ABCFR34564.354IJVEOJC3434.36453 strapply(MyString, (\\D+)(\\d+)(\\D+)(\\d file://d+)(//d+)(//D+)(//d+), c)[[1]] [1] ABCFR 34564 . 354 IJVEOJC 3434. 36453 Can you please tell me how can I modify that? Thanks, On Sun, Feb 13, 2011 at 11:10 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Sun, Feb 13, 2011 at 10:27 AM, Megh Dal megh700...@gmail.com wrote: Please consider following string: MyString - ABCFR34564IJVEOJC3434 Here you see that, there are 4 groups in above string. 1st and 3rd groups are for english letters and 2nd and 4th for numeric. Given a string, how can I separate out those 4 groups? Try this. \\D+ and \\d+ match non-digits and digits respectively. The portions within parentheses are captures and passed to the c function. It returns a list with a component for each element of MyString. Like R's split it returns a list with a component per element of MyString but MyString only has one element so we get its contents using [[1]]. library(gsubfn) strapply(MyString, (\\D+)(\\d+)(\\D+)(\\d+), c)[[1]] [1] ABCFR 34564 IJVEOJC 3434 Alternately we could convert the relevant portions to numbers at the same time. ~ list(...) is interpreted as a function whose body is the right hand side of the ~ and whose arguments are the free variables, i.e. s1, s2, s3 and s4. strapply(MyString, (\\D+)(\\d+)(\\D+)(\\d+), ~ list(s1, as.numeric(s2), s3, as.numeric(s4)))[[1]] See http://gsubfn.googlecode.com for more. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] string parsing
It's only awfully inefficient if it's a bottleneck. You're not doing this more than once per item fetched from the network, and the time is insignificant relative to the fetch. If it were somehow in your inner loop, it would be worth worrying about, but your purpose is to eliminate Ms and Bs so that you'll never ever see them again. If performance is a problem, look at your inner loop, not here. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mike Marchywka Sent: Tuesday, February 15, 2011 9:01 PM To: s...@gnu.org; r-h...@stat.math.ethz.ch Subject: Re: [R] string parsing To: r-h...@stat.math.ethz.ch From: s...@gnu.org Date: Tue, 15 Feb 2011 17:20:11 -0500 Subject: [R] string parsing I am trying to get stock metadata from Yahoo finance (or maybe there is a better source?) search this for yahoo, http://cran.r-project.org/web/packages/quantmod/quantmod.pdf as a perennial page scraper, I was amazed this existed :) here is what I did so far: yahoo.url - http://finance.yahoo.com/d/quotes.csv?f=j1jka2s=;; stocks - c(IBM,NOIZ,MSFT,LNN,C,BODY,F); # just some samples socket - url(paste(yahoo.url,sep=,paste(stocks,collapse=+)),open=r); data - read.csv(socket, header = FALSE); close(socket); data is now: V1 V2 V3 V4 1 200.5B 116.00 166.25 4965150 2 19.1M 3.75 5.47 8521 3 226.6B 22.73 31.58 57127000 4 886.4M 30.80 74.54 226690 5 142.4B 3.21 5.15 541804992 6 276.4M 11.98 21.30 149656 7 55.823B 9.75 18.97 89369000 now I need to do this: -- convert 55.823B to 55e9 and 19.1M to 19e6 parse.num - function (s) { as.numeric(gsub(M$,e6,gsub(B$,e9,s))); } data[1]-lapply(data[1],parse.num); seems like awfully inefficient (two regexp substitutions), is there a better way? -- iterate over stocks data at the same time and put the results into a hash table: for (i in 1:length(stocks)) cache[[stocks[i]]] - data[i,]; I do get the right results, but I am wondering if I am doing it the right R way. E.g., the hash table value is a data frame. A structure(record?) seems more appropriate. thanks! -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.3 (Final) http://pmw.org.il http://ffii.org http://camera.org http://honestreporting.com http://iris.org.il http://mideasttruth.com http://thereligionofpeace.com I haven't lost my mind -- it's backed up on tape somewhere. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A Math question
If y'all want to discuss this more, do it somewhere else, please. This has little to do with R except that both depend on Peano's Axioms. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of jlu...@ria.buffalo.edu Sent: Tuesday, February 15, 2011 12:46 PM To: Kjetil Halvorsen Cc: r-help@r-project.org; r-help-boun...@r-project.org; Maithula Chandrashekhar Subject: Re: [R] A Math question Kjetil et al, Unlike finite sums, infinite sums are not commutative. To have commutativity, one must have absolute summability, that is, the sum of the absolute values of the terms must be finite. If one has absolute summability, the infinite sum exists and is unique. This sum is not absolutely summable and thus undefined. If one does not require commutativity, then the order of the summation must be specified. The order is often implicitly assumed to be the order of the integers. The sum of the negative integers is negative infinity, the sum of the positive integers is infinity, and the sum of these two sums is undefined. However, Riemann's rearrangement theorem shows that the terms can be re-ordered to yield any sum whatsoever. In particular, if one creates pairs of terms consisting of a positive integer and its negative, then the infinite sum is zero. So the unique sum is undefined; otherwise the sum depends on the order of addition. Joe David Winsemius dwinsem...@comcast.net Sent by: r-help-boun...@r-project.org 02/15/2011 09:17 AM To Kjetil Halvorsen kjetilbrinchmannhalvor...@gmail.com cc r-help@r-project.org, Maithula Chandrashekhar m.chandrashekhar1...@gmail.com Subject Re: [R] A Math question On Feb 14, 2011, at 7:33 PM, Kjetil Halvorsen wrote: or even better: http://mathoverflow.net/ I beg to differ. That is designated in its FAQ as expecting research level questions, while the forum I offered is labeled as Welcome to QA for people studying math at any level and professionals in related fields. I don't think the proffered question could be considered research level. On Sun, Feb 13, 2011 at 8:02 PM, David Winsemius dwinsem...@comcast.net wrote: On Feb 13, 2011, at 4:47 PM, Maithula Chandrashekhar wrote: Dear all, I admit this is not anything to do R and even with Statistics perhaps. Strictly speaking this is a math related question. However I have some reasonable feeling that experts here would come up with some elegant suggestion to my question. Here my question is: What is sum of all Integers? I somewhere heard that it is Zero as positive and negative integers will just cancel each other out. However want to know is it correct? There are more appropriate places to pose such questions: http://math.stackexchange.com/ David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cycle in a directed graph
If the graph has n nodes and is represented by an adjacency matrix, you can square the matrix (log_2 n)+1 times. Then you can multiply the matrix element-wise by its transpose. The positive entries in the 7th row will tell you all nodes sharing a cycle with node 7. This assumes all edge weights are positive. Are you sure we're not doing your graph theory homework? You asked about MSTs yesterday. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of amir Sent: Friday, February 11, 2011 10:11 AM To: r-help@r-project.org Subject: [R] cycle in a directed graph Hi, I have a directed graph and wants to find is there any cycle in it? If it is, which nodes or edges are in the cycle. Is there any way to find the cycle in a directed graph in R? Regards, Amir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate multivariate normal data with a random correlation matrix
If you want a random correlation matrix, why not just generate random data and accept the correlation matrix that you get? The standard normal distribution in k dimensions is (hyper)spherically symmetric. If you generate k standard normal N(0,1) variates, you have a point in k-space with direction uniformly distributed on the (k-1)sphere and Gaussian magnitude. If you generate k such, you have a random linear transformation with all sorts of desirable symmetries. So, if you generate a kxk matrix of standard normal variates, and another nxk standard normal variates, and multiply the two matrices to get n points in k space, that seems to be a pretty good definition of random correlation to me. I'm sure you can decompose the kxk matrix to get the theoretical distribution, maybe by multiplying it by its transpose and doing an SVD; I'd have to think about that part. ... unless you have a particular distribution of correlation matrices in mind to begin with, which doesn't seem to be the case. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Szumiloski, John Sent: Wednesday, February 09, 2011 11:30 AM To: r-help@r-project.org Cc: Rick DeShon Subject: Re: [R] Generate multivariate normal data with a random correlation matrix The knee jerk thought I had was to express the correlation matrix as a generic Choleski decomposition, then randomly populate the triangular decomposed matrix. When you remultiply, you can simply rescale to 1s on the diagonals. Then rmnorm as usual. In R, see ?chol If you want to get fancy, you could look at the random distribution you would use for the triangular matrix and play with that, including different distributions for different elements, elements' distributions being conditional on values of previously randomized elements, etc. John -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rick DeShon Sent: Wednesday, 09 February, 2011 11:06 AM To: r-h...@stat.math.ethz.ch Subject: [R] Generate multivariate normal data with a random correlation matrix Hi All. I'd like to generate a sample of n observations from a k dimensional multivariate normal distribution with a random correlation matrix. My solution: The lower (or upper) triangle of the correlation matrix has n.tri=(d/2)(d+1)-d entries. Take a uniform sample of n.tri possible correlations (runi(n.tr,-.99,.99) Populate a triangle of the matrix with the sampled correlations Mirror the triangle to populate the other triangle forming a symmetric matrix, cormat Sample n observations from a multivariate normal distribution with mean vector=0 and varcov=cormat Problem: This approach violates the triangle inequality property of correlation matrices. So, the matrix I've constructed is certainly a valid matrix but it is not a valid correlation matrix and it blows up when you submit it to a random number generator such as rmnorm. With a small matrix you sometimes get lucky and generate a valid correlation matrix but as you increase d the probability of obtaining a valid correlation matrix drops off quickly. So, any ideas on how to construct a correlation matrix with random entries that cover the range (or most of the range) or the correlation [-1,1]? Here's the code I've used that won't work. library(mnormt) n - 1000 d - 50 n.tri - ((d*(d+1))/2)-d r - runif(n.tri, min=-.5, max=.5) cormat - diag(c) count1=1 for (i in 1:c){ for (j in 1:c){ if (ij) { cormat[i,j]=r[count1] cormat[j,i]=cormat[i,j] count1=count1+1 } } } eigen(cormat) # if negative eigenvalue, then the matrix violates the triangle inequality x - rmnorm(n, rep(0, c), cormat) # Sample the data Thanks in advance, Rick DeShon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
Re: [R] Calculating rowMeans from different columns in each row?
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Marine Andersson Sent: Thursday, February 10, 2011 3:53 PM To: r-help@r-project.org Subject: [R] Calculating rowMeans from different columns in each row? Hello! I have a dataset like this: X1 X2 X3 X4 X5X6X7X8 1 2 2 1 2 3 2 6 2 3 2 5 7 9 1 3 19 12 6 1 1 3 6 The columns X1-X6 contains ordinary numeric values. X7 contains the number of the first column that the rowMeans should be calculated from and X8 contains the last column that should be included in the rowMeans. when I try test - (df[,df$X7:df$X8]) the rowMeans are calculated based on the values in the X7 and X8 in the first row only. Thanks in advance! /Marine __ [Dwyer Rex USRE] Well, if you print df$X7:df$X8, you'll see why... you can't : together two vectors: c(1,2,3):c(8,10,12) [1] 1 2 3 4 5 6 7 8 Warning messages: 1: In c(1, 2, 3):c(8, 10, 12) : numerical expression has 3 elements: only the first used 2: In c(1, 2, 3):c(8, 10, 12) : numerical expression has 3 elements: only the first used So try: apply(df,1, function(v) {n=length(v); mean(v[v[n-1]:v[n]]) }) message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A question on Duplicating
ab = paste(a,b,sep=;~;~;~) flag = length(ab)==length(unique(ab)) This should work unless you use 3 consecutive winking elephants in other places in your program. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Nipesh Bajaj Sent: Wednesday, February 09, 2011 10:12 AM To: r-help@r-project.org Subject: [R] A question on Duplicating Hello I am struggling to accomplice an idea which is as follows: I have a vector say: a - c(a, b, c, a) and another: b - c(m, n, o, m). Length of those 2 vectors are essentially be same. Here task is to check the duplicates in the vector 'a' and then to check whether any duplicates are there in the same places of 'b'. If not, flag a FALSE. I above example, it is correct hence TRUE. However in general how can I implement this? Can somebody please help me? Thanks, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strange behavior of panel.abline inside a for-loop
Dear Marius, Try this: plot.list = lapply(1:10, function(i) xyplot(i~i,type=p,xlim=c(0,11),panel=function(...) { panel.xyplot(...); panel.abline(v=i)}) ) plot.list[[3]] I imagine it will work for Mr Luftjammer, too. Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Marius Hofert Sent: Tuesday, February 08, 2011 6:37 AM To: Help R Subject: [R] strange behavior of panel.abline inside a for-loop Dear expeRts, I would like to create a list of lattice xyplots. Here is the minimal example: library(lattice) plot.list - vector(list, 10) for(i in 1:10){ plot.list[[i]] - xyplot(i~i, type=p, xlim=c(0,11), panel=function(...){ panel.xyplot(...) panel.abline(v=i) }) } plot.list[[3]] As you can see, the vertical line is *always* printed at x=10 [and not at x=i]. Why? Cheers, Marius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pass nrow(x) to dots in function(x){plot(x,...)}
Hi Marianne, The quick-and-dirty solution is to add one character and make ns global: ns - nrow(x) Poor practice, but OK temporarily if you're just debugging. This is an issue of scope. You are assuming dynamic scope, whereas R uses static scope. 'ns' was not defined when you said paste(n=,ns); it doesn't matter what its value is later. Even though R delays evaluation of the argument until it is first needed, if it ends up being evaluated, the result is the same as if you evaluated it in the environment where it appeared in the text. You can do something like this: myfun - function(x, title.fun, ...) { ns - nrow(x) title = title.fun(ns) barplot(x , main=title, ... ) } myfun(m1, title.fun=function(n) paste(n = ,n) ) Then the paste isn't evaluated until title.fun is *called*. If you don't want to always supply title.fun, you give a default: myfun - function(x, title.fun=paste, ...) { ... or myfun - function(x, title.fun=function(...){}, ...) { ... or myfun - function(x, title.fun=function(...){main}, main=, ...) { ... # (I think) Rex --- Message: 4 Date: Wed, 2 Feb 2011 11:51:50 + From: Marianne Promberger marianne.promber...@kcl.ac.uk To: r-help@r-project.org r-help@r-project.org Subject: [R] pass nrow(x) to dots in function(x){plot(x,...)} Message-ID: 20110202115150.GD8598@lauren Content-Type: text/plain; charset=us-ascii Dear Rers, I have a function to barplot() a matrix, eg myfun - function(x, ...) { barplot(x , ... )} (The real function is more complicated, it does things to the matrix first.) So I can do: m1 - matrix(1:20,4) myfun(m1) myfun(m1, main=My title) I'd like to be able to add the number of rows of the matrix passed to the function to the ... argument, eg myfun(m1, main=paste(n=,ns)) where 'ns' would be nrow(m1) I've tried this but it doesn't work: myfun - function(x, ...) { ns - nrow(x) barplot(x , ... ) } myfun(m1, main=paste(n = ,ns) ) ns is not found So, basically, how do I assign an object inside a function that I can then access in the dots when executing the function? Many thanks Marianne -- Marianne Promberger PhD, King's College London http://promberger.info R version 2.12.0 (2010-10-15) Ubuntu 9.04 -- message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.