Re: [R] how to replace my double for loop which is little efficient!
Thank Berend, It seems like that it is better to attach a PDF file for avoiding messy code. Yes, I want to obtain is Tanimoto coefficient and your web site "wikipedia" is about this coefficient. I also search R site about tanimoto coefficient and learn it more. About your code, I has saved and learned it. Thanks again Kevin -- View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164920.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to replace my double for loop which is little efficient!
Found this: http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29 and an R site search with "tanimoto" yielded some more interesting stuff. The rest is up to you. Berend -- View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164785.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to replace my double for loop which is little efficient!
bbslover wrote: > > thanks for your help. I am sorry I do not full understand your code, so i > can not correct using your code to my data. here is the attachment of my > data, and what I want to compute is the equation in the word document of > the attachment: > > the code form Berend can get the answer i want to get. > I've seen what you want to do. OpenOffice mangled the equations so I used an online conversion to pdf. Thanks. See my previous post for the fastest version. It's just converting your formulas with some matrix algebra. Berend -- View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164767.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to replace my double for loop which is little efficient!
djmuseR wrote: > > On Sun, Dec 26, 2010 at 4:18 AM, bbslover wrote: > >> >> x: is a matrix 202*263, that is 202 samples, and 263 independent >> variables >> >> num.compd<-nrow(x); # number of compounds >> diss.all<-0 >> for( i in 1:num.compd) >> for (j in 1:num.compd) >> if (i!=j) { >> > > Isn't this just X'X? > >>S1<-sum(x[i,]*x[j,]) >> > Aren't each of S2 and S3 just diag(X'X)? > >>S2<-sum(x[i,]^2) >> >S3<-sum(x[j,]^2) >>sim2<-S1/(S2+S3-S1) >>diss2<-1-sim2 >>diss.all<-diss.all+diss2} >> > > I tried > s1 <- crossprod(x) > s2 <- diag(s1) > s3 <-outer(s2, s2, '+') - s1 > s1/s3 > > This yields a symmetric matrix with 1's along the diagonal and quantities > between 0 and 1 in the off-diagonal. Something like it could conceivably > be > used as a similarity matrix. Is that what you're looking for with sim2? > > I agree with Berend: it looks like a problem that could be easily solved > with some matrix algebra. R can do matrix algebra quite efficiently, > y'know... > > (BTW, I tried this on a 1000 x 1000 input matrix: > system.time(myfunc(x)) >user system elapsed >0.990.021.02 > > I expect it could be improved by an order of magnitude if one actually > knew > what you were computing... ) > I did some more work along Dennis' lines xtx <- tcrossprod(x) xtd <- diag(xtx) xzz <- outer(xtd,xtd,'+') zz <- 1 - xtx/(xzz-xtx) diss.all <- sum(zz) this appears to give the desired result and it's quite a bit faster than my alternative 2. It would indeed be nice to know what is being computed. Berend -- View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164755.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to replace my double for loop which is little efficient!
thanks for your help. I am sorry I do not full understand your code, so i can not correct using your code to my data. here is the attachment of my data, and what I want to compute is the equation in the word document of the attachment: the code form Berend can get the answer i want to get. http://r.789695.n4.nabble.com/file/n3164741/my_data.rar my_data.rar -- View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164741.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to replace my double for loop which is little efficient!
thanks for your help, it is great. In addition, In the beginning, the format of x is dataframe, and i run my code, it is so slow, after your help, I change x for matirx, it is so quick. I am very grateful your kind help, and your code is so good! kevin -- View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164732.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to replace my double for loop which is little efficient!
Hi: On Sun, Dec 26, 2010 at 4:18 AM, bbslover wrote: > > Dear all, > > My double for loop as follows, but it is little efficient, I hope all > friends can give me a "vectorized" program to replace my code. thanks > > x: is a matrix 202*263, that is 202 samples, and 263 independent > variables > > num.compd<-nrow(x); # number of compounds > diss.all<-0 > for( i in 1:num.compd) > for (j in 1:num.compd) > if (i!=j) { > Isn't this just X'X? >S1<-sum(x[i,]*x[j,]) > Aren't each of S2 and S3 just diag(X'X)? >S2<-sum(x[i,]^2) > S3<-sum(x[j,]^2) >sim2<-S1/(S2+S3-S1) >diss2<-1-sim2 >diss.all<-diss.all+diss2} > I tried s1 <- crossprod(x) s2 <- diag(s1) s3 <-outer(s2, s2, '+') - s1 s1/s3 This yields a symmetric matrix with 1's along the diagonal and quantities between 0 and 1 in the off-diagonal. Something like it could conceivably be used as a similarity matrix. Is that what you're looking for with sim2? I agree with Berend: it looks like a problem that could be easily solved with some matrix algebra. R can do matrix algebra quite efficiently, y'know... (BTW, I tried this on a 1000 x 1000 input matrix: system.time(myfunc(x)) user system elapsed 0.990.021.02 I expect it could be improved by an order of magnitude if one actually knew what you were computing... ) HTH, Dennis it will cost a long time to finish this computation! i really need "rapid" > code to replace my code. > > thanks > > kevin > > > -- > View this message in context: > http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164222.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to replace my double for loop which is little efficient!
bbslover wrote: > > x: is a matrix 202*263, that is 202 samples, and 263 independent > variables > > num.compd<-nrow(x); # number of compounds > diss.all<-0 > for( i in 1:num.compd) >for (j in 1:num.compd) > if (i!=j) { > S1<-sum(x[i,]*x[j,]) > S2<-sum(x[i,]^2) > S3<-sum(x[j,]^2) > sim2<-S1/(S2+S3-S1) > diss2<-1-sim2 > diss.all<-diss.all+diss2} > > it will cost a long time to finish this computation! i really need "rapid" > code to replace my code. > Alternative 1: j-loop only needs to start at i+1 so for( i in 1:num.compd) { for (j in seq(from=i+1,to=num.compd,length.out=max(0,num.compd-i))) { S1<-sum(x[i,]*x[j,]) S2<-sum(x[i,]^2) S3<-sum(x[j,]^2) sim2<-S1/(S2+S3-S1) diss2<-1-sim2 diss2.all<-diss2.all+diss2 } } diss2.all <- 2 * diss2.all On my pc this is about twice as fast as your version (with 202 samples and 263 variables) Alternative 2: all sum() are not necessary. Use some matrix algebra: xtx <- x %*% t(x) diss3.all <- 0 for( i in 1:num.compd) { for (j in seq(from=i+1,to=num.compd,length.out=max(0,num.compd-i))) { S1 <- xtx[i,j] S2 <- xtx[i,i] S3 <- xtx[j,j] sim2<-S1/(S2+S3-S1) diss2<-1-sim2 diss3.all<-diss3.all+diss2 } } diss3.all <- 2 * diss3.all This is about four times as fast as alternative 1. I'm quite sure that more expert R gurus can get some more speed up. Note: I generated the x matrix with: set.seed(1);x<-matrix(runif(202*263),nrow=202) (Timings on iMac 2.16Ghz and using 64-bit R) Berend -- View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164262.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.