You invoke Rprof, run your code and then terminate it:
Rprof() ....... code you want to profile Rprof(NULL) # generate output summaryRprof() example: > Rprof() > for (i in 1:1e6) sin(i) + cos(i) + sqrt(i) > Rprof(NULL) > summaryRprof() $by.self self.time self.pct total.time total.pct sin 0.24 30.77 0.24 30.77 sqrt 0.22 28.21 0.22 28.21 cos 0.16 20.51 0.16 20.51 + 0.14 17.95 0.14 17.95 : 0.02 2.56 0.02 2.56 $by.total total.time total.pct self.time self.pct sin 0.24 30.77 0.24 30.77 sqrt 0.22 28.21 0.22 28.21 cos 0.16 20.51 0.16 20.51 + 0.14 17.95 0.14 17.95 : 0.02 2.56 0.02 2.56 $sample.interval [1] 0.02 $sampling.time [1] 0.78 On Fri, Feb 25, 2011 at 6:57 AM, Ivan Calandra <ivan.calan...@uni-hamburg.de> wrote: > Dear Jim, > > I've tried to use Rprof() as you advised me, but I don't understand how it > works. > I've done this: > Rprof(for (i in seq_along(seq.yvar)){ > all_my_commands > }) > summaryRprof() > > But I got this error: > Error in summaryRprof() : no lines found in ‘Rprof.out’ > > I couldn't really understand from the help page what I should do. > > In any case, it's sure that the function tstsreg(), is what takes the most > computing time. But I wanted to optimize the rest of the code to gain as > much speed as possible. > > Ivan > > Le 2/25/2011 12:30, Jim Holtman a écrit : >> >> use Rprof to find where time is being spent. probably in 'plot' which >> might imply it is not the 'for' loop and therefore beyond your control. >> >> Sent from my iPad >> >> On Feb 25, 2011, at 6:19, Ivan Calandra<ivan.calan...@uni-hamburg.de> >> wrote: >> >>> Thanks Nick for your quick answer. >>> It does work (no missed bracket!) but unfortunately doesn't really speed >>> up anything: with my real data, it takes 82.78 seconds with the double >>> lapply() instead of 83.59s with the double loop (about 0.8 s). >>> >>> It looks like my double loop was not that bad. Does anyone know another >>> faster way to do this? >>> >>> Thanks again in advance, >>> Ivan >>> >>> Le 2/25/2011 11:41, Nick Sabbe a écrit : >>>> >>>> Simply avoiding the for loops by using lapply (I may have missed a >>>> bracket >>>> here or there cause I did this without opening R)... >>>> Haven't checked the speed up, though. >>>> >>>> lapply(seq.yvar, function(k){ >>>> plot(mydata1[[k]]~mydata1[[ind.xvar]], type="p", >>>> xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k]) >>>> lapply(seq_along(mydata_list), function(j){ >>>> foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j, >>>> pos=mypos[j], name.dat=names(mydata_list)[j]) >>>> return(NULL) >>>> }) >>>> invisible(NULL) >>>> }) >>>> >>>> HTH, >>>> >>>> Nick Sabbe >>>> -- >>>> ping: nick.sa...@ugent.be >>>> link: http://biomath.ugent.be >>>> wink: A1.056, Coupure Links 653, 9000 Gent >>>> ring: 09/264.59.36 >>>> >>>> -- Do Not Disapprove >>>> >>>> >>>> >>>> >>>> -----Original Message----- >>>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] >>>> On >>>> Behalf Of Ivan Calandra >>>> Sent: vrijdag 25 februari 2011 11:20 >>>> To: r-help >>>> Subject: [R] speed up process >>>> >>>> Dear users, >>>> >>>> I have a double for loop that does exactly what I want, but is quite >>>> slow. It is not so much with this simplified example, but IRL it is >>>> slow. >>>> Can anyone help me improve it? >>>> >>>> The data and code for foo_reg() are available at the end of the email; I >>>> preferred going directly into the problematic part. >>>> Here is the code (I tried to simplify it but I cannot do it too much or >>>> else it wouldn't represent my problem). It might also look too complex >>>> for what it is intended to do, but my colleagues who are also supposed >>>> to use it don't know much about R. So I wrote it so that they don't have >>>> to modify the critical parts to run the script for their needs. >>>> >>>> #column indexes for function >>>> ind.xvar<- 2 >>>> seq.yvar<- 3:4 >>>> #position vector for legend(), stupid positioning but it doesn't matter >>>> here >>>> mypos<- c("topleft", "topright","bottomleft") >>>> >>>> #run the function for columns 3&4 as y (seq.yvar) with column 2 as x >>>> (ind.xvar) for all 3 datasets (mydata_list) >>>> par(mfrow=c(2,1)) >>>> for (i in seq_along(seq.yvar)){ >>>> k<- seq.yvar[i] >>>> plot(mydata1[[k]]~mydata1[[ind.xvar]], type="p", >>>> xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k]) >>>> for (j in seq_along(mydata_list)){ >>>> foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j, >>>> pos=mypos[j], name.dat=names(mydata_list)[j]) >>>> } >>>> } >>>> >>>> I tried with lapply() or mapply() but couldn't manage to pass the >>>> arguments for names() and col= correctly, e.g. for the 2nd loop: >>>> lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar, >>>> yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])}) >>>> mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])}, >>>> mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k)) >>>> >>>> Thanks in advance for any hints. >>>> Ivan >>>> >>>> >>>> >>>> >>>> #create data (it looks horrible with these datasets but it doesn't >>>> matter here) >>>> mydata1<- structure(list(species = structure(1:8, .Label = c("alsen", >>>> "gogor", "loalb", "mafas", "pacyn", "patro", "poabe", "thgel"), class = >>>> "factor"), fruit = c(0.52, 0.45, 0.43, 0.82, 0.35, 0.9, 0.68, 0), Asfc = >>>> c(207.463765, 138.5533755, 70.4391735, 160.9742745, 41.455809, >>>> 119.155109, 26.241441, 148.337377), Tfv = c(47068.1437773483, >>>> 43743.8087431582, 40323.5209129239, 23420.9455581495, 29382.6947428651, >>>> 50460.2202192311, 21810.1456510625, 41747.6053810881)), .Names = >>>> c("species", "fruit", "Asfc", "Tfv"), row.names = c(NA, 8L), class = >>>> "data.frame") >>>> >>>> mydata2<- mydata1[!(mydata1$species %in% c("thgel","alsen")),] >>>> mydata3<- mydata1[!(mydata1$species %in% c("thgel","alsen","poabe")),] >>>> mydata_list<- list(mydata1=mydata1, mydata2=mydata2, mydata3=mydata3) >>>> >>>> #function for regression >>>> library(WRS) >>>> foo_reg<- function(dat, xvar, yvar, mycol, pos, name.dat){ >>>> tsts<- tstsreg(dat[[xvar]], dat[[yvar]]) >>>> tsts_inter<- signif(tsts$coef[1], digits=3) >>>> tsts_slope<- signif(tsts$coef[2], digits=3) >>>> abline(tsts$coef, lty=1, col=mycol) >>>> legend(x=pos, legend=c(paste("TSTS ",name.dat,": >>>> Y=",tsts_inter,"+",tsts_slope,"X",sep="")), lty=1, col=mycol) >>>> } >>>> >>> -- >>> Ivan CALANDRA >>> PhD Student >>> University of Hamburg >>> Biozentrum Grindel und Zoologisches Museum >>> Abt. Säugetiere >>> Martin-Luther-King-Platz 3 >>> D-20146 Hamburg, GERMANY >>> +49(0)40 42838 6231 >>> ivan.calan...@uni-hamburg.de >>> >>> ********** >>> http://www.for771.uni-bonn.de >>> http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > -- > Ivan CALANDRA > PhD Student > University of Hamburg > Biozentrum Grindel und Zoologisches Museum > Abt. Säugetiere > Martin-Luther-King-Platz 3 > D-20146 Hamburg, GERMANY > +49(0)40 42838 6231 > ivan.calan...@uni-hamburg.de > > ********** > http://www.for771.uni-bonn.de > http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php > > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.