Hi This problem is fixed. I was running an older kernel 2.6.28. I upgraded to 2.6.32-5 Debian Sid/Squeeze and now the performance is similar.
Regards Abhijit Bera On Wed, May 19, 2010 at 6:49 PM, Carlos J. Gil Bellosta < c...@datanalytics.com> wrote: > Dear Abhijit, > > If you think that table.CAPM is the culprit, you could run the call to > such function in R on both platforms using Rprof to check which part > of the function is producing the bottleneck. > > Best regards, > > Carlos J. Gil Bellosta > http://www.datanalytics.com > > > 2010/5/19 Abhijit Bera <abhib...@gmail.com>: > > Update: it appears that the time taken isn't so much on the Data > conversion. > > The maximum time taken is in CAPM calculation. :( Anyone know why the > CAPM > > calculation would be faster on Windows? > > > > On Wed, May 19, 2010 at 5:51 PM, Abhijit Bera <abhib...@gmail.com> > wrote: > > > >> Hi > >> > >> This is my function. It serves an HTML page after the calculations. I'm > >> connecting to a MSSQL DB using pyodbc. > >> > >> def CAPM(self,client): > >> > >> r=self.r > >> > >> cds="1590" > >> bm="20559" > >> > >> d1 = [] > >> v1 = [] > >> v2 = [] > >> > >> > >> print"Parsing GET Params" > >> > >> params=client.g[1].split("&") > >> > >> for items in params: > >> item=items.split("=") > >> > >> if(item[0]=="cds"): > >> cds=unquote(item[1]) > >> elif(item[0]=="bm"): > >> bm=unquote(item[1]) > >> > >> print "cds: %s bm: %s" % (cds,bm) > >> > >> print "Fetching data" > >> > >> t3=datetime.now() > >> > >> for row in self.cursor.execute("select * from (select * from ( > >> select co_code,dlyprice_date,dlyprice_close from feed_dlyprice P where > >> co_code in (%s,%s) ) DataTable PIVOT ( max(dlyprice_close) FOR co_code > IN > >> ([%s],[%s]) )PivotTable ) a order by dlyprice_date" %(cds,bm,cds,bm)): > >> d1.append(str(row[0])) > >> v1.append(row[1]) > >> v2.append(row[2]) > >> > >> t4=datetime.now() > >> > >> t1=datetime.now() > >> > >> print "Calculating" > >> > >> d1.pop(0) > >> d1vec = robjects.StrVector(d1) > >> v1vec = robjects.FloatVector(v1) > >> v2vec = robjects.FloatVector(v2) > >> > >> r1 = r('Return.calculate(%s)' %v1vec.r_repr()) > >> r2 = r('Return.calculate(%s)' %v2vec.r_repr()) > >> > >> tl = robjects.rlc.TaggedList([r1,r2],tags=('Geo','Nifty')) > >> df = robjects.DataFrame(tl) > >> > >> ts2 = r.timeSeries(df,d1vec) > >> tsa = r.timeSeries(r1,d1vec) > >> tsb = r.timeSeries(r2,d1vec) > >> > >> robjects.globalenv["ta"] = tsa > >> robjects.globalenv["tb"] = tsb > >> robjects.globalenv["t2"] = ts2 > >> a = r('table.CAPM(ta,tb)') > >> > >> t2=datetime.now() > >> > >> > >> page="<html><title>CAPM</title><body>Result:<br>%s<br>Time taken > by > >> DB:%s<br>Time taken by R:%s<br>Total time elapsed:%s<br></body></html>" > >> %(str(a),str(t4-t3),str(t2-t1),str(t2-t3)) > >> print "Serving page:" > >> #print page > >> > >> self.serveResource(page,"text",client) > >> > >> > >> > >> On Linux > >> Time taken by DB:0:00:00.024165 > >> Time taken by R:0:00:05.572084 > >> Total time elapsed:0:00:05.596288 > >> > >> On Windows > >> Time taken by DB:0:00:00.112000 > >> Time taken by R:0:00:02.355000 > >> Total time elapsed:0:00:02.467000 > >> > >> Why is there such a huge difference in the time taken by R on the two > >> platforms? Am I doing something wrong? It's my first Rpy2 code so I > guess > >> it's badly written. > >> > >> I'm loading the following libraries: > >> 'PerformanceAnalytics','timeSeries','fPortfolio','fPortfolioBacktest' > >> > >> I'm using Rpy2 2.1.0 and R 2.11 > >> > >> Regards > >> > >> Abhijit Bera > >> > >> > >> > >> > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel