Birsch wrote: > Thanks Michael and Dino. > > I'll prof and send update. Got a good profiler recommendation for .Net? > Meanwhile I noticed the sample site below causes BeautifulSoup to > generate quite a few [python] exceptions during __init__. Does > IronPython handle exceptions significantly slower than CPtyhon?
I wouldn't make any assumptions about what is taking the time until you have profiled. :-) Michael http://www.manning.com/foord > > Repro code is simple (just build a BeautifulSoup obj with mininova's > home page). > Here are the .py and .cs I used to time the diffs: > > *bstest.py:* > #Bypass CPython default socket implementation with IPCE/FePy > import imp, os, sys > sys.modules['socket'] = module = imp.new_module('socket') > execfile('socket.py', module.__dict__) > > from BeautifulSoup import BeautifulSoup > from urllib import urlopen > import datetime > > def getContent(url): > #Download html data > startTime = datetime.datetime.now() > print "Getting url", url > html = urlopen(url).read() > print "Time taken:", datetime.datetime.now() - startTime > > #Make soup > startTime = datetime.datetime.now() > print "Making soup..." > soup = BeautifulSoup(markup=html) > print "Time taken:", datetime.datetime.now() - startTime > > if __name__ == "__main__": > print getContent("www.mininova.org <http://www.mininova.org>") > > > *C#:* > using System; > using System.Collections.Generic; > using System.Text; > using IronPython.Hosting; > > namespace IronPythonBeautifulSoupTest > { > public class Program > { > public static void Main(string[] args) > { > //Init > System.Console.WriteLine("Starting..."); > DateTime start = DateTime.Now; > PythonEngine engine = new PythonEngine(); > > //Add paths: > //BeautifulSoup.py, socket.py, bstest.py located on exe dir > engine.AddToPath(@"."); > //CPython Lib (replace with your own) > engine.AddToPath(@"D:\Dev\Python\Lib"); > > //Import and load > TimeSpan span = DateTime.Now - start; > System.Console.WriteLine("[1] Import: " + span.TotalSeconds); > DateTime d = DateTime.Now; > engine.ExecuteFile(@"bstest.py"); > span = DateTime.Now - d; > System.Console.WriteLine("[2] Load: " + span.TotalSeconds); > > //Execute > d = DateTime.Now; > engine.Execute("getContent(\"http://www.mininova.org\ > <http://www.mininova.org%5C>")"); > span = DateTime.Now - d; > System.Console.WriteLine("[3] Execute: " + span.TotalSeconds); > span = DateTime.Now - start; > System.Console.WriteLine("Total: " + span.TotalSeconds); > } > } > } > > > > On Wed, Feb 20, 2008 at 6:57 PM, Dino Viehland > <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> > wrote: > > We've actually had this issue reported once before a long time ago > - it's a very low CodePlex ID - > http://www.codeplex.com/IronPython/WorkItem/View.aspx?WorkItemId=651 > > We haven't had a chance to investigate the end-to-end scenario. > If someone could come up with a smaller simpler repro that'd be > great. Otherwise we haven't forgotten about it we've just had > more immediately pressing issues to work on :(. > > -----Original Message----- > From: [EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]> > [mailto:[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>] On Behalf Of Michael > Foord > Sent: Wednesday, February 20, 2008 5:20 AM > To: Discussion of IronPython > Subject: Re: [IronPython] Slow Performance of CPython libs? > > Birsch wrote: > > Hi - We've been using IronPython successfully to allow extensibility > > of our application. > > > > Overall we are happy with the performance, with the exception of > > BeautifulSoup which seems to run very slowly: x5 or more time to > > execute compared to CPython. > > > > Most of the time seems to be spent during __init__() of BS, > where the > > markup is parsed. > > > > We suspect this has to do with the fact that our CPython env is > > executing .pyc files and can precompile its libs, while the > IronPython > > environment compiles each iteration. We couldn't find a way to > > pre-compile the libs and then introduce them into the code, but > in any > > case this will result in a large management overhead since the > amount > > of CPython libs we expose to our users contains 100's of modules. > > > > Any ideas on how to optimize? > > I think it is worth doing real profiling to find out where the time is > being spent during parsing. > > If it is spending most of the time in '__init__' then the time is > probably not spent in importing - so compilation isn't relevant and it > is a runtime performance issue. (Importing is much slower with > IronPython and at Resolver Systems we do use precompiled binaries > - but > strangely enough it doesn't provide much of a performance gain.) > > Michael > http://www.manning.com/foord > > > > > Thanks, > > -Birsch > > > > Note: we're using FePy/IPCE libs with regular IP v1.1.1 runtime DLLs > > (this was done to overcome library incompatibilities and network > > errors). However, the relevant slow .py code (mainly SGMLParser and > > BeautifulSoup) is the same. > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Users mailing list > > [email protected] <mailto:[email protected]> > > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com > > > > _______________________________________________ > Users mailing list > [email protected] <mailto:[email protected]> > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com > _______________________________________________ > Users mailing list > [email protected] <mailto:[email protected]> > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com > > > ------------------------------------------------------------------------ > > _______________________________________________ > Users mailing list > [email protected] > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com > _______________________________________________ Users mailing list [email protected] http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
