Fredrik Lundh wrote: > John Nagle wrote: > >> I'd like to hear more about what kind of performance gain can be >> obtained from "__slots__". I'm looking into ways of speeding up >> HTML parsing via BeautifulSoup. If a significant speedup can be >> obtained when navigating large trees of small objects, that's worth >> quite a bit to me. > > The following micro-benchmarks are from Python 2.5 on a Core Duo > machine. C0 is an old-style class, C1 is a new-style class, C2 is a > new-style class using __slots__: > > # read access > $ timeit -s "import q; o = q.C0(); o.attrib = 1" "o.attrib" > 10000000 loops, best of 3: 0.133 usec per loop > $ timeit -s "import q; o = q.C1(); o.attrib = 1" "o.attrib" > 10000000 loops, best of 3: 0.184 usec per loop > $ timeit -s "import q; o = q.C2(); o.attrib = 1" "o.attrib" > 10000000 loops, best of 3: 0.161 usec per loop > > # write access > $ timeit -s "import q; o = q.C0(); o.attrib = 1" "o.attrib = 1" > 10000000 loops, best of 3: 0.15 usec per loop > $ timeit -s "import q; o = q.C1(); o.attrib = 1" "o.attrib = 1" > 1000000 loops, best of 3: 0.217 usec per loop > $ timeit -s "import q; o = q.C2(); o.attrib = 1" "o.attrib = 1" > 1000000 loops, best of 3: 0.209 usec per loop
Not much of a win there. Thanks. > > > I'm looking into ways of speeding up HTML parsing via BeautifulSoup. > > The solution to that is spelled "lxml". I may eventually have to go to a non-Python solution. But I've finally made enough robustness fixes to BeautifulSoup that it's usable on large numbers of real-world web sites. (Only two exceptions in the last 100,000 web sites processed. If you want to exercise your HTML parser on hard cases, run hostile-code web sites through it.) John Nagle -- http://mail.python.org/mailman/listinfo/python-list