On Thu, Dec 13, 2012 at 7:21 PM, Stefan Behnel <[email protected]> wrote:
> Maciej Fijalkowski, 13.12.2012 09:13:
>> On Thu, Dec 13, 2012 at 9:35 AM, Stefan Behnel wrote:
>>> Maciej Fijalkowski, 12.12.2012 20:10:
>>>> On Wed, Dec 12, 2012 at 7:06 PM, Joe Hillenbrand wrote:
>>>>> I was able to fix the issue with scrapy.
>>>>>
>>>>> https://github.com/joehillen/scrapy/commit/8778af5c5be50a5d746751352f8d710d1f24681c
>>>>>
>>>>> Unfortunately, scrapy takes twice as long in PyPy than in CPython. I 
>>>>> suspect
>>>>> this is because lxml is twice as slow in PyPy vs CPython, which I found in
>>>>> lxml's benchmarks.
>>>>>
>>>>> Should lxml be added to the set of speed tests?
>>>>
>>>> no. lxml uses cpyext (CPython extension compatibility) that is and
>>>> will forever be slow.
>>>
>>> Well, I don't think it would be hard for any PyPy core developer to make it
>>> twice as fast. Shouldn't be more than a day's work.
>>
>> I'm not so sure, we wouldn't know until someone tries it. What
>> optimizations did you have in mind?
>
> Anything that creates a proper fast-path in the ref-counting functions and
> that generally takes pressure off them, e.g. by keeping PyObjects alive in
> a weakref dict as long as the corresponding PyPy object lives, so that
> useless re-allocation cycles are avoided. I'm sure that really simple
> changes can bring a substantial improvement here.

short term allocations are usually very cheap. dictionaries lookups
not necesarilly so. Do you have any specific optimizations in mind? I
don't see any easy way of doing it all.

>
>
>> For what is worth, cpyext is not twice as slow, lxml is. cpyext is
>> likely 10-20x slower or so. I presume lowering the overhead would not
>> automatically make lxml twice as fast, since it's doing quite a lot of
>> other work.
>
> lxml's API performance suffers a lot from object/reference creation and
> deallocation time, so making object deallocation faster and making it
> happen only when necessary would certainly improve the overall performance.
>
> Stefan
>
>
> _______________________________________________
> pypy-dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/pypy-dev
_______________________________________________
pypy-dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to