> So why would you need three separate implementation of the unrolled
> loop? You already have a macro named WRITE_FLEXIBLE_OR_WSTR.

Depending on where the speedup comes from in this optimization, it
may well be that the overhead of figuring out where to store the
result eats the gain from the fast test.

> Even without taking into account the unrolled loop, I wonder how much
> slower UTF-8 decoding becomes with that approach, by the way. 

In some cases, tests show that it gets faster, overall, compared to 3.2.
This is probably because strings take less memory, which means less
copying, more cache locality, etc.

Of course, it still may be possible to apply micro-optimizations to
the new implementation.

> Instead of
> testing the "kind" variable at each loop iteration, using a
> stringlib-like approach may be a better deal IMO.

Well, things have to be done in order:
1. the PEP needs to be approved
2. the performance bottlenecks need to be identified
3. optimizations should be applied.

I'm not sure what you mean by "stringlib-like" approach - if you are
talking about templating, I'd rather avoid this for maintainability
reasons, unless significant improvements can be demonstrated. Torsten
had a version that used macros for that, and it was a pain to debug.
So we put correctness and readability first.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to