On 13/03/2013 23:43, Chris Angelico wrote:
On Thu, Mar 14, 2013 at 3:49 AM, rusi <rustompm...@gmail.com> wrote:
On Mar 13, 3:59 pm, Chris Angelico <ros...@gmail.com> wrote:
On Wed, Mar 13, 2013 at 9:11 PM, rusi <rustompm...@gmail.com> wrote:
> Uhhh..
> Making the subject line useful for all readers

I should have read this one before replying in the other thread.

jmf, I'd like to see evidence that there has been a performance
regression compared against a wide build of Python 3.2. You still have
never answered this fundamental, that the narrow builds of Python are
*BUGGY* in the same way that JavaScript/ECMAScript is. And believe you
me, the utterly unnecessary hassles I have had to deal with when
permitting user-provided .js code to script my engine have wasted
rather more dev hours than you would believe - there are rather a lot
of stupid edge cases to deal with.

This assumes that there are only three choices:
- narrow build that is buggy (surrogate pairs for astral characters)
- wide build that is 4-fold space inefficient for wide variety of
common (ASCII) use-cases
- flexible string engine that chooses a small tradeoff of space
efficiency over time efficiency.

There is a fourth choice: narrow build that chooses to be partial over
being buggy. ie when an astral character is encountered, an exception
is thrown rather than trying to fudge it into a 16-bit
representation.

As a simple factual matter, narrow builds of Python 3.2 don't do that.
So it doesn't factor into my original statement. But if you're talking
about a proposal for 3.4, then sure, that's a theoretical possibility.
It wouldn't be "buggy" in the sense of "string indexing/slicing
unexpectedly does the wrong thing", but it would still be incomplete
Unicode support, and I don't think people would appreciate it. Much
better to have graceful degradation: if there are non-BMP characters
in the string, then instead of throwing an exception, it just makes
the string wider.

[snip]
Do you mean that instead of switching between 1/2/4 bytes per codepoint
it would switch between 2/4 bytes per codepoint?

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to