Mistake or Troll (was Re: 'Straße' ('Strasse') and Python 2)

Terry Reedy Mon, 13 Jan 2014 15:16:24 -0800

On 1/13/2014 4:54 AM, [email protected] wrote:

I'm afraid I'm understanding Python (on this
aspect very well).


Really?

Do you belong to this group of people who are naively
writing wrong Python code (usually not properly working)
during more than a decade?

To me, the important question is whether this and previous similar postsare intentional trolls designed to stir up the flurry of responses theyget or 'innocently' misleading or even erroneous. If your claim ofunderstanding Python and Unicode is true, then this must be a trollpost. Either way, please desist, or your access to python-list fromgoogle-groups may be removed.

'ß' is the the fourth character in that text "Straße"
(base index 0).

As others have said, in the *unicode text "Straße", 'ß' is the fifthcharacter, at character index 4, ...

This assertions are correct (byte string and unicode).

whereas, when the text is encoded into bytes, the byte index depends onthe encoding and the assertion that it is always 4 is incorrect. Did youknow this or were you truly ignorant?

sys.version

'2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]'

assert 'Straße'[4] == 'ß'


Sometimes true, sometimes not.

assert u'Straße'[4] == u'ß'

PS Nothing to do with Py2/Py3.

This issue has everything to do with Py2, where 'Straße' is encodedbytes, versus Py3, where 'Straße' is unicode text where each characterof that word takes one code unit, whether each is 2 bytes or 4 bytes.

If you replace 'ß' with any astral (non-BMP) character, this issueappears even for unicode text in 3.2-, where an astral characterrequires 2, not 1, code units on narrow builds, thereby screwing upindexing, just as can happen for encoded bytes. In 3.3+, all charactersuse 1 code unit and indexing (and slicing) always works properly. Thisis another unicode issue where you appear not to understand, but mightjust be trolling.


--
Terry Jan Reedy



--
https://mail.python.org/mailman/listinfo/python-list

Mistake or Troll (was Re: 'Straße' ('Strasse') and Python 2)

Reply via email to