Re: 'Straße' ('Strasse') and Python 2

Robin Becker Wed, 15 Jan 2014 04:52:49 -0800

On 15/01/2014 12:13, Ned Batchelder wrote:
........

On my utf8 based system

robin@everest ~:
$ cat ooo.py
if __name__=='__main__':
    import sys
    s='A̅B'
    print('version_info=%s\nlen(%s)=%d' % (sys.version_info,s,len(s)))
robin@everest ~:
$ python ooo.py
version_info=sys.version_info(major=3, minor=3, micro=3,
releaselevel='final', serial=0)
len(A̅B)=3
robin@everest ~:
$

........

You are right that more than one codepoint makes up a grapheme, and that you'll
need code to deal with the correspondence between them. But let's not muddy
these already confusing waters by referring to that mapping as an encoding.

In Unicode terms, an encoding is a mapping between codepoints and bytes.  Python
3's str is a sequence of codepoints.

Semantics is everything. For me graphemes are the endpoint (or should be); toget a proper rendering of a sequence of graphemes I can use either a sequence ofbytes or a sequence of codepoints. They are both encodings of the graphemes;what unicode says is an encoding doesn't define what encodings are ie mappingsfrom some source alphabet to a target alphabet.

--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list

Re: 'Straße' ('Strasse') and Python 2

Reply via email to