Le vendredi 17 août 2012 01:59:31 UTC+2, Terry Reedy a écrit :
> a = '…'
> 
> print(ord(a))
> 
>  >>>
> 
> 8230
> 
> Most things with unicode are easier in 3.x, and some are even better in 
> 
> 3.3. The current beta is good enough for most informal work. 3.3.0 will 
> 
> be out in a month.
> 
> 
> 
> -- 
> 
> Terry Jan Reedy

Slightly off topic.

The character '…', Unicode name 'HORIZONTAL ELLIPSIS',
is one of these characters existing in the cp1252, mac-roman
coding schemes and not in iso-8859-1 (latin-1) and obviously
not in ascii. It causes Py3.3 to work a few 100% slower
than Py<3.3 versions due to the flexible string representation
(ascii/latin-1/ucs-2/ucs-4) (I found cases up to 1000%).

>>> '…'.encode('cp1252')
b'\x85'
>>> '…'.encode('mac-roman')
b'\xc9'
>>> '…'.encode('iso-8859-1') # latin-1
Traceback (most recent call last):
  File "<eta last command>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2026'
in position 0: ordinal not in range(256)

If one could neglect this (typographically important) glyph, what
to say about the characters of the European scripts (languages)
present in cp1252 or in mac-roman but not in latin-1 (eg. the
French script/language)?

Very nice. Python 2 was built for ascii user, now Python 3 is
*optimized* for, let say, ascii user!

The future is bright for Python. French users are better
served with Apple or MS products, simply because these
corporates know you can not write French with iso-8859-1.

PS When "TeX" moved from the ascii encoding to iso-8859-1
and the so called Cork encoding, "they" know this and provided
all the complementary packages to circumvent this. It was
in 199? (Python was not even born).

Ditto for the foundries (Adobe, Linotype, ...)

jmf
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to