Nick Coghlan added the comment:

I admit that the main thing that bothers me with the proposal in PEP 3118 is 
the inconsistency between c -> bytes, while u, w -> str

This was less of an issue in 2.x (which was the main frame of reference when 
the PEP was written), with implicit str/unicode interoperability, but seems 
quite jarring in the 3.x world.

Status quo:
struct module: 'c' = individual bytes, 's' = multi-byte sequence
array module: 'u' typecode may be either 2 bytes or 4 bytes (Py_UNICODE) (the 
addition of the 'w' typecode has been reverted)

My current inclination is still to apply Victor's patch from #13072 (which 
changes array to export the appropriate integer typecodes for 'u' arrays) and 
otherwise punt on this for 3.3 and try to sort out the mess for 3.4.

For 3.4, I'm inclined to favour Stefan's proposal of C, U, W mapping to 
multi-point sequences of UCS-1, UCS-2, UCS-4 code points (with corresponding 
typecodes in the array module).

Support for lowercase 'u' would then never become an official part of the 
buffer API, existing only as an array typecode.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue15625>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to