On Wed, Mar 7, 2012 at 9:39 PM, wrote:
> Ah. I think the array module should maintain compatibility with Python 3.2,
> i.e. "u" should continue to denote Py_UNICODE, i.e. 7fa098f6dc6a should be
> reverted.
>
> It may be that the 'u' code is not particularly useful, but AFAICT, it never
> was usef
On Wed, Mar 7, 2012 at 8:50 PM, Stefan Krah wrote:
> *If* the arrays that Victor mentioned give one character per array location,
> then memoryview(str) could be used for zero-copy slicing etc.
A slight tangent, but it's worth trying to stick to the "code point"
term when talking about what Unico
The main reason why I raised the issue is this: If Python-3.3 is shipped
with 'u' -> UCS4 in the array module and *then* someone figures out that
the above format codes are a great idea, we'd be stuck with yet another
format code incompatibility.
Ah. I think the array module should maintain comp
"Martin v. L?wis" wrote:
> > I think it would be nice for Python3.3 to implement the PEP-3118
> > suggestion:
> >
> > 'c' -> UCS1
> >
> > 'u' -> UCS2
> >
> > 'w' -> UCS4
>
> What is the use case for these format codes?
Unfortunately I've only worked with UTF-8 so far and I'm not too familiar
On Wed, Mar 7, 2012 at 4:15 AM, Stefan Krah wrote:
> Victor Stinner wrote:
>> A Unicode string is an array of code point. Another approach is to
>> expose such string as an array of uint8/uint16/uint32 integers. I
>> don't know if you expect to get a character / a substring when you
>> read the b
> I think it would be nice for Python3.3 to implement the PEP-3118
> suggestion:
>
> 'c' -> UCS1
>
> 'u' -> UCS2
>
> 'w' -> UCS4
What is the use case for these format codes?
Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mai
Victor Stinner wrote:
> > 'c' -> UCS1
> > 'u' -> UCS2
> > 'w' -> UCS4
>
> A Unicode string is an array of code point. Another approach is to
> expose such string as an array of uint8/uint16/uint32 integers. I
> don't know if you expect to get a character / a substring when you
> read the buffer o
> In the array module the 'u' specifier previously meant "2-bytes, on wide
> builds 4-bytes". Currently in 3.3 the 'u' specifier is mapped to UCS4.
>
> I think it would be nice for Python3.3 to implement the PEP-3118
> suggestion:
>
> 'c' -> UCS1
>
> 'u' -> UCS2
>
> 'w' -> UCS4
A Unicode string is
Hello,
In the array module the 'u' specifier previously meant "2-bytes, on wide
builds 4-bytes". Currently in 3.3 the 'u' specifier is mapped to UCS4.
I think it would be nice for Python3.3 to implement the PEP-3118
suggestion:
'c' -> UCS1
'u' -> UCS2
'w' -> UCS4
Actually we could even add '