Ton Hospel wrote:
> Ok, how about the following:
> 
> The backward compatibility problem probably mainly exists for 
> unpack("C*") which has traditionally been used to get to the 
> underlying bytes of an encoded string. In my patch C was a bit
> special anyways (I made >= 256 a croak to make it easy to catch 
> that style of usage), and for pack it has the property of wrapping
> for >= 256.

Ooh, didn't know that. But that's apparently obscurely documented in perlfunc.

    $ perl -le 'print pack("C*",256+ord("a"))' | od -c
    0000000   a  \n
    0000002

> With the companion patch I plan for pack that would no 
> longer be needed, but changing it would be another break with backward
> compatibility. So what if I leave "C" as old style "look through
> the encoding", make all other formats (except "c" and "C") encoding 
> neutral and introduce a new letter for encoding neutral "character",
> let's say "E" (suggestions for a better letter welcome, a pity "u" is
> already taken).

W ? (reminiscent of the old wide-char type)

> I'd still like to make U0 and C0 mode into my "reversed" interpretation,
> but now that unpack("C*") ignores the mode I think the impact will be
> very low and worth doing to make pack and unpack consistent.

OK, go for it.

Reply via email to