On Mon, Sep 6, 2010 at 11:20 AM, Stefan Behnel <[email protected]> wrote: > Robert Bradshaw, 06.09.2010 19:01: >> On Mon, Sep 6, 2010 at 9:36 AM, Dag Sverre Seljebotn >>> I don't understand this suggestion. What happens in each of these cases, >>> for different settings of "from __future__ import unicode_literals"? >>> >>> cdef char* x1 = 'abc\u0001' > > As I said in my other mail, I don't think anyone would use the above in > real code. The alternative below is just too obvious and simple. > > >>> cdef char* x2 = 'abc\x01' >> >> from __future__ import unicode_literals (or -3) >> >> len(x1) == 4 >> len(x2) == 4 >> >> Otherwise >> >> len(x1) == 9 >> len(x2) == 4 > > Hmm, now *that* looks unexpected to me.
But this *exactly* how Python handles. x1 = 'abc\u0001' x2 = 'abc\x01' len(x1), len(x2) for with and without unicode_literals. > The way I see it, a C string is the > C equivalent of a Python byte string and should always and predictably > behave like a Python byte string, regardless of the way Python object > literals are handled. Python bytes are very different than strings. C (and most C libraries) use char* for both strings and binary data. - Robert _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
