On Jul 13, 2009, at 3:35 PM, Eric Eisner wrote: > On Tue, Jul 14, 2009 at 00:51, Robert > Bradshaw<[email protected]> wrote: >> On Jul 13, 2009, at 6:47 AM, Eric Eisner wrote: >> >>> Hi, >>> >>> I was working on a wrapper for a c function that took an unsigned >>> char* and its length (the string could have null bytes, so it >>> needs a >>> specific length). I was having some trouble getting cython to >>> compile >>> a simple conversion of string to unsigned char*, the way I >>> eventually >>> got it to work is: >>> >>> udata = <unsigned char*><char*>pydata >>> >>> This was a surprising requirement that took me a while to figure >>> out. >>> Is it intentional that strings cannot be directly cast to unsigned >>> char? >> >> No, I don't think that's intentional. >> >>> If not, I assume this can be fixed easily...by someone who >>> understands the code of course. >> >> Yes, I think that could be fixed relatively easy. However, note >> that casting >> Python objects directly to char* is skirting all unicode/charset >> issues. >> >> - Robert > > This application was specifically supposed to be for arbitrary data > bytes (hence needed the null bytes) and the term string was the 2.x > nomenclature. For a 3.x version, it would definitely need to take > bytes
Having null bytes has nothing to do with char vs. unsigned char. I've thought about this some more, and the amount of casting it would take to get the C compiler to not complain when trying to treat unsigned char* as strings, I actually don't think it's any natural to convert strings to unsigned char*, so the double cast above seems like the right thing to do (the first cast extracts the string data, the second changes the pointer type). The same would work if you wanted to treat the contents of pydata as a void* or an int*, etc. It would, however, be worth an entry in the FAQ. - Robert _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
