Re: Unicode problem in ucs4

2009-03-25 Thread abhi
On Mar 24, 4:55 am, "Martin v. Löwis" wrote: > > So, both Py_UNICODE and wchar_t are 4 bytes and since it contains 3 > > \0s after a char, printf or wprintf is only printing one letter. > > No. printf indeed will see a terminating character. However, wprintf > should correctly know that a wchar_t

Re: Unicode problem in ucs4

2009-03-23 Thread Martin v. Löwis
> So, both Py_UNICODE and wchar_t are 4 bytes and since it contains 3 > \0s after a char, printf or wprintf is only printing one letter. No. printf indeed will see a terminating character. However, wprintf should correctly know that a wchar_t has four bytes per character, and print it correctly. M

Re: Unicode problem in ucs4

2009-03-23 Thread M.-A. Lemburg
On 2009-03-23 12:57, abhi wrote: >>> Is there any way >>> by which I can force wchar_t to be 2 bytes, or can I convert this UCS4 >>> data to UCS2 explicitly? >> Sure: just use the appropriate UTF-16 codec for this. >> >> /* Generic codec based encoding API. >> >>object is passed through the enc

Re: Unicode problem in ucs4

2009-03-23 Thread M.-A. Lemburg
On 2009-03-23 14:05, abhi wrote: > Hi Marc, >Is there any way to ensure that wchar_t size would always be 2 > instead of 4 in ucs4 configured python? Googling gave me the > impression that there is some logic written in PyUnicode_AsWideChar() > which can take care of ucs4 to ucs2 conversion

Re: Unicode problem in ucs4

2009-03-23 Thread abhi
On Mar 23, 4:57 pm, abhi wrote: > On Mar 23, 4:37 pm, "M.-A. Lemburg" wrote: > > > > > On 2009-03-23 11:50, abhi wrote: > > > > On Mar 23, 3:04 pm, "M.-A. Lemburg" wrote: > > > Thanks Marc, John, > > >          With your help, I am at least somewhere. I re-wrote the code > > > to compare Py_Unic

Re: Unicode problem in ucs4

2009-03-23 Thread abhi
On Mar 23, 4:37 pm, "M.-A. Lemburg" wrote: > On 2009-03-23 11:50, abhi wrote: > > > > > On Mar 23, 3:04 pm, "M.-A. Lemburg" wrote: > > Thanks Marc, John, > >          With your help, I am at least somewhere. I re-wrote the code > > to compare Py_Unicode and wchar_t outputs and they both look exac

Re: Unicode problem in ucs4

2009-03-23 Thread M.-A. Lemburg
On 2009-03-23 11:50, abhi wrote: > On Mar 23, 3:04 pm, "M.-A. Lemburg" wrote: > Thanks Marc, John, > With your help, I am at least somewhere. I re-wrote the code > to compare Py_Unicode and wchar_t outputs and they both look exactly > the same. > > #include > > static PyObject *unicode_

Re: Unicode problem in ucs4

2009-03-23 Thread abhi
On Mar 23, 3:04 pm, "M.-A. Lemburg" wrote: > On 2009-03-23 08:18, abhi wrote: > > > > > On Mar 20, 5:47 pm, "M.-A. Lemburg" wrote: > >>> unicodeTest.c > >>> #include > >>> static PyObject *unicode_helper(PyObject *self,PyObject *args){ > >>>    PyObject *sampleObj = NULL; > >>>            Py_UNIC

Re: Unicode problem in ucs4

2009-03-23 Thread M.-A. Lemburg
On 2009-03-23 08:18, abhi wrote: > On Mar 20, 5:47 pm, "M.-A. Lemburg" wrote: >>> unicodeTest.c >>> #include >>> static PyObject *unicode_helper(PyObject *self,PyObject *args){ >>>PyObject *sampleObj = NULL; >>>Py_UNICODE *sample = NULL; >>> if (!PyArg_ParseTuple(args, "O", &

Re: Unicode problem in ucs4

2009-03-23 Thread John Machin
On Mar 23, 6:41 pm, John Machin had a severe attack of backslashitis: > [presuming littleendian] The ucs4 string will look like "\t\0\0\0e > \0\0\0s\0\0\0t\0\0\0" in memory. I suspect that your wprintf is > grokking only 16-bit doodads -- "\t\0" is printed and then "\0\0" is > end-of-string. Try

Re: Unicode problem in ucs4

2009-03-23 Thread John Machin
On Mar 23, 6:18 pm, abhi wrote: [snip] > Hi Mark, >      Thanks for the help. I tried PyUnicode_AsWideChar() but I am > getting the same result i.e. only the first letter. > > sample code: > > #include > > static PyObject *unicode_helper(PyObject *self,PyObject *args){ >         PyObject *sampleO

Re: Unicode problem in ucs4

2009-03-23 Thread abhi
On Mar 20, 5:47 pm, "M.-A. Lemburg" wrote: > On 2009-03-20 12:13, abhi wrote: > > > > > > > On Mar 20, 11:03 am, "Martin v. Löwis" wrote: > >>> Any idea on why this is happening? > >> Can you provide a complete example? Your code looks correct, and should > >> just work. > > >> How do you know th

Re: Unicode problem in ucs4

2009-03-20 Thread M.-A. Lemburg
On 2009-03-20 12:13, abhi wrote: > On Mar 20, 11:03 am, "Martin v. Löwis" wrote: >>> Any idea on why this is happening? >> Can you provide a complete example? Your code looks correct, and should >> just work. >> >> How do you know the result contains only 't' (i.e. how do you know it >> does not c

Re: Unicode problem in ucs4

2009-03-20 Thread abhi
On Mar 20, 11:03 am, "Martin v. Löwis" wrote: > > Any idea on why this is happening? > > Can you provide a complete example? Your code looks correct, and should > just work. > > How do you know the result contains only 't' (i.e. how do you know it > does not contain 'e', 's', 't')? > > Regards, >

Re: Unicode problem in ucs4

2009-03-19 Thread Martin v. Löwis
> Any idea on why this is happening? Can you provide a complete example? Your code looks correct, and should just work. How do you know the result contains only 't' (i.e. how do you know it does not contain 'e', 's', 't')? Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list

Unicode problem in ucs4

2009-03-19 Thread abhi
Hi, I have a C extension, which takes a unicode or string value from python and convert it to unicode before doing more operations on it. The skeleton looks like: static PyObject *unicode_helper( PyObject *self, PyObject *args){ PyObject *sampleObj = NULL; Py_UNICODE *sample = NULL