On 2009-03-23 08:18, abhi wrote: > On Mar 20, 5:47 pm, "M.-A. Lemburg" <m...@egenix.com> wrote: >>> unicodeTest.c >>> #include<Python.h> >>> static PyObject *unicode_helper(PyObject *self,PyObject *args){ >>> PyObject *sampleObj = NULL; >>> Py_UNICODE *sample = NULL; >>> if (!PyArg_ParseTuple(args, "O", &sampleObj)){ >>> return NULL; >>> } >>> // Explicitly convert it to unicode and get Py_UNICODE value >>> sampleObj = PyUnicode_FromObject(sampleObj); >>> sample = PyUnicode_AS_UNICODE(sampleObj); >>> wprintf(L"database value after unicode conversion is : %s\n", >>> sample); >> You have to use PyUnicode_AsWideChar() to convert a Python >> Unicode object to a wchar_t representation. >> >> Please don't make any assumptions on what Py_UNICODE maps >> to and always use the the Unicode API for this. It is designed >> to provide a portable interface and will not do more conversion >> work than necessary. > > Hi Mark, > Thanks for the help. I tried PyUnicode_AsWideChar() but I am > getting the same result i.e. only the first letter. > > sample code: > > #include<Python.h> > > static PyObject *unicode_helper(PyObject *self,PyObject *args){ > PyObject *sampleObj = NULL; > wchar_t *sample = NULL; > int size = 0; > > if (!PyArg_ParseTuple(args, "O", &sampleObj)){ > return NULL; > } > > // use wide char function > size = PyUnicode_AsWideChar(databaseObj, sample, > PyUnicode_GetSize(databaseObj));
The 3. argument is the buffer size in bytes, not code points. The result will require sizeof(wchar_t) * PyUnicode_GetSize(databaseObj) bytes without a trailing NUL, otherwise sizeof(wchar_t) * (PyUnicode_GetSize(databaseObj) + 1). You also have to allocate the buffer to store the wchar_t data in. Passing in a NULL pointer will result in a seg fault. The function does not allocate a buffer for you: /* Copies the Unicode Object contents into the wchar_t buffer w. At most size wchar_t characters are copied. Note that the resulting wchar_t string may or may not be 0-terminated. It is the responsibility of the caller to make sure that the wchar_t string is 0-terminated in case this is required by the application. Returns the number of wchar_t characters copied (excluding a possibly trailing 0-termination character) or -1 in case of an error. */ PyAPI_FUNC(Py_ssize_t) PyUnicode_AsWideChar( PyUnicodeObject *unicode, /* Unicode object */ register wchar_t *w, /* wchar_t buffer */ Py_ssize_t size /* size of buffer */ ); > printf("%d chars are copied to sample\n", size); > wprintf(L"database value after unicode conversion is : %s\n", > sample); > return Py_BuildValue(""); > > } > > > static PyMethodDef funcs[]={{"unicodeTest",(PyCFunction) > unicode_helper,METH_VARARGS,"test ucs2, ucs4"},{NULL}}; > > void initunicodeTest(void){ > Py_InitModule3("unicodeTest",funcs,""); > > } > > This prints the following when input value is given as "test": > 4 chars are copied to sample > database value after unicode conversion is : t > > Any ideas? > > - > Abhigyan > -- > http://mail.python.org/mailman/listinfo/python-list -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 23 2009) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ -- http://mail.python.org/mailman/listinfo/python-list