I tried the code below with Python 2.x. For a given str or unicode object, it 
copies the
bytes in memory (char*) to a list of 1-character strings. I'm getting

"hello" =  ['h', 'e', 'l', 'l', 'o']
u"hello" =  ['h', '\x00', 'e', '\x00', 'l', '\x00', 'l', '\x00', 'o', '\x00']
U"hello" =  ['h', '\x00', 'e', '\x00', 'l', '\x00', 'l', '\x00', 'o', '\x00']

on platforms with sizeof(PY_UNICODE_TYPE) = 2 and

"hello" =  ['h', 'e', 'l', 'l', 'o']
u"hello" =  ['h', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 'l', 
'\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', 'o', '\x00', '\x00', 
'\x00']
U"hello" =  ['h', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 'l', 
'\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', 'o', '\x00', '\x00', 
'\x00']

on platforms with sizeof(PY_UNICODE_TYPE) = 4.

Will the results be different using Python 3?

I have quite a few C++ functions with const char* arguments, expecting one byte 
per character.

> - convert char* and std::string to/from Python 3 unicode string.

How would this work exactly?
Is the plan to copy the unicode data to a temporary one-byte-per-character 
buffer?




----- Original Message ----
From: Stefan Seefeld <seef...@sympatico.ca>
To: Development of Python/C++ integration <cplusplus-sig@python.org>
Sent: Wednesday, March 18, 2009 11:18:03 AM
Subject: Re: [C++-sig] Some thoughts on py3k support

Haoyu Bai wrote:
>
> Yes of course we should allow users to set policy. So the problem is
> what the default behavior should be when there is no policy set by
> user explicitly. The candidates are:
>
> - raise an error
> - convert char* and std::string to/from Python bytes
> - convert char* and std::string to/from Python 3 unicode string.
>
> I personally like the last one because it would keep most of the
> existing code compatible.
>  

I agree.

Thanks,
       Stefan



  boost::python::list
  str_or_unicode_as_char_list(
    boost::python::object const& O)
  {
    PyObject* obj = O.ptr();
    boost::python::ssize_t n;
    const char* c;
    if (PyString_Check(obj)) {
      n = PyString_GET_SIZE(obj);
      c = PyString_AS_STRING(obj);
    }
    else if (PyUnicode_Check(obj)) {
      n = PyUnicode_GET_DATA_SIZE(obj);
      c = PyUnicode_AS_DATA(obj);
    }
    else {
      throw std::invalid_argument("str or unicode object expected.");
    }
    boost::python::list result;
    for(boost::python::ssize_t i=0;i<n;i++) {
      result.append(std::string(c+i, 1u));
    }
    return result;
  }
_______________________________________________
Cplusplus-sig mailing list
Cplusplus-sig@python.org
http://mail.python.org/mailman/listinfo/cplusplus-sig

Reply via email to