<Paul_Koning <at> Dell.com> writes: > > > On Jun 9, 2014, at 9:40 PM, Christian K. <ckkart <at> hoc.net> wrote: > > > Am 09.06.14 16:00, schrieb Paul_Koning <at> Dell.com: > >> > >> On Jun 9, 2014, at 2:53 PM, Christian K. <ckkart <at> hoc.net> wrote: > >> > >>> <Paul_Koning <at> Dell.com> writes: > >>> > >>>> > >>>> > >>>> On Jun 9, 2014, at 9:07 AM, Christian K. <ckkart <at> hoc.net> wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>> I was very pleased to see that retrieving properties of a MAPI object yields > >>>>> either a <str> or <bytes> type depending on whether the _A or _W property > >>>>> was queried … > >>>> > >>>> Really? That seems strange. As I recall, the *_W APIs are “wide > >>> character” ones. So in Python 3, they > >>>> should both map to <str> type. <bytes> applies only to non-text data. > >>> > >>> At least for text properties like e.g. PR_SUBJECT_A / _W the former returns > >>> a mbcs encoded "string", i.e. of bytes type and the latter a 2-byte unicode > >>> string. Binary properties are always returned as bytes in contrast to > >>> earlier when using pyrhon2. > >> > >> Yes, “bytes” for binary values is clearly correct. But MBCS and “2 byte Unicode” (more > accurately called either UCS-2 or UCS-2 BMP subset, not sure which) are both text strings. The different > encoding in the API doesn’t mean they should be different datatypes in Python 3; both cases are properly > mapped to “str”. > > > > No, this is not what I am seeing. MBCS encoded properties, i.e. those terminating with _A are mapped to > 'bytes' and the _W ones to 'str' which is consistent with the handling of unicode and encoded information > in python3. And this is great indeed because having to distinguish between strings which can be encoded or > not while having the same type is really painful. > > Perhaps I’m missing something. > > I’m used to Windows API calls that come in a foo_A and foo_W flavor, the only difference being that the _A > flavor has ASCII arguments and the _W flavor has Unicode arguments (for those arguments that are, > abstractly, strings). > > In Python 3, the “str” type is an abstract string; its character repertoire is Unicode but it doesn’t > have an encoding. Instead, encoding and decoding is done when it is converted to/from external > interfaces — files, external API calls, etc.
True, and the type which handles that data is called "str". In contrast to what I said before more than two bytes have to be used internally since unicode defines more than 100000 characters. > So... I would expect foo_A and foo_W to have “str” arguments, and the interface machinery between > Python3 and those functions would run the appropriate encoding to generate the string representation expected. > > For example, if a given API wants strings in ASCII form, it would be str.encode (“ascii”) or perhaps > str.encode (“latin1”). If it wants MBCS data, it would be encode to that encoding. If 2-byte Unicode, > it would be encode to ucs-2. And so on. Ditto in the reverse direction, when strings are delivered by an > external function. Whenever you encode a "str" object in python3, i.e. call its encode() method you will end up with a "bytes" object. And vice versa only a "bytes" object does have a decode method. So the concept of having a unicode character pool and its printable representation is reflected by two differnt types in python. unicode:"str", any encoded string:"bytes" For that reason, if a function returns an encoded string, the return type has to be "bytes" and this is what is happening when retrieving the _A properties. Try it yourself: type('a') yields "unicode", type('a'.encode('asci'')) yields "bytes" > I would only want/expect to see “bytes” types when the values in question are binary data streams, or > unknown format. But anytime we’re dealing with text strings, the Python 3 approach is that the Python > code sees “str” type, and questions of encoding have been handled at the edge. This is where Python 3 This is not true, see above. Christian _______________________________________________ python-win32 mailing list python-win32@python.org https://mail.python.org/mailman/listinfo/python-win32