i got it!! OMG! so sorry for the confusion, but i learned a lot, and i can share the result:
the CORRECT code *was* what i had assumed. the Python side has always been correct (no need to put "u" in front of strings, it is known that the bytes are utf8 bytes) it was my "run script" function which read in the file. THAT was what was "reinterpreting" the utf8 bytes as macRoman (on both platforms). correct code below: SuperString ScPyObject::GetAs_String() { SuperString str; if (PyUnicode_Check(i_objP)) { ScPyObject utf8Str(PyUnicode_AsUTF8String(i_objP)); str = utf8Str.GetAs_String(); } else { // calling "uc" on this means "assume this is utf8" str.Set(uc(PyString_AsString(i_objP))); } return str; } PyObject* PyString_FromString(const SuperString& str) { return PyString_FromString(str.utf8Z()); } -- http://mail.python.org/mailman/listinfo/python-list