New submission from paul rubin <[EMAIL PROTECTED]>: For object serialization and some other purposes, Java encodes unicode strings with a modified version of utf-8:
http://en.wikipedia.org/wiki/UTF-8#Java http://java.sun.com/javase/6/docs/api/java/io/DataInput.html#modified-utf-8 It is used in Lucene index files among other places. It would be useful if Python had a codec for this, maybe called "UTF-8J" or something like that. ---------- components: Library (Lib) messages: 66843 nosy: phr severity: normal status: open title: add coded for java modified utf-8 versions: Python 2.5 __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2857> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com