Gertjan Klein wrote: > Laszlo Nagy wrote: > > >> However, there are malformed emails and I have to put them into the >> database. What should I do with this: >> > [...] > >> There is no encoding given in the subject but it contains 0x92. When I >> try to insert this into the database, I get: >> > > This is indeed malformed email. The content type in the header specifies > iso-8859-1, but this looks like Windows code page 1252, where character > \x92 is a single right quote character (unicode \x2019). > > As the majority of the mail clients out there are Windows-based, and as > far as I can tell many of them get the encoding wrong, I'd simply try to > decode as CP1252 on error, especially if the content-type claims > iso-8859-1. Many Windows mail clients consider iso-8859-1 equivalent to > 1252 (it's not; the former doesn't use code points in the range \x8n and > \x9n, the latter does.) > > Thank you very much!
-- http://mail.python.org/mailman/listinfo/python-list