Andreas Jung <[EMAIL PROTECTED]> wrote: > [-- text/plain, encoding quoted-printable, charset: us-ascii, 6 lines --] > > Does anyone know of a Python module that is able to sniff the encoding of > text? Please: I know that there is no reliable way to do this but I need > something that works for most of the case...so please no discussion about > the sense of such a module and approach. >
depends on what exactly you need one approach is pyenca the other is: def try_encoding(s, encodings): "try to guess the encoding of string s, testing encodings given in second parameter" for enc in encodings: try: test = unicode(s, enc) return enc except UnicodeDecodeError: pass return None print try_encodings(text, ['ascii', 'utf-8', 'iso8859_1', 'cp1252', 'macroman'] depending on what language and encodings you expects the text to be in, the first or second approach is better -- ----------------------------------------------------------- | Radovan GarabĂk http://kassiopeia.juls.savba.sk/~garabik/ | | __..--^^^--..__ garabik @ kassiopeia.juls.savba.sk | ----------------------------------------------------------- Antivirus alert: file .signature infected by signature virus. Hi! I'm a signature virus! Copy me into your signature file to help me spread! -- http://mail.python.org/mailman/listinfo/python-list