On 2009-03-13, Johannes Bauer <dfnsonfsdu...@gmx.de> wrote: > Peter Otten schrieb: > >> encoding = sys.stdout.encoding or "ascii" >> for row in rows: >> id, address = row[:2] >> print id, address.encode(encoding, "replace") >> >> Example: >> >>>>> u"ähnlich lölich üblich".encode("ascii", "replace") >> '?hnlich l?lich ?blich' > > A very good tip, Peter - I've also had this problem before and didn't > know about your solution.
If you know before hand that you will be using ascii, you can eliminate the accents, so that you will get the unaccentuated letter (followed by a question mark if you prefer) instead of a question mark >>> from unicodedata import normalize, combining >>> example = u"ähnlich lölich üblich" >>> normalised = normalize('NFKD', example) >>> normalised.encode("ascii", "replace") 'a?hnlich lo?lich u?blich' >>> eliminated = u''.join(l for l in normalised if not combining(l)) >>> eliminated.encode("ascii", "replace") 'ahnlich lolich ublich' -- Antoon Pardon -- http://mail.python.org/mailman/listinfo/python-list