[EMAIL PROTECTED] schrieb: > The trick is finding the right XXXX. Has someone attempted this > before, or am I stuck writing my own solution?
In this specific example, there is a different approach, using the Unicode character database: def strip_combining(s): import unicodedata # Expand pre-combined characters into base+combinator s1 = unicodedata.normalize("NFD", s) r = [] for c in s1: # add all non-combining characters if not unicodedata.combining(c): r.append(c) return u"".join(r) py> a.strip_combining(u'B\xe9la Fleck') u'Bela Fleck' As the accented characters get decomposed into base character plus combining accent, this strips off all accents in the string. Of course, it is still fairly limited. If you have non-latin scripts (Greek, Cyrillic, Arabic, Kanji, ...), this approach fails, and you would need a transliteration database for them. There is non built into Python, and I couldn't find a transliteration database that transliterates all Unicode characters into ASCII, either. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list