[Luis] > The script converted the ÇÃ from the first line, but not the º from > the second one.
That's because º, 0xba, MASCULINE ORDINAL INDICATOR is classed as a letter and not a diacritic: http://www.fileformat.info/info/unicode/char/00ba/index.htm You can't encode it in ascii because it's not an ascii character, and the script doesn't remove it because it only removes diacritics. I don't know what the best thing to do with it would be - could you use latin-1 as your base encoding and leave it in there? I don't speak any language that uses it, but I'd guess that anyone searching for eg. 5º (forgive me if I have the gender wrong 8-) would actually type 5º - are there any Italian/Spanish/Portuguese speakers here who can confirm or deny that? In the general case, you have to decide what happens to characters that aren't diacritics and don't live in your base encoding - what happens when a Chinese user searches for a Chinese character? Probably you should just encode(base_encoding, 'ignore'). -- Richie Hindle [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list