Salut Alain, On 2007-03-25 20:21:35 +0200, Alain Bench wrote: > Bonjour Vincent, > > On Sunday, March 25, 2007 at 4:28:36 +0200, Vincent Lefèvre wrote: > > > On 2007-03-24 16:05:08 +0100, Alain Bench wrote: > >> Setting UTF-8 after ISO-8859-1 is useless. Any string is always > >> valid Latin-1. > > Shouldn't characters 128-159 be regarded as invalid? > > No, I don't think so, for a number of reasons: > > - Mutt doesn't decide valid/invalid; It asks to iconv, which replies > that Latin-1 128-159 are valid and convertable. > > - 128-159 are (part of) printable characters in some charsets.
Yes, I meant in ISO-8859-1, as being *non-printable* characters. > - If avoidable, we prefer to not hardcode special cases in Mutt. I agree, but testing printability should be sufficient. > - We would not get a clean benefit anyway: Many UTF-8 strings would > still be wrongly detected as Latin-1. Not all, but many. > > - To properly distinguish UTF-8 from a 256 chars charset (like > Latin-1), we really need to set UTF-8 first. In this order, invalidating > 128-159 buys us nothing. Concerning these two points, I was thinking about files that contain both ISO-8859-1 and UTF-8, to let the user decide. Note that this is not necessarily an error. For instance, it can happen in diffs where some files are encoded in ISO-8859-1 and others in UTF-8. -- Vincent Lefèvre <[EMAIL PROTECTED]> - Web: <http://www.vinc17.org/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/> Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)
