On Sun, 23 May 2004 10:26:19 -0400 Sam Varshavchik <[EMAIL PROTECTED]> wrote:
> All right, I think this is good enough for a first attempt. I still made > the changes that I like: mark UTF-8 and UTF-7 as preferring > quoted-printable, and choose between quoted printable and base64 header > encoding based on how many characters need encoding. > > I think this is a better approach. RFC2047 says quoted-printable "is designed to allow text containing mostly ASCII characters to be decipherable on an ASCII terminal without decoding". In general, a UTF-8 text doesn't contain "mostly ASCII". I think base64 is preferred. UTF-7 text is already 7-bit through. Body text by it haven't to be encoded, or, by same reason as UTF-8, I think base64 is preferred (also for headers). Not all MUAs can handle both base64 and quoted-printable, however neither of 2 methods is more preferred or more recommended by MIME. MUAs based on Latin culture tend to prefer quoted-printable and MUAs based on non-Latin culture tend to prefer base64, since charsets often used by former often recommend quoted-printable and latter base64. + Some MUAs made for Japanese recognize only base64 (and even only 7bit bodies), since base64 (and 7bit bodies) is mostly enough to implement Japanese-only text processing. + By same reason, I worry some Latin-based MUAs would be able to handle only quoted-printable text part. I think the best practice is to determin encoding method by fixed flags (recommended by each charset), not to determin by such as rate of non-ASCII character (except when the text contains ASCII characters only; it's 7bit). And I suggest a text by unknown charset will be encoded by base64, since it will not be known that texts by the charset can contain enough (or not enough) ASCII characters. --- nezumi