I sent this message on TBTECH first; it seems however that there is
almost nobody reading the tech list, so I decided to resend it here. I
recently submitted the problem described in this message as a bug
(https://www.ritlabs.com/bt/bug_view_advanced_page.php?bug_id=0002349);
decided to also report on the list though - it would be interesting to
hear what other people think. Hope it's still not too technical for
TBUDL :).

If you choose an 8-bit encoding for your outgoing messages, but the
message actually does not contain any symbols with decimal values
higher than 127, then TB! would just make it "Content-Type:
text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit", when
queuing that message in Outbox.

Some people think this is the corrrect behaviour, and they refer to
RFC2045 et al. Somebody even reported a bug
https://www.ritlabs.com/bt/bug_view_advanced_page.php?bug_id=0002343 -
"Possibility to leave definition of 8 bit charset in case of message
with 7-bit only without resetting to "us-ascii"" I still consider it
definitely a logical mistake, and a serious one, since RFCs say octets
with decimal values of 127 and up are not allowed in 7-bit data, but
no one RFC says 0-127 should never be encoded as 8-bit - characters
themselves are not intrinsically "7-bit" or "8-bit".

As a result of this behaviour combined with some other MUAs' (e.g.
Microsoft-made ones') improper behaviour, there is the following
problem reported by various people. Suppose I send a message to my
Ukrainian friend in Canada, and he replies in Russian. I know his MUA
would try and put in the headers of his reply the same encoding as my
message had. To save him time on checking, I would indicate I want
_every message of mine_ (even if it's plain English only!) to be
"Content-Type: text/plain; charset=koi8-r / Content-Transfer-Encoding:
8bit", _which is perfectly legal in my view, as explained above. I
compose my message indicating "KOI8-R" as the charset to be used,
but... looking in the Outbox, I see "Content-Type: text/plain;
charset=us-ascii / Content-Transfer-Encoding: 7bit" there!

Hope you get my point. www.livejournal.com uses UTF-8 for all those
webpages, even though I do not use any Chinese or other double-byte
characters in my blog there. I consider this to be a good example:
characters, be they English, Ukrainian, Chinese, or whatever, are not
"7bit", "8bit", "double-byte", etc. They _can_ be _encoded_ in various
ways; and other than for those cases where it is just plainly
impossible to encode them in a specific way (like it is impossible to
encode Russian as 7bit), - standards do not prohibit us from using
anything. So, seeing a good, standards-compliant, mail client like
TB!, which calls itself "mail servant" :), I would like it to respect
my will, _or_ at least to produce a warning when it changes (again,
without a valid, standards-based reason!) what I've set as my default
charset.

Would you agree with that?..

My apology for this letter being rather long, - at least I hope it is
not completely boring for everybody :).

Maksym.

-- 
Maksym Kozub, MK881-UANIC    mailto: [EMAIL PROTECTED]



________________________________________________
Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html

Reply via email to