Re[4]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Maksym, Tuesday, January 27, 2004, 2:00:24 PM, you wrote: MK> First things first, in this case you cannot (and should not) call your MK> character set "Default character set". Yes, we can - it's a shortened form of "Default character set for non-7-bit ASCII characters", but the latter is a way too long and much more confusing (how many people know what ASCII is and what are 7-bit characters anyway? :-) In fact, you are the only one complaining about the current behaviour as far as I can recall and you are complaining about it only because the mail client of your correspondent is not functioning properly. Won't that be a better solution to find a simple workaround (like adding an 8-bit character like your name in Cyrillic or non-breaking space Alt+0160 in your template) ? :-) MK> Also, as you see from the quted discussion at nobat.ru, the whole MK> issue shows some massive misunderstanding of (I would even say MK> "illiteracy in reading of") RFCs: people keep saying you've MK> implemented it this way because of RFC compliance Some people do believe that the current behaviour is more RFC-compliant, but my opinion (and experience) is - the current behaviour is more logical, safe and convenient. -- Cheers! Stefan pgp0.pgp Description: PGP signature Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Thomas, Monday, January 26, 2004, 8:01:14 PM, you wrote: >> The behaviour you propose cause more problems because some systems >> (especially those functioning in the US and Canada) do not know anything >> about character sets other than us-ascii (and ISO, if one is lucky) >> and that caused problem with message processing and the recipient may >> not get the message. TF> Hm. If he does use a high ASCII character, the encoding will not be TF> changed by TB. Well, here is the problem - I use a Russian character set by default. When I send messages in English, I'm always sure that they're going out in us-ascii, no need to change anything here. Now, imagine I'm sending a message in English to some of those servers from the above - in best case, I'll get my message back saying server could not process it because an unknown character set. But I used only English! :-) Do you see my point? Another point - some MUAs may/will choose different fonts for viewing messages in a non-Western character set because of font mangling... TF> What you do is change the encoding despite the sender's explicit TF> wish. Not exactly. I wish my messages I write in Russian to go out in the character set I choose. But if I write messages in English, I don't want to mess with switching my encoding - all I want is just to write a message and be sure the recipient will be able to read it without any problems :-) -- Cheers! Stefan pgp0.pgp Description: PGP signature Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re[3]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Stefan, I agree to what Thomas said in his message. ST> Well, here is the problem - I use a Russian character set by default. ST> When I send messages in English, I'm always sure that they're going ST> out in us-ascii, no need to change anything here. Now, imagine I'm ST> sending a message in English to some of those servers from the ST> above - in best case, I'll get my message back saying server could not ST> process it because an unknown character set. But I used only English! ST> :-) Do you see my point? ...skipped TF>> What you do is change the encoding despite the sender's explicit TF>> wish. ST> Not exactly. I wish my messages I write in Russian to go out in the ST> character set I choose. But if I write messages in English, I don't ST> want to mess with switching my encoding - all I want is just to write ST> a message and be sure the recipient will be able to read it without ST> any problems :-) First things first, in this case you cannot (and should not) call your character set "Default character set". Then it's something like your "Default character set, but only for everything Russian/Not-low-ASCII/...". Then please, please, please, - say that in The Bat! options. I will be still unhappy with such behaviour, but it would be at least honest, and everybody will know: "Whatever I choose in that dropdown list is _not_ my default charset". Oh, and that won't be completely honest yet: we've been missing another part of my initial message here. When I say "Options - Message encoding - Cyrillic (KOI8-R)" (or, for that matter, whatever non-US-ASCII) _while writing this very message_, TB! would still change it to US-ASCII when queuing the message to Outbox, or sending it immediately. Is _that_ logical? I think in this case I have expressly told The Bat! that I _do_ want to "mess with switching my encoding", haven't I?.. What do you think of _that_? Also, as you see from the quted discussion at nobat.ru, the whole issue shows some massive misunderstanding of (I would even say "illiteracy in reading of") RFCs: people keep saying you'veimplemented it this way because of RFC compliance, while from what you said here follows that it is merely your idea of user convenience, etc. Regards, Maksym. -- Maksym Kozub, MK881-UANICmailto: [EMAIL PROTECTED] Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Stefan, On Mon, 26 Jan 2004 09:30:00 +0200 GMT (26/01/2004, 14:30 +0700 GMT), Stefan Tanurkov wrote: > The behaviour you propose cause more problems because some systems > (especially those functioning in the US and Canada) do not know anything > about character sets other than us-ascii (and ISO, if one is lucky) > and that caused problem with message processing and the recipient may > not get the message. Hm. If he does use a high ASCII character, the encoding will not be changed by TB. > In fact, at the early age TB! was functioning "your" way (i.e. it was > setting character set even though there were no characters from that > CS in the message at all), but we changed it to the current behaviour > after cases caused by systems described above... Oh, but if the recipient cannot receive 8-bit character encoded messages, how would he write in kyrillic and get replies? And why would the sender explicitely choose 8-bit? If the recipinet cannot receive 8-bit, there is a reason not to send him message thus encoded; if if he does, what point is there in changing it? I mean, it is between sender and recipient. > So, basically, there is no problem on our side to assign non-ASCII > character set for messages with only 7-bits characters, but this may > be troublesome - this is why we don't want to do that :-) What you do is change the encoding despite the sender's explicit wish. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. Apple - Typically a device to seduce men, usually equipped with a display screen. Message reply created with The Bat! 2.03.47 under Chinese Windows 98 4.10 Build A using a Pentium P4 1.7 GHz, 256MB RAM Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: (not LONGISH any more) Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Maksym, On Sun, 25 Jan 2004 22:09:25 +0200 GMT (26/01/2004, 03:09 +0700 GMT), Maksym Kozub wrote: > Hope you get my point. Yes. And I admit to still not having read the RFC. > Any high ASCII letter can never be 7bit data, - that's right, and > that's what RFC2045 says. What it does _not_ say is that low ASCII > (like the Latin letter "t") is intrinsically bound to be represented > as US-ASCII, and _not_ as KOI8-R or even UTF-8, for that matter. I agree with you. But see Stefan's message. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. A chicken crossing the road is poultry in motion. Message reply created with The Bat! 2.03.47 under Chinese Windows 98 4.10 Build A using a Pentium P4 1.7 GHz, 256MB RAM Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
On Monday, January 26, 2004 at 2:30:00 AM, Stefan Tanurkov wrote in the message "Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII" <mid:[EMAIL PROTECTED]>: > So, basically, there is no problem on our side to assign non-ASCII > character set for messages with only 7-bits characters, but this may > be troublesome - this is why we don't want to do that :-) Would that the day when we all use Unicode come faster! -- Chris Quoting when replying to this message is good for your karma. At a Towing Company: "We don't charge an arm and a leg. We want tows." Using The Bat! v2.02.3 CE on Windows XP 5.1 Build 2600 Service Pack 1 pgp0.pgp Description: PGP signature Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Maksym, Sunday, January 25, 2004, 5:07:51 PM, you wrote: MK> As a result of this behaviour combined with some other MUAs' (e.g. MK> Microsoft-made ones') improper behaviour, there is the following MK> problem reported by various people. The behaviour you propose cause more problems because some systems (especially those functioning in the US and Canada) do not know anything about character sets other than us-ascii (and ISO, if one is lucky) and that caused problem with message processing and the recipient may not get the message. In fact, at the early age TB! was functioning "your" way (i.e. it was setting character set even though there were no characters from that CS in the message at all), but we changed it to the current behaviour after cases caused by systems described above... So, basically, there is no problem on our side to assign non-ASCII character set for messages with only 7-bits characters, but this may be troublesome - this is why we don't want to do that :-) -- Cheers! Stefan pgp0.pgp Description: PGP signature Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Carsten, On Mon, Jan 26 2004, 0:52:14 you wrote: >> That sounds wrong. CT> Huh? This is the way good mail clients work. It is not wrong at all. It _is_ wrong. See my other messages in this thread, where I clearly demonstrate that 1) there is actually _nothing_ in RFC2045 preventing a MUA from encoding low ASCII characters in whatever encoding and forcing it to US-ASCII, and 2) failure to do that on MUA's part results in various inconveniences for some users, and those inconveniences are actually _not_ forced by RFC compliance. That's why I describe this as "maybe wrong understanding of RFCs" on behalf of MUAs' authors (not only those of The Bat!, seems to be). Regards, Maksym. -- Maksym Kozub, MK881-UANICmailto: [EMAIL PROTECTED] Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
* Thomas Fernandez writes: > Maksym Kozub wrote: > Let me understand this. You explicitely tell TB to use: >> "Content-Type: text/plain; charset=koi8-r / >> Content-Transfer-Encoding: 8bit" > but TB changes it to >> "Content-Type: text/plain; charset=us-ascii >> Content-Transfer-Encoding: 7bit" > just because it doesn't detect a high ASCII character? > That sounds wrong. Huh? This is the way good mail clients work. It is not wrong at all. Carsten -- Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re[2]: (not LONGISH any more) Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Thomas, On Sun, 25 Jan 2004 20:41:42 you wrote: TF> OK, let's read on: >> Alexandr Kiselev, administrator, Dec 02, 2003, 07:28:24 pm: >> >> "If all characters in a message are us-ascii, then Bat has been always >> putting us-ascii in message headers, irrespective of the default >> encoding. This is by the way in complete accordance with the letter >> and spirit of RFCs. TF> The others confirm that's so in the RFCs. In fact, that's _not_ what the RFCs say. See below. >> I would recommend to replace one of Latin "a"'s with a Russian "а" - >> this would be sufficient to cope with your problem. TF> Here you have a work-around that should solve your problem for the TF> time being. TF> But having read the discussion you kindly translated, I don't consider TF> it a bug in TB. Because TB behauves RFC-conform (if what was said in TF> the thread from the forum is true, I didn't check it). That workaround is needed _only_ because The Bat! misinterprets the RFCs (see below). That's an important part of my whole point. TF> The work-around is therefore for an RFC, which I - as you - think TF> should be altered. The correct way is to write to the author of the TF> RFC rather than asking Ritlabs to violate it. I believe they have a TF> right to be proud of their RFC-compliance. If an RFC doesn't make TF> sense, it ought to be changed rather than ignored. IMHO. The whole matter is this: I _don't_ think that RFC should be altered, but I would like to take the liberty to say there is a wrong understanding shown by The Bat!, and by the discussion participants, of what _is_ said in that RFC. Let's have one more look at RFC2045 now. What does it say? It says: "2.7. 7bit Data "7bit data" refers to data that is all represented as relatively short lines with 998 octets or less between CRLF line separation sequences [RFC-821]. No octets with decimal values greater than 127 are allowed and neither are NULs (octets with decimal value 0). CR (decimal value 13) and LF (decimal value 10) octets only occur as part of CRLF line separation sequences. 2.8. 8bit Data "8bit data" refers to data that is all represented as relatively short lines with 998 octets or less between CRLF line separation sequences [RFC-821]), but octets with decimal values greater than 127 may be used. As with "7bit data" CR and LF octets only occur as part of CRLF line separation sequences and no NULs are allowed." To keep it short: "7bit data should _never_ ever contain 127 and up. 8bit data _may_ contain 127 and up." Does it say "8bit data _should always_ contain 127 and up"? _No_. Does it say "Whatever does not contain 127 and up _is always_ 7bit data"? _No_. When I type the Latin letter "t", am I typing "7bit data"? _Not necessarily_. It may be represented as 7-bit, 8-bit, UTF-7, UTF-8... Of course, if there is a Russian character in the same message, then the message cannot be encoded as US-ASCII anymore - see RFC2045 above. However, if there is nothing but low ASCII in that message, - please show me why, based on the definitions from RFC2045 quoted above, it cannot be encoded as 8bit KOI8-R, or Win-1252, or UTF-8... Hope you get my point. Any high ASCII letter can never be 7bit data, - that's right, and that's what RFC2045 says. What it does _not_ say is that low ASCII (like the Latin letter "t") is intrinsically bound to be represented as US-ASCII, and _not_ as KOI8-R or even UTF-8, for that matter. And I think it is not by chance. The RFC creator understood it very clearly that if a character (like that poor "t" :) ) exists in various encodings, then it can be encoded in any of those. Period. Regards, Maksym. -- Maksym Kozub, MK881-UANICmailto: [EMAIL PROTECTED] Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: (not LONGISH any more) Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Maksym, On Sun, 25 Jan 2004 20:02:39 +0200 GMT (26/01/2004, 01:02 +0700 GMT), Maksym Kozub wrote: TF>> just because it doesn't detect a high ASCII character? > Yep. Looks exactly so. TF>> That sounds wrong. > For me too. OK, let's read on: > Alexandr Kiselev, administrator, Dec 02, 2003, 07:28:24 pm: > > "If all characters in a message are us-ascii, then Bat has been always > putting us-ascii in message headers, irrespective of the default > encoding. This is by the way in complete accordance with the letter > and spirit of RFCs. The others confirm that's so in the RFCs. > I would recommend to replace one of Latin "a"'s with a Russian "а" - > this would be sufficient to cope with your problem. Here you have a work-around that should solve your problem for the time being. But having read the discussion you kindly translated, I don't consider it a bug in TB. Because TB behauves RFC-conform (if what was said in the thread from the forum is true, I didn't check it). The work-around is therefore for an RFC, which I - as you - think should be altered. The correct way is to write to the author of the RFC rather than asking Ritlabs to violate it. I believe they have a right to be proud of their RFC-compliance. If an RFC doesn't make sense, it ought to be changed rather than ignored. IMHO. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. "I'm in favor of love as long as it doesn't happen when 'The Simpsons' are on TV." (Anita, 6) Message reply created with The Bat! 2.03.47 under Chinese Windows 98 4.10 Build A using a Pentium P4 1.7 GHz, 256MB RAM Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
(LONGISH) Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Thomas, On Sun, 25 Jan 2004 г. 18:39:19 you wrote: TF> Let me understand this. You explicitely tell TB to use: >> "Content-Type: text/plain; charset=koi8-r / >> Content-Transfer-Encoding: 8bit" TF> but TB changes it to >> "Content-Type: text/plain; charset=us-ascii >> Content-Transfer-Encoding: 7bit" TF> just because it doesn't detect a high ASCII character? Yep. Looks exactly so. TF> That sounds wrong. For me too. However, based on the bug No. 2343 which I also mentioned, as well as on the Russian discussion I mentioned on the bugtraq, some people think this behaviour (completely unacceptable for me) is absolutely OK. TF> But let me ask you *how* you have set TB to the first setting to being TF> with. Well, there is that "Use Character set" in account templates, as well as "Message encoding" when composing a message... If I choose a charset in those 2 places, I think that means I want to be in the specified charset, and since it is an 8-bit charset, I think it also means I want it to be "Content-Transfer-Encoding: 8bit". (To be strict, I don't think I indicate "Content-Type: text/plain" directly anywhere, - I simply use a plain text message editor. but thanks God, The Bat! does not change that anyway :))) ). You may also want to have a look at the bugtraq link I provided, where I list in every detail the steps required to reproduce this. By the way, a fellow TBUDL subscriber has already added a note there saying it's the same with Latin-9 (ISO-8859-15) encoding. Of course, feel free to ask for any additional info. The following are thing s you may be not so interested in, but still. I will now quote some parts of those discussions here (in my English translation). Those are public discussions anyway, so I think that's OK. First, there was a discussion at http://www.forum.nobat.ru/index.php?board=5;action=display;threadid=1361 , from which I myself got first information on that problem. People complained about this sort of TB! behaviour, saying it is still the same with Windows-1251 charset, it is still the same if you include "%charset" in your templates, whatever. Alexandr Kiselev, administrator, Dec 02, 2003, 07:28:24 pm: "If all characters in a message are us-ascii, then Bat has been always putting us-ascii in message headers, irrespective of the default encoding. This is by the way in complete accordance with the letter and spirit of RFCs. I would recommend to replace one of Latin "a"'s with a Russian "а" - this would be sufficient to cope with your problem. In general terms, this is a problem with Outglitch (=Outlook - M.K.)... (skipped - M.K.)" Sokol, Dec 03, 2003, 09:30:29 am: Alexander Kiselev on Dec 02, 2003, 07:28:24 pm wrote: "If all characters in a message are us-ascii, then Bat has been always putting us-ascii in message headers, irrespective of the default encoding. This is by the way in complete accordance with the letter and spirit of RFCs." I don't seem to remember seeing such things there. Would you kindly remind me where exactly it says so?.. (skipped - M.K.)" Alexandr Kiselev, administrator, Dec 03, 2003, 08:15:47 pm: RFC2045, see the definition of 7bit data and 8bit data. Yours are 7bit data, so they're encoded under RFC822, i.e. us-ascii. Moreover, I purposely check, and Pegasus for example uses exactly the same logic as The Bat!. Same for Becky. So don't tell me this is wrong. You can install some smtp-gateway like Xray, and change forcefully the respective header field. I cannot give any other advice. Oh well... If you have honestly purchased your Outglitch, then you can complain to Microsoft. Let them patch their mistake." Vadim, administrator, Dec 05, 2003, 10:54:21 am: ...It is for this very reason that monsters like Microsoft write programs which do not correspond to standards that people like ypu appear... IMHO, there is a standard, so it must be complied with. And all clients comly with it, while Microsoft invents something of their own and most people comply with that for some [incomprehensible] reasons..." I got a feeling of all that being definitely a wrong approach, and started a new thread at http://www.forum.nobat.ru/index.php?board=3;action=display;threadid=1564 : Me on Jan 14, 2004, 03:42:05 pm : (arguments like those provided in the previous message, aa well as in my bugtraq message, skipped...) "...So this is a really illogical behaviour of The Bat!, and if Pegasus and Becky use the same logic... it doesn't mean the logic itself is correct. By the way, regarding Pegasus... Harris puts it in the help files: "Allow 8-bit MIME encodings" "If you check this control, Pegasus Mail will generate MIME messages using the MIME "8
Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Peter, On Sun, 25 Jan 2004 18:30:51 +0100 GMT (26/01/2004, 00:30 +0700 GMT), Peter Meyns wrote: TF>> just because it doesn't detect a high ASCII character? TF>> That sounds wrong. > That's exactly what happens here with my default setting of Latin-9 > (ISO-8859-15). Then it sounds like a bug to me. I cannot imagine how this can be desired. > Have a look at this message's source. Did. Says us-ascii. :-( -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. "Reason for leaving last job: maturity leave." Message reply created with The Bat! 2.03.47 under Chinese Windows 98 4.10 Build A using a Pentium P4 1.7 GHz, 256MB RAM Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hi Thomas, on Sun, 25 Jan 2004 23:39:19 +0700GMT (25.01.04, 17:39 +0100GMT here), you wrote in mid:[EMAIL PROTECTED] : TF> Let me understand this. You explicitely tell TB to use: >> "Content-Type: text/plain; charset=koi8-r / >> Content-Transfer-Encoding: 8bit" TF> but TB changes it to >> "Content-Type: text/plain; charset=us-ascii >> Content-Transfer-Encoding: 7bit" TF> just because it doesn't detect a high ASCII character? Here's the same with high ASCII character. -- Cheers Peter Anything that can go wr ...Segmentation violation -- Core dumped. Winamp currently playing: Caetano Veloso & Gilberto Gil - Tradição Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hi Thomas, on Sun, 25 Jan 2004 23:39:19 +0700GMT (25.01.04, 17:39 +0100GMT here), you wrote in mid:[EMAIL PROTECTED] : >> If you choose an 8-bit encoding for your outgoing messages, but the >> message actually does not contain any symbols with decimal values >> higher than 127, then TB! would just make it "Content-Type: >> text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit", when >> queuing that message in Outbox. TF> [...] TF> Let me understand this. You explicitely tell TB to use: >> "Content-Type: text/plain; charset=koi8-r / >> Content-Transfer-Encoding: 8bit" TF> but TB changes it to >> "Content-Type: text/plain; charset=us-ascii >> Content-Transfer-Encoding: 7bit" TF> just because it doesn't detect a high ASCII character? TF> That sounds wrong. That's exactly what happens here with my default setting of Latin-9 (ISO-8859-15). Have a look at this message's source. I won't use any umlauts, the charset will be set to us-ascii. I had to change my tagline though... -- Cheers Peter Fatal error: Close your eyes and press ESCAPE three times. Winamp currently playing: Caetano Veloso & Gilberto Gil - Tradicao Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
Hello Maksym, On Sun, 25 Jan 2004 17:07:51 +0200 GMT (25/01/2004, 22:07 +0700 GMT), Maksym Kozub wrote: > If you choose an 8-bit encoding for your outgoing messages, but the > message actually does not contain any symbols with decimal values > higher than 127, then TB! would just make it "Content-Type: > text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit", when > queuing that message in Outbox. [...] Let me understand this. You explicitely tell TB to use: > "Content-Type: text/plain; charset=koi8-r / > Content-Transfer-Encoding: 8bit" but TB changes it to > "Content-Type: text/plain; charset=us-ascii > Content-Transfer-Encoding: 7bit" just because it doesn't detect a high ASCII character? That sounds wrong. But let me ask you *how* you have set TB to the first setting to being with. -- Cheers, Thomas. Moderator der deutschen The Bat! Beginner Liste. Things You Would Never Know Without the Movies: It is always possible to park directly outside any building you are visiting. Message reply created with The Bat! 2.03.47 under Chinese Windows 98 4.10 Build A using a Pentium P4 1.7 GHz, 256MB RAM Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html
Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII
I sent this message on TBTECH first; it seems however that there is almost nobody reading the tech list, so I decided to resend it here. I recently submitted the problem described in this message as a bug (https://www.ritlabs.com/bt/bug_view_advanced_page.php?bug_id=0002349); decided to also report on the list though - it would be interesting to hear what other people think. Hope it's still not too technical for TBUDL :). If you choose an 8-bit encoding for your outgoing messages, but the message actually does not contain any symbols with decimal values higher than 127, then TB! would just make it "Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit", when queuing that message in Outbox. Some people think this is the corrrect behaviour, and they refer to RFC2045 et al. Somebody even reported a bug https://www.ritlabs.com/bt/bug_view_advanced_page.php?bug_id=0002343 - "Possibility to leave definition of 8 bit charset in case of message with 7-bit only without resetting to "us-ascii"" I still consider it definitely a logical mistake, and a serious one, since RFCs say octets with decimal values of 127 and up are not allowed in 7-bit data, but no one RFC says 0-127 should never be encoded as 8-bit - characters themselves are not intrinsically "7-bit" or "8-bit". As a result of this behaviour combined with some other MUAs' (e.g. Microsoft-made ones') improper behaviour, there is the following problem reported by various people. Suppose I send a message to my Ukrainian friend in Canada, and he replies in Russian. I know his MUA would try and put in the headers of his reply the same encoding as my message had. To save him time on checking, I would indicate I want _every message of mine_ (even if it's plain English only!) to be "Content-Type: text/plain; charset=koi8-r / Content-Transfer-Encoding: 8bit", _which is perfectly legal in my view, as explained above. I compose my message indicating "KOI8-R" as the charset to be used, but... looking in the Outbox, I see "Content-Type: text/plain; charset=us-ascii / Content-Transfer-Encoding: 7bit" there! Hope you get my point. www.livejournal.com uses UTF-8 for all those webpages, even though I do not use any Chinese or other double-byte characters in my blog there. I consider this to be a good example: characters, be they English, Ukrainian, Chinese, or whatever, are not "7bit", "8bit", "double-byte", etc. They _can_ be _encoded_ in various ways; and other than for those cases where it is just plainly impossible to encode them in a specific way (like it is impossible to encode Russian as 7bit), - standards do not prohibit us from using anything. So, seeing a good, standards-compliant, mail client like TB!, which calls itself "mail servant" :), I would like it to respect my will, _or_ at least to produce a warning when it changes (again, without a valid, standards-based reason!) what I've set as my default charset. Would you agree with that?.. My apology for this letter being rather long, - at least I hope it is not completely boring for everybody :). Maksym. -- Maksym Kozub, MK881-UANICmailto: [EMAIL PROTECTED] Current version is 2.02.3 CE | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html