Re[4]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-27 Thread Stefan Tanurkov
Hello Maksym,

Tuesday, January 27, 2004, 2:00:24 PM, you wrote:

MK> First things first, in this case you cannot (and should not) call your
MK> character set "Default character set".

Yes, we can - it's a shortened form of "Default character set for
non-7-bit ASCII characters", but the latter is a way too long and much
more confusing (how many people know what ASCII is and what are 7-bit
characters anyway? :-)

In fact, you are the only one complaining about the current behaviour
as far as I can recall and you are complaining about it only because
the mail client of your correspondent is not functioning properly.
Won't that be a better solution to find a simple workaround (like
adding an 8-bit character like your name in Cyrillic or non-breaking
space Alt+0160 in your template) ? :-)

MK> Also, as you see from the quted discussion at nobat.ru, the whole
MK> issue shows some massive misunderstanding of (I would even say
MK> "illiteracy in reading of") RFCs: people keep saying you've
MK> implemented it this way because of RFC compliance

Some people do believe that the current behaviour is more
RFC-compliant, but my opinion (and experience) is - the current
behaviour is more logical, safe and convenient.


-- 
Cheers!
 Stefan


pgp0.pgp
Description: PGP signature

Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html

Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-27 Thread Stefan Tanurkov
Hello Thomas,

Monday, January 26, 2004, 8:01:14 PM, you wrote:

>> The behaviour you propose cause more problems because some systems
>> (especially those functioning in the US and Canada) do not know anything
>> about character sets other than us-ascii (and ISO, if one is lucky)
>> and that caused problem with message processing and the recipient may
>> not get the message.

TF> Hm. If he does use a high ASCII character, the encoding will not be
TF> changed by TB.

Well, here is the problem - I use a Russian character set by default.
When I send messages in English, I'm always sure that they're going
out in us-ascii, no need to change anything here. Now, imagine I'm
sending a message in English to some of those servers from the
above - in best case, I'll get my message back saying server could not
process it because an unknown character set. But I used only English!
:-)  Do you see my point?

Another point - some MUAs may/will choose different fonts for viewing
messages in a non-Western character set because of font mangling...


TF> What you do is change the encoding despite the sender's explicit
TF> wish.

Not exactly. I wish my messages I write in Russian to go out in the
character set I choose. But if I write messages in English, I don't
want to mess with switching my encoding - all I want is just to write
a message and be sure the recipient will be able to read it without
any problems :-)


-- 
Cheers!
 Stefan


pgp0.pgp
Description: PGP signature

Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html

Re[3]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-27 Thread Maksym Kozub
Hello Stefan,

I agree to what Thomas said in his message.

ST> Well, here is the problem - I use a Russian character set by default.
ST> When I send messages in English, I'm always sure that they're going
ST> out in us-ascii, no need to change anything here. Now, imagine I'm
ST> sending a message in English to some of those servers from the
ST> above - in best case, I'll get my message back saying server could not
ST> process it because an unknown character set. But I used only English!
ST> :-)  Do you see my point?
...skipped
TF>> What you do is change the encoding despite the sender's explicit
TF>> wish.

ST> Not exactly. I wish my messages I write in Russian to go out in the
ST> character set I choose. But if I write messages in English, I don't
ST> want to mess with switching my encoding - all I want is just to write
ST> a message and be sure the recipient will be able to read it without
ST> any problems :-)

First things first, in this case you cannot (and should not) call your
character set "Default character set". Then it's something like your
"Default character set, but only for everything
Russian/Not-low-ASCII/...". Then please, please, please, - say that in
The Bat! options. I will be still unhappy with such behaviour, but it
would be at least honest, and everybody will know: "Whatever I
choose in that dropdown list is _not_ my default charset".

Oh, and that won't be completely honest yet: we've been missing
another part of my initial message here. When I say "Options - Message
encoding - Cyrillic (KOI8-R)" (or, for that matter, whatever
non-US-ASCII) _while writing this very message_, TB! would still
change it to US-ASCII when queuing the message to Outbox, or sending
it immediately. Is _that_ logical? I think in this case I have
expressly told The Bat! that I _do_ want to "mess with switching my
encoding", haven't I?.. What do you think of _that_?

Also, as you see from the quted discussion at nobat.ru, the whole
issue shows some massive misunderstanding of (I would even say
"illiteracy in reading of") RFCs: people keep saying you'veimplemented
it this way because of RFC compliance, while from what you said here
follows that it is merely your idea of user convenience, etc.

Regards,
Maksym.

-- 
Maksym Kozub, MK881-UANICmailto: [EMAIL PROTECTED]



Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-26 Thread Thomas Fernandez
Hello Stefan,

On Mon, 26 Jan 2004 09:30:00 +0200 GMT (26/01/2004, 14:30 +0700 GMT),
Stefan Tanurkov wrote:

> The behaviour you propose cause more problems because some systems
> (especially those functioning in the US and Canada) do not know anything
> about character sets other than us-ascii (and ISO, if one is lucky)
> and that caused problem with message processing and the recipient may
> not get the message.

Hm. If he does use a high ASCII character, the encoding will not be
changed by TB.

> In fact, at the early age TB! was functioning "your" way (i.e. it was
> setting character set even though there were no characters from that
> CS in the message at all), but we changed it to the current behaviour
> after cases caused by systems described above...

Oh, but if the recipient cannot receive 8-bit character encoded
messages, how would he write in kyrillic and get replies? And why
would the sender explicitely choose 8-bit? If the recipinet cannot
receive 8-bit, there is a reason not to send him message thus encoded;
if if he does, what point is there in changing it? I mean, it is
between sender and recipient.

> So, basically, there is no problem on our side to assign non-ASCII
> character set for messages with only 7-bits characters, but this may
> be troublesome - this is why we don't want to do that :-)

What you do is change the encoding despite the sender's explicit wish.

-- 

Cheers,
Thomas.

Moderator der deutschen The Bat! Beginner Liste.

Apple - Typically a device to seduce men, usually equipped with a
display screen.

Message reply created with The Bat! 2.03.47
under Chinese Windows 98 4.10 Build  A 
using a Pentium P4 1.7 GHz, 256MB RAM





Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: (not LONGISH any more) Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-26 Thread Thomas Fernandez
Hello Maksym,

On Sun, 25 Jan 2004 22:09:25 +0200 GMT (26/01/2004, 03:09 +0700 GMT),
Maksym Kozub wrote:

> Hope you get my point.

Yes. And I admit to still not having read the RFC.

> Any high ASCII letter can never be 7bit data, - that's right, and
> that's what RFC2045 says. What it does _not_ say is that low ASCII
> (like the Latin letter "t") is intrinsically bound to be represented
> as US-ASCII, and _not_ as KOI8-R or even UTF-8, for that matter.

I agree with you. But see Stefan's message.

-- 

Cheers,
Thomas.

Moderator der deutschen The Bat! Beginner Liste.

A chicken crossing the road is poultry in motion.

Message reply created with The Bat! 2.03.47
under Chinese Windows 98 4.10 Build  A 
using a Pentium P4 1.7 GHz, 256MB RAM





Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-26 Thread Chris
On Monday, January 26, 2004 at 2:30:00 AM, Stefan Tanurkov wrote in
the message "Fwd: Bug (maybe wrong understanding of RFCs): an encoding
selected by the user sometimes silently replaced with 7-bit US-ASCII"
<mid:[EMAIL PROTECTED]>:

> So, basically, there is no problem on our side to assign non-ASCII
> character set for messages with only 7-bits characters, but this may
> be troublesome - this is why we don't want to do that :-)
Would that the day when we all use Unicode come faster!

-- 
Chris
Quoting when replying to this message is good for your karma.

At a Towing Company: "We don't charge an arm and a leg. We want tows."

Using The Bat! v2.02.3 CE on Windows XP 5.1 Build 2600 Service Pack 1


pgp0.pgp
Description: PGP signature

Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html

Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-26 Thread Stefan Tanurkov
Hello Maksym,

Sunday, January 25, 2004, 5:07:51 PM, you wrote:

MK> As a result of this behaviour combined with some other MUAs' (e.g.
MK> Microsoft-made ones') improper behaviour, there is the following
MK> problem reported by various people.

The behaviour you propose cause more problems because some systems
(especially those functioning in the US and Canada) do not know anything
about character sets other than us-ascii (and ISO, if one is lucky)
and that caused problem with message processing and the recipient may
not get the message.

In fact, at the early age TB! was functioning "your" way (i.e. it was
setting character set even though there were no characters from that
CS in the message at all), but we changed it to the current behaviour
after cases caused by systems described above...

So, basically, there is no problem on our side to assign non-ASCII
character set for messages with only 7-bits characters, but this may
be troublesome - this is why we don't want to do that :-)

-- 
Cheers!
 Stefan


pgp0.pgp
Description: PGP signature

Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html

Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Maksym Kozub
Hello Carsten,

On Mon, Jan 26 2004, 0:52:14 you wrote:

>> That sounds wrong.

CT> Huh? This is the way good mail clients work. It is not wrong at all.

It _is_ wrong. See my other messages in this thread, where I clearly
demonstrate that 1) there is actually _nothing_ in RFC2045 preventing
a MUA from encoding low ASCII characters in whatever encoding and
forcing it to US-ASCII, and 2) failure to do that on MUA's part
results in various inconveniences for some users, and those
inconveniences are actually _not_ forced by RFC compliance. That's why
I describe this as "maybe wrong understanding of RFCs" on behalf of
MUAs' authors (not only those of The Bat!, seems to be).

Regards,
Maksym.

-- 
Maksym Kozub, MK881-UANICmailto: [EMAIL PROTECTED]



Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Carsten Thönges
* Thomas Fernandez writes:
> Maksym Kozub wrote:

> Let me understand this. You explicitely tell TB to use:

>> "Content-Type: text/plain; charset=koi8-r /
>> Content-Transfer-Encoding: 8bit"

> but TB changes it to

>> "Content-Type: text/plain; charset=us-ascii
>> Content-Transfer-Encoding: 7bit"

> just because it doesn't detect a high ASCII character?

> That sounds wrong.

Huh? This is the way good mail clients work. It is not wrong at all.

Carsten
-- 



Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re[2]: (not LONGISH any more) Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Maksym Kozub
Hello Thomas,

On Sun, 25 Jan 2004 20:41:42 you wrote:

TF> OK, let's read on:

>> Alexandr Kiselev, administrator, Dec 02, 2003, 07:28:24 pm:
>> 
>> "If all characters in a message are us-ascii, then Bat has been always
>> putting us-ascii in message headers, irrespective of the default
>> encoding. This is by the way in complete accordance with the letter
>> and spirit of RFCs.

TF> The others confirm that's so in the RFCs.

In fact, that's _not_ what the RFCs say. See below.

>> I would recommend to replace one of Latin "a"'s with a Russian "а" -
>> this would be sufficient to cope with your problem.

TF> Here you have a work-around that should solve your problem for the
TF> time being.

TF> But having read the discussion you kindly translated, I don't consider
TF> it a bug in TB. Because TB behauves RFC-conform (if what was said in
TF> the thread from the forum is true, I didn't check it).

That workaround is needed _only_ because The Bat! misinterprets the
RFCs (see below). That's an important part of my whole point.

TF> The work-around is therefore for an RFC, which I - as you - think
TF> should be altered. The correct way is to write to the author of the
TF> RFC rather than asking Ritlabs to violate it. I believe they have a
TF> right to be proud of their RFC-compliance. If an RFC doesn't make
TF> sense, it ought to be changed rather than ignored. IMHO.

The whole matter is this: I _don't_ think that RFC should be altered,
but I would like to take the liberty to say there is a wrong
understanding shown by The Bat!, and by the discussion participants,
of what _is_ said in that RFC. Let's have one more look at RFC2045
now. What does it say? It says:

"2.7. 7bit Data
"7bit data" refers to data that is all represented as relatively short
lines with 998 octets or less between CRLF line separation sequences
[RFC-821]. No octets with decimal values greater than 127 are allowed
and neither are NULs (octets with decimal value 0). CR (decimal value
13) and LF (decimal value 10) octets only occur as part of CRLF line
separation sequences.
2.8. 8bit Data
"8bit data" refers to data that is all represented as relatively short
lines with 998 octets or less between CRLF line separation sequences
[RFC-821]), but octets with decimal values greater than 127 may be
used. As with "7bit data" CR and LF octets only occur as part of CRLF
line separation sequences and no NULs are allowed."

To keep it short: "7bit data should _never_ ever contain 127 and up.
8bit data _may_ contain 127 and up."

Does it say "8bit data _should always_ contain 127 and up"? _No_.

Does it say "Whatever does not contain 127 and up _is always_ 7bit
data"? _No_.

When I type the Latin letter "t", am I typing "7bit data"? _Not
necessarily_. It may be represented as 7-bit, 8-bit, UTF-7, UTF-8...
Of course, if there is a Russian character in the same message, then
the message cannot be encoded as US-ASCII anymore - see RFC2045 above.
However, if there is nothing but low ASCII in that message, - please
show me why, based on the definitions from RFC2045 quoted above, it
cannot be encoded as 8bit KOI8-R, or Win-1252, or UTF-8...

Hope you get my point. Any high ASCII letter can never be 7bit data, -
that's right, and that's what RFC2045 says. What it does _not_ say is
that low ASCII (like the Latin letter "t") is intrinsically bound to
be represented as US-ASCII, and _not_ as KOI8-R or even UTF-8, for
that matter.

And I think it is not by chance. The RFC creator understood it very
clearly that if a character (like that poor "t" :) ) exists in various
encodings, then it can be encoded in any of those. Period.

Regards,
Maksym.

-- 
Maksym Kozub, MK881-UANICmailto: [EMAIL PROTECTED]



Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: (not LONGISH any more) Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Thomas Fernandez
Hello Maksym,

On Sun, 25 Jan 2004 20:02:39 +0200 GMT (26/01/2004, 01:02 +0700 GMT),
Maksym Kozub wrote:

TF>> just because it doesn't detect a high ASCII character?

> Yep. Looks exactly so.

TF>> That sounds wrong.

> For me too.

OK, let's read on:

> Alexandr Kiselev, administrator, Dec 02, 2003, 07:28:24 pm:
> 
> "If all characters in a message are us-ascii, then Bat has been always
> putting us-ascii in message headers, irrespective of the default
> encoding. This is by the way in complete accordance with the letter
> and spirit of RFCs.

The others confirm that's so in the RFCs.

> I would recommend to replace one of Latin "a"'s with a Russian "а" -
> this would be sufficient to cope with your problem.

Here you have a work-around that should solve your problem for the
time being.

But having read the discussion you kindly translated, I don't consider
it a bug in TB. Because TB behauves RFC-conform (if what was said in
the thread from the forum is true, I didn't check it).

The work-around is therefore for an RFC, which I - as you - think
should be altered. The correct way is to write to the author of the
RFC rather than asking Ritlabs to violate it. I believe they have a
right to be proud of their RFC-compliance. If an RFC doesn't make
sense, it ought to be changed rather than ignored. IMHO.

-- 

Cheers,
Thomas.

Moderator der deutschen The Bat! Beginner Liste.

"I'm in favor of love as long as it doesn't happen when 'The Simpsons'
are on TV." (Anita, 6)

Message reply created with The Bat! 2.03.47
under Chinese Windows 98 4.10 Build  A 
using a Pentium P4 1.7 GHz, 256MB RAM





Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


(LONGISH) Re[2]: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Maksym Kozub
Hello Thomas,

On Sun, 25 Jan 2004 г. 18:39:19 you wrote:

TF> Let me understand this. You explicitely tell TB to use:

>> "Content-Type: text/plain; charset=koi8-r /
>> Content-Transfer-Encoding: 8bit"

TF> but TB changes it to

>> "Content-Type: text/plain; charset=us-ascii
>> Content-Transfer-Encoding: 7bit"

TF> just because it doesn't detect a high ASCII character?

Yep. Looks exactly so.

TF> That sounds wrong.

For me too. However, based on the bug No. 2343 which I also mentioned,
as well as on the Russian discussion I mentioned on the bugtraq, some
people think this behaviour (completely unacceptable for me) is
absolutely OK.

TF> But let me ask you *how* you have set TB to the first setting to being
TF> with.

Well, there is that "Use Character set" in account templates, as well
as "Message encoding" when composing a message... If I choose a
charset in those 2 places, I think that means I want to be in the
specified charset, and since it is an 8-bit charset, I think it also
means I want it to be "Content-Transfer-Encoding: 8bit". (To be
strict, I don't think I indicate "Content-Type: text/plain" directly
anywhere, - I simply use a plain text message editor. but thanks God,
The Bat! does not change that anyway :))) ). You may also want to have
a look at the bugtraq link I provided, where I list in every detail
the steps required to reproduce this. By the way, a fellow TBUDL
subscriber has already added a note there saying it's the same with
Latin-9 (ISO-8859-15) encoding.

Of course, feel free to ask for any additional info.

The following are thing s you may be not so interested in, but still.
I will now quote some parts of those discussions here (in my English
translation). Those are public discussions anyway, so I think that's
OK.

First, there was a discussion at
http://www.forum.nobat.ru/index.php?board=5;action=display;threadid=1361
, from which I myself got first information on that problem. People
complained about this sort of TB! behaviour, saying it is still the
same with Windows-1251 charset, it is still the same if you include
"%charset" in your templates, whatever.

Alexandr Kiselev, administrator, Dec 02, 2003, 07:28:24 pm:

"If all characters in a message are us-ascii, then Bat has been always
putting us-ascii in message headers, irrespective of the default
encoding. This is by the way in complete accordance with the letter
and spirit of RFCs. I would recommend to replace one of Latin "a"'s
with a Russian "а" - this would be sufficient to cope with your
problem. In general terms, this is a problem with Outglitch (=Outlook
- M.K.)... (skipped - M.K.)"

Sokol, Dec 03, 2003, 09:30:29 am:

Alexander Kiselev on Dec 02, 2003, 07:28:24 pm wrote: "If all
characters in a message are us-ascii, then Bat has been always putting
us-ascii in message headers, irrespective of the default encoding.
This is by the way in complete accordance with the letter and spirit
of RFCs."

I don't seem to remember seeing such things there. Would you kindly
remind me where exactly it says so?.. (skipped - M.K.)"

Alexandr Kiselev, administrator, Dec 03, 2003, 08:15:47 pm:

RFC2045, see the definition of 7bit data and 8bit data. Yours are 7bit
data, so they're encoded under RFC822, i.e. us-ascii.

Moreover, I purposely check, and Pegasus for example uses exactly the
same logic as The Bat!. Same for Becky. So don't tell me this is
wrong. You can install some smtp-gateway like Xray, and change
forcefully the respective header field. I cannot give any other
advice. Oh well... If you have honestly purchased your Outglitch, then
you can complain to Microsoft. Let them patch their mistake."

Vadim, administrator, Dec 05, 2003, 10:54:21 am:

...It is for this very reason that monsters like Microsoft write
programs which do not correspond to standards that people like ypu
appear... IMHO, there is a standard, so it must be complied with. And
all clients comly with it, while Microsoft invents something of their
own and most people comply with that for some [incomprehensible]
reasons..."

I got a feeling of all that being definitely a wrong approach, and
started a new thread at
http://www.forum.nobat.ru/index.php?board=3;action=display;threadid=1564
:

Me on Jan 14, 2004, 03:42:05 pm :

(arguments like those provided in the previous message, aa well as in
my bugtraq message, skipped...) "...So this is a really illogical
behaviour of The Bat!, and if Pegasus and Becky use the same logic...
it doesn't mean the logic itself is correct. By the way, regarding
Pegasus... Harris puts it in the help files: "Allow 8-bit MIME
encodings" "If you check this control, Pegasus Mail will generate MIME
messages using the MIME "8

Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Thomas Fernandez
Hello Peter,

On Sun, 25 Jan 2004 18:30:51 +0100 GMT (26/01/2004, 00:30 +0700 GMT),
Peter Meyns wrote:

TF>> just because it doesn't detect a high ASCII character?

TF>> That sounds wrong.

> That's exactly what happens here with my default setting of Latin-9
> (ISO-8859-15).

Then it sounds like a bug to me. I cannot imagine how this can be
desired.

> Have a look at this message's source.

Did. Says us-ascii. :-(

-- 

Cheers,
Thomas.

Moderator der deutschen The Bat! Beginner Liste.

"Reason for leaving last job: maturity leave."

Message reply created with The Bat! 2.03.47
under Chinese Windows 98 4.10 Build  A 
using a Pentium P4 1.7 GHz, 256MB RAM





Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Peter Meyns
Hi Thomas,

on Sun, 25 Jan 2004 23:39:19 +0700GMT (25.01.04, 17:39 +0100GMT here),
you wrote in mid:[EMAIL PROTECTED] :

TF> Let me understand this. You explicitely tell TB to use:

>> "Content-Type: text/plain; charset=koi8-r /
>> Content-Transfer-Encoding: 8bit"

TF> but TB changes it to

>> "Content-Type: text/plain; charset=us-ascii
>> Content-Transfer-Encoding: 7bit"

TF> just because it doesn't detect a high ASCII character?

Here's the same with high ASCII character.

-- 
Cheers
Peter

Anything that can go wr ...Segmentation violation -- Core dumped.

Winamp currently playing: Caetano Veloso & Gilberto Gil - Tradição



Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Peter Meyns
Hi Thomas,

on Sun, 25 Jan 2004 23:39:19 +0700GMT (25.01.04, 17:39 +0100GMT here),
you wrote in mid:[EMAIL PROTECTED] :

>> If you choose an 8-bit encoding for your outgoing messages, but the
>> message actually does not contain any symbols with decimal values
>> higher than 127, then TB! would just make it "Content-Type:
>> text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit", when
>> queuing that message in Outbox.
TF> [...]

TF> Let me understand this. You explicitely tell TB to use:

>> "Content-Type: text/plain; charset=koi8-r /
>> Content-Transfer-Encoding: 8bit"

TF> but TB changes it to

>> "Content-Type: text/plain; charset=us-ascii
>> Content-Transfer-Encoding: 7bit"

TF> just because it doesn't detect a high ASCII character?

TF> That sounds wrong.

That's exactly what happens here with my default setting of Latin-9
(ISO-8859-15).

Have a look at this message's source. I won't use any umlauts, the
charset will be set to us-ascii. I had to change my tagline though...

-- 
Cheers
Peter

Fatal error: Close your eyes and press ESCAPE three times.

Winamp currently playing: Caetano Veloso & Gilberto Gil - Tradicao



Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Thomas Fernandez
Hello Maksym,

On Sun, 25 Jan 2004 17:07:51 +0200 GMT (25/01/2004, 22:07 +0700 GMT),
Maksym Kozub wrote:

> If you choose an 8-bit encoding for your outgoing messages, but the
> message actually does not contain any symbols with decimal values
> higher than 127, then TB! would just make it "Content-Type:
> text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit", when
> queuing that message in Outbox.
[...]

Let me understand this. You explicitely tell TB to use:

> "Content-Type: text/plain; charset=koi8-r /
> Content-Transfer-Encoding: 8bit"

but TB changes it to

> "Content-Type: text/plain; charset=us-ascii
> Content-Transfer-Encoding: 7bit"

just because it doesn't detect a high ASCII character?

That sounds wrong.

But let me ask you *how* you have set TB to the first setting to being
with.

-- 

Cheers,
Thomas.

Moderator der deutschen The Bat! Beginner Liste.

Things You Would Never Know Without the Movies: It is always possible
to park directly outside any building you are visiting.

Message reply created with The Bat! 2.03.47
under Chinese Windows 98 4.10 Build  A 
using a Pentium P4 1.7 GHz, 256MB RAM





Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Fwd: Bug (maybe wrong understanding of RFCs): an encoding selected by the user sometimes silently replaced with 7-bit US-ASCII

2004-01-25 Thread Maksym Kozub
I sent this message on TBTECH first; it seems however that there is
almost nobody reading the tech list, so I decided to resend it here. I
recently submitted the problem described in this message as a bug
(https://www.ritlabs.com/bt/bug_view_advanced_page.php?bug_id=0002349);
decided to also report on the list though - it would be interesting to
hear what other people think. Hope it's still not too technical for
TBUDL :).

If you choose an 8-bit encoding for your outgoing messages, but the
message actually does not contain any symbols with decimal values
higher than 127, then TB! would just make it "Content-Type:
text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit", when
queuing that message in Outbox.

Some people think this is the corrrect behaviour, and they refer to
RFC2045 et al. Somebody even reported a bug
https://www.ritlabs.com/bt/bug_view_advanced_page.php?bug_id=0002343 -
"Possibility to leave definition of 8 bit charset in case of message
with 7-bit only without resetting to "us-ascii"" I still consider it
definitely a logical mistake, and a serious one, since RFCs say octets
with decimal values of 127 and up are not allowed in 7-bit data, but
no one RFC says 0-127 should never be encoded as 8-bit - characters
themselves are not intrinsically "7-bit" or "8-bit".

As a result of this behaviour combined with some other MUAs' (e.g.
Microsoft-made ones') improper behaviour, there is the following
problem reported by various people. Suppose I send a message to my
Ukrainian friend in Canada, and he replies in Russian. I know his MUA
would try and put in the headers of his reply the same encoding as my
message had. To save him time on checking, I would indicate I want
_every message of mine_ (even if it's plain English only!) to be
"Content-Type: text/plain; charset=koi8-r / Content-Transfer-Encoding:
8bit", _which is perfectly legal in my view, as explained above. I
compose my message indicating "KOI8-R" as the charset to be used,
but... looking in the Outbox, I see "Content-Type: text/plain;
charset=us-ascii / Content-Transfer-Encoding: 7bit" there!

Hope you get my point. www.livejournal.com uses UTF-8 for all those
webpages, even though I do not use any Chinese or other double-byte
characters in my blog there. I consider this to be a good example:
characters, be they English, Ukrainian, Chinese, or whatever, are not
"7bit", "8bit", "double-byte", etc. They _can_ be _encoded_ in various
ways; and other than for those cases where it is just plainly
impossible to encode them in a specific way (like it is impossible to
encode Russian as 7bit), - standards do not prohibit us from using
anything. So, seeing a good, standards-compliant, mail client like
TB!, which calls itself "mail servant" :), I would like it to respect
my will, _or_ at least to produce a warning when it changes (again,
without a valid, standards-based reason!) what I've set as my default
charset.

Would you agree with that?..

My apology for this letter being rather long, - at least I hope it is
not completely boring for everybody :).

Maksym.

-- 
Maksym Kozub, MK881-UANICmailto: [EMAIL PROTECTED]




Current version is 2.02.3 CE | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html