Hi there!
On 8 Oct 99, at 21:58, Keith Russell wrote
about "Re: Tips Language, 1.36 release":
> >> Вы можете легко выбрать "любим
> >>
> >> What's up?
>
> > It's in Russian:-)) "You may easily select the belov.....":-)
> > Must be a bug:-(
>
> Okay, so how do I view the Russian? When I viewed Alexander's message
> with The Bat! 1.36 at work, I saw very nice Cyrillic characters.
> However, Tip of the Day was displayed as pure garbage. I tried
> switching to every Cyrillic encoding, and every encoding displayed garbage.
Okay, I'll try to explain <sigh>:-)) It's not simple to explain
anyway, it's more of "trial and error" thing IMHO, and besides
IMO it's usually *pretty* hard to explain those 8-bit matters to an
English-speaker (maybe, too proud of your own language? no
personal offence, please:-))).
So well, plain English text (actually, each Latin letter) is treated
in the same way everywhere (these letters are internally
represented as decimal values between 0 and 127, that is,
when your computer wants to print out the letter "A" it actually
tries to find the character numbered with "65" decimal value in
the font *currently used*). In all the fonts *I* know, Latin letters
occupy *exactly the same* places, so whatever font you use
your computer will be able to print them correctly. More
generally, the set of characters with the decimal values in
between 0 and 127 are referred to as "plain ASCII set", and
these characters are always here at your fingertips.
With the foreign (to you:-))) languages the situation is *very*
different indeed. Almost every other language uses it's own
language-specific symbols, which are missing in the "plain
ASCII". With European languages it's quite simple, since they
usually have only a few of such characters each (for example,
umlauted letters and that "ss" one for German, etc.) But for
Cyrillic languages the situation is almost fatal (for example,
Russian uses the Cyrillic-based alphabet consisting of 33
characters, *none* of which belongs to plain ASCII set). The
situation described becomes even worse when Eastern
languages (like Korean or Chineeze) are concerned. They
have *more* then 256 characters each, so they cannot "map"
each character to a single byte of information and hence they
use various kinds of two-byte encodings.
Let's restrict ourselves to the case of 8-bit languages (actually,
I don't know enough about the 2-byte case:-))). Nowadays
almost every font used by the public has the so-called
"scripting mechanism", which works as follows:
1. for the letters #0--127 the font is equal to the plain ASCII, for
*every* script
2. for the higher-bit characters (128--255) it's *different* for
every script, that is, when you choose, say, "Arial" with
"Western" script, you'll get Western-European umlauts etc. at
the positions 128--255, but when you choose the same font
with "Cyrillic" script, you'll find Cyrillic letters there.
Now let's look closer at how TB uses this technics. Suppose,
somebody (say, me) sends you a message in Russian. The
standard "encoding"[1] for e-mail in Russian is KOI8-R, which
is stated in the message headers (kludges in TB's terminology:-
)). On your side TB "sees" this string in headers and
*automatically* switches the font script used to display the
message to "Cyrillic". *But* provided that the font you're in the
habit of using doesn't contain *this particular script* it will fail,
and then *you* will fail to see the Russian text correctly. So we
obtained the answer to your first question: apparently, you are
using *different* fonts at home and at work, one *with* Cyrillic
support (i.e. script), the other -- without it.
The completely different case is with the tips-file. TB has no
control over what font script is used to diplay the "tips window",
so it uses the font+script pair which is the default for your
system (apparently, in your case the script is Western).
Obviously, the Russian text displayed using the wrong font
script looks like gibberish:-))
Well, HTH:-)
[1] See the RFC #2047 if you're interested in these matters.
> Now after installing the new version at home, both Alexander's example
> AND the tips are equally unreadable....
SY, Alex
(St.Petersburg, Russia)
--
Thought for the day:
Everything in excess! To enjoy the flavor of life, take big
bites. Moderation is for monks.
---
PGP public keys on keyservers:
0xA2194BF9 (RSA); 0x214135A2 (DH/DSS)
fingerprints:
F222 4AEF EC9F 5FA6 7515 910A 2429 9CB1 (RSA)
A677 81C9 48CF 16D1 B589 9D33 E7D5 675F 2141 35A2 (DH/DSS)
---
--
--------------------------------------------------------------
View the TBUDL archive at http://tbudl.thebat.dutaint.com
To send a message to the list moderation team click here:
<mailto:[EMAIL PROTECTED]>
To Unsubscribe from TBUDL, click below and send the generated message.
<mailto:[EMAIL PROTECTED]>
--------------------------------------------------------------