Hi there!

On 8 Oct 99, at 21:58, Keith Russell wrote
    about "Re: Tips Language, 1.36 release":

> >> Вы можете легко выбрать "любим
> >> 
> >> What's up?
> 
> > It's in Russian:-)) "You may easily select the belov.....":-)
> > Must be a bug:-(
> 
> Okay, so how do I view the Russian? When I viewed Alexander's message
> with The Bat! 1.36 at work, I saw very nice Cyrillic characters.
> However, Tip of the Day was displayed as pure garbage. I tried
> switching to every Cyrillic encoding, and every encoding displayed garbage.

Okay, I'll try to explain <sigh>:-)) It's not simple to explain 
anyway, it's more of "trial and error" thing IMHO, and besides 
IMO it's usually *pretty* hard to explain those 8-bit matters to an 
English-speaker (maybe, too proud of your own language? no 
personal offence, please:-))).

So well, plain English text (actually, each Latin letter) is treated 
in the same way everywhere (these letters are internally 
represented as decimal values between 0 and 127, that is, 
when your computer wants to print out the letter "A" it actually 
tries to find the character numbered with "65" decimal value in 
the font *currently used*). In all the fonts *I* know, Latin letters 
occupy *exactly the same* places, so whatever font you use 
your computer will be able to print them correctly. More 
generally, the set of characters with the decimal values in 
between 0 and 127 are referred to as "plain ASCII set", and 
these characters are always here at your fingertips.

With the foreign (to you:-))) languages the situation is *very* 
different indeed. Almost every other language uses it's own 
language-specific symbols, which are missing in the "plain 
ASCII". With European languages it's quite simple, since they 
usually have only a few of such characters each (for example, 
umlauted letters and that "ss" one for German, etc.) But for 
Cyrillic languages the situation is almost fatal (for example, 
Russian uses the Cyrillic-based alphabet consisting of 33 
characters, *none* of which belongs to plain ASCII set). The 
situation described becomes even worse when Eastern 
languages (like Korean or Chineeze) are concerned. They 
have *more* then 256 characters each, so they cannot "map" 
each character to a single byte of information and hence they 
use various kinds of two-byte encodings.

Let's restrict ourselves to the case of 8-bit languages (actually, 
I don't know enough about the 2-byte case:-))). Nowadays 
almost every font used by the public has the so-called 
"scripting mechanism", which works as follows:
1. for the letters #0--127 the font is equal to the plain ASCII, for 
*every* script
2. for the higher-bit characters (128--255) it's *different* for 
every script, that is, when you choose, say, "Arial" with 
"Western" script, you'll get Western-European umlauts etc. at 
the positions 128--255, but when you choose the same font 
with "Cyrillic" script, you'll find Cyrillic letters there.

Now let's look closer at how TB uses this technics. Suppose, 
somebody (say, me) sends you a message in Russian. The 
standard "encoding"[1] for e-mail in Russian is KOI8-R, which 
is stated in the message headers (kludges in TB's terminology:-
)). On your side TB "sees" this string in headers and 
*automatically* switches the font script used to display the 
message to "Cyrillic". *But* provided that the font you're in the 
habit of using doesn't contain *this particular script* it will fail, 
and then *you* will fail to see the Russian text correctly. So we 
obtained the answer to your first question: apparently, you are 
using *different* fonts at home and at work, one *with* Cyrillic 
support (i.e. script), the other -- without it.

The completely different case is with the tips-file. TB has no 
control over what font script is used to diplay the "tips window", 
so it uses the font+script pair which is the default for your 
system (apparently, in your case the script is Western). 
Obviously, the Russian text displayed using the wrong font 
script looks like gibberish:-))

Well, HTH:-)

[1] See the RFC #2047 if you're interested in these matters.

> Now after installing the new version at home, both Alexander's example
> AND the tips are equally unreadable....


SY, Alex
(St.Petersburg, Russia)
-- 
Thought for the day:
  Everything in excess! To enjoy the flavor of life, take big
  bites. Moderation is for monks.

--- 
PGP public keys on keyservers:
0xA2194BF9 (RSA);   0x214135A2 (DH/DSS)
fingerprints:
F222 4AEF EC9F 5FA6  7515 910A 2429 9CB1 (RSA)
A677 81C9 48CF 16D1 B589  9D33 E7D5 675F 2141 35A2 (DH/DSS) 
--- 

-- 
--------------------------------------------------------------
View the TBUDL archive at http://tbudl.thebat.dutaint.com
To send a message to the list moderation team click here:
   <mailto:[EMAIL PROTECTED]>
To Unsubscribe from TBUDL, click below and send the generated message.
   <mailto:[EMAIL PROTECTED]>
--------------------------------------------------------------

Reply via email to