I for one am so glad we now have Unicode.
I remember when in pre-Unicode days my then-girlfriend was writing a PhD
thesis in German about Russian linguistics. She had fonts for both
alphabets, but due to technical limitations the different letters had to
share the same code points. And at one point somehow the correct
formatting got lost in her word processor... A complete and utter disaster!
You are not serious, are you?
Charlie
* Naena Guru <naenag...@gmail.com> [2013-01-08 22:56]:
Thank you for commenting and Happy New Year.
CP-1252 is a perfectly legal web character set, and nobody is going to
argue with you if you want to use it in legal ways. (I.e. writing
Latin script in it, not Sinhala.) But .
Okay, what is implied is I am doing something illegal. Define what I
am doing that is illegal and cite the rule and its purpose of
preventing what harm to whom.
May I ask if the following two are Latin script, English or Singhala?
1. This is written in English.
2. mee laþingaþa síhalayi.
For me, both are Latin script and 1 is English and 2 is Singhala
(says,' this is romanized Singhala').
The fo;;owing are the *only* Singhala language web pages that pass
HTML validation (Challenge me):
http://www.lovatasinhala.com/
They are in romanized Singhala.
The statement,
the death of most character sets makes everyone's systems smaller
and faster
is *FALSE*. Compare the sizes of the following two files that are
copies of a newspaper article. The top part in red has few more words
in romanized Singhala in the romanized Singhala file. Notice the size
of each file:
1. http://ahangama.com/jc/uniSinDemo.htm size:38,092
bytes
2. http://ahangama.com/jc/RSDemo.htm size:18,922
bytes
As the size of the page grows, the size of Unicode Sinhala tends to
double the size relative to its romanized Singhala version. Unicode
Sinhala characters become 50% larger when UTF-8 encoded
for transmission That is three times the size of the romanized
Singhala file. So, the Unicode Sinhala file consumes 3 times the
bandwidth needed to send the romanized Singhala file.
more likely to correctly show them the document instead of trash
Again *demonstrably WRONG*: Unicode Sinhala is trash in a machine that
does not have the fonts. It is trash also if the font used by the OS
is improperly made, such as in iPhone. It is generally trash because
the SLS1134 standard corrupts at least one writing convention. (Brandy
issue). On the other hand, romanized Singhala is always readable
whether you have the font or not. It is not helpful to criticize
Singhala related things without making a serious effort to understand
the issues. Blind men thought different things about the elephant.
If you mean that everyone should start using 16-bit Unicode
characters, I have no objection to that. It would happen if and
when all applications implement it. I cannot fight that even if I want
to. But I do not see users of English doing anything different to what
they are doing now, like my typing now, I think, using 8-bit
characters. (I can verify that by copying it and pasting into a text
editor.
I showed that the Singhala can be romanized and all the problems of
ill-conceived Unicode Indic can be eliminated by carefully studying
the grammar of the language and romanizing. (I used the word
'transliterate' earlier, but the correct word is transcribe). I did it
for Singhala and made an Open Type font to show it perfectly in the
traditional Singhala script. So far, one RS smartfont and six Unicode
fonts even after spending $20M for a foreign expert to tell how to
make fonts though it is right on the web in the same language the
expert spoke in.
My work irritates some may be because it is an affront their belief
that they know all and decide all. Some feel let down why they could
not think of it earlier and may be write about a strange discovery
like Abiguda and write a book on the nonsense. Most of all, I think it
is a just cultural block on this side of the globe.
As for Lankan technocrats, their worry is that the purpose of ICTA
would come unraveled. I went there in November and it was revealed to
me (by one of its employees) that its purpose is to provide a single
point of contact for foreign vendors that can use local experts as
their advocates.
On Thu, Jan 3, 2013 at 12:56 AM, Leif Halvard Silli
<xn--mlform-...@xn--mlform-iua.no
<mailto:xn--mlform-...@xn--mlform-iua.no>> wrote:
Asmus Freytag, Mon, 31 Dec 2012 06:44:44 -0800:
> On 12/31/2012 3:27 AM, Leif Halvard Silli wrote:
>> Asmus Freytag, Sun, 30 Dec 2012 17:05:56 -0800:
>> The Web archive for this very list, needs a fix as well …
>
>
> The way to formally request any action by the Unicode Consortium is
> via the contact form (found on the home page).
Good idea. Done!
Turned out to only be - it seems to me - an issue of mislabeling the
monthly index pages as ISO-8859-1 instead of UTF-8. Whereas the very
messages themselves are archived correctly. And thus I made the
request
that they properly label the index pages.
Happy new year!
--
leif h silli