I for one am so glad we now have Unicode.

I remember when in pre-Unicode days my then-girlfriend was writing a PhD thesis in German about Russian linguistics. She had fonts for both alphabets, but due to technical limitations the different letters had to share the same code points. And at one point somehow the correct formatting got lost in her word processor... A complete and utter disaster!

You are not serious, are you?

Charlie


* Naena Guru <naenag...@gmail.com> [2013-01-08 22:56]:

Thank you for commenting and Happy New Year.

    CP-1252 is a perfectly legal web character set, and nobody is going to
    argue with you if you want to use it in legal ways. (I.e. writing
    Latin script in it, not Sinhala.) But .

Okay, what is implied is I am doing something illegal. Define what I am doing that is illegal and cite the rule and its purpose of preventing what harm to whom.

May I ask if the following two are Latin script, English or Singhala?

    1. This is written in English.
    2. mee laþingaþa síhalayi.


For me, both are Latin script and 1 is English and 2 is Singhala (says,' this is romanized Singhala').

The fo;;owing are the *only* Singhala language web pages that pass HTML validation (Challenge me):
http://www.lovatasinhala.com/
They are in romanized Singhala.

The statement,

    the death of most character sets makes everyone's systems smaller
    and faster

is *FALSE*. Compare the sizes of the following two files that are copies of a newspaper article. The top part in red has few more words in romanized Singhala in the romanized Singhala file. Notice the size of each file: 1. http://ahangama.com/jc/uniSinDemo.htm size:38,092 bytes 2. http://ahangama.com/jc/RSDemo.htm size:18,922 bytes As the size of the page grows, the size of Unicode Sinhala tends to double the size relative to its romanized Singhala version. Unicode Sinhala characters become 50% larger when UTF-8 encoded for transmission That is three times the size of the romanized Singhala file. So, the Unicode Sinhala file consumes 3 times the bandwidth needed to send the romanized Singhala file.

    more likely to correctly show them the document instead of trash

Again *demonstrably WRONG*: Unicode Sinhala is trash in a machine that does not have the fonts. It is trash also if the font used by the OS is improperly made, such as in iPhone. It is generally trash because the SLS1134 standard corrupts at least one writing convention. (Brandy issue). On the other hand, romanized Singhala is always readable whether you have the font or not. It is not helpful to criticize Singhala related things without making a serious effort to understand the issues. Blind men thought different things about the elephant.

If you mean that everyone should start using 16-bit Unicode characters, I have no objection to that. It would happen if and when all applications implement it. I cannot fight that even if I want to. But I do not see users of English doing anything different to what they are doing now, like my typing now, I think, using 8-bit characters. (I can verify that by copying it and pasting into a text editor.

I showed that the Singhala can be romanized and all the problems of ill-conceived Unicode Indic can be eliminated by carefully studying the grammar of the language and romanizing. (I used the word 'transliterate' earlier, but the correct word is transcribe). I did it for Singhala and made an Open Type font to show it perfectly in the traditional Singhala script. So far, one RS smartfont and six Unicode fonts even after spending $20M for a foreign expert to tell how to make fonts though it is right on the web in the same language the expert spoke in.

My work irritates some may be because it is an affront their belief that they know all and decide all. Some feel let down why they could not think of it earlier and may be write about a strange discovery like Abiguda and write a book on the nonsense. Most of all, I think it is a just cultural block on this side of the globe.

As for Lankan technocrats, their worry is that the purpose of ICTA would come unraveled. I went there in November and it was revealed to me (by one of its employees) that its purpose is to provide a single point of contact for foreign vendors that can use local experts as their advocates.


On Thu, Jan 3, 2013 at 12:56 AM, Leif Halvard Silli <xn--mlform-...@xn--mlform-iua.no <mailto:xn--mlform-...@xn--mlform-iua.no>> wrote:

    Asmus Freytag, Mon, 31 Dec 2012 06:44:44 -0800:
    > On 12/31/2012 3:27 AM, Leif Halvard Silli wrote:
    >> Asmus Freytag, Sun, 30 Dec 2012 17:05:56 -0800:
    >> The Web archive for this very list, needs a fix as well …
    >
    >
    > The way to formally request any action by the Unicode Consortium is
    > via the contact form (found on the home page).

    Good idea. Done!

    Turned out to only be - it seems to me - an issue of mislabeling the
    monthly index pages as ISO-8859-1 instead of UTF-8. Whereas the very
    messages themselves are archived correctly. And thus I made the
    request
    that they properly label the index pages.

    Happy new year!
    --
    leif h silli



Reply via email to