Re: ISO 8859-11 (Thai) cross-mapping table

2002-10-18 Thread Arthit Suriyawongkul
Doug Ewell wrote:
> 
> Arthit Suriyawongkul  wrote:
> 
> >> These 9 code positions (0xA0, 0xDB..0xDE, 0xFC..0xFF) appear to be
> >> undefined in TIS 620.2533.  Reference [3] below does show a "word
> >> separator character" at 0xDC, which I interpret as U+200B ZERO WIDTH
> >> SPACE, but the other positions are still undefined.  So this may not
> >
> > does it mean 0xA0 / U+00A0 ?
> 
> As you mentioned in your other message, ISO/IEC 8859-11 does assign
> U+00A0 NO-BREAK SPACE to code position 0xA0, but TIS 620 does not.
> 
> If you are asking whether the "word separator character" I mentioned
> should be U+00A0, I don't think that would be a good idea for Thai since
> U+00A0 is supposed to be rendered as a visible space (like U+0020) but
> without breaking lines between the surrounding words.  My understanding
> is that visible spaces are not used in Thai.  So U+00A0 would do exactly
> the wrong thing for Thai, whereas U+200B would do the right thing
> (indicating a word boundary without displaying a space).  This is also
> officially recommended by the Unicode Standard.
> 
> -Doug Ewell
>  Fullerton, California


I think your idea is better, Doug :)

"word separator character" should be invisible, you're right.

thanks,
art




Re: ISO 8859-11 (Thai) cross-mapping table

2002-10-07 Thread Arthit Suriyawongkul

>>>You can get Unicode-format mapping tables for TIS 620 and many other
>>>encodings at http://crl.nmsu.edu/~mleisher/csets.html
>>
>>Thanks. Looking at that, it appears the mapping is imperfect. There
>>are about 10 characters in TIS-620 that are mapped to the Unicode
>>replacement character. This is from 1998 though. Has Unicode's Thai
>>support improved any in later versions?
> 
> These 9 code positions (0xA0, 0xDB..0xDE, 0xFC..0xFF) appear to be
> undefined in TIS 620.2533.  Reference [3] below does show a "word
> separator character" at 0xDC, which I interpret as U+200B ZERO WIDTH
> SPACE, but the other positions are still undefined.  So this may not be

does it mean 0xA0 / U+00A0 ?

regards,
Art





Re: ISO 8859-11 (Thai) cross-mapping table

2002-10-06 Thread Arthit Suriyawongkul

Dear Elliotte,

ISO-8859-11 has been finished for almost year :)

for refs:
ISO-8859-11 8-bit single-byte coded graphic character sets -- Part 11: 
Latin/Thai alphabet
http://www.iso.ch/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=28263&ICS1=35&ICS2=40&ICS3=

or see its draft at
http://www.nectec.or.th/it-standards/iso8859-11/


ISO-8859-11 = TIS-620 + { NBSP }

NBSP = Non-Breaking Space (0xA0 / U+00A0)



from http://crl.nmsu.edu/~mleisher/csets.html those "REPLACEMENT CHARACTER".
imho, they should be replace with "UNDEFINED",

since in other unicode mapping of CP874 (another Thai encoding,
which is almost the same as Windows-874, the TIS-620 varaints)
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT
they also use "UNDEFINED".


you may interesting in some comments from
http://bugzilla.mozilla.org/show_bug.cgi?id=127755

:)

regards,
Art


Elliotte Rusty Harold wrote:
> The Unicode data files at 
> http://www.unicode.org/Public/MAPPINGS/ISO8859/ do not include a mapping 
> for ISO-8859-11, Thai. Is there any particular reason for this? Is 
> ISO-8859-11 unfinished or deprecated or unable to be mapped to Unicode 
> or some such? If none of these things are true, is there a mapping chart 
> for this set somewhere?
> 
> I'm working on adding output support for the ISO sets to XOM 
> (http://www.cafeconleche.org/XOM/) and need the mapping tables to figure 
> out which characters need to be esacped in which character sets.
> 
> -- 
> Elliotte Harold
> 







TIM - A Table-base Input Method Module

2002-07-22 Thread Arthit Suriyawongkul

anybody here interesting in this Table-based Input Method ?

  http://sourceforge.net/projects/wenju/


i've got this site from gtk-i18n-list.

:)

regards,
Art


 Original Message 
Subject: Re: TIM - A Table-base Input Method Module
Date: Sun, 21 Jul 2002 09:03:06 -0400
From: Daniel Yacob <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED], [EMAIL PROTECTED]

many months later...


 > Now I just finished such a IM module which you can find it at

 >   http://sourceforge.net/projects/wenju/

 > I call it TIM (Table-based Input Method).  I haven't released a package
 > yet, but it is in the CVS.


I do like this idea, if I were to give a wish list of features I'd like
to see in an IM description file I'd no doubt end up describing what
Keyman uses.  Perhaps because it is what I'm most familiar with but it
also some nice expressive syntax.

It has occured to me before that it would be nice to be able to import
keyman .kmn files directly.  Has an XML definition for IMs ever been
developed?  It would *really* be nice to have some kind of universal
vendor independent, IM definition, like unicode is to charsets.

Could TIM be taken in this direction?  Towards a XIM?  I'd be happy to
participate in defining an XML schema for it.  Anyone interested?

cheers,

/Daniel
___
gtk-i18n-list mailing list
[EMAIL PROTECTED]
http://mail.gnome.org/mailman/listinfo/gtk-i18n-list





Re: What is TISI character Code?

2002-07-12 Thread Arthit Suriyawongkul

Dear Sreedhar,

for Thai Industrial Standard character set, it's TIS-620.

to makes your apps support Thai, please try consisder about conforming
these standards.


TIS 620-2533 (1990) Standard for Thai Character Codes for Computers
UDC 681.3.04:003.62   ISBN 974-606-153-4

TIS 820-2538 (1995) Layout of Thai Character Keys on Computer Keyboards
UDC 681.3.02:003.62   ISBN 974-607-416-4

TIS 1566-2541 (1998) Thai Input/Output Methods for Computers
ICS 35.060ISBN 974-607-898-4


Thai Industrial Standards Institute, Ministry of Industrial
http://www.tisi.go.th
[EMAIL PROTECTED]

regards,
Art

Sreedhar.M wrote:
> Hi,
> I would lilke to make my application to Thai language compatible.In 
> that way I heard the term TISI character code.That's why I want to know 
> about the TISI character code.Please let me know if anybody have an idea 
> regarding this.
> Thanks in Advance.
> with Regards,
> Sreedhar M.






Re: [OT] Again about "Burma" and "Myanmar"

2002-06-26 Thread Arthit Suriyawongkul

Marco,

sorry that i can't answer your question,
but just for your information.

Thai people call Burma/Myanmar people/country as "Pha-ma"
(short "Pha", long "ma").

this supports the saying that
""Burma" would be a phonetic transcription".
(Ph sound, in Thai, is closer to B than M)

regards,
Art

Marco Cimarosti wrote:
> Time ago, someone said that "Burma" and "Myanmar" are just different
> romanization of the same Burmese word.
> 
> Particularly, "Burma" would be a phonetic transcription based on the English
> spelling, while "Myanmar" would be letter-by-letter transliteration of the
> Burmese spelling.
> 
> Can someone confirm or deny this? Can someone show me the word in original
> writing and a faithful (e.g. IPA) transcription of its sound?
> 
> Is it normal that Burmese "my-" sounds "b"?
> 
> Thanks in advance.
> 
> _ Marco
> 






Re: Thai character names

2002-06-02 Thread Art - Arthit Suriyawongkul

Hi,

> ] the Thai character names look like this:
> ] 
> ]   0E01THAI CHARACTER KO KAI
> ]   0E02THAI CHARACTER KHO KHAI
> ]   0E03THAI CHARACTER KHO KHUAT
> ] 
> ] I wonder what the `ko' and `kho' parts actually mean, since the `kai',
> ] `khai', etc. final parts already define an unambiguous name for the
> ] particular character.
> ] 
> ] How do Thai people alphabetize a word?  Do they say `ko kai', `kho
> ] khai', ... or do they say `kai', `khai', ...?
> 
> We do say 'ko kai', 'kho khai' ,  since the 1st day we learn Thai
> language in school ;)

yep, there're two parts of it.

(1)   (2)
(ko)  (kai)
(kho) (khai)
(kho) (khuat)

1st part tell us the 'phonetic' of that alphabet.
(in this case, semi-syllables of 'k-', 'kh-' and 'kh-')

2nd part tell us the 'visual/shape' of that alphabet.
'kho khai' and 'kho khuat' both give the same semi-syllable,
but they differ in shape (see unicode chart).

2nd part made it clear which alphabet we're talking about.
-- may be it just likes 'A for Ant', 'B for Boy', .. .
(anyway, each English alphabet doesn't share the same
 semi-syllable. so, it is not necessary to do that :) )


Art :)