Re: Pupil's question about Burmese
On 11/10/2010 02:17 PM, Shawn Steele wrote: As mentioned, the "solution" is to fix the app to use Unicode. Especially for a language like this. In these cases, machines will be fairly inconsistent even if they did support some code page, but Unicode works most everywhere. Afaik there never has been a standard code page for Myanmar text, Unicode was the first time storage of Burmese text was standardised for computers. There are several different legacy font families in use for Myanmar each with their own slightly different mapping to Latin code points. The font in question has a Unicode cmap table, but the map is from Latin code points to glyphs, not from Myanmar code points to glyphs. There are also several fonts which map incorrectly from the Myanmar Unicode block using the Mon, Shan and Karen code points for glyph variants so the font can avoid having OpenType/Graphite/AAT rules. If anyone is having trouble installing genuine Myanmar Unicode fonts, then I have some instructions at http://www.thanlwinsoft.org/ThanLwinSoft/MyanmarUnicode/gettingStarted.php Keith
RE: Pupil's question about Burmese
FWIW: The OS really likes Unicode, so lots of the text input, etc, are really Unicode. ANSI apps (including non-Unicode web pages), get the data back from those controls in ANSI, so you can lose data that it looked like you entered. As mentioned, the "solution" is to fix the app to use Unicode. Especially for a language like this. In these cases, machines will be fairly inconsistent even if they did support some code page, but Unicode works most everywhere. Usually it's not difficult for a web page to switch to UTF-8. If it's a form, it's even possible that overriding it on your end might get the data posted back in UTF-8 and succeed (if you're really lucky), but the real fix is to have the web server serve Unicode. -Shawn http://blogs.msdn.com/shawnste From: unicode-bou...@unicode.org [unicode-bou...@unicode.org] on behalf of Peter Constable [peter...@microsoft.com] Sent: Tuesday, November 09, 2010 10:42 PM To: James Lin; Ed Cc: Unicode Mailing List Subject: RE: Pupil's question about Burmese A non-Unicode web page is like a non-Unicode app. Web pages, and apps, should use Unicode.' Peter -Original Message- From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf Of James Lin Sent: Tuesday, November 09, 2010 11:24 AM To: Ed Cc: Unicode Mailing List Subject: RE: Pupil's question about Burmese Oh, don't get me wrong. By having Unicode is like wearing a crown and be a king. It's best thing out there. What I am referring is, if a web page is not Unicode supported, or any applications that do not support Unicode, even if running a windows 7 with English locale(even though natively, it supports UTF-16), it is not possible to directly copy/paste without having the correct supported locale, if not, you may damaging the bytes of the characters which show corruptions. Even though most modern API is and hopefully written in Unicode calls, not all (legacy) applications are written in Unicode, so conversion is still necessary to even handling the non-ASCII data. Let me know if I am still missing something here. -Original Message- From: Ed [mailto:ed.tra...@gmail.com] Sent: Tuesday, November 09, 2010 11:02 AM To: James Lin Cc: Unicode Mailing List Subject: Re: Pupil's question about Burmese > > Yes, displaying is fine, but the original question is copying and > pasting; without the correct locale settings, you can’t copy/paste > without corrupting the byte sizes. Copy/paste is generally handle by > OS itself, not application. Even if you have unicode support > application, you can display, but you can’t handle none-ASCII characters. Why not? Modern Win32 OSes use UTF-16. Presumably most modern applications are written using calls to the modern API which should seamlessly support copy-and-paste of Unicode text, regardless of script or language -- so long as the script or language is supported at the level of displaying the text correctly and you have a font that works for that script. Actually, even if the text display is imperfectly (i.e., one sees square boxes when lacking a proper font, or even if OpenType GPOSs and GSUBs are not correct for a Complex Text Layout script like Burmese), copy-and-paste of the raw Unicode text should still work correctly. Is this not the case?
RE: Pupil's question about Burmese
You can copy and paste between Unicode-enabled apps to your heart's content. Only legacy, non-Unicode apps need system locale support. Peter From: James Lin [mailto:james_...@symantec.com] Sent: Tuesday, November 09, 2010 10:24 AM To: Peter Constable; Andrew Cunningham Cc: JP Blankert (thuis & PC based); Unicode Mailing List; stichtingburnout Subject: Re: Pupil's question about Burmese >> So, for instance, every copy of Windows 2000 or later versions is capable of >> displaying Hindi or Armenian text, regardless of the system locale setting; >> >>every copy of Windows Vista or later is capable of displaying, in >> addition, text in scripts such as Khmer and Ethiopic; and every copy of >> Windows 7 is, >>additionally, able to display text in scripts Tifinagh and >> Tai Le. In all these cases, the system locale setting has no bearing. Yes, displaying is fine, but the original question is copying and pasting; without the correct locale settings, you can't copy/paste without corrupting the byte sizes. Copy/paste is generally handle by OS itself, not application. Even if you have unicode support application, you can display, but you can't handle none-ASCII characters. On 11/8/10 6:22 PM, "Peter Constable" wrote: From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf Of Andrew Cunningham >> Your system locale has to handle the Burmese language. So you need to >> either install Windows 7 in Burmese or change under Regional / >> Language options in Control panel, under Adv tab. > well considering Burmese is a language that is not supported by Microsoft ... > the above is relatively irrelevant. At whatever point Burmese _is_ supported in Windows, system locale will not be relevant. To be clear, the legacy Windows notion of system locale is relevant only in relation to apps that support only legacy Windows encodings, not Unicode. There is no system locale support for languages such as Hindi or Armenian or Khmer, but that does not prevent display of text in those scripts in Unicode-capable applications. So, for instance, every copy of Windows 2000 or later versions is capable of displaying Hindi or Armenian text, regardless of the system locale setting; every copy of Windows Vista or later is capable of displaying, in addition, text in scripts such as Khmer and Ethiopic; and every copy of Windows 7 is, additionally, able to display text in scripts Tifinagh and Tai Le. In all these cases, the system locale setting has no bearing. Peter
RE: Pupil's question about Burmese
A non-Unicode web page is like a non-Unicode app. Web pages, and apps, should use Unicode.' Peter -Original Message- From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf Of James Lin Sent: Tuesday, November 09, 2010 11:24 AM To: Ed Cc: Unicode Mailing List Subject: RE: Pupil's question about Burmese Oh, don't get me wrong. By having Unicode is like wearing a crown and be a king. It's best thing out there. What I am referring is, if a web page is not Unicode supported, or any applications that do not support Unicode, even if running a windows 7 with English locale(even though natively, it supports UTF-16), it is not possible to directly copy/paste without having the correct supported locale, if not, you may damaging the bytes of the characters which show corruptions. Even though most modern API is and hopefully written in Unicode calls, not all (legacy) applications are written in Unicode, so conversion is still necessary to even handling the non-ASCII data. Let me know if I am still missing something here. -Original Message- From: Ed [mailto:ed.tra...@gmail.com] Sent: Tuesday, November 09, 2010 11:02 AM To: James Lin Cc: Unicode Mailing List Subject: Re: Pupil's question about Burmese > > Yes, displaying is fine, but the original question is copying and > pasting; without the correct locale settings, you can’t copy/paste > without corrupting the byte sizes. Copy/paste is generally handle by > OS itself, not application. Even if you have unicode support > application, you can display, but you can’t handle none-ASCII characters. Why not? Modern Win32 OSes use UTF-16. Presumably most modern applications are written using calls to the modern API which should seamlessly support copy-and-paste of Unicode text, regardless of script or language -- so long as the script or language is supported at the level of displaying the text correctly and you have a font that works for that script. Actually, even if the text display is imperfectly (i.e., one sees square boxes when lacking a proper font, or even if OpenType GPOSs and GSUBs are not correct for a Complex Text Layout script like Burmese), copy-and-paste of the raw Unicode text should still work correctly. Is this not the case?
RE: Pupil's question about Burmese
Oh, don't get me wrong. By having Unicode is like wearing a crown and be a king. It's best thing out there. What I am referring is, if a web page is not Unicode supported, or any applications that do not support Unicode, even if running a windows 7 with English locale(even though natively, it supports UTF-16), it is not possible to directly copy/paste without having the correct supported locale, if not, you may damaging the bytes of the characters which show corruptions. Even though most modern API is and hopefully written in Unicode calls, not all (legacy) applications are written in Unicode, so conversion is still necessary to even handling the non-ASCII data. Let me know if I am still missing something here. -Original Message- From: Ed [mailto:ed.tra...@gmail.com] Sent: Tuesday, November 09, 2010 11:02 AM To: James Lin Cc: Unicode Mailing List Subject: Re: Pupil's question about Burmese > > Yes, displaying is fine, but the original question is copying and > pasting; without the correct locale settings, you can’t copy/paste > without corrupting the byte sizes. Copy/paste is generally handle by > OS itself, not application. Even if you have unicode support > application, you can display, but you can’t handle none-ASCII characters. Why not? Modern Win32 OSes use UTF-16. Presumably most modern applications are written using calls to the modern API which should seamlessly support copy-and-paste of Unicode text, regardless of script or language -- so long as the script or language is supported at the level of displaying the text correctly and you have a font that works for that script. Actually, even if the text display is imperfectly (i.e., one sees square boxes when lacking a proper font, or even if OpenType GPOSs and GSUBs are not correct for a Complex Text Layout script like Burmese), copy-and-paste of the raw Unicode text should still work correctly. Is this not the case?
Re: Pupil's question about Burmese
> > Yes, displaying is fine, but the original question is copying and pasting; > without the correct locale settings, you can’t copy/paste without corrupting > the byte sizes. Copy/paste is generally handle by OS itself, not > application. Even if you have unicode support application, you can display, > but you can’t handle none-ASCII characters. Why not? Modern Win32 OSes use UTF-16. Presumably most modern applications are written using calls to the modern API which should seamlessly support copy-and-paste of Unicode text, regardless of script or language -- so long as the script or language is supported at the level of displaying the text correctly and you have a font that works for that script. Actually, even if the text display is imperfectly (i.e., one sees square boxes when lacking a proper font, or even if OpenType GPOSs and GSUBs are not correct for a Complex Text Layout script like Burmese), copy-and-paste of the raw Unicode text should still work correctly. Is this not the case?
Re: Pupil's question about Burmese
>> So, for instance, every copy of Windows 2000 or later versions is capable of displaying Hindi or Armenian text, regardless of the system locale setting; >>every copy of Windows Vista or later is capable of displaying, in addition, text in scripts such as Khmer and Ethiopic; and every copy of Windows 7 is, >>additionally, able to display text in scripts Tifinagh and Tai Le. In all these cases, the system locale setting has no bearing. Yes, displaying is fine, but the original question is copying and pasting; without the correct locale settings, you can¹t copy/paste without corrupting the byte sizes. Copy/paste is generally handle by OS itself, not application. Even if you have unicode support application, you can display, but you can¹t handle none-ASCII characters. > On 11/8/10 6:22 PM, "Peter Constable" wrote: > From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf > Of Andrew Cunningham > >>> >> Your system locale has to handle the Burmese language. So you need to >>> >> either install Windows 7 in Burmese or change under Regional / >>> >> Language options in Control panel, under Adv tab. > >> > well considering Burmese is a language that is not supported by Microsoft >> ... the above is relatively irrelevant. > > At whatever point Burmese _is_ supported in Windows, system locale will not be > relevant. To be clear, the legacy Windows notion of system locale is relevant > only in relation to apps that support only legacy Windows encodings, not > Unicode. There is no system locale support for languages such as Hindi or > Armenian or Khmer, but that does not prevent display of text in those scripts > in Unicode-capable applications. So, for instance, every copy of Windows 2000 > or later versions is capable of displaying Hindi or Armenian text, regardless > of the system locale setting; every copy of Windows Vista or later is capable > of displaying, in addition, text in scripts such as Khmer and Ethiopic; and > every copy of Windows 7 is, additionally, able to display text in scripts > Tifinagh and Tai Le. In all these cases, the system locale setting has no > bearing. > > > > Peter > > >
Re: Pupil's question about Burmese
Dear Ngwe Tun, The forthcoming ICU 4.6 will include a Burmese locale (using CLDR data), with support for Burmese collation. http://site.icu-project.org/ Best regards, Peter Edberg On Nov 9, 2010, at 2:05 AM, Ngwe Tun wrote: > ... > > We are in dead-lock because without releasing Myanmar Opentype specifiction > for burmese by Microsoft. We can't implement burmese in opentype adopted > rendering engine like pango and harfbuzz. > > We are not satisify just typing burmese text and printing burmese text. We > want to have effective use of unicode data in burmese language processing > like spelling check, machine translation and OCR. > > ... > > I've encouraged to use Unicode standards among Myanmar Users. Myanmar Users > willing to use unicode standards in their works, personal and every > application. But there are no advantages in using Unicode Standards and CLDR > too. If Unicode.org make standards and do not apply those standards in > software and systems, how can we trust those standards. Myanmar Users do not > wait on Microsoft, Apple, Oracle implementation. They are going wrong or > breakthrough solution. > >
Re: Pupil's question about Burmese
Dear Peter Constable, * Burmese_is_supported in windows.* It makes worse than ever to create another story like pseudo-unicode like Zawgyi in Windows. too. We are in dead-lock because without releasing Myanmar Opentype specifiction for burmese by Microsoft. We can't implement burmese in opentype adopted rendering engine like pango and harfbuzz. We are not satisify just typing burmese text and printing burmese text. We want to have effective use of unicode data in burmese language processing like spelling check, machine translation and OCR. So, Do we need system locale for Burmese? How about CultureInfo for Microsoft .Net Framework. I've encouraged to use Unicode standards among Myanmar Users. Myanmar Users willing to use unicode standards in their works, personal and every application. But there are no advantages in using Unicode Standards and CLDR too. If Unicode.org make standards and do not apply those standards in software and systems, how can we trust those standards. Myanmar Users do not wait on Microsoft, Apple, Oracle implementation. They are going wrong or breakthrough solution. Again. I have to say caution about ethnics language. We should take care about Mon, Shan and Karen Language which is encoded in Unicode 5.1 But Microsoft didn't assign yet for those language in Windows 7 I'm trying to get Burmese Language Pack in Microsoft Windows .since 2002. I gave up and no more try to get it. Microsoft not waiting stable Standards, Politics and/or Technical. I don't not any of reason for delaying our beloved language. Thanks for reading it and support for 40 million speaking language. We did petition to Microsoft at http://petition.myanmarlanguage.org/ http://my.wiktionary.org is the good dictionary site. It is started but not yet finisned. Best Ngwe Tun On Tue, Nov 9, 2010 at 8:52 AM, Peter Constable wrote: > From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On > Behalf Of Andrew Cunningham > > >> Your system locale has to handle the Burmese language. So you need to > >> either install Windows 7 in Burmese or change under Regional / > >> Language options in Control panel, under Adv tab. > > > well considering Burmese is a language that is not supported by Microsoft > ... the above is relatively irrelevant. > > At whatever point Burmese _is_ supported in Windows, system locale will not > be relevant. To be clear, the legacy Windows notion of system locale is > relevant only in relation to apps that support only legacy Windows > encodings, not Unicode. There is no system locale support for languages such > as Hindi or Armenian or Khmer, but that does not prevent display of text in > those scripts in Unicode-capable applications. So, for instance, every copy > of Windows 2000 or later versions is capable of displaying Hindi or Armenian > text, regardless of the system locale setting; every copy of Windows Vista > or later is capable of displaying, in addition, text in scripts such as > Khmer and Ethiopic; and every copy of Windows 7 is, additionally, able to > display text in scripts Tifinagh and Tai Le. In all these cases, the system > locale setting has no bearing. > > > > Peter > > > > >
RE: Pupil's question about Burmese
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf Of Andrew Cunningham >> Your system locale has to handle the Burmese language. So you need to >> either install Windows 7 in Burmese or change under Regional / >> Language options in Control panel, under Adv tab. > well considering Burmese is a language that is not supported by Microsoft ... > the above is relatively irrelevant. At whatever point Burmese _is_ supported in Windows, system locale will not be relevant. To be clear, the legacy Windows notion of system locale is relevant only in relation to apps that support only legacy Windows encodings, not Unicode. There is no system locale support for languages such as Hindi or Armenian or Khmer, but that does not prevent display of text in those scripts in Unicode-capable applications. So, for instance, every copy of Windows 2000 or later versions is capable of displaying Hindi or Armenian text, regardless of the system locale setting; every copy of Windows Vista or later is capable of displaying, in addition, text in scripts such as Khmer and Ethiopic; and every copy of Windows 7 is, additionally, able to display text in scripts Tifinagh and Tai Le. In all these cases, the system locale setting has no bearing. Peter
Re: Pupil's question about Burmese
Your system locale has to handle the Burmese language. So you need to either install Windows 7 in Burmese or change under Regional / Language options in Control panel, under Adv tab. On 11/8/10 1:26 PM, "JP Blankert (thuis & PC based)" wrote: > http://www.burmese-dictionary.org/tastatur.php?terme=hotel&termb=%5Bkdw%2Cf&id > =2970 > > Dear Unicoders, > > As you know I am only still a pupil in unicode, am studying, but the > answer to my next question may be a leap forward in my learning. > > Burmese: > http://www.burmese-dictionary.org/tastatur.php?terme=hotel&termb=%5Bkdw%2Cf&id > =2970 > > I installed the fonts recommended in the dictionary, have Windows 7 and > Opera 10.6. When I type 'hotel' it is translated into Burmese. But when > I want to copy and paste the word, on the very same page, > > [kdw, > > appears. Explanations, tips? What happens on your screens? > > Thanks, br, Philippe Blankert > > >
Re: Pupil's question about Burmese
Hello Philippe Blankert - Thanks for your interest in Unicode... http://www.burmese-dictionary.org/tastatur.php?terme=hotel&termb=%5Bkdw%2Cf&id=2970 That page isn't in Unicode at all, it's an 8859-1 encoded page. That's part of the problem. Then, the Burmese characters on the page are all *images*, and when you click the buttons to type into the field, it seems to send ASCII text to the input field. And the WWin_Burmese1 font, which I just downloaded to check, is an "ASCII-hack" font that is not encoded in Unicode. Hope that helps. Rick
Re: Pupil's question about Burmese
OK, I sent an email to the web site contact address. Who knows if that will help? Maybe if a *lot* of people bombard them with emails, they will get the message ... : >> from Ed >> to diction...@burmese-dictionary.org >> date Mon, Nov 8, 2010 at 5:07 PM >> subject Please Convert Your Website to Unicode! >> mailed-bygmail.com >> >> >> Hi! >> >> Your website is not encoded using Unicode. >> >> This causes great confusion for your users. For example, they cannot >> "copy and paste" Burmese text because the text is really just ASCII >> Latin letters with a "hacked" Burmese font. >> >> It would be much better if you would convert your web site to use Unicode. >> >> Thank you! >> >> Sincerely -- Ed Trager >> On Mon, Nov 8, 2010 at 4:49 PM, Rick McGowan wrote: > Hello Philippe Blankert - > > Thanks for your interest in Unicode... > >> >> http://www.burmese-dictionary.org/tastatur.php?terme=hotel&termb=%5Bkdw%2Cf&id=2970 > > That page isn't in Unicode at all, it's an 8859-1 encoded page. That's part > of the problem. > > Then, the Burmese characters on the page are all *images*, and when you > click the buttons to type into the field, it seems to send ASCII text to the > input field. > > And the WWin_Burmese1 font, which I just downloaded to check, is an > "ASCII-hack" font that is not encoded in Unicode. > > Hope that helps. > > Rick > > >