Re: Unicode on a website
Elaine Keown <[EMAIL PROTECTED]> wrote: > Is there some automatic procedure that will happen soon, where a new > UTF-8 will come out that has all the Hebrew symbols from Unicode 2.0 > and 3.0? Does the increase in size of the Hebrew character set > interact with UTF-8 in some negative way? UTF-8 is just a way of expressing Unicode. Any of the 1,114,112 possible code points in Unicode, whether assigned or not, can be expressed in UTF-8. What this means is that there is no such thing as "a new UTF-8" that contains more characters than some previous UTF-8. If you find Web pages that don't have these additional characters (and you feel that they should), the problem is simply that the page was written using an earlier version of Unicode, or that the author of the page was unaware that the characters has been added. This has nothing to do with any limitation of UTF-8. -Doug Ewell Fullerton, California
Re: Unicode on a website
From: "Elaine Keown" <[EMAIL PROTECTED]> > I'm interested in using the more recent Unicode Hebrew versions on Web sites. These versions have about 30 more symbols for Hebrew Bible text than the original Unicode from the early 90s. > > But the UTF-8 versions I found on the Web only seem to have the early 90s version of Hebrew, and it doesn't have these 30 extra symbols. > > How does this work? Is there some automatic procedure that will happen soon, where a new UTF-8 will come out that has all the Hebrew symbols from Unicode 2.0 and 3.0? Does the increase in size of the Hebrew character set interact with UTF-8 in some negative way? Hi Elaine, This is not a UTF-8 issue at all; it is an issue with font support. As soon as a font supports the code points, you will see it display things approproately. If you consider the number of code points needed for some scripts, the ones required for scripts such as Hebrew is really no big deal. :-) michka Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/
Re: Unicode on a website
Hello, I'm interested in using the more recent Unicode Hebrew versions on Web sites. These versions have about 30 more symbols for Hebrew Bible text than the original Unicode from the early 90s. But the UTF-8 versions I found on the Web only seem to have the early 90s version of Hebrew, and it doesn't have these 30 extra symbols. How does this work? Is there some automatic procedure that will happen soon, where a new UTF-8 will come out that has all the Hebrew symbols from Unicode 2.0 and 3.0? Does the increase in size of the Hebrew character set interact with UTF-8 in some negative way? Thanks, Elaine ___ Free Unlimited Internet Access! Try it now! http://www.zdnet.com/downloads/altavista/index.html ___
RE: TATAP => TATAR
Cathy, I have found four references to support your contention. A reference to it using a Latin script, another to "Azari-Arabic, Azari-Cyrilic & Azari-Turkish", I found a Mac font system but I don't have a Mac to try it and I installed a True Type font that seems to produce both a dotted and an accented i. BTW I also found that it seems that there is a movement to Latinize Uigur as well that started about 1960. Carl -Original Message- From: Cathy Wissink [mailto:[EMAIL PROTECTED]] Sent: Tuesday, September 19, 2000 10:22 AM To: 'Carl W. Brown'; Unicode List Subject: RE: TATAP => TATAR I believe Azeri also uses the dotless i/dotted i Turkish-style casing. Cathy -Original Message- From: Carl W. Brown [mailto:[EMAIL PROTECTED]] Sent: Tuesday, September 19, 2000 9:03 AM To: Unicode List Subject: RE: TATAP => TATAR >-Original Message- >From: Herman Ranes [mailto:[EMAIL PROTECTED]] >Sent: Tuesday, September 19, 2000 6:30 AM >To: Unicode List >Cc: [EMAIL PROTECTED] >Subject: Re: TATAP => TATAR >Several Tatar language links here: >http://members.tripod.com/~anttikoski/eng_tatar.html >In particular, the Tatar-Bashkir latin alphabet is presented in RFE/RL's >site at >http://rferl.org/bd/tb/tatar/TATAR/abs.html >Are all these characters supported in UNICODE? I was unaware that they were moving back to the Latin alphabet. What jumps out at me is that case conversion code like the code that I just submitted for inclusion into ICU is wrong. Turkish is not the only language with dotted and dot less i. I assume that Tatar and Bashkir should follow the same rules as Turkish. Are there other languages? So I guess that I should check for "ba", "tt" & "tr" for special case shifting. I presume that the alphabet is listed in proper sort order? Carl
(no subject)
please remove me from this listÂ
Re: Can anyone help me!!!
From: "James Kass" <[EMAIL PROTECTED]> > > IE 5.5 support all of the Unicode Indian scripts. > > I just tried it on a couple of Devanagari sites > > because the English Windows comes with mangal > > true type font. > > May we see links to some of those pages? Here are a few such pages: http://www.trigeminal.com/index.asp?1081 http://www.trigeminal.com/frmrpt2dap.html?1081 http://www.trigeminal.com/frmrpt2dap_readme.htm?1081 They all use an explicit style for fonts in a CSS: { font-family:Mangal,Code2000,Arial Unicode MS; font-size:12pt; } Mangal I put in first since it is included in Windows 2000 and Arial Unicode MS I include last as the feedback I have gotten has found that Code2000 looks much better than it does for several Indic scripts. michka a new book on internationalization in VB at http://www.i18nWithVB.com/
FTP and UTF-8
Does anybody know of a publicly accessible FTP server that supports RFCs 2389 (negotiation of new features) and 2640 (internationalization)? Preferably one that allows anonymous uploads (for testing purposes)? In case you're not aware of these RFCs, they provide for UTF-8 based FTP. Thanks! - Frank
Re: Can anyone help me!!!
Here's a page about Indic scripts in Unicode which offers some pointers: http://www.tamil.net/people/sivaraj/unicode.html Carl W. Brown had written about testing Unicode Devanagari support by visiting some web pages. > IE 5.5 support all of the Unicode Indian scripts. > I just tried it on a couple of Devanagari sites > because the English Windows comes with mangal > true type font. May we see links to some of those pages? Best regards, James Kass, - Original Message - From: "sanatan mohanty" <[EMAIL PROTECTED]> To: "Unicode List" <[EMAIL PROTECTED]> Sent: Saturday, September 23, 2000 5:50 AM Subject: Can anyone help me!!! > > > Dear Friends!. > > How are you!. > > i have a project to make a webpage, which will be unicode enable. i can > show indian language fonts. i can type those fonts on the webpage itself > on text boxes!. and it should be atleast work on netscape and windows > explorer!, and atleast LINUX and Windows OS supports it!. > > so, can u people give me some brief ideas abt keyboard mapping, unicode > font setting, dispay setting > > > i will be grateful to you all for your help.. > > waiting for you kind response.. > > Regards, > > Sanatan > >
Re: Unicode on a website: ? Devanagari
You will find examples of Devanagari on the ICU locale explorer pages.. http://oss.software.ibm.com/icu/demo/ Try Marathi, Konkani, and Hindi. The encoding should be UTF-8 by default or you can change it at the bottom of the page. Hindi especialy has an extensive but incomplete list of translated language and country names. -s
RE: Unicode on a website
"Carl W. Brown" <[EMAIL PROTECTED]> wrote: > scsu makes sense for large blocks of data. Send the frame work in > utf-8 but use HTTP to request the bulk data in scsu. If it is a > small amount of data you don't want to pay the overhead of the > compression. SCSU was intentionally designed to be extremely low in overhead. This is one of the main differences between SCSU and most other compression schemes. > You don't need a BOM with UTF-8. Not for byte-ordering purposes, but it is often handy as a signature. Auto-detection of UTF-8 is not difficult, but not foolproof either -- there are legitimate sequences of Latin-1 characters that look like UTF-8. Using the signature EF BB BF at the beginning of a file is a more reliable indication that the file is UTF-8. -Doug Ewell Fullerton, California
Re: Unicode on a website: ? Devanagari
Perhaps because of poor support, there don't seem to be any substantial works in Devanagari Unicode on the web. Aside from test pages or charts, like the ones found on Alan Wood's excellent site, the best bet would seem to be to make your own pages. Naidunia uses dynamic fonts, but with a non-Unicode mapping. Mark Leisher has a perl script to convert Naidunia's pages to Unicode. http://clr.nmsu.edu/~mleisher/nai.html Best regards, James Kass, - Original Message - From: "Carl W. Brown" <[EMAIL PROTECTED]> To: "Unicode List" <[EMAIL PROTECTED]> Sent: Saturday, September 23, 2000 9:58 PM Subject: RE: Unicode on a website: ? Devanagari Chris, Just came across an interesting site: http://www.hclrss.demon.co.uk/unicode/ Follow some of the links. Carl -Original Message- From: Christopher J. Fynn [mailto:[EMAIL PROTECTED]] Sent: Saturday, September 23, 2000 1:15 PM To: Unicode List Subject: Re: Unicode on a website: ? Devanagari Anyone know of any Devanagari documents (Sanskrit, Hindi, Nepali) on the Web using UTF-8 (other than the pages at http://titus.uni-frankfurt.de/unicode/samples/rvbeispx.htm ) - especially any using Dynamic fonts? I am not interested in Devanagri sites using font based encodings. - Chris