Re: Euro
At 12:13 PM 7/28/00 -0800, Roozbeh Pournader wrote: I was not talking about the shape. I think all of us have seen it, and many have also read the documents which define its exact shape using a ruler and a compass. I was talking about the origin of the shape. In some sense, except for purists, this discussion is rapidly becoming moot. The 'euro glyphs' have been out in the wild, on shop displays, in newsprint etc. for well over a year now. If you will, the 'common man's' idea of what a proper Euro glyph is, is fast becoming influenced by what he sees on a daily basis, not by the origin of the glyph or by the logo (which is prescribed only for its appearance on the currency itself). Given the name, I'm sure even the 'non-European' font designers that Werner likes to blame aren't suggesting that the logo for the 'e'uro is based on a 'c'. However, when you try to put the thing together with the serifs used in many of the common type faces, the result can indeed look a bit like a 'c'. This seems particularly true for monospaced fonts. A./
Re: Euro
Yeah, how WOULD you make a serifed, rounded E that doesn't look silly and doesn't look like a C with an extra line? Well, maybe you can, I dunno. Anyone who can do that, I'd like to see it. -- Robert Lozyniak Accusplit pedometer manufactures can go suck eggs My page: http://walk.to/11 [EMAIL PROTECTED] - email (917) 421-3909 x1133 - voicemail/fax Asmus Freytag [EMAIL PROTECTED] wrote: At 12:13 PM 7/28/00 -0800, Roozbeh Pournader wrote: I was not talking about the shape. I think all of us have seen it, and many have also read the documents which define its exact shape using a ruler and a compass. I was talking about the origin of the shape. In some sense, except for purists, this discussion is rapidly becoming moot. The 'euro glyphs' have been out in the wild, on shop displays, in newsprint etc. for well over a year now. If you will, the 'common man's' idea of what a proper Euro glyph is, is fast becoming influenced by what he sees on a daily basis, not by the origin of the glyph or by the logo (which is prescribed only for its appearance on the currency itself). Given the name, I'm sure even the 'non-European' font designers that Werner likes to blame aren't suggesting that the logo for the 'e'uro is based on a 'c'. However, when you try to put the thing together with the serifs used in many of the common type faces, the result can indeed look a bit like a 'c'. This seems particularly true for monospaced fonts. A./ ___ Get your own FREE Bolt Onebox - FREE voicemail, email, and fax, all in one place - sign up at http://www.bolt.com
Re: Euro
I found it! Everybody's invited to take a look at: http://www.tug.org/TUGboat/Articles/tb19-2/tb59inn.pdf On Sat, 29 Jul 2000, Asmus Freytag wrote: If you will, the 'common man's' idea of what a proper Euro glyph is, is fast becoming influenced by what he sees on a daily basis, not by the origin of the glyph or by the logo (which is prescribed only for its appearance on the currency itself). Ok, but I only want to know about the historical origins. Given the name, I'm sure even the 'non-European' font designers that Werner likes to blame aren't suggesting that the logo for the 'e'uro is based on a 'c'. However, when you try to put the thing together with the serifs used in many of the common type faces, the result can indeed look a bit like a 'c'. This seems particularly true for monospaced fonts. Take a look at the referenced article.
Encoding of non-characters
Did I read recently (in a message that I shortsightedly deleted) something to the effect that a character encoding scheme (CES) or transfer encoding syntax (TES) needs to be able to encode the non- characters U+D800 through U+DFFF, and presumably U+xxFFFE and U+xx as well? I've been playing around with a TES (or maybe it's a CES; I'm still having a little trouble knowing exactly where to draw the line). Don't worry, I'm not going to propose it anywhere as Yet Another UTF. I'm just playing around with Unicode, and hopefully teaching myself something along the way. Anyway, my scheme encodes non-BMP characters not *as* surrogates, but using the surrogate mechanism in a slightly modified way. Like UTF-16, this makes it impossible to encode the BMP non-characters in the range U+D800 through U+DFFF. Normally I wouldn't think this was a problem, but I thought someone (Davis?) just said recently that it should be possible to round-trip these thingies, for some reason. The situation would be different in the case of U+xxFFFE and U+xx, because while the surrogates occupy entire ranges that can be utilized in a special way, you kind of have to *deliberately* exclude the FFFx characters. Nonetheless, the same question applies: Must these bogus code points be representable in a CES or TES, or can they be handled conformantly by raising an error or mapping them to U+FFFD? -Doug Ewell Fullerton, California
Re: Display Persian characters under Linux
Darya Said-Akbari wrote: Hi, this is my firts email to the Unicode email list. There is a lot I want to learn from you all. So even if my questions are sometimes stupid, nevertheless I like to read your answer on all issues. The only stupid question is the one you didn't ask. My goal turning my interest on Unicode is to get Persian letters on my Monitor and into a database lets say in Oracle8i. The operating system will be Linux. So, what I have done until now is to buy the Unicode Standard 3.0. But that is not enough and therefore I need your help. The short answer is that Oracle supports Arabic script data entry from files or keyboard, and you should check your manuals, on-line resources at Oracle.com and Oracle user groups, and tech support (in that order) for information on importing files and setting data input modes. You should also check Linux.org and the Linux user group sites for information on font and keyboard mapping file availability and installation procedures. For example, send the message subscribe linux-utf8 to the e-mail address [EMAIL PROTECTED] to join a mailing list discussion entirely devoted to issues of Unicode on Linux. What steps do I have to do to get my dream real. Yes, I have several character sets on my machine but they are all european one. And honestly I am a little bit afraid to touch them, Yes, leave them alone, since you need them for other purposes. since I dont know the different between a character set and unicode. UNIX systems have a font mapping file (name, please, somebody?) containing character set information, including the mapping between PostScript font names and UNIX font names. The UNIX names include the standard ID for the covered character set. Most of what you have will be identified as 8859-1, which is ASCII plus Latin-1. The current Arabic script font standard (which covers Farsi) is ISO 8859-6. You may see ECMA-114 or ASMO 449 mentioned in some places. Yes, you need to find Farsi fonts encoded in either 8859-6 or Unicode. Any search engine can find a number of sites for you. Reading the first pages of the book, makes me more confuse. There is something talken about rendering. It seems when I use the ARABIC letters I have to concern on rendering. Reading the Unicode Standard 3.0 through from the beginning is definitely not recommended. Skip to page 189, where the description of Arabic script rendering begins, and be sure to look at the code chart and notes for U-0600-U-067F, Arabic, pp. 389-395. Also skim through the Bidirectional Behavior section starting on p. 55. Bidi only matters to you if your Farsi data is sometimes mixed with material in other writing systems. Is there anybody who can give me a quickstart to get rid of confusing charsets, unicode, rendering etc.? I know I have to put more time on this issue and I am prepared for this. But a little success would really motivate me. Yes. You can prepare test data files for convenient import into Oracle in any Farsi-capable word processor or spreadsheet program that reads and writes ISO-8859-6 and UTF-8. You don't have to wait until you have the fonts and keyboard layouts installed on your Linux system. You can also get Oracle to generate Farsi output files to be viewed on a different system. best regards Darya Said-Akbari -- Edward Cherlin Generalist "A knot!" exclaimed Alice. "Oh, do let me help to undo it." Alice in Wonderland
Re: Bangla(Bengali) letter Missing
Now I came to the conclusion that there is a way to represent khando-ta in Standard and that is quite satisfactory. However some indications are confusing. So I am writing my understanding, Ta + Virama + ZWNJ = ta with explicit virama Ta + Virama + consonant = Conjunct (ta + consonant) Ta + Virama = Khando-ta (while occurs final ) Ta + Virama + ZWJ = Khando-ta (explicit half - consonant) This was my suggestion: [Ta] [virama] - [khando-ta] (when final) [Ta] [virama] [ZWNJ] - [khando-ta] [Ta] [virama] [] - [appropriate conjunt form] [Ta] [Virama] [ZWJ] - [Ta Virama] The difference is only ZWNJ and ZWJ after virama. I think you should try the guidelines of Unicode 3.0 standards. My opinion follows the guideline. Since all indic languages are derived from sanskreet hence I think the guideline for devanagari is not absolutely useless for Bangla. of the 'Bengali Script' and *not* the 'Bangladeshi language'. Assami and monipuri writers *do* make the distinction, but they have the luxury of being able to use Assami Va (U+09F1) as well as Ba (U+09AC) to produce the two forms shown in my gif. Speakers of Bangla make the distinction of the two forms depending only on context. e.g.. svaamii is spelt sbami and pronunced shami and not sbami by Bangladeshis, whilst in Assamiya it is spelt svami (with U+09F1) not sbami. My question is, should speakers of Bangla be restricted to be able to form only the common forms, or should there be a way for us to produce both forms shown? Or perhaps do you expect us (Bangladeshis) to use the assami Va? In the grammar book by Munir chowdhury, Mofazzal Haider Ch. and Ibrahim Khalil (Text book for S.S.C), vba is omitted from the bangla character set. It is confusing for common people. So I think the decision is wise. Regards, Zia
Re: Display Persian characters under Linux
On Sat, 29 Jul 2000, Edward Cherlin wrote: The current Arabic script font standard (which covers Farsi) is ISO 8859-6. You may see ECMA-114 or ASMO 449 mentioned in some places. ISO 8859-6 does not cover Farsi. There are at least six missing important letters, PEH, TCHEH, JEH, KEHEH, GAF, and FARSI YEH. Yes, you need to find Farsi fonts encoded in either 8859-6 or Unicode. Any search engine can find a number of sites for you. There's no Farsi font encoded in 8859-6 because of the stated reason. Bidi only matters to you if your Farsi data is sometimes mixed with material in other writing systems. Or you have alphabetic data mixed with numerical data. Bidi is needed even for pure Persian text, since the numbers are written left-to-right. --roozbeh
Re: FW: Oracle and Surrogate Pairs
At 2:41 AM -0800 7/25/2000, [EMAIL PROTECTED] wrote: Hi all, I have been developing/convering a software to support multiple languages, especially Japanese, Korean and later on French etc. i have increased all the required fields by a factor of 3. Keeping "True, but within a year or so, there *will* be surrogates assigned in Unicode. " in mind what do you think i should be the value of this "factor" i should have. I think you should upgrade to software that handles variable length fields, or to programming practices that make use of them. Thanks Regards, Samir Mehrotra, i-flex Solutions Limited, a CitiCorp venture capital company at SEI-CMM level 5. [EMAIL PROTECTED] -Original Message- From: John H. Jenkins [SMTP:[EMAIL PROTECTED]] Sent: Tuesday, July 25, 2000 8:12 AM To: Unicode List Subject:Re: Oracle and Surrogate Pairs Does the field in question need to support literally any possible character in Unicode 3.0 and beyond (since 3.0 does not have any surrogates assigned!)? True, but within a year or so, there *will* be surrogates assigned in Unicode. One cannot be premature in supporting them at this point. = John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://www.blueneptune.com/~tseng -- Edward Cherlin Generalist "A knot!" exclaimed Alice. "Oh, do let me help to undo it." Alice in Wonderland
Re: Display problems
The best I've got so far is: Java To allow Java applets (and/or programs) to draw Unicode characters in the fonts you have available, you will need to hand-edit the font.properties files that the Java runtime uses. Since you may have several Java runtimes installed on your machine (for different browsers, development environments, etc), you will need to search for all the files containing the letters "font.properties". These files may also be in .jar files, depending on your configuration. Once you have found the files, Sun provides instructions [http://java.sun.com/products/jdk/1.1/docs/guide/intl/fontprop.html] on how to edit them to add new fonts. (This may take some patience: the description is not exactly straightforward.) Edward Cherlin wrote: At 6:41 AM -0800 7/25/2000, Mark Davis wrote: The issue of how to get Java to display Unicode characters comes up periodically. Since the instructions on how to do it are fairly arcane (hand-editing the font.properties files), I'd like to see someone compose a short description to add to the material we have on http://www.unicode.org/help/display_problems.html. If there is already a good description in a persistent page, we could just provide a link to it. Any volunteers? Mark Since I want to display Unicode in Java, and haven't yet found out the gory details, I volunteer to accept all available information, figure out what it means, test it on Windows, Mac, and Linux (Red Hat and Yellow Dog), and write it up properly. If someone else comes up with a complete procedure, I still volunteer to test it, and to edit it with illustrations (screen shots, diagrams, and code). I can produce multilingual documents in MSWord, FrameMaker, PDF, HTML, and TeX. Does anybody know whether this process could be reduced to an installation utility written in Java? -- Edward Cherlin Generalist "A knot!" exclaimed Alice. "Oh, do let me help to undo it." Alice in Wonderland
Re: Display problems
Hi, Some time ago I wrote up how-to-edit font.properties file for Communicator users. In it I illustrate how to set up multilingual display with Cyberbit font. The file has been available from the International Users page under Communicator's HELP menu. I hope to update this document slightly in the near future. http://home.netscape.com/eng/intl/jdkfontinfo.html - Kat Mark Davis wrote: The best I've got so far is: Java To allow Java applets (and/or programs) to draw Unicode characters in the fonts you have available, you will need to hand-edit the font.properties files that the Java runtime uses. Since you may have several Java runtimes installed on your machine (for different browsers, development environments, etc), you will need to search for all the files containing the letters "font.properties". These files may also be in .jar files, depending on your configuration. Once you have found the files, Sun provides instructions [http://java.sun.com/products/jdk/1.1/docs/guide/intl/fontprop.html] on how to edit them to add new fonts. (This may take some patience: the description is not exactly straightforward.) Edward Cherlin wrote: At 6:41 AM -0800 7/25/2000, Mark Davis wrote: The issue of how to get Java to display Unicode characters comes up periodically. Since the instructions on how to do it are fairly arcane (hand-editing the font.properties files), I'd like to see someone compose a short description to add to the material we have on http://www.unicode.org/help/display_problems.html. If there is already a good description in a persistent page, we could just provide a link to it. Any volunteers? Mark Since I want to display Unicode in Java, and haven't yet found out the gory details, I volunteer to accept all available information, figure out what it means, test it on Windows, Mac, and Linux (Red Hat and Yellow Dog), and write it up properly. If someone else comes up with a complete procedure, I still volunteer to test it, and to edit it with illustrations (screen shots, diagrams, and code). I can produce multilingual documents in MSWord, FrameMaker, PDF, HTML, and TeX. Does anybody know whether this process could be reduced to an installation utility written in Java? -- Edward Cherlin Generalist "A knot!" exclaimed Alice. "Oh, do let me help to undo it." Alice in Wonderland -- Katsuhiko Momoi Netscape International Client Products Group [EMAIL PROTECTED] What is expressed here is my personal opinion and does not reflect official Netscape views.
RE: FW: Oracle and Surrogate Pairs
Or you can use SQL Server w/o upgrading your exist databases. SQL Server 7.0 supports two character sets - CHAR for legacy character set and NCHAR for Unicode (UCS-2). SQL is surrogate safe. You can store surrogates in the NCHAR data column. Michael -Original Message- From: Edward Cherlin [mailto:[EMAIL PROTECTED]] Sent: Saturday, July 29, 2000 2:33 PM To: Unicode List Subject: Re: FW: Oracle and Surrogate Pairs At 2:41 AM -0800 7/25/2000, [EMAIL PROTECTED] wrote: Hi all, I have been developing/convering a software to support multiple languages, especially Japanese, Korean and later on French etc. i have increased all the required fields by a factor of 3. Keeping "True, but within a year or so, there *will* be surrogates assigned in Unicode. " in mind what do you think i should be the value of this "factor" i should have. I think you should upgrade to software that handles variable length fields, or to programming practices that make use of them. Thanks Regards, Samir Mehrotra, i-flex Solutions Limited, a CitiCorp venture capital company at SEI-CMM level 5. [EMAIL PROTECTED] -Original Message- From: John H. Jenkins [SMTP:[EMAIL PROTECTED]] Sent: Tuesday, July 25, 2000 8:12 AM To: Unicode List Subject:Re: Oracle and Surrogate Pairs Does the field in question need to support literally any possible character in Unicode 3.0 and beyond (since 3.0 does not have any surrogates assigned!)? True, but within a year or so, there *will* be surrogates assigned in Unicode. One cannot be premature in supporting them at this point. = John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://www.blueneptune.com/~tseng -- Edward Cherlin Generalist "A knot!" exclaimed Alice. "Oh, do let me help to undo it." Alice in Wonderland
Re: Bangla(Bengali) letter Missing
My question is, should speakers of Bangla be restricted to be able to form only the common forms, or should there be a way for us to produce both forms shown? Or perhaps do you expect us (Bangladeshis) to use the assami Va? In the grammar book by Munir chowdhury, Mofazzal Haider Ch. and Ibrahim Khalil (Text book for S.S.C), vba is omitted from the bangla character set. It is confusing for common people. So I think the decision is wise. Regards, Zia Okay, I think by this you mean that for bangla (language) only the common forms should be displayed? If that is so, it only leaves the problem of how to display Unicode encoded text, where the language is not known. I mean should 'da virama ba' be displayed as 'dba' or 'dva'? A bangladeshi would expect 'dva' (the common form) but an assamiya reader would expect 'dba' or 'da virama ba' to be displayed ('dva' being displayed only for da virama va) Hmm. I think then, that the default behaviour of an application should be to render the common forms, and only display the other forms, or with a visible virama, when the language is known not to be bangla. (But I'm biased, I doubt that someone from assam would agree with that.) (Another solution would be to insist that assami writers always insert a ZWNJ / ZWJ before their Ba's so that we don't confuse them with Va's ;-)) That only leaves the problem of how to deal with Assamiya text quoted within Bangla text. Oh well, I think i'll leave that for another day. In any case, I think distinguishing between Ba and Va is only going to be a problem in rare circumstances. Best regards Abdul