Re: Chess symbols, ZWJ, Opentype and holly type ornaments.
William Overington WOverington at ngo dot globalnet dot co dot uk wrote: In view of the fact that some people are unwilling to let my ideas be discussed in this forum upon their academic merit but simply use an ad hominem attack almost every time I post (before many people can have the chance to sit down and, if they wish, have a serious read of my ideas), when it seems that their objection is really about the Unicode Consortium having included the word published in section 13.5 of chapter 13 of the Unicode specification, and they seem to angrily refute things which I have not said, I think that it would be best for me not to post details of my research in this forum. I don't recall seeing any ad hominem attacks. I do recall seeing a lot of criticism (attacks, if you will) of some of your ideas based on their merit, none based on the fact that they are William Overington's ideas. Also, as I have tried to convey before, many of us lead relatively busy lives and receive a lot of e-mail, and don't always have time to read through a post of 2,000 words or more. When it gets that long, it's better to post it on your Web site and send us an announcement. In Section 13.5, my objection was to the word promoted. It apparently gives the impression that characters can use the PUA as a stepping stone to full Unicode status, when in fact all characters are considered for inclusion in Unicode without regard to PUA implementations. Someone else may have had a problem with the word published. There also seems to be the problem of the great tidal wave that everybody is expected to be using the very latest equipment. I'm using a 166 MHz Pentium classic with 24 MB of RAM and Windows 95. So it's obvious you're not talking about me here. My understanding was that this forum was a place to ask questions of end users of the Unicode system. I have done that. In this thread I have asked interesting scientific questions. And gotten back some answers you didn't want to hear, namely that the ideas don't fall within the intended scope of Unicode and have already been (or can easily be) solved using other technologies or mechanisms. Ad hominem attacks have prevented those questions being discussed properly, possibly because some people may be too embarrassed to respond to the scientific questions when an atmosphere of ad hominem attack prevails. My understanding is that academic freedom is about being able to hold unpopular ideas without personal disadvantage. James Kass, for one, has responded positively to some of your inquiries. Academic freedom means everyone has a chance to listen to the ideas of others. Nobody has infringed upon your right to post your essays. (This is a moderated list, and Sarasvati could have withheld your postings if it were appropriate to do so, but it is not and she has not.) Academic freedom also means people have a right to object or criticize the ideas of others, or at least point out where the ideas are flawed. Ask anyone in the scientific or research community whether new ideas are always met with universal approval. I feel that the fact that I am trying to use the Unicode specification as it exists rather than on some nudge nudge wink wink understanding of how some people feel that it should be interpreted is at the root of the problem. See, I think it's the other way around. I just reread Section 13.5 and I don't see anything about the character-glyph model or other policies of Unicode being suspended for the PUA. The issue of not encoding additional ligatures isn't a secret; it's been published in several documents available on the Web. I have provided a pointer to one already; I can provide more if you like. The potentially interesting question of whether an OpenType fount may be programmed to produce a two colour display has not been discussed. Such a discussion could have either established that it could be done, or that it could not be done in which case perhaps some extension to OpenType could be produced for the future which could have that facility. If so, how would that facility best be produced? This is how progress is achieved. It is indeed a potentially interesting question. I'm not a font expert, so I have stayed out of that discussion. Note, however, that not all printers support color, so there would need to be an appropriate fallback mechanism for rendering the information that would have been displayed in a second color. -Doug Ewell Fullerton, California
Creative IDN Opportunities
Couldn't help but cringe at the last line of this press release. Can anyone give me a quick update on the status of IDN standards work? It's been a while since I checked it out... WEB Addresses Take On New Look As Multilingual Symbol-Based Capability Launched http://www.globalization.com/newsIndex.cfm?newsID=news57700062020024 Neteka Inc. today announced it's International Domain Name Server (DNS) has gone live on the dot.BZ Registry, extending the Internet's accepted character set for Uniform Resource Locator (URL) addresses from English letters and numbers, to the thousands of symbols and letters of the world's many alphabets. Web addresses supporting the world's languages give the non-English Internet community a chance to break free from the confines of the English alphabet. In addition, the expanded character set gives rise to untapped creative opportunities to combine symbols and letters for unique Web addresses.
RE: Google and Unicode
Is this strictly true? I think there are cases where the results are sent back ISO-8859-1. It would not surprise me if there was a more complex algorithm which tried to determine the requesting browser. I love UTF-8 but some older browsers do not tolerate it very well. Sigh. -Paul -Original Message- From: Roozbeh Pournader [mailto:[EMAIL PROTECTED]] Sent: Thursday, June 20, 2002 9:40 AM To: Unicode List Subject: Google and Unicode Did anyone notice that Google now uses Unicode (UTF-8) in displaying the search results? No more of that 'This page contains Russian characters that..' :) roozbeh
Ethiopic chromatic fonts (Was: Chess symbols, ZWJ, Opentype and holly type ornaments.)
At 10:32 6/20/2002, [EMAIL PROTECTED] wrote: The potentially interesting question of whether an OpenType fount may be programmed to produce a two colour display has not been discussed. Did you raise that question? That's something I might have noticed if it had been stated in a two-line post. But I didn't notice it, and I'm guessing it's because it was in the midst of some 500 lines. The question interests me because a while ago now I was amusing myself with the idea of being able to do this kind of thing in Graphite (another smart-font technology akin to OpenType) in order to emulate dual-coloured Ethiopic manuscripts -- specifically, I was thinking of a way to handle the paragraph marks that are done with four black dots interspersed with five red dots. Can an OpenType (or Graphite) font be programmed to do this? No. Should the technology be revised to accommodate this? There's not a clear enough case to warrent the increased complexity, I think. (But it would be possible to implement, and it's still amusing to imagine doing so.) Peter, what do you see as the options for achieving something like this? Some aspects of colour use in Ethiopic manuscripts can clearly be handled using markup (e.g. the small, raised red glyphs providing chant instructions, for which I'm wondering if existing ruby notation solutions might be easily adapted). The paragraph marker is a tricky problem, though. William will be thrilled to hear that one option would be to use a PUA codepoint for a zero-width combining character to represent the red dots, and include a variant glyph for the Ethiopic paragraph mark that contains only the black dots (or visa versa). If the user input the standard paragraph mark U+1368, the default glyph would be used, but if the user then input the PUA codepoint for the combining dots, we would contextually change the default paragraph mark for the variant with space for the combining dots. In VOLT notation: uni1368 - uni1368.black | uni1368.red This puts us in a position where we can use markup to colour the different elements. Unfortunately, this has to be done using a PUA codepoint, because markup applies to characters, not glyphs. By making the formation contextually dependent on the PUA character, we ensure that a correct paragraph sign is always shown if the PUA codepoint is not used or if a font is used that does not contain a glyph for the PUA codepoint (in which case you would get the paragraph sign followed by a .notdef glyph -- not pretty, but unambiguous). John Hudson Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED] Language must belong to the Other -- to my linguistic community as a whole -- before it can belong to me, so that the self comes to its unique articulation in a medium which is always at some level indifferent to it. - Terry Eagleton
Re: Hexadecimal characters.
-- On Thu, 20 Jun 2002 9:42:12 Frank da Cruz wrote: At 03:03 AM 6/20/02 -0400, Tom Finch wrote: I wish to propose sixteen consecutive digits for the purpose of displaying hexadecimal values. [...] Has this been considered? I seem to recall that it has. The problem is, they're just new copies of old characters. An A used in hexadecimal notation is just an A. Besides the problem with normalization, you have the problem with all look-alike characters - people won't use them consistently. Even if this got adopted, 99% of time you looked at hexadecimal numbers, they would be in plain old ASCII, so you don't really gain anything but confusion. It's a no-go. The proposal that was rejected is this one: ftp://kermit.columbia.edu/kermit/ucsterminal/hex.txt - Frank This is for full byte representation, which requires of course 256 rather than 16 characters. I looked at the code chart and there are many 16 character sequences empty. Oh and hi John Cowan, recognize me from hexadecimal lojban? _ Communicate with others using Lycos Mail for FREE! http://mail.lycos.com/
Re: Hexadecimal characters.
At 03:03 AM 6/20/02 -0400, Tom Finch wrote: I wish to propose sixteen consecutive digits for the purpose of displaying hexadecimal values. [...] Has this been considered? [David Starner] I seem to recall that it has. The problem is, they're just new copies of old characters. An A used in hexadecimal notation is just an A. Besides the problem with normalization, you have the problem with all look-alike characters - people won't use them consistently. Even if this got adopted, 99% of time you looked at hexadecimal numbers, they would be in plain old ASCII, so you don't really gain anything but confusion. It's a no-go. [Tom Finch] I looked at the code chart and there are many 16 character sequences empty. That is true enough -- but the more appropriate place to look is the BMP roadmap: http://www.unicode.org/roadmaps/bmp-3-6.html where you can see that many of those empty columns are already accounted for by roadmapped allocations for living minority scripts. The BMP is rather tight now for allocation, and it is unlikely that the committees are going to look kindly on miscellaneous collections of dubious stuff for encoding there. Of course there is plenty of space in Plane 1 for just about everything, but... That said, David Starner has this one right. There really is no good reason to create clones of 0..9, A..F to represent hexadecimal digits. The existing characters do that just fine, and represent an overwhelming legacy data representation precedent that any proposal such as Tom Finch's would have to cope with. Introducing new characters for these would just introduce confusion and would be unlikely to be implemented in any useful way. --Ken
RE: Google and Unicode
On Thu, 20 Jun 2002, Paul Deuter wrote: Is this strictly true? I think there are cases where the results are sent back ISO-8859-1. Might be. However, try a search on japanese with IE. The first page is, quite definitely, UTF-8. I'd say it's about time one of the major search engines went over to Unicode, big time, and what we have here seems like a big Go Girl! for Google. Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111 student/math+cs/helsinki university, http://www.iki.fi/~decoy/front openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Re: Ethiopic chromatic fonts (Was: Chess symbols, ZWJ, Opentype and hollytype ornaments.)
On 06/20/2002 01:34:34 PM John Hudson wrote: The question interests me because a while ago now I was amusing myself with the idea of being able to do this kind of thing in Graphite (another smart-font technology akin to OpenType) in order to emulate dual-coloured Ethiopic manuscripts -- specifically, I was thinking of a way to handle the paragraph marks that are done with four black dots interspersed with five red dots. Can an OpenType (or Graphite) font be programmed to do this? No. Should the technology be revised to accommodate this? There's not a clear enough case to warrent the increased complexity, I think. (But it would be possible to implement, and it's still amusing to imagine doing so.) Peter, what do you see as the options for achieving something like this? If by the options you mean what kind of mechanism would it take?, then it would amount to a substitution rule along the lines (using some pseudo notation) of gU1368 gU1368_a [colour = red] gU1368_b [colour = black] or gU1368 gU1368_a [colour = alt] gU1368_b [colour = default] If you means, what likelihood do you see of anyone implementing support for something like this, that would be slim. Some aspects of colour use in Ethiopic manuscripts can clearly be handled using markup (e.g. the small, raised red glyphs providing chant instructions, for which I'm wondering if existing ruby notation solutions might be easily adapted). The paragraph marker is a tricky problem, though. Indeed, since it requires a character's shape to be divided into two differently-coloured glyphs. It probably wouldn't be hard to implement a smart-font system that could support switching between default and alternate colours (where the actual colour choices are specified somewhere else in the system, with control handled using some kind of feature or attribute system (e.g. in the Graphite Description Language, this could easily be expressed as a glyph attribute), but I don't really expect anybody to implement this. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: [EMAIL PROTECTED]
Re: Creative IDN Opportunities
I think it is somehow tied into the whole ICANN political mess. I haven't sorted it out yet but I am interested if anyone else has... Barry Caplan www.i18n.com At 02:13 PM 6/20/2002 -0400, Suzanne M. Topping wrote: Couldn't help but cringe at the last line of this press release. Can anyone give me a quick update on the status of IDN standards work? It's been a while since I checked it out...
RE: Rotated Glyphs
Thanks to Jungshik Shin for the solution to the problem and to Marco for his comments; a corrected page reflecting both is up: http://www.columbia.edu/kermit/glass.html (if you looked at it before, you'll need to refresh the images). I also added a bit more about BIDI, using the Hebrew University ALEPH library system as an illustration. - Frank
Re: Hexadecimal characters.
-- On Thu, 20 Jun 2002 12:56:25 Kenneth Whistler wrote: At 03:03 AM 6/20/02 -0400, Tom Finch wrote: I wish to propose sixteen consecutive digits for the purpose of displaying hexadecimal values. [...] Has this been considered? [David Starner] I seem to recall that it has. The problem is, they're just new copies of old characters. An A used in hexadecimal notation is just an A. Besides the problem with normalization, you have the problem with all look-alike characters - people won't use them consistently. Even if this got adopted, 99% of time you looked at hexadecimal numbers, they would be in plain old ASCII, so you don't really gain anything but confusion. It's a no-go. [Tom Finch] I looked at the code chart and there are many 16 character sequences empty. That is true enough -- but the more appropriate place to look is the BMP roadmap: http://www.unicode.org/roadmaps/bmp-3-6.html where you can see that many of those empty columns are already accounted for by roadmapped allocations for living minority scripts. The BMP is rather tight now for allocation, and it is unlikely that the committees are going to look kindly on miscellaneous collections of dubious stuff for encoding there. Of course there is plenty of space in Plane 1 for just about everything, but... That said, David Starner has this one right. There really is no good reason to create clones of 0..9, A..F to represent hexadecimal digits. The existing characters do that just fine, and represent an overwhelming legacy data representation precedent that any proposal such as Tom Finch's would have to cope with. Introducing new characters for these would just introduce confusion and would be unlikely to be implemented in any useful way. --Ken Hmm, so representing Devanagari digits is more important than hexadecimal, which is used almost more than decimal on the web? I know inertia is a law of the universe, but this is rediculous. Hexadecimal is very important and deserves to be in Plane 0. I see a good spot in misc technical (23D--oh look hexadecimal again). _ Communicate with others using Lycos Mail for FREE! http://mail.lycos.com/
Re: Chess symbols, ZWJ, Opentype and holly type ornaments.
In view of the fact that some people are unwilling to let my ideas be discussed in this forum upon their academic merit but simply use an ad hominem attack almost every time I post (before many people can have the chance to sit down and, if they wish, have a serious read of my ideas), when it seems that their objection is really about the Unicode Consortium having included the word published in section 13.5 of chapter 13 of the Unicode specification, ... Speaking here as an editor of the Unicode Standard, I do not find the word published in section 13.5 of the book. Perhaps William was thinking of the subheader Promotion of Private-Use Characters. Since -- despite the explicit text that follows in that section -- some people seem to be getting the wrong idea about private-use character assignments as a step towards standardization, it is quite likely that the editorial committee will be rewriting that section for Unicode 4.0, to provide further clarification for users. I feel that the fact that I am trying to use the Unicode specification as it exists rather than on some nudge nudge wink wink understanding of how some people feel that it should be interpreted is at the root of the problem. If parts of the Unicode Standard are unclear and are leading to misinterpretations or incompatible interpretations of how characters should be used -- including private-use agreements for private-use characters, then airing those issues is certainly germane to this discussion list. I think what a number of people on the list have been hinting -- or openly stating -- is that prolixity is not a virtue on an email list when trying to convey one's ideas. --Ken
Re: MySQL 3.23.51 and unicode
On Thu, Jun 20, 2002 at 04:29:33PM +0700, Art - Arthit Suriyawongkul wrote: but as long as it can stores ASCII encoded text, it can also stores UTF-8 encoded text. (just store, not understand) if that's true, so, with some additional works (in user program layer, not MySQL), can we do support it ? Yes. LiveJournal.com, for example, uses UTF-8 and MySQL quite successfully. -- Evan Martin [EMAIL PROTECTED] http://neugierig.org
Re: Hexadecimal characters.
Tom Finch wrote: Hexadecimal is very important and deserves to be in Plane 0. Hmmm, well.. In this case, importance has nothing to do with it, and going off on a comparison of the importance of Devanagari as opposed to Hex will not prevail in this discussion. Hex is already representable with characters in plane zero, as people have been pointing out. There are the ten digits 0-9 and the letters A-F. People have explained this, and why your proposal would be confusing and not cater to legacy data. What is the problem you are trying to solve by encoding 16 things in a row? And how would people convert their legacy data forward while avoiding confusion, etc? And how do you proposal to deal with multiple representation problems? Legacy data? Rick
Re: Hexadecimal characters.
Tom Finch scripsit: Hmm, so representing Devanagari digits is more important than hexadecimal, which is used almost more than decimal on the web? I know inertia is a law of the universe, but this is rediculous. Hexadecimal is very important and deserves to be in Plane 0. I see a good spot in misc technical (23D--oh look hexadecimal again). Clueless still, I see. (Yes, that's ad hominem.) -- John Cowan [EMAIL PROTECTED] http://www.reutershealth.com I amar prestar aen, han mathon ne nen,http://www.ccil.org/~cowan han mathon ne chae, a han noston ne 'wilith. --Galadriel, _LOTR:FOTR_
Re: Hexadecimal characters.
At 17:45 -0400 2002-06-20, Tom Finch wrote: Hmm, so representing Devanagari digits is more important than hexadecimal, which is used almost more than decimal on the web? I know inertia is a law of the universe, but this is rediculous. Hexadecimal is very important and deserves to be in Plane 0. I see a good spot in misc technical (23D--oh look hexadecimal again). Hexadecimal is represented in Plane 0, with 0123456789AaBbCcDdEeFf. I don't get it. -- Michael Everson *** Everson Typography *** http://www.evertype.com
Re: Hexadecimal characters.
Tom Finch said: Hmm, so representing Devanagari digits is more important than hexadecimal, which is used almost more than decimal on the web? I think you may be misconstruing the purpose of the character encoding here. If I want to represent the hexadecimal numbers 0x60DB 0x618A in email or in HTML hexadecimal NCR's or whatever, guess what -- I can use ASCII (or Latin-1 or Unicode) characters: 6 0 D B 6 1 8 A -- and that is what everyone does. It is also what is *required* by the HTML and XML standards for the representation of hexadecimal NCR's on the web, by the way. If I want to represent Devanagari digits, on the other hand, I don't have an ASCII representation to hand -- those *require* separate encoding, since Devanagari characters are not the same as Latin characters or Arabic digits. So Devanagari digits were encoded in Unicode. Simple. I know inertia is a law of the universe, but this is rediculous. Hexadecimal is very important and deserves to be in Plane 0. Umm. It *is* in Plane 0: U+0030..U+0039, U+0041..U+0046 (and U+0061..U+0066), to be exact. I see a good spot in misc technical (23D--oh look hexadecimal again). Nobody has any quarrel with the notion that hexadecimal notation is very important in computer science -- and vital for character encoding discussions. The issue is whether we need any separate characters to represent hexadecimal digits, when we already have the digits everybody has been using for decades encoded. --Ken
Re: Hexadecimal characters.
At 15:03 -0700 2002-06-20, Kenneth Whistler wrote: In any case, I wonder if Tom could explain what is special about hexadecimal expressed with 0..9, A..F, as opposed to any other base numeric system that might be in widespread use, (duodecimal and vigesimal come to mind) which would lead to a particular argument that it should be encoded with a distinct set of characters. Actually I heard once that there were duodecimal digits (hm for the six-fingered I guess) for 11 and 12 out there somewhere that somebody was mulling over proposing. -- Michael Everson *** Everson Typography *** http://www.evertype.com
Re: Hexadecimal characters.
Kenneth Whistler scripsit: In any case, I wonder if Tom could explain what is special about hexadecimal expressed with 0..9, A..F, as opposed to any other base numeric system that might be in widespread use, (duodecimal and vigesimal come to mind) which would lead to a particular argument that it should be encoded with a distinct set of characters. Oh, just wait. The next step will be to propose separate characters for binary 1 and 0. He will also propose these novel glyphs for hex digits A-F: *** * * * *** *** * * * * * * * *** *** *** * *** * * * * * * * * * * *** * A fanatic is someone who can't change his mind and won't change the subject. --Winston Churchill -- John Cowan [EMAIL PROTECTED] http://www.reutershealth.com I amar prestar aen, han mathon ne nen,http://www.ccil.org/~cowan han mathon ne chae, a han noston ne 'wilith. --Galadriel, _LOTR:FOTR_
Re: Hexadecimal characters.
For the scripts which have their own digits, are there conventions to write hexadecimal numbers with those digits? If I read a Devanagari text book, will I see 20A7, or २०?७ (where ? stands for whatever is used for A)? Thanks, Eric.
Re: Hexadecimal characters.
-- On Thu, 20 Jun 2002 15:14:13 Rick McGowan wrote: Tom Finch wrote: Hexadecimal is very important and deserves to be in Plane 0. Hmmm, well.. In this case, importance has nothing to do with it, and going off on a comparison of the importance of Devanagari as opposed to Hex will not prevail in this discussion. Agreed. Hex is already representable with characters in plane zero, as people have been pointing out. There are the ten digits 0-9 and the letters A-F. People have explained this, and why your proposal would be confusing and not cater to legacy data. What is the problem you are trying to solve by encoding 16 things in a row? And how would people convert their legacy data forward while avoiding confusion, etc? And how do you proposal to deal with multiple representation problems? Legacy data? Rick Another example might be superscript 4 (2074h). You already can say 2^4 for sixteen, but the new character allows you to say it easier. Further, there already is multiple representation problems--A-F as well as a-f. The problem being solved is properly supporting the base sixteen system. Multiple representation is a problem as you say so yourself, and the current way cannot avoid this (A-F or a-f). Also using letters to stand for numeric data can lead to confusion--this is BAD. Legacy data will be dealt with by accepting the old system as long as necessary. _ Communicate with others using Lycos Mail for FREE! http://mail.lycos.com/
Re: Hexadecimal characters.
-- On Thu, 20 Jun 2002 15:14:13 Rick McGowan wrote: What is the problem you are trying to solve by encoding 16 things in a row? To answer this, it is better to have 16 in a row as it makes computation of a numeric value from the character value easier and more straightforward. A different proposal could be made for just 6 extra digits for hexadecimal if it is determined that space is really at a premium. But then you lose the unambiguousness of sixteen separate characters. _ Communicate with others using Lycos Mail for FREE! http://mail.lycos.com/
Re: Chess symbols, ZWJ, Opentype and holly type ornaments.
On Thursday, June 20, 2002, at 03:25 PM, Kenneth Whistler wrote: I think what a number of people on the list have been hinting -- or openly stating -- is that prolixity is not a virtue on an email list when trying to convey one's ideas. IOW, brevity's wit's soul. == John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Re: Hexadecimal characters.
-- On Thu, 20 Jun 2002 16:00:15 Eric Muller wrote: For the scripts which have their own digits, are there conventions to write hexadecimal numbers with those digits? If I read a Devanagari text book, will I see 20A7, or २०?७ (where ? stands for whatever is used for A)? Thanks, Eric. These would remain decimal. Some scripts have numerals beyond 0-9 already, but those are part of the tradition associated with it. For instance Mayan or Sumerian would have numerals beyond ten (if they were to be included in Unicode). _ Communicate with others using Lycos Mail for FREE! http://mail.lycos.com/
Re: Hexadecimal characters.
Hmmm. I was hoping this discussion would go away after the initial round of reasons why it won't happen. The problem being solved is properly supporting the base sixteen system. It is already properly supported. In fact, Unicode contains far more than a mere 16 entities sufficient for hexadecimal. With Unicode, any number base up to about 94,000 can easily be represented. It should satisfy even the hippest numerologists. Also using letters to stand for numeric data can lead to confusion-- this is BAD. Perhaps, but it's like spelling versus spelling reform. The current representation is already engrained in the computer-literate culture, and you'll be hard-pressed to change it, especially without a compelling story. And this story isn't very compelling. it is better to have 16 in a row as it makes computation of a numeric value from the character value easier and more straightforward. So what? This isn't rocket science. The hex-binary conversion problem is so trivial that every beginning CS student has probably had a homework assignement to solve it. Big deal. Five lines of library code. ...But then you lose the unambiguousness of sixteen separate characters. We already have done: because everybody already uses 0-9, a-f and A-F, and there's tons of software that already deals with this and mounds of existing data. The problem won't be solved, it will be augmented with yet another representation. The proposal is a non-starter. There isn't even a glimmer of serious interest here, and it's rather pointless to continue this discussion. Rick
Re: Hexadecimal characters.
From: Tom Finch [EMAIL PROTECTED] Rick McGowan wrote: What is the problem you are trying to solve by encoding 16 things in a row? To answer this, it is better to have 16 in a row as it makes computation of a numeric value from the character value easier and more straightforward. A different proposal could be made for just 6 extra digits for hexadecimal if it is determined that space is really at a premium. But then you lose the unambiguousness of sixteen separate characters. Or, since the proposal has already been rejected you can just write the conversion code using the existing numbers/letters and call it a day? :-) MichKa Michael Kaplan Trigeminal Software, Inc. -- http://www.trigeminal.com/
Re: Chess symbols, ZWJ, Opentype and holly type ornaments.
IOW, brevity's wit's soul. Well-spoken, dear Polonius. But better to Adorn the soul of wit so briefly put to us. My liege, and madam, to expostulate What majesty should be, what duty is, Why day is day, night is night, and time is time. Were nothing but to waste night, day, and time. Therefore, since brevity is the soul of wit, And tediousness the limbs and outward flourishes, I will be brief. Your noble son is mad. --the Bard
Re: Hexadecimal characters.
At 10:12 AM 6/20/02 +0100, Avarangal wrote: Long time ago I raised this matter in this forum. Hope you will go through filling the proposal forms, etc... In addition to your reasons, hex code code points need to be established but not the character shapes. All languages may not need to use the 0-9 and a-f shapes. But need to use the same code points. how do you enter these? Right now, if I want to write a hexadecimal number, I write using my keyboard. In fact *all* keyboards have a way for me to enter 0-9 and A-F or a-f. *No* existing keyboards has mappings to these novel characters. Some systems let me give longer key sequences to designate a unicode character, but that's not very convenient. Therefore, the most likely consequence of adding 16 characters would be that they are *not* being universally used, and possibly only used by a few people or a few applications. The proposal fails to address this migration issue, as well as a number of other issues others have mentioned, such as the issue of confusability caused by similarity to existing characters, compatibility mappings etc. etc. In sum, the downsides of taking such an action would have to be outweighed by other benefits, which themselves need to be clearly established (and not just taken for granted) before the proposal could reasonably be considered. I personally doubt that benefits can be shown to outweigh the substantial negative impacts such a proposal would have, and think very unlikely that they would be so compelling as to warrant encoding on the BMP. But this is all speculation until somebody actually writes the whole thing down - and not just a sketch. A./
Re: Google and Unicode
google also seems to sniff locales, for instance it feeds me thai langauge pages when i use thai locale on my browser. Might be. However, try a search on japanese with IE. The first page is, quite definitely, UTF-8. I'd say it's about time one of the major search engines went over to Unicode, big time, and what we have here seems like a big Go Girl! for Google. --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.371 / Virus Database: 206 - Release Date: 13/6/2545
Think twice before submitting a proposal
This is especially in reference to those hex digits. Here is what i have to say about the matter: To discourage frivolous character proposals, the Unicode Consortium requires you to come up with these (I am not sure if this is all the requirements, there might be more): 1. You gotta fill out a form. Probably not that hard, except that some ISO standards are referenced, and you might have some research work to do to find out just what these standards are. If you don't know where to access these documents, you're stuck. 2. You gotta have a font including the proposed characters. I do not know what type of font, but No Font = No Complete Proposal. 3. You gotta give them instances of the characters in actual use. I think you have to send them something to prove this, I'm not sure what. SO... If you are the only guy who uses these hex digits, and you don't have a font containing the digits, you basically have less chance proposing these characters to Unicode than proposing marriage to Anna Kournikova. However, if you find some previously undiscovered (and illiterate) jungle tribe, and they count in base 16, and you introduce literacy to the tribe, and give them YOUR digits, and if they ACCEPT and USE your digits, then it's pretty safe to say they're in. But you still need that font. Weird. I am told that it is good that we have non-literals for digits so we can do higher math, that without them higher math would be all but impossible, and now we have come full circle to using letters for digits!! If Thinkit wants to see something interesting, he should see how numbers are expressed in Braille. And the real reason I did not propose those two digits to Unicode was that I did not know how much WORK it would be. Sheesh, where CAN you get the relevant ISO documents anyway? _ $B%G%8%+%a$G;#$C$?