Re: Unicode 4.0 and ISO10646-2003

2004-04-22 Thread Asmus Freytag
At 10:44 AM 4/22/2004, Frank Yung-Fong Tang wrote: I saw the announcment of publishing ISO/IEC 10646: 2003, Information technology -- Universal Multiple-Octet Coded Character Set (UCS) From http://anubis.dkuug.dk/jtc1/sc2/open/02n3729.htm I expect there are no difference from Unicode 4.0,

Re: U+0140

2004-04-20 Thread Asmus Freytag
At 03:49 PM 4/19/2004, Kenneth Whistler wrote: The Unicode Standard is not prescriptive about rendering, beyond the basics required to simply ensure correct mapping of textual content into streams of characters. If one font vendor wants to have a raised glyph for the MIDDLE DOT and another wants

Re: Downloading UCD 4.0.0

2004-04-19 Thread Asmus Freytag
At 08:42 AM 4/19/2004, Theo Veenker wrote: Hi, Until now I always downloaded the lastest version of the UCD and worked with that. Now I want to download the UCD files for 4.0.0 again. I know it is all in http://www.unicode.org/Public/- 4.0-Update/, but in http://www.unicode.org/ucd/ I read this:

Re: U+0140 Catalan middle-dot

2004-04-17 Thread Asmus Freytag
At 06:16 PM 4/15/2004, Philippe Verdy wrote: The other reason is that the middle-dot, being a punctuation, would be likely to have extra spacing on both sides, which would make it inappropriate for rendering Catalan words. Also such punctuation would probably forbid kerning of the middle-dot

Re: U+0140

2004-04-17 Thread Asmus Freytag
At 01:54 PM 4/17/2004, Michael Everson wrote: The samples Asmus sent suggest to me that a school of typographers made a set of bad decisions, even if they were really famous and got paid lots of money and their fonts are widely shipped! In all charity, Michael, your opinion seems to be mainly

Re: U+0140

2004-04-16 Thread Asmus Freytag
At 12:26 AM 4/16/2004, Alexandros Diamantidis wrote: * Philippe Verdy [EMAIL PROTECTED] [2004-04-16 01:22]: U+0387 GREEK ANO TELEIA wrong form? it's a small square, and is the greek semicolon, and is then separating words. U+0387 is canonically equivalent to U+00B7. About its shape, whether

Re: U+0140

2004-04-15 Thread Asmus Freytag
At 03:31 PM 4/15/2004, Peter Kirk wrote: [PA] Isn't this the one that should be used in dictionaries ? See http://www.unicode.org/unicode/standard/reports/tr14/tr14-6.html Why are you guys citing the 1999 (!) version of this TR? It's 2004, Unicode 4.0.1 has been published and we are up to

RE: Newbie questions: 1) Surrogates in WinXP? 2) Unicode in PostScript?

2004-04-08 Thread Asmus Freytag
At 10:49 PM 4/7/2004, Peter Constable wrote: , and the length it reports is the number of code units, not the number of characters or graphemes in the string. True; that is documented. However, that's very common; many APIs relating to UTF-8 would report the number of bytes, not the number of

Re: CJK U+3ADA and U+66F6

2004-04-08 Thread Asmus Freytag
James, this is the kind of thing that you should report via our error reporting form. Here on the open list, it's liable to get lost (no-one owns excerpting issues from this forum). The contact form can be found on our home page under contact us. A./ At 12:03 PM 4/8/2004, [EMAIL PROTECTED]

RE: Doulos SIL (was: French typographic thin space)

2004-04-07 Thread Asmus Freytag
At 01:29 PM 4/7/2004, Richard Cook wrote: On Wed, 7 Apr 2004, Peter Constable wrote: They were encoded that way some while before they were accepted in Unicode. Also, until Unicode 4.1 is published, there is a possibility that codepoints may change. I see. I assumed the codepoint assignments

Re: names of the chars?

2004-04-07 Thread Asmus Freytag
At 09:11 PM 4/7/2004, Tobias Stamm wrote: Greetings to all standartisers! I'm new here so forgive me my stupidness. I just have one little question to which I didn't found the answer in the whole homepage: What is the standard of the characters names? You are looking for the character naming

Re: New Currency sign in Unicode

2004-04-02 Thread Asmus Freytag
At 09:37 AM 4/1/2004, you wrote: [EMAIL PROTECTED] wrote: The cedi sign should be of the size of the dollar sign ($) or the euro sign (EUR). The site you provided is using the cent sign. The Ghana web site uses a better version of the cent sign for the cedi. See

Re: Line Break class of U+FE51 Small Ideographic Comma

2004-04-02 Thread Asmus Freytag
At 12:34 PM 4/2/2004, Kenneth Whistler wrote: But by all means, make the proposal to the UTC if fixing this inconsistency seems important and there is some argument to be made for it. I might add that 'merely' fixing an apparent inconsistency cannot be enough of a rationale for making this change.

Re: New Currency sign in Unicode

2004-04-02 Thread Asmus Freytag
At 11:44 AM 4/2/2004, Kenneth Whistler wrote: Rick said: We also learn from the bird stamps web site cited later that the government of Ghana is extremely inconsistent about their images and usage of their own currency sign. I.e., they apparently don't have a standard for it. So, I don't

Re[2]: Fixed Width Spaces (was: Printing and Displaying DependentVowels)

2004-04-02 Thread Asmus Freytag
Somebody wrote: non-breaking and non-stretching are presentational properties, not semantic ones. They don't change the meaning of the space: it's still just a space, not a hyphen or the letter g. They don't affect non-visual media; we don't break lines in spoken speech. Louis XVI is

Re: Printing and Displaying Dependent Vowels

2004-03-30 Thread Asmus Freytag
At 04:28 PM 3/29/2004, Kenneth Whistler wrote: I will say again as I have said before - but the above (and what I snipped) is extra evidence for it - that what is broke ... is the rule that the isolated (generally spacing) form of a combining mark should be formed by SPACE or NBSP followed by

Re: Printing and Displaying Dependent Vowels

2004-03-29 Thread Asmus Freytag
At 12:19 PM 3/29/2004, Ernest Cline wrote: [Original Message] From: Peter Kirk [EMAIL PROTECTED] On 29/03/2004 06:56, John Cowan wrote: Peter Kirk scripsit: Using NBSP rather than SPACE has several advantages, and has long been specified in Unicode, although not widely implemented. It

Re: [OT] proscribed words... (was:What is the principle?)

2004-03-28 Thread Asmus Freytag
At 09:46 AM 3/28/2004, Philippe Verdy wrote: It was like the US telecommunications act which set fines for transmitting its set of proscribed words including in programs that were designed to filter the words out of text. Dos this list really exist? Seriously, there's no word that can be

Re: What is the principle?

2004-03-28 Thread Asmus Freytag
At 07:53 PM 3/27/2004, [EMAIL PROTECTED] wrote: What does the collation standard say to do with unassigned codepoints anyhow? Variation selectors are not unassigned characters. But, they might be regarded as such by any application predating VSs. And, likewise for any VS sequences approved

Fwd: Re: [OT] proscribed words... (was:What is the principle?)

2004-03-28 Thread Asmus Freytag
Date: Sun, 28 Mar 2004 15:26:12 -0800 To: Philippe Verdy [EMAIL PROTECTED] From: Asmus Freytag [EMAIL PROTECTED] Subject: Re: [OT] proscribed words... (was:What is the principle?) At 02:46 PM 3/28/2004, Philippe Verdy wrote: From: Asmus Freytag [EMAIL PROTECTED] Does this list really exist

Re: What is the principle?

2004-03-27 Thread Asmus Freytag
At 05:32 PM 3/26/2004, John Cowan wrote: Asmus Freytag scripsit: Another drawback is the fact that too few systems handle any variation selectors gracefully. Well, at least they should be easy to handle in fonts: add the selectors to the font as invisible characters, and then create mandatory

RE: RTL - LTR

2004-03-27 Thread Asmus Freytag
John, Look at UTR#20 and at UAX#9 (the 4.01. version is due out shortly). Taken together they suggest that the non-plain text way is to keep such text direction overrides out of band (i.e. in markup) and to apply the bidi algorithm segment by segment in a marked up file. If you export to plain

Re: What is the principle?

2004-03-27 Thread Asmus Freytag
At 05:47 PM 3/27/2004, John Cowan wrote: Asmus Freytag scripsit: This can be tricky esp,. when the user doesn't know a VS is present and the font used to view the data doesn't have an alternate glyph. Well, surely it'll turn into the black blob, or the reversed question mark, or whatever

Re: What is the principle?

2004-03-26 Thread Asmus Freytag
At 01:33 PM 3/26/2004, Jim Allan wrote: Arcane Jill posted: (A) A proposed character will be rejected if its glyph is identical in appearance to that of an extant glyph, regardless of its semantic meaning, Obviously not. Unicode encodes characters not glyphs. That particular glyphs of one

Re: What is the principle?

2004-03-26 Thread Asmus Freytag
At 02:03 PM 3/26/2004, Ernest Cline wrote: [Original Message] From: Asmus Freytag [EMAIL PROTECTED] There are millions of fonts out there with variations of the zodiac. Font shifting would seem to be the correct answer to implement glyph variations there. (A wrong font will ruin the mood

Re: [Slightly OT] Font examiner program/utility?

2004-03-24 Thread Asmus Freytag
At 12:14 PM 3/24/2004, Mike Ayers wrote: Does anyone know of a good program for examining fonts? What I am looking for is some way to, given a font, find out both the glyphs contained and the code points (bad term?) at which those glyphs are situated. Ability to read hinting/shaping

Re: vertical direction control

2004-03-24 Thread Asmus Freytag
At 02:58 PM 3/24/2004, Thomas Kuehne wrote: Am 2004-03-23 20:23 schrieb Asmus Freytag: I don't think I know of a scenario where it is crtical for a resource limited device to display the kinds of texts you list below. Reading the font data and processing it into a display representation poses

Re: vertical direction control

2004-03-23 Thread Asmus Freytag
At 02:55 PM 3/23/2004, Thomas Kuehne wrote: Is somebody already using a PUA assignment for vertical text direction controls? from http://www.unicode.org/faq/bidi.html#1 [...] the choice of vertical layout is usually treated as a formatting style; therefore, the Unicode Standard does not define

Re: vertical direction control

2004-03-23 Thread Asmus Freytag
At 06:09 PM 3/23/2004, Thomas Kuehne wrote: Am Mittwoch 24 März 2004 00:09 schrieb Asmus Freytag: Is somebody already using a PUA assignment for vertical text direction controls? I think the idea was that these don't belong in plain text. Markup languages have had vertical layout controls

Re: tick, tick box, cross, cross box

2004-03-21 Thread Asmus Freytag
At 02:26 AM 3/21/2004, Philippe Verdy wrote: Look into Wingdings and Dingbats code blocks, ** Phillipe, this is a new low in sloppy inaccuracy even for you. WingDings is a name of a series of fonts shipped by MS. They contain many symbols not found in Unicode. There is no

RE: help needed with adding new character

2004-03-19 Thread Asmus Freytag
At 09:48 AM 3/19/2004, Mike Ayers wrote: In less than half an hour of looking at printed samples, I've been able to locate two instances of the symbol replacing the letter A in a word. If that's not use in text, I don't know what is. That is use in text as a glyph variant, which is,

Re: What's the BMP being saved for?

2004-03-19 Thread Asmus Freytag
At 07:13 AM 3/19/2004, Marion Gunn wrote: Ar 15:33 + 2004/03/18, scríobh Arcane Jill: This probably is going to sound like a really dumb question, but ... Is the BMP being saved for something? ... Arcane Jill There are never any dumb questions, Jill, only dumb answers. And some of the latter

Re: help needed with adding new character

2004-03-18 Thread Asmus Freytag
At 10:34 AM 3/18/2004, Michael Everson wrote: I think the ANARCHY SIGN is perfectly good, but I think it is a glyph variant of an existing character. Just as 2117 and 24C5 are similar, but unrelated the *ANARCHY SIGN is not the same as 24B6. A./

Re: help needed with adding new character

2004-03-18 Thread Asmus Freytag
At 08:27 AM 3/18/2004, Jon Wilson wrote: Hi folks, I believe there is a character missing from the standard. I would like to apply to have it included, but I am a typography and Unicode novice, so I require some assistance with the application process. The character in question is a variant of

RE: help needed with adding new character

2004-03-18 Thread Asmus Freytag
At 04:18 PM 3/18/2004, Mike Ayers wrote: Note that in *that* rendition of the anarchy symbol, the crossbar on the A does *not* touch the circle on either edge, but it may just be that the renderer was a little short of black paint. I find

Re: OT? Languages with letters that always take diacriticals

2004-03-16 Thread Asmus Freytag
At 12:07 PM 3/16/2004, Antoine Leca wrote: (For example, old German in Frakkur typeface has been decided to be just different font, but the same lattin letters as we know today) Like U+017F? ;-) A little known fact is that the long s cannot be implemented as your typical context-based glyph

RE: New Public Review Issue

2004-02-24 Thread Asmus Freytag
At 12:11 PM 2/24/2004, Kenneth Whistler wrote: Think of variation selection as being more appropriate when what we are talking about are for most purposes simply *free variants* for presentation -- either is equally correct to most people under most circumstances -- but where for particular

Re: Character allocation

2004-02-10 Thread Asmus Freytag
At 01:20 PM 2/7/2004, Laurentiu Iancu wrote: I noticed that a new combining character, U+1DC2 Combining Snake Below, has been added. Just out of curiosity, what were the reasons why this character was allocated at this code point rather than, for instance, U+0358, the last free position in the

Re: Public Review Issue #27

2004-02-09 Thread Asmus Freytag
At 04:12 PM 2/9/2004, Kenneth Whistler wrote: That leaves item A. And it is mostly a matter of determining what is the best mechanism for getting people to know how they should spell the metegs with the minimum of confusion. Putting something in the Unicode Standard might be appropriate, or there

Re: Mongolian Unicoding (was Re: Cuneiform Free Variation Selectors)

2004-01-20 Thread Asmus Freytag
Just a few comments on Andrew's note: At 06:43 AM 1/19/2004, Andrew C. West wrote: An analogy for those not familiar with the Mongolian script is the much beloved long s, which is a positional glyph variant of the ordinary letter s for some languages at some periods of time. The long s does not

Re: Mongolian Unicoding (was Re: Cuneiform Free Variation Selectors)

2004-01-18 Thread Asmus Freytag
At 09:23 PM 1/18/2004, [EMAIL PROTECTED] wrote: Seriously, it's my understanding that implementation guidelines for Mongolian script and Unicode are still being worked out. You are correct. A group of experts is currently working out a definite description of how Mongolian should work. All the

Re: Long S in Germany (was: 0364 COMBINING LATIN SMALL LETTER E)

2004-01-08 Thread Asmus Freytag
At 04:08 PM 1/8/2004, D. Starner wrote: Otto Stolz [EMAIL PROTECTED] wrote: Gerd Schumacher wrote: The long s [...] has been abandoned from the Roman alphabet in Germany in the mid of the 19th century. You mean the 20th century, don't you? I have a facsimile reprint of the 1914 issue of

Re: Mathematical exist and forall in Unicode

2004-01-02 Thread Asmus Freytag
Another rule which isn't written into Unicode but I like (don't know if Everson and Whistler and others will), is the font clarity rule. Given a font minus one character, I should be able to predict what that character will look like. If I have a Sütterlin font or a Fraktur font, I know what

Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)

2003-12-03 Thread Asmus Freytag
- Original Message - From: Frank Yung-Fong Tang [EMAIL PROTECTED] UTF-166,634,430 bytes UTF-87,637,601 bytes SCSU6,414,319 bytes BOCU-15,897,258 bytes Legacy encoding (*)5,477,432 bytes (*) KS C 5601, KS X 1001, or EUC-KR) What is the size

Re: BOM as WJ?

2003-11-21 Thread Asmus Freytag
At 05:52 AM 11/20/2003, Philippe Verdy wrote: We need a comprehensive new technical report that lists all the exceptions to the general category system, as these line-breaking or word-breaking or grapheme cluster breaking properties are orthogonal to the basic GC system and to the combining class

Re: BOM as WJ?

2003-11-21 Thread Asmus Freytag
At 05:44 AM 11/19/2003, Philippe Verdy wrote: However, a couple of paragraphs up, the definition for No-Break Space says: U+00A0 [No-Break Space] behaves like the following coded character sequence: U+FEFF [Zero Width No-Break Space] + U+0020 [Space] + U+FEFF [Zero Width No-Break Space].

Re: Unicode and Script Encoding Initiative in San Jose Mercury News

2003-10-28 Thread Asmus Freytag
At 09:35 PM 10/27/03 -0800, Doug Ewell wrote: That said, I can try to improve my use of real Unicode punctuation on these lists, if I have time to paste it in (since my keyboard doesn't support it). Please don't. I remember being told by someone a few years back that I should limit my use of

Re: [OT by now] Re: Traditional dollar sign

2003-10-27 Thread Asmus Freytag
At 09:30 PM 10/26/03 -0800, Doug Ewell wrote: I can't speak for the whole of the last two centuries, but certainly current American bills and coins do not use either symbol. The bills in common use say ONE DOLLAR, FIVE DOLLARS, TEN DOLLARS, and TWENTY DOLLARS; the coins say ONE CENT, FIVE

Re: U+0BA3, U+0BA9

2003-10-26 Thread Asmus Freytag
At 02:08 PM 10/25/03 -0700, Doug Ewell wrote: So, in effect the UNICODE character names attempt to be a unified transliteration scheme for all languages? Are these principles laid down somewhere or is this more informal? The Unicode character names attempt to be (a) unique and (b) reasonably

Re: Traditional dollar sign

2003-10-25 Thread Asmus Freytag
At 03:36 AM 10/26/03 +1100, Simon Butcher wrote: Just a quick question.. The description for U+0024 (DOLLAR SIGN) states that the glyph may contain one or two vertical bars. Is there a codepoint specifically for the traditional double-bar form, or any plan to include one in the future? No. I

Re: New contribution N2676

2003-10-25 Thread Asmus Freytag
At 05:51 PM 10/25/03 +0100, Raymond Mercier wrote: Among the new characters in N2676 there is 10186 G GREEK ARTABE SIGN This is one of the many signs found in papyri, such as those edited by Kenyon. This symbol represents apparently a measure of volume used for grain. It appears as a small

RE: Traditional dollar sign

2003-10-25 Thread Asmus Freytag
At 11:02 AM 10/26/03 +1100, Simon Butcher wrote: Hi! snip I was taught at school that the double-bar form was used when Australia switched to decimal currency in 1966, and that it was incorrect to write the single-bar form when referring to Australian dollars. It would be interesting if

RE: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-24 Thread Asmus Freytag
At 02:05 PM 10/24/03 +0100, Jill Ramonsky wrote: Here's a better idea. Let's just stick with the idea that ANY C0 or C1 control has no place being anywhere in a line of text, and so any sequence of one or more of them will be interpretted as a line-break! Sorted once and for all! I'm not sure

Re: PUA

2003-10-19 Thread Asmus Freytag
Why does this have to be in 'plain text'?? Plain text can be streams or strings. For streams, such a mechanism might make sense, if you could identify a compelling case that's not better handled by HTML, XML etc. For strings, embedding font names in front of characters just violates some

RE: Unicode Public Review Issues update (braille)

2003-10-16 Thread Asmus Freytag
I noticed that this message had not gotten a reply. At 05:07 PM 10/7/03 +0200, Kent Karlsson wrote: A question about the issues already open: What is the justification for proposing to make Braille Lo? Shortly before this came up as a Public Review Issue, I suggested that Braille characters

Re: Canonical equivalence in rendering: mandatory or recommended?

2003-10-16 Thread Asmus Freytag
At 02:26 AM 10/16/03 -0700, Peter Kirk wrote: You can never tell whether something is going to be a performance issue -- not just measurably slower, but actually affecting usability -- until you do some profiling. Guessing does no good. Well, did the people who wrote this in the standard do some

Re: Beyond 17 planes, was: Java char and Unicode 3.0+

2003-10-16 Thread Asmus Freytag
At 08:03 AM 10/16/03 -0700, Peter Kirk wrote: Or perhaps a way can be found to graciously retire UTF-16 in some distant future version of Unicode. That is likely to become viable long before the extra planes are needed. This discussion is a pure numbers game. Since no-one can define a hard

Re: Beyond 17 planes, was: Java char and Unicode 3.0+

2003-10-16 Thread Asmus Freytag
At 10:16 PM 10/16/03 +0200, Philippe Verdy wrote: Standards should always be designed with the idea of integrating well with other standards, without introducing contradictory objectives. This is what Americans call motherhood and apple pie - feel godd statements that are lofty but do nothing to

Re: Beyond 17 planes, was: Java char and Unicode 3.0+

2003-10-16 Thread Asmus Freytag
At 09:59 PM 10/16/03 +0200, Philippe Verdy wrote: We're not discussing about addition of characters standardized by joint efforts of Unicode's UTC and ISO's WG2, and I'm not expecting a lot of changes in this area. But about a more general scheme in which the Unicode/ISO10646 would become a part

Re: Canonical equivalence in rendering: mandatory or recommended?

2003-10-15 Thread Asmus Freytag
I'm going to answer some of Peter's points, leaving aside the interesting digressions into Java subclassing etc. that have developed later in the discussion. At 04:19 AM 10/15/03 -0700, Peter Kirk wrote: I note the following text from section 5.13, p.127, of the Unicode standard v.4:

Re: Canonical equivalence in rendering: mandatory or recommended?

2003-10-15 Thread Asmus Freytag
At 01:44 PM 10/15/03 -0700, Peter Kirk wrote: The guidelines are concerned with the average case: displaying the characters as *text*. [The use of the word 'must' in a guideline is always awkward, since that word has such a strong meaning in the normative part of the standard.] So, are you

Re: Variation sequences in code charts?

2003-10-12 Thread Asmus Freytag
At 03:07 PM 10/12/03 -0400, Laurentiu Iancu wrote: Hello, I was wondering if it would be a good idea to include variation sequences in the code charts, as notes below the base characters that have standardized variants. To me it would seem as a convenient place to reference them, but I realize

Re: Unicode Public Review Issues update: BRAILLE

2003-10-07 Thread Asmus Freytag
At 10:32 AM 10/7/03 +0530, [EMAIL PROTECTED] wrote: The only justification mentioned so far for changing Braille from So to Lo is to be able to use Braille in identifiers. I'm not sure why someone whould want to use Braille in this way, for a start how would these identifiers be translated into

Re: Unicode Public Review Issues update

2003-10-06 Thread Asmus Freytag
At 10:29 AM 10/6/03 +0530, [EMAIL PROTECTED] wrote: The Unicode Technical Committee has posted some new issues for public review and comment. Details are on the following web page: http://www.unicode.org/review/ A question about the issues already open: What is the justification for

RE: Unicode Hebrew proposal: nomenclature..

2003-10-03 Thread Asmus Freytag
At 04:24 PM 10/3/03 -0700, Peter Constable wrote: HEBREW BABYLONIAN (SIMPLE) ATNACH I don't know that parens in names are acceptable. Also, might it make sense to hyphenate the first two words (the first word in the name of characters in the Hebrew block doesn't need to be HEBREW). Hence,

Re: FW: Web Form: Other Question: British pound sign - U+00A3

2003-10-01 Thread Asmus Freytag
At 10:50 AM 10/1/03 -0700, Magda Danish \(Unicode\) wrote: Our problem is the representation of the £ sign (British pound sign - U+00A3). When we type this character into our pages and then set the character encoding in our pages to Unicode (UTF-8) (either by setting it directly in the HTTP

Re: Internal Representation of Unicode

2003-09-30 Thread Asmus Freytag
At 11:15 AM 9/30/03 -0400, John Cowan wrote: Isaac Newton spent an unconscionable amount of time, by our standards, messing about with astrology and numerology One of the aspects of character encoding and standardization that seems to have an unholy fascination for people is its numerical aspect.

Re: Possible error in the Unicode code charts

2003-09-30 Thread Asmus Freytag
At 10:34 PM 9/30/03 +0200, Stefan Persson wrote: The code charts tells that U+ACs-0308 Combining di+AOY-resis may be used for indicating the +IBw-double derivate.+IB0- I have only heard people calling this the +IBw-second derivate+IB0gFA-is +IBw-double derivate+IB0- a valid name for this? The

RE: About that alphabetician...

2003-09-25 Thread Asmus Freytag
At 05:41 PM 9/25/03 +0100, Richard Ishida wrote: Aha. Maybe, next time I try to explain it on the plane, I'll say something like: Unicode is a standard for enabling your computer to represent all the letters of all the alphabets of the world. Still not terribly accurate and deliberately vague

Re: Questions on ZWNBS - for line initial holam plus alef

2003-09-18 Thread Asmus Freytag
At 08:36 PM 9/18/03 -0400, Noah Levitt wrote: On Mon, Aug 11, 2003 at 12:57:11 -0700, Kenneth Whistler wrote: Kent asked: How should a freestanding double diacritic be encoded (for purposes of meta-discussions, and the like): SPACE, dbl diacritic or SPACE, dbl diacritic, SPACE? It

Re: QBCS

2003-09-02 Thread Asmus Freytag
At 08:26 PM 9/1/03 -0700, Doug Ewell wrote: Tex Texin tex at i18nguy dot com wrote: In most industry usages, MBCS refers to variable width encodings, not fixed width. Well, if variable-width encodings are referred to as both DBCS (see, for example, http://czyborra.com/charsets/cjk.html#dbcs)

RE: Missing Ugaritic Code Chart Link

2003-08-31 Thread Asmus Freytag
At 10:40 AM 8/31/03 -0400, Jim Allan wrote: The code chart menu page at http://www.unicode.org/charts/ does not contain a link to the Ugaritic characters However the Ugaritic chart exists and can be obtained by using the direct url http://www.unicode.org/charts/PDF/U10380.pdf. I've just

RE: Clones (was RE: Hexadecimal)

2003-08-19 Thread Asmus Freytag
Compatibility characters: The recommendations for compatibility characters are necessarily vague, since their use in legacy data (and legacy environments) is strongly dependent on what is (or was) customary in a given environment. If a process merely warehouses text data (or parses only a very

Re: Question about properties of some Code Points

2003-07-22 Thread Asmus Freytag
At 04:50 AM 7/22/03 +0200, Chris Jacobs wrote: Where am I going with this? Basically what I'm after is a clean/clear way to tell if quotation marks and parentheses (plus the other bracketing characters such as '[' or '{' are opening or closing punctuation. That's the real question here!

Re: French group separators, was Re: The character for 10**24 i nJapanesenumbers (jo)

2003-07-10 Thread Asmus Freytag
At 05:09 PM 7/8/03 -0400, you wrote: Even if this were done, I wonder if most software would understand U+2007 or other non-breaking spaces as spaces for the purpose of full-justification or right-justification and hide them when they would otherwise appear at column right position. Such usage

RE: When is a character a currency sign?

2003-07-08 Thread Asmus Freytag
Unicode assigns the general category value, Sk, or Symbol, [k]urrency to all characters whose *primary* function is to act as a currency symbol. That excludes all characters that have other, unrelated uses, as long as those are not more specialized than the use as currency sign. That's an

Updated: Unicode TR#20 Unicode in XML

2003-06-13 Thread Asmus Freytag
Consortium. The location on the Unicode website is http://www.unicode.org/reports/tr20/ Asmus Freytag Technical Vice President The Unicode Consortium

RE: Letterforms based on p

2003-06-09 Thread Asmus Freytag
At 10:00 AM 6/9/03 +0300, [EMAIL PROTECTED] wrote: It also appears along with other symbols used in the OED at http://dictionary.oed.com/public/help/Advanced/symbols.htm#mod1letter. (Again, not all these symbols are currently part of Unicode.) To state the obvious (and random email does not

Letterforms based on p

2003-06-06 Thread Asmus Freytag
I keep coming across a letterlike symbol based on the letter p. In going through my collections, I found it listed in a table of symbols in an excerpt from the US Government Printing office style manual from 1984. That symbol is named 'per' and looks like To me, the symbol looks like something

Re: Letterforms based on p

2003-06-06 Thread Asmus Freytag
At 01:34 AM 6/7/03 +0200, Philippe Verdy wrote: - Original Message - From: Asmus Freytag [EMAIL PROTECTED] Can anyone shed further light on this character? I assume this is a lower case form, does anyone care to confirm that? Isn't your per symbol it similar to the form variant

Re: RE: IPA Null Consonant

2003-06-03 Thread Asmus Freytag
At 05:21 PM 6/2/03 -0400, Jim Allan wrote: Rick McGowan asked: Can someone point more specifically to where it says anything about variation selectors? This pointer is to the table of contents/overview... Well, at http://www.usefulcontent.org/docs/manuals/REC-MathML2-20010221/chapter6.htm

Re: U+1D29

2003-05-30 Thread Asmus Freytag
At 02:56 PM 5/29/03 -0700, Kenneth Whistler wrote: António asked: I've just downloaded the PDF files with 4.0 additions (U40-*.pdf). One question: How is one supposed to tell apart the glyphs for U+1D29 and U+1D18?... Or one isn't?... (OK, this question is probably more suited to be posed to

Re: Annotation

2003-03-27 Thread Asmus Freytag
At 09:10 PM 3/26/03 +, Michael Everson wrote: At 10:48 -0800 26/03/2003, Kenneth Whistler wrote: And the reason why U+2030 PER MILLE SIGN is the right answer is that salinity is measured in grams per 1 kg of solution. The question :-) Yes, what is the question? Shall Ken add salinity

Re: Which cross is this?

2003-03-22 Thread Asmus Freytag
At 11:13 AM 3/22/03 +0100, Pim Blokland wrote: David Starner schreef: Criss-cross-lain. is./i The alphabet; so called in consequence of its being formerly preceded in the ihorn-book/i by a ✠to remind us of the cross of Christ; hence the term. iChrist-Cross- line/i came at last to mean

RE: ANSI requires licence fees to use ISO language and country code?

2003-03-21 Thread Asmus Freytag
At 12:15 PM 3/21/03 -0800, Kenneth Whistler wrote: Let's try this one on for size: == However, if you load the list of ISO/IEC 10646 character codes in a commercial product, thus giving an added value to your product, we

Re: What are provisional properties

2003-03-12 Thread Asmus Freytag
At 11:55 AM 3/13/03 +0900, you wrote: Dear Unicoders, The unicode beta page mentions that a new concept of provisional properties has been introduced to 4.0. Unfortunately, no text is available that elaborates this. Is there any way to learn more about that prior to publication of TUS 4.0?

Re: CGJ and ZWJ (was Re: Currency symbols)

2003-03-10 Thread Asmus Freytag
At 06:47 PM 3/10/03 -0800, Kenneth Whistler wrote: Sorry. I mean such an invisible character that would keep those letters toghether, even when the inter-character space is expanded, like as if they were in the same lead type. (The same thing I'd use decompose U+0133 into i+THING+j.) What

Re: Looking for information on the UnicodeData file

2003-03-05 Thread Asmus Freytag
At 04:57 PM 3/5/03 +0100, Pim Blokland wrote: I apologize if this question has been asked before, but I'm relatively new at this. My question is: where can I find formal definitions of the terms used in the Character Name field of the UnicodeData.txt file? Most specifically, precise

Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)

2003-03-03 Thread Asmus Freytag
delete illegal sequences, but substitute a replacement character for missing characters. Mark [EMAIL PROTECTED] IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 - Original Message - From: Asmus Freytag [EMAIL PROTECTED] To: Mark Davis [EMAIL PROTECTED

Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)

2003-03-03 Thread Asmus Freytag
At 11:52 AM 3/3/03 -0800, Mark Davis wrote: Perhaps I wasn't clear; I agree with you on that. 1) It is conformant to skip or substitute text, with just a code at the end indicating that something of that sort was done. It's a subtle point, but can be put into your formulation: What I was after

Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)

2003-03-03 Thread Asmus Freytag
At 01:07 PM 3/3/03 -0800, Mark Davis wrote: If your converter purports to produce any one of the Unicode encoding forms, then it cannot conformantly produce malformed Unicode as a result. If, of course, it does not purport to do that, it can do anything it wants to. Then, as long as the

Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)

2003-03-02 Thread Asmus Freytag
At 07:21 AM 3/2/03 -0800, Mark Davis wrote: C12a When a process interprets a code unit sequence which purports to be in a Unicode character encoding form, it shall treat ill-formed code unit sequences as an error condition, and shall not interpret such sequences as

Re: UTF-8 (was:Unicode 4.0 BETA available for review)

2003-02-26 Thread Asmus Freytag
Can we retitle this thread? I'm getting actual replies to my posting of the BETA that I need to keep track of, and the run-on discussion of UTF-8 under this title is distracting. Thanks for your help, A./ At 04:56 PM 2/26/03 -0800, you wrote: Yung-Fong Tang wrote: I see a hole here. How about

Re: Currency symbols (was: Re: guarani sign)

2003-02-25 Thread Asmus Freytag
At 12:55 PM 2/25/03 +, Anto'nio Martins-Tuva'lkin wrote: Most (all?) of them are composable, either by means of letter + slash (OSLI) or by ZWJ (for things like Pta or Pts, if anything), Using ZWJ for such things is frowned upon. The ZWJ may be used to request a ligature between two

Re: [OpenType] PS glyph `phi' vs `phi1'

2003-02-21 Thread Asmus Freytag
At 07:26 AM 2/21/03 +0100, Werner LEMBERG wrote: Show me a widely used font which contains both U+03C6 and U+03D5. That was not the issue. The issue is when font wanted to add 03D5 that they would not just put the opposite glyph into 03D5. Or just end up having a duplicate glyph. Fonts that have

Re: [OpenType] PS glyph `phi' vs `phi1'

2003-02-20 Thread Asmus Freytag
At 12:08 AM 2/21/03 +0100, Werner LEMBERG wrote: Virtually all fonts I know of use the pre-3.0 glyph representations. Sigh. Any suggestion how to fix this mess? [...] To give just one very widely available example Times New Roman has always used the post 3.0 glyph. A./

Re: Wrong Charakter Categories (was: Hot Beverage font)

2003-02-19 Thread Asmus Freytag
.] Asmus Freytag Technical Vice President The Unicode Consortium

Re: Never say never

2003-02-12 Thread Asmus Freytag
At 08:13 AM 2/12/03 -0800, Doug Ewell wrote: Even then, you may be behind a time lag of more than one month because the UTC meetings minutes are posted a little late. So, to be fully aware, apart from becoming a member, you should also attend UTC meetings. I would imagine that issues like

Re: Plane 14 Tag Deprecation Issue (was Re: VS vs. P14 (was Re: Indic Devanagari Query))

2003-02-07 Thread Asmus Freytag
At 11:54 AM 2/6/03 -0800, Kenneth Whistler wrote: My personal opinion? The whole debate about deprecation of language tag characters is a frivolous distraction from other technical matters of greater import, and things would be just fine with the current state of the documentation. But, if formal

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-07 Thread Asmus Freytag
At 01:52 AM 2/7/03 -0800, Andrew C. West wrote: Ah, but decorative motifs are not plain text. Ah, but it could be. Ah, but it wouldn't be Unicode. A(h)./

<    6   7   8   9   10   11   12   13   >