Re: Line breaking for Kanbun and Bopomofo characters

2004-02-10 Thread Kenneth Whistler
Katsuhiko Momoi asked: > > In Kanbun reading (classical Chinese), I always thought that these > characters are a part of the preceding character so that a line should > not break before it. For example, 0x3191 is an instruction to skip the > preceding character and read the next character first

Re: Character allocation

2004-02-10 Thread Kenneth Whistler
Asmus said: > The editors at one point had a draft allocation there and forgot that the hole > opened up when that allocation was withdrawn. Umm. Not really. > For that reason alone, and for no other, it might be worthwhile to ask WG2 to > move that character and close the 0300 block. Everyone

Re: Unified Canadian Syllabics

2004-02-09 Thread Kenneth Whistler
Chris Harvey wrote: > I think I posted this to the list last week, but I haven't seen it come up. You may have run up against the size of message constraints currently imposed on the [EMAIL PROTECTED] email list because of the MyDoom virus. Some comments on particular issues: > Misnamed Charact

Re: Public Review Issue #27

2004-02-09 Thread Kenneth Whistler
Peter C opined: > >Well, perhaps there is a step that's needed to propose representations > >for the alternate positions of meteg, one of these making use of ZWJ or > >ZWNJ (whichever) and to get UTC to approve that so that it's formally a > >part of the standard and, hence, an interoperable repre

Re: Public Review Issue #27

2004-02-09 Thread Kenneth Whistler
> Was any decision made at the UTC meeting concerning Public Review Issue > #27? I ask because I am waiting to encode a text which needs to use ZWJ > and ZWNJ within combining character sequences (and for which there is > already publicly available font support!), but I don't want to do > anyt

Re: Collation charts out of date

2004-01-30 Thread Kenneth Whistler
Peter Kirk asked: > It does look very odd that 1D28 has been separated from the other pi's, > 1D29 from the other rho's etc. Is there a good reason for that? I know > everyone hates the UPA (except for Uralicists presumably), but these > letters are still clearly variants of pi and rho. The sam

Re: Unwanted publicity?

2004-01-28 Thread Kenneth Whistler
Tim Partridge wrote: > I was somewhat surprised to see the word Unicode on page 8 of the Metro > newspaper (London, UK) today (January 28, 2004). > > Unfortunately it was in the middle of an article about Mydoom, where it says > "The message may read 'The message contains Unicode characters and h

Re: U+02C1 and U+02E4

2004-01-27 Thread Kenneth Whistler
Peter asked: > What is the difference between U+02C1 and U+02E4? The first is supposed > to be a miscellaneous phonetic modifier and apparently the pair of > U+02C0 which marks "ejective or glottalized" (not 1996 IPA). Which > should be used to mark pharyngealisation (as in 1996 IPA)? U+02E4,

Re: Unicode forms for internal storage - BOCU-1 speed

2004-01-22 Thread Kenneth Whistler
Philippe suggested: > I don't object proposals to define new "UTF-*" forms, > but this should still be > proposals for an otherwise distinctly named encoding form, chosen by the > proposal author out of the "UTF-*" naming space. The UTC clearly does object to proposals to define "new 'UTF-*' for

RE: Cuneiform Free Variation Selectors

2004-01-20 Thread Kenneth Whistler
Peter Kirk suggested: > Presumably the same principles can be applied when you run into a newly > discovered (probably archaic) cuneiform character. Except that for some > reason, Ken, you classified "dynamic cuneiform" as Type VI: Glyph > Description Language. Why can't it be seen as Type V: I

Re: Cuneiform Free Variation Selectors

2004-01-20 Thread Kenneth Whistler
Dean Snyder continued: > >> But NO ONE mentioned free variation selectors in the discussion until > >> yesterday. > > > >This is not the case. *I* mentioned free variation selectors > >during both of the ICE meetings. They weren't discussed at any > >great length, precisely because I and the other

RE: Chinese FVS? (was: RE: Cuneiform Free Variation Selectors)

2004-01-20 Thread Kenneth Whistler
John Jenkins tried to present some usage cases for Han FVS combinations, and Mike Ayers responded with a bunch more questions: > Ummm - if this simplified form were used at all, wouldn't it already > be encoded? Isn't there a process for getting such encoded? Has this > process broken down

Re: Cuneiform Free Variation Selectors

2004-01-20 Thread Kenneth Whistler
Dean Snyder asserted: > >No, we do not need to rehearse the pros and cons of the "dynamic" > >model for Cuneiform already. Abundant evidence for why it has not > >been chosen has already been presented. > > But NO ONE mentioned free variation selectors in the discussion until > yesterday. This

Amazing Facts About Irish (was stupidly continuing as Re: Klingon)

2004-01-15 Thread Kenneth Whistler
> > In Irish, however, initial digraphs like "tS" and "hO" and "gC" *are* a > standard > > part of the orthography, and constitute the normal capitalization > convention: > > words beginning thus are capitalized on the second letter, not the first. > > Interesting. I did not know that of Irish...

Re: Klingon

2004-01-15 Thread Kenneth Whistler
> Mark E. Shoulson scripsit: > > > It's incredibly useful, Philippe, to have some inkling of what you're > > talking about before you answer. > > What, and ruin his large and growing reputation as one of the masters of > misinformation? He'll be challenging Abrigon Gusiq next. ghojmoHwl'Daj v

Re: Chinese rod numerals

2004-01-14 Thread Kenneth Whistler
> Thus for example, referring to the page from a 13th > century book reproduced in Needham (1959) p. 132, I would translate the > passage from the bottom of the fourth column from the right (reading > right to left) roughly as: > > " ... having done that, multiply the breadth of the yellow hypo

Re: Chinese rod numerals

2004-01-12 Thread Kenneth Whistler
John Jenkins responded: > Personally, I think it's an excellent idea. I have my doubts, personally, but concur that getting a proposal together to debate the merits is a good idea. > It'd be good to get it on > the UTC agenda for next month, so if you could start on the form. I > can give yo

Re: U+0185 in Zhuang and Azeri (was Re: unicode Digest V4 #3)

2004-01-05 Thread Kenneth Whistler
[Doing a little cut and pasting here to coalesce the context...] > Peter Kirk wrote, > > > > I note an incorrect glyph for U+0185 in Code2000 and in Arial Unicode > > MS; this looks like b with no serif at the bottom but should be much > > shorter, like ь, the Cyrillic soft sign. > James Kas

Re: Latin letter GHA or Latin letter IO ?

2004-01-05 Thread Kenneth Whistler
Peter said: > As you will see, I have requested precisely this clarification for > U+0184/0185, to clarify that this letter is used in pan-Turkic alphabets > as well as in Zhuang. I am also asking for a change in the reference > glyph for U+0185, because in both Zhuang and pan-Turkic this shoul

Re: Pre-1923 characters? (was: unicode Digest V4 #3)

2004-01-05 Thread Kenneth Whistler
> >Not a good idea: the Nogai and Khakass languages appear to have used both > >gha/oi and "i with lower right hook" according to > >http://www.writingsystems.net/languages/nogai/nogailatin.htm and > >http://www.writingsystems.net/languages/khakass/khakasslatin.htm . > > > >Charles Cox > > > > >

Re: Latin letter GHA or Latin letter IO ?

2004-01-05 Thread Kenneth Whistler
Peter Kirk wrote in response to Philippe Verdy: > But you do seem to have found a real problem with the standard. If the > character name is not guaranteed to be an accurate means of > identification of the character, and the glyph is not normative, how can > I know from the standard that U+01A

Re: Aramaic unification and information retrieval

2003-12-22 Thread Kenneth Whistler
Peter Kirk said: > Anyway, I don't see the main purpose of > collation as producing lists of legible words, but rather as matching in > text and database searches. Collation is used for both purposes, of course. And there is nothing which requires you to use the same rules for sorting lists as

Re: Arabic Presentation Forms-A

2003-12-17 Thread Kenneth Whistler
Philippe asked: > The "Arial Unicode MS" font does not have a glyph for the Rial currency sign > so I won't comment lots about it, even if it's a special ligature of its > component letters: > it's just regrettable that it's > not found in Arial Unicode MS (unless this Rial sign is traditional an

American English translation of character names (was Re: Stability of WG2)

2003-12-17 Thread Kenneth Whistler
Jim Allan noted: > On the other hand, there is nothing to prevent the Unicode consortium or > any other body or any single person from creating a new *additional* > corrected set of names if the Unicode consortium or any other body or > any single person wishes to do so. > > That would just be

Re: Case mapping of dotless lowercase letters

2003-12-16 Thread Kenneth Whistler
Correcting myself: > Note that none of the 3 sets of equivalence classes violates > *canonical* equivalence, because none of the 8 sequences involved > is canonically equivalent to any other. In other words, no matter > which of the 3 approaches you take to case folding, in no instance > are you c

Re: Case mapping of dotless lowercase letters

2003-12-16 Thread Kenneth Whistler
John Cowan noted: Here's what happens exactly: source simple case folding full case folding tr/az case folding dotted i dotted idotted idotted i dotless i dotless i dotless i dotless i dotted I dot

Speaking of glottophagic hegemony (was Re: [OT] CJK -> CJC (Re: Corea?))

2003-12-16 Thread Kenneth Whistler
Wow. Antonio is running it down! > Etc. All this crackpot misguided political correctness reeks of > unconscious glottophagic hegemony, cultural parochalism and well-meaning > gringocentered patronizing -- it's unsettling to sniff (in this and > other threads) whips of it in a forum such as this.

Re: Stability of WG2 (was: Re: [OT] CJK -> CJC)

2003-12-15 Thread Kenneth Whistler
Doug wrote: > Perhaps that is Peter's point: that some day, changes in the membership > and market pressures (which have shown to be an influence on other ISO > committees) could result in a different attitude toward the written > policies of WG2 from that which currently exists. > > > It s

Re: [OT] CJK -> CJC (Re: Corea?)

2003-12-15 Thread Kenneth Whistler
Peter Kirk noted: > Anyway I was thinking not so much of a voluntary decision by WG2, but > that there might perhaps be pressure, even a directive, from the top of > ISO to change "Korean" to "Corean", which even you, even WG2, might be > unable to resist. That would constitute a *technical* c

Re: Text Editors and Canonical Equivalence (was Coloured diacritics)

2003-12-10 Thread Kenneth Whistler
Peter Kirk continued: > >Once again, people are falling afoul of the subtle distinctions > >that the Unicode conformance clauses are attempting to make. > > > > > In that case the distinctions are too subtle and need to be clarified. > C9 states that "no process can assume that another process

Re: Text Editors and Canonical Equivalence (was Coloured diacritics)

2003-12-10 Thread Kenneth Whistler
Peter Kirk averred: > Agreed. C9 clearly specifies that a process cannot assume that another > process will give a correct answer to the question "is this string > normalised?", because that is to "assume that another process will make > a distinction between two different, but canonical-equiva

Re: New symbols (was Qumran Greek)

2003-12-08 Thread Kenneth Whistler
Elaine asked: > > is > > a complete listing of new symbols to go into Unicode > > Thanks!--how many Web sites do you all have? http://www.dkuug.dk/JTC1/SC2 is the official website of -- you guessed it -- JTC1/SC2, the JTC1 subcommittee which m

Re: New symbols (was Qumran Greek)

2003-12-08 Thread Kenneth Whistler
Mark Shoulson wondered: > >And not complete. That is simply the draft for the PDAM > >(preliminary draft amendment) to 10646. It will be subject > >to national ballot comments, which will, no doubt, result > >in further additions, as well as some minor modifications to > >what is currently there.

Re: Glottal stops (bis) (was RE: Missing African Latin letters (bis))

2003-12-08 Thread Kenneth Whistler
Michael Everson asked: > Your solution then, for Athapascan orthography? First of all, the preferred spellings are Athabascan (or Athabaskan [ANLC] or Athapaskan [Smithsonian]). There are *many* Athabascan orthographies, not just one, of course. See: http://www.uaf.edu/anlc/orthography.html fo

RE: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-08 Thread Kenneth Whistler
Peter Jacobi said: > Unicode doesn't prevent styling, of course. But having 'logical' order > instead of 'visual' makes it a hard task for the application and the > renderer. > This is witnessed by the thin-spread support for this. Yes, but having visual order instead of logical order makes *othe

Re: New symbols (was Qumran Greek)

2003-12-08 Thread Kenneth Whistler
> > Also, where will the new numbers for the accepted TLG > > items be posted? Debbie said everything got in, but I > > don't know where to find their assigned code points. > > is a > complete listing of new symbols to go into Unicode 4.1. > Reme

RE: Glottal stops (bis) (was RE: Missing African Latin letters (bis))

2003-12-05 Thread Kenneth Whistler
Peter, > For those situations in which unmarked-case glottal has been used, I > think it would cause the least confusion to leave 0294 as a cap-height > glyph, and call it upper case. I don't have time to argue this out today, but it is wrong, wrong, wrong, wrong, wrong. Oh, by the way, did I sa

Glottal stops (bis) (was RE: Missing African Latin letters (bis))

2003-12-05 Thread Kenneth Whistler
Peter said: > > On this list we > > have discussed the relation of > > > > U+0294 LATIN SMALL LETTER GLOTTAL STOP > > Actually, is LATIN LETTER GLOTTAL STOP. It is only the general category > property in the UCS that suggests lowercase. > > > > with an x-height *LATIN SMALL LETTER GLOTTAL STO

Re: Compression through normalization

2003-12-05 Thread Kenneth Whistler
Doug asked: > Mark indicated that a compression-decompression cycle should not only > stick to canonical-equivalent sequences, which is what C10 requires, but > should convert text only to NFC (if at all). Ken mentioned > normalization "to forms NFC or NFD," but I'm not sure this was in the > sam

Re: Sort Order

2003-12-04 Thread Kenneth Whistler
Mustafa Jabbar inquired: > Please also inform me about what will be the sorting for Bangla. > Thanks and regards The Unicode Standard is *not* a sorting standard -- nor is any character encoding. The reason why it might seem to be, on occasion, is that there is a long history of people fiddling

Re: Compression through normalization

2003-12-04 Thread Kenneth Whistler
Mark said: > The operations of compression followed by decompression can conformantly produce > any text that is canonically equivalent to the original without purporting to > modify the text. (How the internal compressed format is determined is completely > arbitrary - it could NFD, compress, dec

Re: meteorological symbols

2003-12-03 Thread Kenneth Whistler
Eric Scace asked: >At the risk of re-triggering yet another "what is a character" > discussion... Have meteorological symbols been considered for > incorporation in Unicode? (A search of the archives did not turn > up any discussion.) Not per se. And yes, this will trigger another "what is

Re: Free Fonts

2003-12-03 Thread Kenneth Whistler
Philippe, > > Sorry, but you really do not know what you are talking about. What cannot > > be freely distributed are *rasterisers* that make use of Apple patented > > technology that interpret TT instruction sets. Anyone can make, hint and > > ship -- freely or for a licensing fee -- a font wi

Fontasmagoria (was: Re: MS Windows and Unicode 4.0 ?)

2003-12-03 Thread Kenneth Whistler
Patric Andries continued: > > On Dec 2, 2003, at 7:35 PM, Patrick Andries wrote: > > > > > Well, some fonts would be better than none (and they have to be made > > > so that > > > the Unicode standard be printed). > > > > > > > The Unicode standard doesn't require Unicode to be printed. A lot of

RE: MS Windows and Unicode 4.0 ?

2003-12-01 Thread Kenneth Whistler
Philippe wrote: > Oh God... Surrogates were standardized long before they started > being used in Unicode 3.2 for new codepoint assignments out of > the BMP... Actually, the first supplementary graphic characters were assigned for Unicode 3.1. Unicode 3.2 added only BMP characters. > It was clea

Re: "Www" as an internet riddle

2003-11-17 Thread Kenneth Whistler
> «That's good for symbolizing e-mail», I said, «but that joint supports > no POP3/SMTP access, only webbrowsing. «You should go for a "www" > instead...» «Well, I want it as one character only. Any ideas, dummy?» > > This dummy then produced U+02AC to the startled friend, and hurried in > search

Re: compatibility characters (in XML context)

2003-11-14 Thread Kenneth Whistler
John Cowan said: > Kenneth Whistler scripsit: > > > However, there were character encoding standards committees, > > predating the UTC, which did not understand this principle, > > and which encoded a character for the Ångstrom sign as a > > separate symbol. In

Re: compatibility characters (in XML context)

2003-11-14 Thread Kenneth Whistler
Stefan Persson asked: > Alexandre Arcouteil wrote: > > Is that a clear indication that \u212B is actually a compatibility > > character and then should be, according to XML 1.1 recommandation, > > replaced by the \u00C5 character ? > > Isn't U+00C5 a compatibility character for U+0041 U+030A,

Re: What does i18n mean?

2003-11-14 Thread Kenneth Whistler
Ted Smith asked: > what does i18n mean? I see it bandied about a lot. > > My guess is "internationalisation", Correct. Or "internationalization", depending on your spelling conventions. > but actually when you pronounce > "eye won ayht en" it doesn't sound anything like that word. It is pron

Re: compatibility characters (in XML context)

2003-11-14 Thread Kenneth Whistler
Alexandre, > Philippe Verdy wrote: > > > From: "Kent Karlsson" <[EMAIL PROTECTED]> > > > >>Philippe Verdy wrote: > >> > >>>(1) a singleton (example the Angström symbol, canonically > >>>mapped to A with diaeresis, > >> > >>The Ångström (note spelling) sign is canonically mapped to > >>cap

Re: Ewellic

2003-11-13 Thread Kenneth Whistler
D Starner asked: > Jim Allan <[EMAIL PROTECTED]> writes: > > > Perhaps rather than "cipher" one should say that Unicode does not encode > > separately scripts or systems intended solely as transliterations of > > other scripts. Ciphers are a common example of such scripts and systems. > > What

Re: Ewellic

2003-11-12 Thread Kenneth Whistler
Jim Allan responded to Michael Everson: > I posted: > > > /Accordingly both Ewellic and Theban could be treated as ciphers of / > > /subsets of the Latin script. / > > Michael Everson responded: > > > I don't see how that follows at all. > > We have two scripts in which the forms of the c

Re: Comb. Diacritics Sup.

2003-11-11 Thread Kenneth Whistler
António asked: > The BMP roadmap shows "Comb. Diacritics Sup." at U+1DC0 .. U+1DFF in > parenthesised blue, a block «for which proposals have been formally > submitted to the UTC or to WG2. There is generally a link to the formal > proposal.» But no document is linked to it. Is it possible to acce

RE: Hexadecimal digits?

2003-11-11 Thread Kenneth Whistler
Jill Ramonsky summarized: > In summary then, suggestions which seem to cause considerably less > objection than the Ricardo Cancho Niemietz proposal are: > (1) Invent a new DIGIT COMBINING LIGATURE character, which allows you to > construct any digit short of infinity > (2) Use ZWJ for the same

Re: Berber/Tifinagh

2003-11-10 Thread Kenneth Whistler
Mark Shoulson wrote: > > We can't write meta-rules for everything, Curtis. And it isn't a good > > use of my time, anyway, to try. "Informed whim" if you will. > > Whim sounds about right. And this isn't a criticism. I honestly doubt > I could satisfactorily defend the choice of unifying Fre

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Kenneth Whistler
> Kenneth Whistler wrote on Theban, > > >Because of that, nobody (seriously) involved in > >10646 or Unicode has bothered to try to provide a > >character encoding proposal for it. And James Kass responded: > > There was a link sent to this list

RE: Tamil 0BB3 and 0BD7

2003-11-10 Thread Kenneth Whistler
Peter Jacobi noted: > but it would still hold, that: > U+0B95 U+0BC6 U+0BB3 and > U+0B95 U+0BCC > are indistinguishable in written Tamil. This is a true ambiguity in the writing system. ==> ke-l.a ==> kau Every analysis of Tamil that I see distinguishes the two letters, l.a versus -a

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Kenneth Whistler
> At 01:57 PM 11/10/2003, Peter Kirk wrote: > > > Was it a whim that Theban > >and Klingon were rejected? First of all, Theban hasn't been rejected. It has never formally been considered by either the UTC nor WG2 for character encoding. Why? Because it is so patently obvious that it is a Latin c

Re: Berber/Tifinagh (was: Swahili & Banthu)

2003-11-10 Thread Kenneth Whistler
Philippe Verdy wrote: > You seem to forget that Tifinagh is not a unified script, but a set of > separate > scripts where the same glyphs are used with distinct semantic functions. I think Philippe is running off the rails here. Tifinagh is a script. It comes in a number of local varieties, adap

RE: Hexadecimal digits?

2003-11-10 Thread Kenneth Whistler
Jill Ramonsky asked: > My question went unanswered, so I'll ask it again - do I get a vote? Well, Philippe did try to answer the question, but if you need it again, then the ultimate answer is no. By the way, in case people have not noticed, this is a WG2 document, not a UTC document, submitted

Re: UTF-9

2003-10-30 Thread Kenneth Whistler
John Cowan wrote: > > http://panda.com/tops-20/utf9.txt > > > > Res ipsa loquitur. But apparently res ipsa non loquitur, because Phillipe continued: > > Are there still now platforms where storage bytes are not octets but nonets? > i.e. 9-bit based platforms? If so this proposal makes sense, bu

Re: Terminology verification

2003-10-30 Thread Kenneth Whistler
Lars Marius Garshol asked: > I'm working on a specification for a data model and would like to > check that my definition of the string type makes sense. Well, language designers and data modelers may want to chime in with alternate opinions, but here is my two cents on this topic. > > The defi

Re: [OT by now] Re: Traditional dollar sign

2003-10-27 Thread Kenneth Whistler
> ... Ironically, > in 1943-45 nickels were actually minted in silver, as nickel was considered > strategic for the war effort. Current nickels are 75% copper and 25% > nickel, the same as the cladding of the other coins. (Pennies are > copper-clad zinc, however.) Prior to 1982, pennies were a

Re: U+0BA3, U+0BA9

2003-10-27 Thread Kenneth Whistler
haracter names attempt to be (a) unique and (b) reasonably > > mnemonic. Anything beyond that is a bonus. They expressly do *not* > > represent any form of transliteration or transcription scheme. > > Kenneth Whistler <[EMAIL PROTECTED]> wrote: > > The 10646 nam

Re: Traditional dollar sign

2003-10-27 Thread Kenneth Whistler
Doug Ewell noted: > The dollar sign was used > occasionally for decoration on large-sized (pre-1929) U.S. currency, but > not on small-sized issues (except for the bank-only $100,000 note). And very rarely even at that. See: http://www.money.org/bebeeexhibit.html for many exhibits of all kinds

Re: New contribution N2676

2003-10-24 Thread Kenneth Whistler
Philippe said: > Interesting: these arabic symbols are proposed, but with strange names. > I understand that these diacritics are needed to fit with their character > properties (notably in BiDi contexts). > > 0659 ARABIC ZWARAKAY . Pashto > Why not ARABIC MACRON ? Well, Zwarakay may be appro

Re: U+0BA3, U+0BA9

2003-10-24 Thread Kenneth Whistler
Peter Jacobi asked: > Can someone clarify the status of > U+0BA3 TAMIL LETTER NNA and > U+0BA9 TAMIL LETTER NNNA > > Comparing the glyph shapes with TSCII character tables > it is quite clear that U+0BA3 is NNNA and U+0BA9 is NNA. > > This makes also a lot of sense for non-speakers of Tamil, >

Re: Bangla: [ZWJ], [VIRAMA] and CV sequences

2003-10-09 Thread Kenneth Whistler
Gautam asked: > I stand corrected. Long syllabic /r l/ as well as > Assamese /r v/ are indeed additions beyond the ISCII > code chart. My objection, however, was not against > their inclusion but against their placement. I > understand why long syllabic /r l/ could not be placed > with the vowels,

Re: Bangla: [ZWJ], [VIRAMA] and CV sequences

2003-10-08 Thread Kenneth Whistler
Gautam said: > > The encoding of most Indic scripts is based on ISCII > > - and that's not going > > to change. It was adopted since ISCII was the > > pre-existing Indian national > > character encoding standard for these scripts. > > I understand that this is so. But perhaps it is > worthwhile f

RE: Bangla: [ZWJ], [VIRAMA] and CV sequences

2003-10-08 Thread Kenneth Whistler
Gautam suggested: > You are absolutely right. I am suggesting that the > language-specific viramas be retained as > script-specific *explicit* viramas that never > disappear. In addition, let's have a script-specific > ZWJ which behaves in the way you describe in the > preceding paragraph. The exp

Re: Unicode Public Review Issues update: BRAILLE

2003-10-07 Thread Kenneth Whistler
Asmus said: > In conclusion, it seems that the correct set of *default* properties for > Braille would be determined by the needs of inserting Braille strings into > other text (for educational manuals and similar specifications). > > As Marco has pointed out that means BIDI = L and I believe i

Re: Byzantine musical notation

2003-10-06 Thread Kenneth Whistler
Nick Nicholas asked: > (Resend) > > I can't really find the answer to this question online, especially > because the proposal documents for it don't seem to have been posted to > anubis.dkuug.dk. Furthermore, this is not actually an area I know > anything about. :-) So: > > Byzantine musical

Re: Punctuation symbols for partial cuneiform characters

2003-09-03 Thread Kenneth Whistler
Well, since Michael is engaged in an all-guns-blazing campaign on the public list, I guess I need to weigh in, too. > > Don't worry. The scholars aren't using them anyway so there won't be > > any disunification cost. TBD. > > Ah, but one of my minions (laughs hysterically) has pointed out the

Re: Character codes for Egyptian transliteration

2003-08-29 Thread Kenneth Whistler
[ME] > > I do not want to add a combining > > Egyptological ring-thingy to Unicode. It is not a productive mark. A > > capital and small letter i with a deformed dot is what's needed, > > that's all. [PK] > I thought it was policy never to add new precomposed characters, however > unproducti

Re: Breaking free from UNICODE

2003-08-19 Thread Kenneth Whistler
Don Osborn wondered: > Such opinions - and they are not necessarily isolated cranks - make one > wonder if there is not a huge "outreach" gap in Unicode's longterm strategy. Perhaps. Although I don't think I would characterize it as a "huge" gap. > A session on internet & African languages that

Re: Hexadecimal

2003-08-15 Thread Kenneth Whistler
Jull Ramonsky asked: > Thoughts anyone? Well, yes... > If the semantic difference between (for example) uppercase D and > mathemematical bold uppercase D was considered sufficiently great so as to > require a new codepoint, then I am tempted to wonder if the same might be > considered true of he

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Kent Karlsson said: > I see no particular *technical* problem with using WJ, though. In > contrast > to the suggestion of using CGJ (re. another problem) anywhere else but > at the end of a combining sequence. CGJ has combining class 0, despite > being invisible and not ("visually") interfering w

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Peter Kirk followed up: > On 07/08/2003 07:27, Philippe Verdy wrote: > > >On Thursday, August 07, 2003 2:40 AM, Doug Ewell <[EMAIL PROTECTED]> wrote: > > > >>Kenneth Whistler wrote: > >> > >>>But I challenge you to find anything in

Unicode 4.0 is online at last!

2003-08-14 Thread Kenneth Whistler
Well, I've been promising that good things would come to those who wait. ;-) At last, the Unicode website has been updated with the online chapters for Unicode 4.0. See: http://www.unicode.org/versions/Unicode4.0.0/ Or just go to the Unicode 4.0 link from the home page. Enjoy. --Ken P.S. Just

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Kent asked: > How should a freestanding double diacritic be encoded (for purposes of > meta-discussions, and the like): or diacritic, SPACE>? It *could* be represented as , of course, or for that matter , or other possibilities. The combining character sequence, in either case, is the sequenc

Re: Roadmap---Mandaic, Early Aramaic, Samaritan

2003-08-14 Thread Kenneth Whistler
Elain Keown responded to Michael: > > I really, really, really don't have time to debug your > > dissatisfaction with the use of the word "Aramaic" in the Roadmaps. > > This is NOT something anyone is working actively on right now. When a > > I'm not writing about nomenclature---not the point a

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Peter Kirk asked: > Thanks for the clarification. I probably misunderstood Jon's intention. > But is there a problem if, for example, an application sees the string > and regularises it (wrongly!) to combining mark>? Then you have a problem, of course. What the Unicode Standard says about ap

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Peter Kirk wrote: > I think this may be a "Peter mistake". I meant to refer to spacing > diacritics. Sorry. > > It is certainly highly inappropriate for spacing diacritics to > be considered word boundaries. Why? It is entirely dependent on the orthography and conventions involved. There is pr

Re: Handwritten EURO sign (off topic?)

2003-08-14 Thread Kenneth Whistler
Jim Cloos wrote: > They aren’t really SI preficies in this context. Milli, centi, kilo, > mega and giga (at least) have part of the global lexicon; terra is ^ > not far behind (especially if disk sizes continue to grow). Does that r

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kenneth Whistler
Ted Hopp asked: > I believe that reasonable people might reasonably conclude from factoids 1 > and 2 that SPACE is indeed a format character. > > Reasonable, but evidently wrong. Explanation, please? I provided the text deconstruction in my last email, but to continue, the confusion arises from

Re: Conflicting principles

2003-08-14 Thread Kenneth Whistler
John Cowan asked: > I would like to ask the old farts^W^Wrespected elders of the UTC > which principle they consider more important, abstractly speaking: > the principle that combining marks always follow their base characters > (a typographical principle), or that text is stored, with a few minor

Re: Aramaic scripts

2003-08-14 Thread Kenneth Whistler
Raymond Mercier wrote: > There are less obvious omissions: > > 1. Kharoshthi, a RtoL script much used in North West India, > and regarded by everyone as a derivative from a form of > the Aramaic script used in that region. ... And to add to Michael's reply, the historical status of Kharoshthi

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kenneth Whistler
Peter responded to Mark: > On 05/08/2003 14:40, Mark Davis wrote: > > >Where did you get the notion that space is not a base character? And > >base characters include those that are not control or format > >characters. Space is neither one. > > > >The standard specifically states in a number of p

Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kenneth Whistler
Peter Kirk asked: > A similar issue which is not Hebrew related would be a (mythical) > requirement to display a diacritic like 0315, 031B or 0322 in isolation. > It would not always be appropriate to use a space or NBSP as a base > character as this would indent the glyph from the beginning of

Re: Compatibility decompositions

2003-08-14 Thread Kenneth Whistler
John Cowan asked: > I realize that existing compatibility decompositions are a rag-bag, > especially those marked with the generic tag rather than one > of the specific tags such as , , or . I wonder > what principles, if any, can be enunciated for giving a newly introduced > character a compati

Re: Conflicting principles

2003-08-14 Thread Kenneth Whistler
Philippe Verdy asked: > > Ken's point of course is that however bizarre the backing store for > > Sindarin and English Tengwar modes may be, combining characters per > > se must follow their base characters no matter what. > > Even if that breaks the logical analysis of text? Yes. And that is th

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-11 Thread Kenneth Whistler
Philippe replied: > From: "Kenneth Whistler" <[EMAIL PROTECTED]> > > Of course a standard which mandates space folding is also > > within its rights to mandate, for example, the non-use of > > nonspacing marks applied to SPACE characters. It can simply > &g

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-11 Thread Kenneth Whistler
Philippe Verdy wrote: > Spacing diacritics are not "on the edge" of the standard, The "edge" I was speaking of was the requirement for the exact display width of a nonspacing diacritic on top of a SPACE to be specifiable in some determinant way. > when they > are already given a full block and

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-11 Thread Kenneth Whistler
Peter Kirk responded: > On 11/08/2003 06:59, Jon Hanna wrote: > > >There are only two theoretical problems that I can see here, the first is > >that a whitespace character other than space gets converted to space by > >attribute value normalisation, and that this changes the meaning of the text >

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-10 Thread Kenneth Whistler
Peter Kirk asked: > If I want to do this, should I explicitly encode a dotted circle, or > should I encode nothing and expect the font to generate the dotted > circle, as it often does? If you want to represent the text content of a dotted circle with an accent on it, the recommended representa

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-10 Thread Kenneth Whistler
Peter Kirk said: > Tell Microsoft! (See Noah Levitt's posting.) Indeed. > > If this is indeed "The standard way to do what you want", then the > standard needs to make it clear that the sequence of mark> or has the properties which I want, i.e. it > has the width of the combining mark alone

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-10 Thread Kenneth Whistler
John Cowan asked: > > D17a Defective combining character sequence: A combining character > > sequence that does not start with a base character. > > > > * Defective combining character sequences occur when a sequence > >of combining characters appears at the start of a stri

Re: Conflicting principles

2003-08-09 Thread Kenneth Whistler
Philippe, > Just look at musical notations where a upper horizontal parenthesis > is used to group some elements (sorry I don't know how you name > it exactly in English or Italian), despite there's a measure break > in the middle, which may span to the other musical line: you end > up with two pa

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-08 Thread Kenneth Whistler
Philippe continued: > On Saturday, August 09, 2003 12:49 AM, Michael Everson wrote: > > > At 14:22 -0700 2003-08-08, Kenneth Whistler wrote: > > > > > Philippe, you are tilting at windmills, here. There is no chance > > > that the UTC is going

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-08 Thread Kenneth Whistler
Thomas Widman suggested: > Peter Kirk <[EMAIL PROTECTED]> writes: > > > On 08/08/2003 08:54, Philippe Verdy wrote: > > > > > ... Could there be another codepoint assigned that has > > >these properties: > > > > > >20CF;ZERO WIDTH SYMBOL;Sk;0;ON; 0020N; > > > [...] > > But I'm not sure th

<    1   2   3   4   5   6   7   8   9   10   >