RE: Questions about Unicode history

2002-01-31 Thread Alan Wood

> Marco Cimarosti asked
> 
>  Are OpenType fonts currently implemented in any
>  platform other than Windows?

Yes.

Apple supplies 4 Japanese OpenType fonts with Mac OS X - Hiragino Kaku
Gothic Pro, Hiragino Kaku Gothic Std, Hiragino Maru Gothic Pro and Hiragino
Mincho Pro.

Adobe supplies TektonPro with InDesign 1.5 for Mac OS 9.

Alan Wood





RE: Introducing the idea of a "ROMAN VARIANT SELECTOR" (was: Re: Proposing Fraktur)

2002-01-31 Thread Yves Arrouye

> quite a lot of space. However, Fraktur is already encoded in the
> Mathematical whatever-it's-called block. This variant selector would mean
> that lots of characters can be displayed in two *different* ways. I'd
> prefer
> that Fraktur diacritics were added instead, and that the mathematical
> letters were to be used for Fraktur texts.

I hope not. These were encoded there because they convey a specific meaning
when used for mathematics. If you use them to spell out names, then you're
abusing them and potentially confusing software that would rely on their
mathematical semantics.

I think it's time to have another proposal for French, FRENCH VARIANT
SELECTOR, where we do not use Fraktur but some other font variation. And we
may need a QUEBEC VARIANT SELECTOR if they have different rules... Or should
it be a QUEBEC FRENCH VARIANT SELECTOR to show the relationship?

YA





RE: Beta version

2002-01-31 Thread Kent Karlsson



> > Of course, e.g., a, , and a should be 
> ordered the same
> > at the primary level for the Nordic languages.
> 
> "ä", "æ", "a ¨-above", and "a e-above" should all be sorted 
> the same in
> Swedish, no matter whether they're written in capital or 
> small letters. Of
> course (?), the "e-above" should always be a small "e". "a 
> e-above" should
> not be sorted as "a", as you stated above.

Ooops.  I wish I could completely disable "US-ASCII" in the mailer!

What I originally wrote and meant, before the mail program mangled it,
was in accordance whith what Stefan says here (resending in UTF-8):

---

Of course, e.g., ä, , and æ should be ordered the same
at the primary level for the Nordic languages.

/kent k

PS (with regard to the related issue with Vietnamese)

Notice how the UCA (UTS 10) requires that 
be ordered at the primary level as an a (), when å is
tailored at the primary level (e.g. to be near the end of the alphabet),
but the dot is a secondary level difference.  (Unfortunately, 14651 does
not make the same requirement...)





RE: Questions about Unicode history

2002-01-31 Thread Marco Cimarosti

Thank you all for all the precious answers that I am receiving publicly and
privately.

I am collecting enough material to write a book about the history of
encoding, rather than just a short article about Unicode!

I think that much of this material has general interest, so I will post a
RESUME of all the answers as soon as I see that the thread has expired.

*** I assume that I CAN RE-POST the PRIVATE ANSWERS that I received. If any
of the authors wishes me to not republish their messages or part of them, or
wish to remain anonymous, please let me know separately. ***

Most of the answers, of course, are contained in Magda Danish's yet
unpublished summary of Unicode history. When the case, I will simply refer
to "the Unicode history on the Unicode web site"; everybody will be able to
read it as soon as it will be completed and published.

_ Marco




Re: Shamrock

2002-01-31 Thread jgo

>At 09:39 -0800 2002-01-29, Kenneth Whistler wrote:
>>Michael,
>>
>>   > >>At some stage I will be requesting a shamrock, as this is used in a
>>>   >>number of dictionaries as a symbol denoting horticulture.
>>>   >
>>>   >What about U+2663?
>>>
>>>   Where on earth did that annotation come from? A club is not a shamrock.
>>...
>  That doesn't mean the two are unifiable.
>
>  A shamrock is "any of various plants with trifoliate leaves, esp.
>  Trifolium minus, T. repens, or Medicago lupulina, used as the
>  national emblem of Ireland." Shamrock leaves are *heart-shaped*.
>
>  A clover, on the other hand, has round leaves, usually three, four
>  when you're lucky.  A clover is not used as an emblem for Ireland, not
>  is a clover pictured in Íslensk ordabók as a symbol for botany.

Yes, we used to have both growing around the yard.

>  Further, the card suits do not derive from symbols of hearts, spades,
>  diamonds, or clubs. From
>  http://www.themysticeye.com/info/playingcard.htm :
>
>  "Designed in the Middle Ages, the tarot deck reflected medieval
>  society...--the deck included 56 cards
>  divided into four suits: cups (the church); swords (the military);
>  pentacles, or 5-pointed stars (merchants); and batons (farmers)...
>  Clover leaves. Deriving from batons, also known as wands or staves.

also "rods".

>  An interesting page suggesting that the Tarot cards may ultimately
>  have derived from China cites anthropologist W. H. Wilkinson writing
>  in 1895 on the subject.
>  http://www.ahs.uwaterloo.ca/~museum/Archive/Wilkinson/Wilkinson.html

"Somewhere around the 11th century AD, the letter W was added to
  distinguish 2 U's from a U & a V; & in the same fashion, the
  letter J came into manifestation as a variant form of the Latin
  letter I." --- William Eisen 1980 _The English Cabalah_ pg 42

"In the Cabalah, 2 of the suits of the Minor Arcana are the movers &
  the other 2 are the moved.  The more powerful suits initiate the force,
  & the weaker suits react to it." --- William Eisen 1980
  _The English Cabalah_ pp 139-140

"The Hermit is a symbol of attainment, rather than a symbol of quest...
  [He carries] the Lamp of Truth, & it contains within it the 6-pointed
  star of the Seal of Solomon.  The Hermit stands isolated & alone.
  He is always hooded, & he is robed in a mantle of discretion.  The
  staff that he leans on is the staff of intuition, & these 3 symbols...
  are what constitute his inner strength.  But what is the secret of
  his power?  It is his great symbol of authority, the letter H, the
  8th letter of the English alphabet.  Turn the number 8 over on its
  side & it becomes... the symbol of infinity.  Open up the letter H
  into its component parts, & it becomes 1-1 or nothing...  [The Magician]
  stands before a table upon which are the symbols of the 4 natural
  elements of earth, wather, air, & fire.  These are represented by
  a pentacle, a cup, a sword, & a wand, respectively.  The cosmic lemniscate
  symbol of infinity is above his head, & around his waist is the occult
  symbol of eternity -- a serpent swallowing its tail.  His black hair,
  which is bound by a white band, signifies the limitation of ignorance
  by knowledge...  Strength  A woman, garlanded with flowers [with the
  symbol of infinity above her head] & dressed in a simple white robe,
  is closing the mouth of a ferocious lion with as much ease as if it
  were a lamb.  In other decks she is opening it...  She is the High
  Priestess (the letter F), the counterpart of the Magician (the letter
  She is demonstrating that the powers of the mind are far superior to
  the physical strength of the lion, & that she is truly 'One to Obey'."
  --- William Eisen 1980 _The English Cabalah_ pp 349 & 350 & 366

"Indeed, when religious people quarrel about religion, or hungry
  people quarrel about victuals, it looks as if they had not much
  of either among them." --- Benjamin Franklin (quoted in Joseph Lewis
  _Benjamin Franklin, FreeThinker_; quoted in B. James 2002 January
  _The Irish Times_ vol 1 #6 pg 6)

John G. Otto, Eagle Scout, Knight, Cybernetic Praxeologist
Existence, Consciousness, Identity, Life, Liberty, Property, Privacy, Justice





Re: Questions about Unicode history

2002-01-31 Thread Mark Davis

For when particular characters were added to Unicode, you can also
consult the new DerivedAge.txt, currently in the BETA at:

http://www.unicode.org/Public/BETA/Unicode3.2/DerivedAge-3.2.0d2.txt

Mark
—

Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο 
πάντα — Ὁμήρου Μαργίτῃ
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

- Original Message -
From: "Kenneth Whistler" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Wednesday, January 30, 2002 12:18
Subject: Re: Questions about Unicode history


> Marco,
>
> I'll answer as many of your questions as I can, and will
> cc this to the unicode list (in part to forestall a gazillion
> "Well, I think maybe X" responses).
>
> --Ken
>
> > - When did the Unicode project start, and who started it?
>
> The detailed history for this will soon be available on the
> Unicode website. The short answer is that Joe Becker (Xerox) and
> Lee Collins (Apple) were highly instrumental in getting the
> ball rolling on this, and the preliminary work they did,
> primarily on Han unification, dated from 1987.
>
> However, "the Unicode project" had many beginnings -- many points
> where you could mark a milestone in its early development. And
> the Unicode Consortium celebrated a number of 10-year
> anniversaries, starting from 1998 and continuing through last year.
>
> >
> > - Is it true Han Unification was the core of Unicode, and the idea
of an
> > universal encoding come afterwards?
>
> The effort by Xerox and Apple to do a Han unification was key to
> the motivation that eventually led to a serious effort to actually
> *do* Unicode and then to establish the Unicode Consortium to
> standardize and promote it. However, the idea of a universal
encoding
> predated that considerably. In some respects the Xerox Character
Code
> Standard (XCCS) was a serious attempt at providing a universal
> character encoding (although it did not include a unified Han
> encoding, but only Japanese kanji). XCCS 2.0 (1980) contained, in
> addition to Japanese kanji: Latin (with IPA), Hiragana, Bopomofo,
Katakana,
> Greek, Cyrillic, Runic, Gothic, Arabic, Hebrew, Georgian, Armenian,
> Devanagari, Hangul jamo, and a wide variety of symbols. The early
> Unicoders mined XCCS 2.0 heavily for the early drafts of Unicode
1.0,
> and always regarded it as the prototype for a universal encoding.
>
> Additionally, you have to consider that the beginning of the ISO
project
> for a Multi-octet Universal Character Set (10646) predated the
> formal establishment of Unicode. Part of the impetus for the serious
> work to standardize Unicode was, of course, discontent with the
> then architecture of the early drafts of 10646.
>
> >
> > - Who and when invented the name "Unicode"?
>
> This one has a definitive answer: Joe Becker coined the term,
> for "unique, universal, and uniform character encoding", in 1987.
> First documented use is in December, 1987.
>
> >
> > - When did the ISO 10646 project start?
>
> Unfortunately, the document register for early WG2 documents doesn't
> have dates for all the early documents, and I don't have all the
> early documents to check. But...
>
> The 4th meeting of WG2 was held in London in February, 1986. The
> first three meetings were in Geneva, Turin, and London,
respectively.
> That puts the likely timeframe for the Geneva meeting, and the
> establishment of WG2 by SC2 at about 1984. The *only* project for
WG2
> was 10646.
>
> Some of the older oldtimers on the list may have more exact
information
> about the early WG2 work.
>
> >
> > - When did Unicode and ISO 10646 merge?
>
> It wasn't a single date that can be pointed to, like the signing
> of an armistice. In some respects, Unicode and ISO 10646 are *still*
> merging, as modifications and amendments to deal with niggling
little
> architectural edge cases are worked out.
>
> However the key dates were:
>
> January 3, 1991. Incorporation of the Unicode Consortium, which
>signalled to SC2 that the Unicoders were serious in their
>intentions.
>
> May, 1991. Meeting #19 of WG2 in San Francisco. An ad hoc meeting
>took place between WG2 members and some Unicoders, which paved
>the way for the later "merger" of the standards.
>
> June, 1991. The 10646 DIS 1 was defeated in its ballotting. This
left
>the only reasonable way forward an architectural compromise with
>the Unicode Standard, which at that point was in copy edit and
>about to go to press.
>
> June 3, 1991. The date of "10646M proposal draft to merge Unicode
and
>10646", by Ed Hart. This was a key document in the resulting
>merger of features.
>
> August, 1991. The Geneva WG2 meeting accepted Han unification,
combining
>marks, dropped byte-by-byte restrictions on code values for
UCS-2,
>and accepted Unicode repertoire additions. From that point
forward,
>the overall aspect of what became ISO/IEC 10646-1:1993 was clear.
>
> >
> > - Wha

Keyboard mapping on Windows XP?

2002-01-31 Thread Rick McGowan

Recently I got Windows XP. Now I need to "fix" the keyboard.

On Windows 98 I used to use the great ZDKeyMap utility (a virtual driver  
available at zdnet.com) to remap several keys on my keyboard. This utility  
doesn't work with Windows XP.

Does anyone out there have a keyboard re-mapping utility -- free or cheap  
or even expensive! -- that works for Windows XP.

Thanks,
Rick

P.S. I've looked around a bit on the Net and did come up with "Remapper  
XP" utility, but it only swaps Control with CapsLock. If anyone knows how  
to swap the Escape and Tilde/Backquote keys, I'd be much obliged for info!









RE: Questions about Unicode history

2002-01-31 Thread Greenwood, Timothy

> - When did the ISO 10646 project start?

A paper that I wrote ("International Character Sets - the 7/8 bit story") for an April 
1985 conference at Digital references a note from Masami Hasegawa, the original editor 
of 10646.  This note was dated 17 October 1984. Masami's paper "Towards Multi-Lingual 
Data Processing" for the same conference has the paragraph 

'In the plenary meeting of TC97/SC2 of ISO, which is a sub-committee for information 
coding, it was decided that an International Standard is needed for a two  byte 
graphic character set. Thus a working group WG2, two-octet graphic, was formed to 
write a draft proposal.'


> - When did Unicode and ISO 10646 merge?

See 
http://groups.google.com/groups?q=hasegawa+ISO+10646&hl=en&selm=10635%40sun103.crosfield.co.uk&rnum=2
 for a report on the first (or one of the first) merger meetings. 


> - When was ISO 8859 published?

The above paper has it that the ECMA standard was approved in December 1984 and that 
ISO and ANSI were approving it as the paper was written in early 1985.


Tim Greenwood





Re: Keyboard mapping on Windows XP?

2002-01-31 Thread Frank da Cruz

> Recently I got Windows XP. Now I need to "fix" the keyboard.
> 
> On Windows 98 I used to use the great ZDKeyMap utility (a virtual driver  
> available at zdnet.com) to remap several keys on my keyboard. This utility  
> doesn't work with Windows XP.
> 
> Does anyone out there have a keyboard re-mapping utility -- free or cheap  
> or even expensive! -- that works for Windows XP.
> 
> Thanks,
>   Rick
> 
> P.S. I've looked around a bit on the Net and did come up with "Remapper  
> XP" utility, but it only swaps Control with CapsLock. If anyone knows how  
> to swap the Escape and Tilde/Backquote keys, I'd be much obliged for info!
> 
I gave up on all this a long time ago.  I have to use PCs with many OS's, and
figuring out how to remap the keys on each one is too much of a time waster.
Instead, you can buy a very nice keyboard in the classic IBM tradition --
heavy, great touch, tactile/audible feedback, i.e. just like IBM keyboards
were before the advent of the cheesy $5 keyboard, but with Ctrl/Caps Lock
swapped and also Esc swapped with Grave/Tilde:

  http://www.pckeyboard.com/customizer.html

For this layout you want the "Linux II" model.

- Frank




dynamic font (WEFT)

2002-01-31 Thread Kundan Singh



hi all
 
can anybody tell me how to make dynamic font for global use (which can be 
uploaded on any server) without taking care of specific url
 
thanks in advance
kundan


Re: Beta version

2002-01-31 Thread Kenneth Whistler

David Starner and Doug Ewell wondered:

> In a message dated 2002-01-30 21:48:47 Pacific Standard Time, 
> [EMAIL PROTECTED] writes:
> 
> > Is there someplace where we, the unwashed masses, have access to these
> > documents?
> 
> Yeah.  Good question.  I've found some of them myself, in particular the code 
> charts, by poking around the WG2 site at dkuug.dk and in other places.  If 
> they're on the public Internet, I have every right to see them and download 
> them, but they clearly weren't put there for that purpose.

The official location of the WG2 site is:

http://www.dkuug.dk/

where you can navigate to the WG2 page.

Or you can avoid the navigation bar, and go direct to:

http://www.dkuug.dk/jtc1/sc2/wg2/

The official location of the SC2 document register is:

http://lucia.itscj.ipsj.or.jp/servlets/ScmDoc10?Com_Id=02

(I wish they had a normal index page as a starting point, but there
you are.)

If you go to the SC2 document register, you can search for
SC2 N3584, which is the PDAM 2 to 10646-1, and for SC2 N3585,
which is the PDAM 1 to 10646-2. Or just pick the document
range N3551-N3600 and browse the list.

FDAM's, administrative documents, and some other documents are locked
on that list, but PDAM's and notices are publicly available without
password access.

The two PDAM documents are actually simply links to zipped documents
also sitting on the dkuug.dk server in the SC2 section there. So
if you want to avoid going through the register, here you go:

http://anubis.dkuug.dk/jtc1/sc2/open/02n3584.zip
http://anubis.dkuug.dk/jtc1/sc2/open/02n3585.zip

If you want to provide feedback on the documents, the right way to
do it is to go through a national body. Obviously, for Unicoders,
the easiest way to do that is by getting formal feedback (as documents,
not just unicode list email chatter) to the UTC, where joint positions
are developed for Unicode and L2 feedback to WG2 through ballotting.

But if you are on the unicode email list but feel it is appropriate
to provide your feedback through another active national body (Ireland,
Japan, Germany, Sweden, Finland, ... whatever), then that, too, is
of course, up to you.

If you are in any doubt about who the national bodies are, ISO
keeps a list:

http://www.iso.ch.iso/en/aboutiso/isomembers/MemberList.MemberSummary?MEMBERCODE=10

will get you directly there.

--Ken




Re: Shamrock

2002-01-31 Thread Wm Seán Glen



I concur with Michael, a shamrock, as a symbol for Ireland, is 
not a clover, even though they may be in the same genus. It would be like 
confusing the smiley face for Mr Yuck.
Although, as a symbol of Éire, I still prefer "Azure, a harp 
Or, strung argent"
Wm Seán Glen

  - Original Message - 
  From: 
  Otto Stolz 
  To: [EMAIL PROTECTED] 
  Sent: Tuesday, 29 January, 2002 
8:14
  Subject: Re: Shamrock
  Mr. Everson said:> At some stage I will be requesting a 
  shamrock, as > this is used in a number of dictionaries as a symbol 
  denoting > horticulture.What about U+2663?Best 
  wishes,   Otto Stolz


Re: Proposing Fraktur

2002-01-31 Thread Stefan Persson

- Original Message -
From: "Kenneth Whistler" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: den 31 januari 2002 01:04
Subject: Re: Proposing Fraktur


> > And so what? I thought the meaning of Unicode was that all languages
should
> > be fully supported in plain text, using one single font to displaying
all of
> > the characters. With old Swedish, this isn't possible.
>
> I think this misconstrues the mission of Unicode as an encoding. The goal
> is to encode sufficient characters to enable the correct and legible
> representation of *plain* text in any script (modern or historic).

This difference has to be done everywhere (read: including in plain text),
otherwise the text is grammatically wrong.

- Original Message -
From: "Yves Arrouye" <[EMAIL PROTECTED]>
To: "'Stefan Persson'" <[EMAIL PROTECTED]>; "Karl Pentzlin"
<[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: den 31 januari 2002 09:54
Subject: RE: Introducing the idea of a "ROMAN VARIANT SELECTOR" (was: Re:
Proposing Fraktur)

> > quite a lot of space. However, Fraktur is already encoded in the
> > Mathematical whatever-it's-called block. This variant selector would
mean
> > that lots of characters can be displayed in two *different* ways. I'd
> > prefer
> > that Fraktur diacritics were added instead, and that the mathematical
> > letters were to be used for Fraktur texts.
>
> I hope not. These were encoded there because they convey a specific
meaning
> when used for mathematics. If you use them to spell out names, then you're
> abusing them and potentially confusing software that would rely on their
> mathematical semantics.

Letters A through Z and ALPHA through OMEGA are used in *both* text and
mathematics, and I see no problem with this. Why would this cause problems
with the Fraktur letters in the Mathematical Alphanumeric Symbols block?

> I think it's time to have another proposal for French, FRENCH VARIANT
> SELECTOR, where we do not use Fraktur but some other font variation. And
we
> may need a QUEBEC VARIANT SELECTOR if they have different rules... Or
should
> it be a QUEBEC FRENCH VARIANT SELECTOR to show the relationship?

Do you have to use *both* kinds of characters at the same time in the same
document? In old Swedish you have to use *both* "a"'s at the same time,
otherwise the text is grammatically wrong, be it so in plain text.

Stefan


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com





Re: Shamrock

2002-01-31 Thread Michael Everson

At 10:06 -0800 2002-01-31, Wm Se·n Glen wrote:
>I concur with Michael, a shamrock, as a symbol for Ireland, is not a 
>clover, even though they may be in the same genus. It would be like 
>confusing the smiley face for Mr Yuck.
>Although, as a symbol of Éire, I still prefer "Azure, a harp Or, 
>strung argent"
>Wm Seán Glen

Hence, the annotation "= shamrock" should be removed from the clubs suit.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




Re: Unicode support in IBM AS -400

2002-01-31 Thread Markus Scherer

Anil Joshi wrote:

> The exact questions that I am looking for are
> a. Does AS-400 support Unicode if so what kind of support it is. I mean can
> I have files names in local language. Say file names in Japanese.


As far as I know, OS/400 filenames have been 16-bit Unicode for a number of years, and 
some other parts of the OS support Unicode as well.


> b. Does QShell support multilingual scripts I mean can I write a script that
> can contain Japanese file names. 


I don't know - I am not that familiar with OS/400...

Some releases of ICU have been ported to OS/400. ICU 2.0 is being ported right now.
Please watch this space if you are interested: 
http://oss.software.ibm.com/icu/download/

Best regards,
markus





Re: Introducing the idea of a "ROMAN VARIANT SELECTOR" (was: Re: Proposing Fraktur)

2002-01-31 Thread Kenneth Whistler

Stefan suggested:

> > Maybe something like a "ROMAN VARIANT SELECTOR" would be appropriate:
> 
> In any case, it'd be better to have *two* selectors, one to turn on Fraktur,
> and a different one to turn it off. Otherwise, you'd have to put the variant
> selector after *every* letter you want to be in antiqua, which would require
> quite a lot of space. 

That sounds like a perfect example of where use of markup is appropriate.

Using Karl Pentzlin's Duden example:

Das sinkende Schiff sandte SOS-Rufe. (The sinking ship emitted SOS calls.)
fff Ffff ff Ff aaaff

Das sinkende Schiff sandte SOS-Rufe.

or conversely, perhaps better:

Das sinkende Schiff sandte SOS-Rufe.

And your rendering process then has all the context to decide whether
to apply the Fraktur rendering rules or the Antiqua rendering rules
to particular segments.

What makes you think that having these "selectors" encoded as characters,
rather than as markup, will improve the situation? You would need
other markup, anyway, to make actual font choices, for example.

And in general, the piling on of more stateful formatting
controls is damaging to the standard and damaging to the clear
relationship between the plain text content conveyed in Unicode
and the markup conveyed in markup languages such as HTML and XML.
(That relationship is already complicated enough at the edges, for
things like bidi -- and adding more instances of stateful transitions
that the purveyors of markup language believe belong in their
standards rather than in plain text doesn't help anything.)

There is a reason why similar stateful selectors such as U+206E NATIONAL
DIGIT SHAPES and U+206F NOMINAL DIGIT SHAPES are formally deprecated
in the Unicode Standard.

--Ken





When to use markup: (Was:Introducing the idea of a "ROMAN VARIANT SELECTOR" (was: Re: Proposing Fraktur))

2002-01-31 Thread Asmus Freytag

At 09:42 AM 1/30/02 +0100, Karl Pentzlin wrote:
>The question is, are typesetting rules "part of the script"?
>
>(I mean rules in the sense of obligatory regulations, not guidelines).

This distinction is a very German way of approaching the question.

>If yes, (in my opinion) the plain text must carry the information that is
>needed to follow them. If no, their execution can be left to higher level
>protocols (which then have to decide whether a word is a foreign word
>[to be set in Roman letters] or a name [to be set in Fraktur letters],
>such at least according to German typesetting rules).

A more productive distinction would be along these lines:

a) is the feature necessary for correctly expressing the content
b) is the feature rule based, and
b.1) is the rule implementable w/o knowledge of semantics, or
c) when implementing the feature, is it necessary to
c.1) provide scope information, or
c.2) is local context sufficient

Looking at this list, roughly in reverse order:

Higher level protocols, understood as "markup languages" in particular,
do really well, when implementing something requires defining a scope,
since in them, all text data and the effect of all syntax are scoped
already.

If layout features can be determined algorithmically, it makes little
sense to add what can be derived from the existing text data, also into
the markup. Allowing for duplicate representation of information, always
allows the possibility of something getting out of step.

If semantic knowledge is required to implement a feature, this knowledge
must be supplied. If the extra information can be expressed as point-like,
local context, then it makes much *less* sense to use higher level markup
compared to character codes. Character codes, in a way, provide the ideal
representation of point like context in a data stream.

Finally, we get back to the original argument. Whether a typesetting
rule (and by rule I mean both conventions and legislated rules) is
supported by information added to the plain text or not, does not depend
on whether a national authority promulgates it, or whether it just
represents the consensus of the users of the language.

If, in practice, such a rule can be ignored, yet not change the meaning
of the text, it's a good candidate for not being implemented via plain
text. However, this is not absolute:

Leaving out italics from a document can not only change the level of
emphasis, but for example in English, there are occasional circumstances
where the use of italics removes a possible ambiguity in interpreting
a sentence. Nevertheless (except for mathematics) italics were left to
a higher level protocol (style markup).

Overriding bad hyphenation, or bad line breaks, is supported by SHY and
NBSP, even though hyphenation is not required at all to express the
content of a text, nor would bad line breaks e.g. after "Dr." change
the meaning of the text.

In the latter two cases, character codes were added (fairly early) to
plain text, because using point-like context to support these very
common algorithms (hyphenation and linebreak) is an elegant solution,
while adding markup for the same purpose would be inelegant to the
extreme.

Like everything else in character encoding, there are shades of gray,
and levels of gradation, so not everything is clear cut. But recognizing
up front that character codes may legitimately serve the support of
algorithms, even where the feature implemented by the algorithm is
merely common, and not absolutely and minimally required, is useful.

A./




Re: Keyboard mapping on Windows XP?

2002-01-31 Thread Michael \(michka\) Kaplan

Hi Rick,

I cannot ever get this sort of thing to work; I usually end up creating a
new keyboard layout instead


MichKa

Michael Kaplan
Trigeminal Software, Inc.  -- http://www.trigeminal.com/

- Original Message -
From: "Rick McGowan" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, January 31, 2002 8:31 AM
Subject: Keyboard mapping on Windows XP?


> Recently I got Windows XP. Now I need to "fix" the keyboard.
>
> On Windows 98 I used to use the great ZDKeyMap utility (a virtual driver
> available at zdnet.com) to remap several keys on my keyboard. This utility
> doesn't work with Windows XP.
>
> Does anyone out there have a keyboard re-mapping utility -- free or cheap
> or even expensive! -- that works for Windows XP.
>
> Thanks,
> Rick
>
> P.S. I've looked around a bit on the Net and did come up with "Remapper
> XP" utility, but it only swaps Control with CapsLock. If anyone knows how
> to swap the Escape and Tilde/Backquote keys, I'd be much obliged for info!
>
>
>
>
>
>
>





Re: Proposing Fraktur

2002-01-31 Thread David Starner

On Thu, Jan 31, 2002 at 07:32:40PM +0100, Stefan Persson wrote:
> Do you have to use *both* kinds of characters at the same time in the same
> document? In old Swedish you have to use *both* "a"'s at the same time,
> otherwise the text is grammatically wrong, be it so in plain text.

Being grammatically wrong implies that there's a error in the normal
form of the language - that is, the spoken form for most languages. And
I don't see it as any different from the rules that you must put the
titles of books in italics.

-- 
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
What we've got is a blue-light special on truth. It's the hottest thing 
with the youth. -- Information Society, "Peace and Love, Inc."




RE: Unicode support in IBM AS -400

2002-01-31 Thread Jones, Bob

Within DB2/400 you can use Unicode using the "GRAPHIC" and "VARGRAPHIC" data
types using CCSID of 13488 (UCS-2).  In addition, as Markus mentions, ICU is
supported on the AS/400.  My understanding is that at some point (V5R1 or
V5R2, I think) IBM is planning to ship ICU with the OS.  As far as filenames
and scripts, I'm not sure what you can do with these as far as Unicode.

We have been able to run in UCS-2 with our C based product on the AS/400
with the DB2/400 database.  We use ICU for a lot of the internal string
functionality.

Bob

Bob Jones
OneWorld Tools Development
JDEdwards
Denver, Colorado

-Original Message-
From: Markus Scherer [mailto:[EMAIL PROTECTED]]
Sent: Thursday, January 31, 2002 12:36 PM
To: unicode
Subject: Re: Unicode support in IBM AS -400


Anil Joshi wrote:

> The exact questions that I am looking for are
> a. Does AS-400 support Unicode if so what kind of support it is. I mean
can
> I have files names in local language. Say file names in Japanese.


As far as I know, OS/400 filenames have been 16-bit Unicode for a number of
years, and some other parts of the OS support Unicode as well.


> b. Does QShell support multilingual scripts I mean can I write a script
that
> can contain Japanese file names. 


I don't know - I am not that familiar with OS/400...

Some releases of ICU have been ported to OS/400. ICU 2.0 is being ported
right now.
Please watch this space if you are interested:
http://oss.software.ibm.com/icu/download/

Best regards,
markus





Unicode History

2002-01-31 Thread Mark Davis



There has been a lot of recent interest in Unicode 
history. Magda has put together a set of pages based on some of our internal 
documents. While we will continue to flesh out and improve these pages, the 
initial versions are publicly available, under "Historical Data" 
on:
 
   http://www.unicode.org/unicode/consortium/consort.html
 
Mark


Re: When to use markup: (Was:Introducing the idea of a "ROMAN VARIANT SELECTOR"(was: Re: Proposing Fraktur))

2002-01-31 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B

.. about Fraktur vs. Roman being a codepoint difference rather than a 
markup difference..
>
>Like everything else in character encoding, there are shades of 
>gray,
>and levels of gradation, so not everything is clear cut. But 
>recognizing
>up front that character codes may legitimately serve the support of
>algorithms, even where the feature implemented by the algorithm is
>merely common, and not absolutely and minimally required, is useful.
>
>A./
>

$B$+$?$+$J$`$h$&$>!*$R$i$,$J$r$D$+$($k!*!*(B

I DON'T NEED LOWERCASE! I CAN USE CAPITAL LETTERS!



$B"*!!$8$e$&$$$C$A$c$s!!"+(B
$B!!$@$s$;$$$i$7$5$`$h$&(B


_
$B%$%s%?!<%M%C%H$r$V$i$V$i%7%g%C%T%s%0$9$k$J$i(BMSN $B%7%g%C%T%s%0$X(B 
http://shopping.msn.co.jp/


commands

2002-01-31 Thread John Noronha



 


Old Hungarian

2002-01-31 Thread Gaspar Sinai

Hi,
Does anyone know if anything happened since the last
proposal in 1988 to include Old Hungarian

 http://wwwold.dkuug.dk/JTC1/SC2/WG2/docs/n1686/n1686.htm

into Unicode? I plan to input text in Szekely Rovasiras,
and I am about to make PUA code. Conversion from PUA
would risk portability, if it will be finallly included...

What are the chances that it will be included?

Thanks
gaspar





Re: Old Hungarian

2002-01-31 Thread DougEwell2

In a message dated 2002-01-31 20:20:33 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

> Does anyone know if anything happened since the last
> proposal in 1988 to include Old Hungarian

Actually 1998.  But yes, I was wondering about the status of the rovásírás as 
well.

-Doug Ewell
 Fullerton, California