Re: Corporate influence on Unicode development (long)

2002-07-22 Thread Peter_Constable


On 07/21/2002 07:30:33 PM "Doug Ewell" wrote:

>First of all, the figure that William (or any other individual) really
>should be looking at is not $12,000 for a full membership, but $600 for
>a "specialist" membership or $120 for an "individual" membership.  (BTW,
>I would be interested in hearing -- perhaps off-line -- from individuals
>who hold or have held such memberships, to find out how they felt their
>memberships benefited them and Unicode.)

Only the $12000 membership makes you a candidate to vote on UTC. Associate
and Specialist memberships give you access to the "insider's" mailing list
(where real proposals get discussed) and documnts, though, which has been
very useful.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>







Corporate influence on Unicode development (long)

2002-07-21 Thread Doug Ewell

I don't know if William Overington is still a subscriber to this mailing
list -- he may have gone away to find (or form) a new group more
sympathetic to his "novel" applications of Unicode -- but one of the
issues he raised about two weeks ago, right about the time the
chromatic-code and precomposed-ligature debates were coming to a head,
was an insinuation that Unicode is unduly influenced by large corporate
interests.

William based this claim, at least in part, on the $12,000 fee required
for "full" membership in the Unicode Consortium, a membership level
described on the Unicode Web site as being appropriate for "your company
or organization" rather than for individuals.

I promised (or maybe threatened) to discuss this issue, from the
standpoint of an individual who is interested in Unicode but has yet to
join the Consortium due to financial considerations.

First of all, the figure that William (or any other individual) really
should be looking at is not $12,000 for a full membership, but $600 for
a "specialist" membership or $120 for an "individual" membership.  (BTW,
I would be interested in hearing -- perhaps off-line -- from individuals
who hold or have held such memberships, to find out how they felt their
memberships benefited them and Unicode.)

Second, many good ideas have come from this list, and we have to assume
that UTC listens to some of them and can be influenced by some of them.
It wouldn't be smart to ignore truly good ideas just because they come
from a free mailing list.

Some list members have already pointed out that the character repertoire
of Unicode/10646 can hardly be said to be reflective of the interests of
big business.  It is hard to imagine how big business would have
benefited from "pushing through" scripts like Tagbanwa or Old Italic, or
non-script blocks like Byzantine Musical Symbols.  The American
Mathematical Society, largely responsible for the big chunk of math
symbols added to Unicode 3.1, doesn't seem like a stereotypical "large
corporate interest" either.

Indeed, if big business interests were at the heart of the Unicode
character repertoire, we would probably be seeing a lot more of the
precomposed ligatures that William favored so strongly.  They would have
given Microsoft and Apple a cheap, easy way to claim "support" for
ligatures without the additional pain and complexity of performing
ligation in a more general, productive way.

And in fact, I had originally planned to write this post to debunk the
entire notion that corporate interest plays any part at all in the
development of Unicode.

But there's more to Unicode than its character repertoire; as Ken and
others remind us, character properties and technical reports and usage
guidelines are what separate Unicode from 10646.  And it is here that
some corporate influences do appear to seep in, and where the Consortium
and UTC may want to be careful to avoid either the appearance or the
reality of inextricable corporate tie-ins to Unicode.

The precomposed-ligature debate brought forth several responses phrased
in terms of, "No, you don't need a precomposed ligature at U+E7xx, or
even a ZWJ hint, because Technology Such-and-So will automatically
handle it."  Technology Such-and-So could be an application like
InDesign or FrameMaker, or it could be a font architecture like OpenType
or AAT; in the latter case there were frequent discussions of GSUB and
GPOS entries and  tables, as though those were part and parcel of
Unicode.  In either case, one could reasonably infer that a particular
vendor's product or a particular technology is necessary to implement
some aspect of Unicode properly, which isn't -- or shouldn't be -- the
case.

Just today (Sunday), Mark Davis responded to a question about the
Unicode Collation Algorithm in part by pointing out how ICU ("a
particular implementation of the UCA") solves the problem.  The solution
was followed shortly with links to ICU-related sites.  Now, even though
ICU is an open-source library and thus not a money-making product of
IBM, and even though ICU may be easy to use and may greatly facilitate
the use of UCA, it's still important to realize that neither UCA nor any
other aspect of Unicode *requires* ICU.  I could roll my own UCA
implementation if I wanted to, and assuming it was correct and followed
the Unicode Standard and UTS #10, it would be just as legitimate and
just as "Unicode" as if I used ICU or any other library or tool.

The Unicode FTP site includes sample implementations for algorithms such
as UTF-8, SCSU, UCA, and the Bidi algorithm.  (UTF-7 was once on this
list as well; thankfully, nobody talks about UTF-7 much any more).  At
some point, the Binary-Ordered Compression for Unicode ("BOCU")
algorithm -- implemented in ICU and already mentioned in the SCSU
Technical Standard, despite having no official status in Unicode -- may
be added to this list as well.  It would be highly desirable for Unicode
to continue to provide reference implementations ra