Character properties

2002-07-15 Thread Suzanne M. Topping

Is there a place on unicode.org which describes the concept of
properties in greater detail?

 -Original Message-
 From: Kenneth Whistler [mailto:[EMAIL PROTECTED]]

 Other properties accrue more directly to characters, per se.
 They attach to the abstract character, and get associated with
 a code point more indirectly by virtue of the encoding of that
 character. The numeric value of a character would be a good example
 of this. No one expects an unassigned code point or an assigned
 dingbat character or a left bracket to have a numeric value property
 (except perhaps a future generation of Unicabbalists).
  
  There are no corresponding features in other character sets usually.
 
 Correct. Before the development of the Unicode Standard, character
 encoding committees tended to leave that property assignments
 either up to implementations (considering them obvious) or up
 to standardization committees whose charter was character
 processing -- e.g. SC22/WG15 POSIX in the ISO context.
 
 The development of a Universal character encoding necessitated
 changing that, bringing character property development and
 standardization under the same roof as character encoding.
 




Re: Common input methods for IPA

2002-07-15 Thread Doug Ewell

Marc Wilhelm Küster kuester at saphor dot net wrote:

 Lukas' German-based phonetic keyboard is something I'll definitely
 take a deeper look into -- what I saw so far on the quoted URL is
 promising. It comes closest to a turn-key solution for Germany for
 the time being. At the same time I'll also have a look at the UniPad
 keyboard.

Remember that SC UniPad is a standalone Unicode text editor.  Keyboards
designed for UniPad can't really be used with anything else.

Also, I haven't made my IPA keyboard for UniPad available yet, for the
reasons I mentioned: it's based on the images at gy.com (which are
somewhat difficult to interpret and which can hardly be said to depict a
standard IPA keyboard), plus there are several characters missing that
I haven't been able to find in Unicode.

If you still want to have a look at it, though, I can e-mail it to you
(and for that matter, to anyone who might be able to help me find those
missing characters!).

I have released six other custom keyboards for UniPad; anyone who is
interested should check http://www.unipad.org/keyboard/ for more
information.

-Doug Ewell
 Fullerton, California





Re: Proposal: Ligatures w/ ZWJ in OpenType

2002-07-15 Thread Doug Ewell

Concerning the use of ZWJ to request ligation in the Latin script (and,
less contentiously, the use of ZWNJ to prevent it), many -- including
some experts and UTC members -- have stated that ZWJ should only be used
in exceptional circumstances, or when the requested ligature is
necessary grammatically or orthographically instead of stylistically
(however that is determined).

I'm starting to see why I disagree so strongly with this position.  It's
not that I'm eager to pepper my text with ZWJs or to require other
writers to do the same, or even that I think modern English text in most
circumstances really requires much more than the basic f-ligatures.

No, what bothers me is that the ZWJ/ZWNJ ligation scheme is starting to
look just like the DOA (deprecated on arrival) Plane 14 language tags.
In each case, Unicode has created a mechanism to solve a genuine (if
limited) need, but then told us -- officially or unofficially -- that we
should not use it, or that it is reserved for use with special
protocols which are never defined or mentioned again.

I think I've lost the battle regarding Plane 14 tags -- though I can't
promise I'll never use them in plain text without those mysterious
special protocols -- but the fight for ZWJ ligation continues.

The UTC may have intended that ZWJ ligation be used only in rare and
exceptional circumstances, but UAX #27, revised section 13.2 doesn't say
that.  It says that ZWJ and ZWNJ *may be used* to request ligation or
non-ligation, and that font vendors should add ZWJ to their ligature
mapping tables as appropriate.  It does acknowledge that some fonts
won't (or shouldn't) include glyphs for every possible ligature, and
never claims that they must (or should).  It specifically does *not* say
that ZWJ ligation is to be restricted to certain orthographies, or to
cases where ligation changes the meaning of the text.

As Michael and Asmus have pointed out, without ZWJ ligation we will
continue to see numerous, very serious proposals to add more ligated
presentation forms to Unicode.  Is that what we want?  Not everyone will
buy into the notion that AAT and OpenType will automagically handle all
ligation scenarios.

ZWJ/ZWNJ for ligation control is part of Unicode.  It is not always the
best solution, but it is *a* solution, and should be available to the
user without restriction or discouragement.

-Doug Ewell
 Fullerton, California





Is UniCode's Thai character representation is acceptable by TISI or not?

2002-07-15 Thread Sreedhar M

Hi Samphan,
   Thank U for Your kind response.Please let me know whether
Unicode's Thai character represation is acceptable by TISI or not? It is
very essential to our project.
Thanks in Advance.
Regards,
Sreedhar M.
- Original Message -
From: Samphan Raruenrom [EMAIL PROTECTED]
To: Sreedhar.M [EMAIL PROTECTED]
Sent: Monday, July 15, 2002 4:46 PM
Subject: Re: tis-620


 tis-620 is in the process of registering as iso-8859-11 so you can
 use the proposal for info about tis-620, the chart is at the end.

 http://www.nectec.or.th/it-standards/iso8859-11/index.html

 Sreedhar.M wrote:
  Hi Samphan,
  Thank  You for Your kind information.But the URL You
  specified is displaying links which are showing all in Thai Language.I
need
  the information in English.Please let me know whether is there any
  information available in english regarding this.
  Thanks in Advance.

 --
 Samphan Raruenrom
 Information Research and Development Division,
 National Electronics and Computer Technology Center, Thailand.
 http://www.nectec.or.th/home/index.html







Re: Proposal: Ligatures w/ ZWJ in OpenType

2002-07-15 Thread John H. Jenkins


On Monday, July 15, 2002, at 09:58 AM, Doug Ewell wrote:
 No, what bothers me is that the ZWJ/ZWNJ ligation scheme is starting to
 look just like the DOA (deprecated on arrival) Plane 14 language tags.
 In each case, Unicode has created a mechanism to solve a genuine (if
 limited) need, but then told us -- officially or unofficially -- that we
 should not use it, or that it is reserved for use with special
 protocols which are never defined or mentioned again.


I'm not sure I agree with you here.  The position of the UTC is not that 
ZWJ should never be used and we're sorry we added it, which is the case of 
the Plane 14 language tags.  It's that the ZWJ should not be the primary 
mechanism for providing ligature support in many cases.  That's as far as 
it goes.

 The UTC may have intended that ZWJ ligation be used only in rare and
 exceptional circumstances, but UAX #27, revised section 13.2 doesn't say
 that.

The latest word is the Unicode 3.2 document, not the Unicode 3.1 document.
   It says:

Ligatures and Latin Typography (addition)

It is the task of the rendering system to select a ligature (where 
ligatures are possible) as part of the task of creating the most pleasing 
line layout. Fonts that provide more ligatures give the rendering system 
more options.

However, defining the locations where ligatures are possible cannot be 
done by the rendering system, because there are many languages in which 
this depends not on simple letter pair context but on the meaning of the 
word in question. 

ZWJ and ZWNJ are to be used for the latter task, marking the non-regular 
cases where ligatures are required or prohibited. This is different from 
selecting a degree of ligation for stylistic reasons. Such selection is 
best done with style markup. See Unicode Technical Report #20, “Unicode in 
XML and other Markup Languages” for more information.

  It says that ZWJ and ZWNJ *may be used* to request ligation or
 non-ligation, and that font vendors should add ZWJ to their ligature
 mapping tables as appropriate.  It does acknowledge that some fonts
 won't (or shouldn't) include glyphs for every possible ligature, and
 never claims that they must (or should).  It specifically does *not* say
 that ZWJ ligation is to be restricted to certain orthographies, or to
 cases where ligation changes the meaning of the text.


This is correct.  Nor is this changed in Unicode 3.2.  The goal is to make 
the ZWJ mechanism available to people who feel it is appropriate to meet 
their needs, but to try to inform them that in the majority of cases, a 
higher-level protocol would be better.

Adobe doesn't have to revise InDesign, for example, to insert ZWJ all over 
when a user selects text and turns optional ligatures on.  OTOH, the hope 
is that if ligatures are available InDesign will honor the ZWJ marked ones,
  even if ligation has been turned off.

John Hudson has recommended what seems a reasonable way to handle this in 
OT.  Apple will be releasing new versions of its font tools in the near 
future, and the documentation will include a recommendation for how this 
can be done with AAT.  We've been revising our own fonts as the 
opportunity presents itself to support ZWJ as well.  (The system and 
ATSUI-savvy applications require no revision.)

The push-back coming from the font community on the issue has to do mostly 
with the communications problem that they weren't aware of it in as timely 
a fashion as would have been best,  and the concern that font developers 
and application/OS developers will be forced to add ligature support where 
they have felt it in appropriate in the past.

 ZWJ/ZWNJ for ligation control is part of Unicode.  It is not always the
 best solution, but it is *a* solution, and should be available to the
 user without restriction or discouragement.


It's discouraged when it's inappropriate.  It isn't deprecated.  There are 
numerous places where Unicode provides multiple ways of representing 
something.  In this instance, Unicode is trying to delineate where a 
particular mechanism is appropriate and where inappropriate.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/





General question

2002-07-15 Thread Rick McGowan

Good morning Unicadets --

This one came in to the Unicode office. If anyone has any hints, please  
reply to the sender directly. Thanks,
Rick


 Date/Time:Mon Jul 15 05:48:43 EDT 2002
 Contact:  [EMAIL PROTECTED]
 Report Type:  General question

 where can I get the free tool to translate my c++ source code to Unicode 
 compliant for internationalization?






Re: Common input methods for IPA

2002-07-15 Thread Michael Everson

Doug,

Let's take this IPA keylayout discussion to the 
[EMAIL PROTECTED] list. I'll be posting a 
comparison chart of four different layouts there shortly.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




Re: Common input methods for IPA

2002-07-15 Thread Michael Everson

At 08:24 -0700 2002-07-15, Doug Ewell wrote:

Also, I haven't made my IPA keyboard for UniPad available yet, for the
reasons I mentioned: it's based on the images at gy.com (which are
somewhat difficult to interpret and which can hardly be said to depict a
standard IPA keyboard), plus there are several characters missing that
I haven't been able to find in Unicode.

I did also find some characters in SIL's key layouts which are not 
encoded, though a couple of the missing ones are handled now by the 
new UPA additions.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




Re: Proposal: Ligatures w/ ZWJ in OpenType

2002-07-15 Thread John Hudson

At 08:58 AM 15-07-02, Doug Ewell wrote:

No, what bothers me is that the ZWJ/ZWNJ ligation scheme is starting to
look just like the DOA (deprecated on arrival) Plane 14 language tags.
In each case, Unicode has created a mechanism to solve a genuine (if
limited) need, but then told us -- officially or unofficially -- that we
should not use it, or that it is reserved for use with special
protocols which are never defined or mentioned again.
...

ZWJ/ZWNJ for ligation control is part of Unicode.  It is not always the
best solution, but it is *a* solution, and should be available to the
user without restriction or discouragement.

I don't think I am trying to discourage people from using ZWJ/ZWNJ for 
ligation control, or to impose restrictions upon it, but I do have concerns 
about the practicalities of implementing such control in a way that 
provides users of ZWJ with the results they desire while not breaking 
existing ligature implementations. I really am trying to figure out a clear 
and consistent way to make ZWJ work. Of course, I can only try to propose 
part of the solution, because ZWJ has an impact not only on how fonts are 
made but on how layout engines handle the relationship of control 
characters and glyphs. The implementation note in TR27 stating that font 
developers should add glyph substitution lookups for ZWJ sequences to their 
fonts seems to me to display an incomplete understanding of the technology 
involved. The comments on 'Ligatures and Latin Typography -- naive 
comments, I think: the layout issues involved are in no way limited to 
Latin typography -- in TR28, instead of clarifying the situation retreat to 
a vaguer position. Perhaps the idea is that, by keeping things vague, the 
UTC permits freedom of implementation, but so far all I am seeing in 
response is confusion: confusion about what ZWJ signifies in text, and how 
it should be implemented in line layout. If Doug is worried that ZWJ will 
be 'deprecated on arrival', he might also worry that ZWJ will be so 
variously interpreted as to become useless as a reliable means of achieving 
any consistent result.

I have other, more general concerns, about the poor communication between 
the UTC and the people who make fonts. This is not the UTC's fault. Unlike 
other technologies that are related to and influenced by Unicode, e.g. web 
standards and technology, there is no parallel organisation governing the 
development of font software, no 'Font Technology Consortium'. This means 
that communication between UTC and font developers has been, at best, ad 
hoc. I am trying to do something to rectify this situation, since I believe 
it will benefit everyone if UTC can rely on more regular, consistent and 
informed involvement from the type industry, and font developers can 
receive and digest information from the UTC that has an impact on font 
technology in a timely fashion.

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

Language must belong to the Other -- to my linguistic community
as a whole -- before it can belong to me, so that the self comes to its
unique articulation in a medium which is always at some level
indifferent to it.  - Terry Eagleton





Re: TrueType signature bits

2002-07-15 Thread Raymond Mercier



Thanks for the confirmation.
Do you know what role the unicode bits play in the use of the font - in MS 
Word, for example ?
As far as I can see, even if the bits are set carelessly, or not set at 
all, the font seems to work in Word.

Raymond







C++ unicode

2002-07-15 Thread Raymond Mercier

I am sure there is no tool, free or otherwise, for making C++ code
unicode compliant
If you are talking of Windows code, you could make a start by just
putting #define UNICODE before the windows headers.
Then, you will have a lot of editing to do.
I have just finished that sort of project, so as to be able to handle
arbitrary (non-Ansi) filenames. As in my Fontlist v.5,
http://ourworld.compuserve.com/homepages/RaymondM
Raymond Mercier



Re: TrueType signature bits

2002-07-15 Thread David Starner

At 10:11 PM 7/15/02 +0100, Raymond Mercier wrote:
Do you know what role the unicode bits play in the use of the font

Among other things, the new X font architecture will try to use them
to pick a font in a culturely acceptable typeface.





Re: TrueType signature bits

2002-07-15 Thread John Hudson

At 02:11 PM 15-07-02, Raymond Mercier wrote:

Do you know what role the unicode bits play in the use of the font - in MS 
Word, for example ?
As far as I can see, even if the bits are set carelessly, or not set at 
all, the font seems to work in Word.

At the moment, they are mainly of use when the font lacks 8-bit codepage 
support -- e.g. it is an Indic font, for which there are no registered 
codepages on the system -- and an application needs to determine whether 
this font is suitable to display a particular text. There are likely to be 
other fallback mechanisms behind this one, e.g. checking for the presence 
of particular characters in the font cmap table.

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

Language must belong to the Other -- to my linguistic community
as a whole -- before it can belong to me, so that the self comes to its
unique articulation in a medium which is always at some level
indifferent to it.  - Terry Eagleton





Re: Is UniCode's Thai character representation is acceptable by TISI or not?

2002-07-15 Thread Samphan Raruenrom

Sreedhar M wrote:
 Thank U for Your kind response.Please let me know whether
 Unicode's Thai character represation is acceptable by TISI or not? It is
 very essential to our project.

Yes. TISI had taken part in the representation of Thai char. in ISO 10646
(and Unicode indirectly). Unicode has backward-compatibility goal so
it takes the whole Thai block in TIS-620 to Unicode directly :-
unicode = tis620 - 0xa0 + 0x0e00
Which is perfect and ease transition of code. We can modified our code
just a little bit to make it work on both tis-620 and unicode (see
libinthai, a Thai word-break library, as an example).

However, there're still some problems which is beyond assignments of code
points, that's char. properties. There're some mistakes in Unicode char.
properties for Thai char. and you have to code around that.