Re: Who to make OTF

2002-03-08 Thread John Hudson

At 19:48 3/7/2002, K S Rohilla wrote:

Hi,
all,
I am font designer
Pl. suggest me Who to make Open Type Fonts

Lots of people are making or trying to make OT fonts. The user community 
for Microsoft's VOLT tool now numbers almost 2,500 people, many of them 
Indic and Arabic developers. I'm not sure what percentage of these are 
professional type designers and font developers. There seem to be a lot of 
enthusiastic but not very experienced amateurs keen to increase the number 
of fonts supporting their native scripts, but I'm afraid most of the 
products I have seen are not very good.

What are you looking for? Someone to help you make an OT font? Someone to 
make an OT font from your existing designs? If you are not already a member 
of the VOLT community, you should join and ask your question there. There 
are developers, both professional and amateur, working on Devanagari and 
Bengali fonts, and probably on other Indian scripts. See 
http://www.microsoft.com/typography/developers/volt/default.htm for more 
information.

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

... es ist ein unwiederbringliches Bild der Vergangenheit,
das mit jeder Gegenwart zu verschwinden droht, die sich
nicht in ihm gemeint erkannte.

... every image of the past that is not recognized by the
present as one of its own concerns threatens to disappear
irretrievably.
   Walter Benjamin





RE: Devanagari variations

2002-03-08 Thread Marco Cimarosti

Peter Constable wrote:
 On 03/07/2002 02:16:10 PM James E. Agenbroad wrote:
 
 A similar but not the same situation is found in the fourth 
 example in
 figure 9-3 of Unicode 3.0 (page 214) where an intedpendent 
 vowel has the
 reph (an abridged form of a the consonant 'ra') above it.  Unicode 
 wants
 this encoded as consonant + halant + independent vowel. I 
 believe it is
 better considered as a consonant + vowel sign combination 
 which happens 
 to
 have an odd display and at least one Sanskrit textbook agrees.
 
 I may be wrong, but I believe that example has  ra, halant, ra, 
 independent i . The first ra is the one that  transforms 
 into the reph.

You are wrong, in fact, sorry. Although figure 9-3 does not show code point
values, both the glyphs and the abbreviated letter names make it clear that
the sequence is:

U+0930 (DEVANAGARI LETTER RA)
U+094D (DEVANAGARI SIGN VIRAMA)
U+090B (DEVANAGARI LETTER VOCALIC R)

James' idea is that the same graphemes could have been better represented
with sequence:

U+0930 (DEVANAGARI LETTER RA)
U+0943 (DEVANAGARI VOWEL SIGN VOCALIC R)

It is an interesting idea, because ra never occurs with matra r., so
there is no danger of confusion. But it is probably too late for changing
it: it would break compatibility with ISCII and existing Unicode fonts.

_ Marco




RE: Keyboard Layouts for Office XP in WIndows 98

2002-03-08 Thread Marco Cimarosti

Lateef Sagar wrote:
 How can I create such a keyboard layout that can be
 used with Office XP (in Windows 98).

http://www.tavultesoft.com/keyman/

It also works on Win 98.

_ Marco




Re: Devanagari variations

2002-03-08 Thread Michael Everson

At 15:36 -0600 07/03/2002, [EMAIL PROTECTED] wrote:

I may be wrong, but I believe that example has  ra, halant, ra,
independent i . The first ra is the one that  transforms into the reph.

You're wrong. RI in this case is a way of writing the vocalic r. 
Compare Kr.s.n.a and Krishna.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




Re: Devanagari variations

2002-03-08 Thread Michael Everson

At 15:16 -0500 07/03/2002, James E. Agenbroad wrote:
On Wed, 6 Mar 2002 [EMAIL PROTECTED] wrote:

  On 03/06/2002 08:25:18 AM Michael Everson wrote:
  [snip]

  In
  Cham, independent vowels can take dependent vowel signs. In
  Devanagari, I guess that doesn't occur, but the Brahmic model
  shouldn't be understood to preclude this behaviour.
 [snip]
  - Peter

A similar but not the same situation is found in the fourth example in
figure 9-3 of Unicode 3.0 (page 214) where an intedpendent vowel has the
reph (an abridged form of a the consonant 'ra') above it.  Unicode wants
this encoded as consonant + halant + independent vowel. I believe it is
better considered as a consonant + vowel sign combination which happens to
have an odd display and at least one Sanskrit textbook agrees.

Is that the sample you showed me when I was a-photocopying at the 
Library of Congress in August, James? You're saying that RA + virama 
+ INDEPENDENT VOCALIC R and RA + VOWEL SIGN VOCALIC R should both 
produce the same glyph?
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




Re: Devanagari variations

2002-03-08 Thread Michael Everson

Using Apple's WorldText, I can confirm that short I did not reorder 
correctly when preceded by 0294. But the 0294 glyph was in another 
font.

I wonder could we see some samples of this in actual Limbu text?
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




RE: Devanagari variations

2002-03-08 Thread Michael Everson

At 11:26 +0100 2002-03-08, Marco Cimarosti wrote:

You are wrong, in fact, sorry. Although figure 9-3 does not show code point
values, both the glyphs and the abbreviated letter names make it clear that
the sequence is:

   U+0930 (DEVANAGARI LETTER RA)
   U+094D (DEVANAGARI SIGN VIRAMA)
   U+090B (DEVANAGARI LETTER VOCALIC R)

James' idea is that the same graphemes could have been better represented
with sequence:

   U+0930 (DEVANAGARI LETTER RA)
   U+0943 (DEVANAGARI VOWEL SIGN VOCALIC R)

It is an interesting idea, because ra never occurs with matra r., so
there is no danger of confusion. But it is probably too late for changing
it: it would break compatibility with ISCII and existing Unicode fonts.

Well, Apple's in WorldText version 1.1 I just typed both of these. 
The first one displayed as RA VIRAMA (visible) VOCALIC R and the 
second displayed as REPHA VOCALIC R. So in at least one 
implementation the latter is supported.
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




MS Command Prompt

2002-03-08 Thread Patrick Rourke

 From: Doug Ewell [EMAIL PROTECTED]
 Indie was doing the right thing by typing Alt+0248 to get the Latin-1
 character, instead of Alt+248 to get the MS-DOS character.  That isn't
 the problem.

 In Windows 95, 98, and NT 4, everything that happens in the command
 prompt goes through the MS-DOS code page -- 437, 850 or whatever.  Since
 Indie's code page is set to 437, and U+00F8 LATIN SMALL LETTER O WITH
 STROKE is not in code page 437, the internal conversion tables in NT 4
 converted '' to 'o', a reasonable if imperfect fallback.  Note that
 Alt+0243 works just fine, because U+00F3 is in code page 437.  Also note
 that if Indie had been using 850 instead of 437, there would have been
 no problem, since 850 does include U+00F8.

 Windows 2000 is different.  You can set your command prompt code page to
 437 and type Alt+0248, and you will still get the ' ' you want.  The
 Alt+0xxx logic has been decoupled from the active code page issue, which
 is nice.

 Martin is right, you can change the code page; but I don't know if that
 will help Indie.  What's kind of fun is that in Windows 2000, you can
 change your code page to 65001 and do all your command-prompt work in
 UTF-8.


In Windows XP, if I type the Alt+0248 in the command prompt with the font
set to raster fonts, I get an o.  If I type it in a command prompt with
the font set to Lucida Console, I get the ø.  However, it only works if I
change the font before I type the character.

So I am guessing that in XP, whatever code page you have selected, if the
default font for the command line doesn't have the character you want,
you're stuck with the closest approximation in that font.

Don't know if this will help any with NT.

Patrick Rourke
[EMAIL PROTECTED]







Re: Keyboard Layouts for Office XP in WIndows 98

2002-03-08 Thread Yaap Raaf

At 07:37 +0100 2002.03.08, Lateef Sagar wrote:

MS Office XP installs many keyboard layouts (like
Arabic etc) in Windows 98. For Windows NT/2000/XP
there is a shareware software Keyboard Layout Manager
32 bit, but I haven't found out any software yet that
allows making a non-ASCII keyboard layout for Windows
98.
How can I create such a keyboard layout that can be
used with Office XP (in Windows 98).

Do you mean the Keyboard Layout Manager at
http://www.klm.freeservers.com/index.html  ?

quote
This program allows you to create and modify Microsoft keyboard 
layout files. It works with Windows 95, Windows 95-OSR/2, Windows 98 
and Windows ME operating systems. Also, it works with Windows NT 4.0, 
Windows XP, and Windows 2000 operating systems. 
/quote

How can I create such a keyboard layout that can be used 
with Office XP (in Windows 98).

Office XP in Windows 98  ??





-- 






RE: Concerning mathematics

2002-03-08 Thread Murray Sargent

Stefan Persson [mailto:[EMAIL PROTECTED]] asks how in the formula
 
mfågel = 1 kg

would the italic å be encoded? 
 
 
Mathematics has a set of standard letters for mathematical symbols. They can include 
diacritics, which can be expressed using the appropriate combining marks. In your 
formula
 
mfågel = 1 kg

the m is a mathematical symbol, while the fågel is a natural language subscript. 
Italic shouldn't be used for such a subscript, since italic is used for symbols in 
mathematical notation (and consequently mathematical journals will change to fågel 
for this case). Else one might construe fågel to be a subscript consisting of the 
product of the five variables. Such natural language text is conveniently done with 
characters from the BMP, although you need some kind of markup to turn it into a 
subscript. If you insist on using italic for this kind of text and for characters like 
the italic ø that aren't used in standard mathematical notation, you can fall back to 
markup. Since such usage is extremely rare and not recommended for mathematical text, 
it wasn't perceived as important to represent unambiguously in plain text.
 
Murray
 




Re: MS Command Prompt

2002-03-08 Thread Doug Ewell

Patrick Rourke [EMAIL PROTECTED] wrote:

 In Windows XP, if I type the Alt+0248 in the command prompt with the
font
 set to raster fonts, I get an o.  If I type it in a command prompt
with
 the font set to Lucida Console, I get the ø.  However, it only works
if I
 change the font before I type the character.

 So I am guessing that in XP, whatever code page you have selected, if
the
 default font for the command line doesn't have the character you want,
 you're stuck with the closest approximation in that font.

I hadn't thought of that.  In Windows 2000 I am using Lucida Console,
while my colleague's NT 4 computer on which I conducted the test was
using the Terminal bitmap font.  I didn't know the NT 4 system was
doing substitutions based on what was available in the font, but it
seems that's what's happening.  Thanks for the info.

-Doug Ewell
 Fullerton, California






RE: Keyboard Layouts for Office XP in WIndows 98

2002-03-08 Thread Peter_Constable

On 03/08/2002 04:39:49 AM Marco Cimarosti wrote:

Lateef Sagar wrote:
 How can I create such a keyboard layout that can be
 used with Office XP (in Windows 98).

http://www.tavultesoft.com/keyman/

It also works on Win 98.

There are some issues to keep in mind in relation to Win9x/Me. I won't 
explain all the gory details (I probably have sometime earlier on this 
list), but in a nutshell, for most of the life of Win9x/Me, the characters 
that could be entered from a keyboard were limited to only those in some 
Windows codepage, and a given layout couldn't mix characters from 
different codepages. Late in 2000, MS added a new mechanism that involved 
using the system message WM_UNICHAR rather than WM_CHAR. This invention 
was quite slick since it could be used without breaking existing software 
and without requiring any patches to Windows itself. With old apps, it 
would just get ignored (not perfect, but not bad). All it would take to 
use it is (a) an input method that will generate it, and (b) apps that 
will recognise it.

Tavultesoft Keyman will attempt to communicate with an app using 
WM_UNICHAR. If the app doesn't recognise that message, then Keyman will 
gracefully resort to plan B -- if the developer of the particular input 
method included rules for ANSI mode as well as Unicode, then Keyman will 
fall back to ANSI mode; otherwise, it deactivates that input method (the 
IM can be reactivated when focus is switched to another app).

There are not many apps at this point that support WM_UNICHAR, but Word 
2002 is one of them. The other apps in the Office suite do not, however, 
with the minor exception that the RichEdit control does support it, so it 
is supported wherever those other apps use the RichEdit control (e.g. the 
text boxes in search/replace dialogs). (I've been told that Keyman can be 
used to give full Unicode input support on Win 98 with Internet Messenger; 
I'm guessing it must be using RichEdit.)

If you are using Word 2000, you can obtain an add-in (WordLink) from 
Tavultesoft that will add support for WM_UNICHAR.

One last point: Keyman 5 did not provide support for supplementary plane 
characters. This will be added in Keyman 6, which will be available this 
spring.

So, if you are on Win9x/Me and want to use Unicode characters that are 
*not* supported by a Windows codeage, it can be done with certain 
limitations. Here's a summary:


 Unicode characters that can be input using 
 Keyman 5Keyman 6 (when released)

Word 2000limited by  limited by 
 Windows codepages   Windows codepages

Word 2000 w/ WordLinkall of BMP  all (planes 0 - 16)

other Office 2000 apps   limited by  limited by 
 Windows codepages   Windows codepages

Word 2002all of BMP  all (planes 0 - 16)

other Office XP apps limited by  limited by 
 Windows codepages   Windows codepages


I'm hoping that when Office dotNet appears that support for WM_UNICHAR 
will have been added to other apps in the Office suite. (Chris Pratley, 
can you comment on that?)



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]





Re: Devanagari variations

2002-03-08 Thread Peter_Constable

On 03/08/2002 06:54:54 AM Michael Everson wrote:

Using Apple's WorldText, I can confirm that short I did not reorder
correctly when preceded by 0294. But the 0294 glyph was in another
font.

I wonder could we see some samples of this in actual Limbu text?

It's on its way.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]





Re: Devanagari variations

2002-03-08 Thread Peter_Constable

On 03/08/2002 05:09:46 AM Michael Everson wrote:

At 15:36 -0600 07/03/2002, [EMAIL PROTECTED] wrote:

I may be wrong, but I believe that example has  ra, halant, ra,
independent i . The first ra is the one that  transforms into the reph.

You're wrong. RI in this case is a way of writing the vocalic r.
Compare Kr.s.n.a and Krishna.

I guess that's what I get for comment on things beyond my ken. Mea culpa.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]





Re: Concerning mathematics

2002-03-08 Thread Peter_Constable

On 03/08/2002 04:09:14 AM Stefan Persson wrote:

The Standard contains several special mathematics characters, such as, 
for 
example, A (italic A). But I thought of some letters that might not be 
fully 
supported: Let's say that you find a formula like this in some Swedish 
book:

msubfågel/sub = 1 kg

Surely sub-scripted qualifiers of this sort -- which, being from a spoken 
language rather than math, could contain any string using any script -- is 
something to be handled by MathML and not character encoding.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]





Re: Devanagari variations

2002-03-08 Thread Peter_Constable

Jim Agenbroad responded (off list):

Not quite. On page 214 of 3.0 there is one RA vowel, a halant and a 
RI
vowel: RA(d) + RI(n) -- RI(n) +RA(sup)   ( parens in lieu ofsubscript)

I didn't realise that RI meant the vocalic R. I mistook it to mean 
something else. I find it a weakness of that section that such notations 
are not defined and prominently displayed in an easy-to-find location.

Thanks for setting me straight. I should have known you knew what you were 
talking about.


Peter




Re: Devanagari variations

2002-03-08 Thread John Cowan

[EMAIL PROTECTED] scripsit:

 I didn't realise that RI meant the vocalic R.  

It reflects the modern Hindi pronunciation of Skt /r=/.


-- 
John Cowan [EMAIL PROTECTED] http://www.reutershealth.com
I amar prestar aen, han mathon ne nen,http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith.  --Galadriel, _LOTR:FOTR_




RE: Devanagari enthousiasm!

2002-03-08 Thread Rick Cameron

It appears that hindi.exe installs Uniscribe - which, AFAIK, is not
permitted by Microsoft - so much for honouring license agreements!

That's another reason why they'd package it as an EXE.

- rick cameron

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] 
Sent: Wednesday, 6 March 2002 12:14
To: Yaap Raaf
Cc: [EMAIL PROTECTED]
Subject: Re: Devanagari enthousiasm!



On 06-03-2002 04:29:20 PM Yaap Raaf wrote:

At 14:02 +0100 2002.03.06, [EMAIL PROTECTED] wrote:

I am on a Mac and can't open it,

Well, this is going to be a problem for non-Windows clients, I admit.

it's a
244K .exe  Why an .exe?

I don't know if this is what the BBC was trying to do, but using an
executable installer package is at least one way to make sure people see the
license agreement...

Bob






Support for Japanese characters

2002-03-08 Thread Eric Ray



Need help 
please. 

Problem: 


1. 
Current librarybuilt forunix and supports ASCII characters 
only. 

2. This 
library must now accept wide characters from Japanese 
client.



Facts:
--
1. The 
library does not really evaluate the Japanese characters to make logical 
decisions. Webelieve base64 encode the character array to avoid any 
"bad things happening in the code" (such as hitting a null value or other values 
that could potential cause problems).

2. 
Cannot rewrite library in time allowed and don't really need to based on Fact 
item #1. Plus, pressure to get product to market is greater than 
internationalizing the library.


What I need 
help with:
--
1.How do I set up an ASCII based unix machine, test 
application and test environment to send Japanese characters to the library in 
question.

2. Do I 
need to create hex input or binary input to represent Japanese characters. 
Since I'm using a standard keyboard how do we get Japanese characters into the 
application?

3. What 
am I not considering here? What gotchas will I come across by not making 
my library i18nized?


Unfortunately, 
I've never done any i18n or l10n work before so I'm really having trouble 
figuring out where and how to get started. Any advice is 
appreciated.


Thanks.

Eric 
Ray


Re: Devanagari variations

2002-03-08 Thread Michael Everson

At 10:29 -0600 2002-03-08, [EMAIL PROTECTED] wrote:
Jim Agenbroad responded (off list):

 Not quite. On page 214 of 3.0 there is one RA vowel, a halant and a
RI
  vowel: RA(d) + RI(n) -- RI(n) +RA(sup)   ( parens in lieu ofsubscript)

I didn't realise that RI meant the vocalic R. I mistook it to mean
something else. I find it a weakness of that section that such notations
are not defined and prominently displayed in an easy-to-find location.

Actually, I would like to see that written R with dot below. We 
should use decent transliteration in those notations; why not?
-- 
Michael Everson *** Everson Typography *** http://www.evertype.com




RE: Keyboard Layouts for Office XP in WIndows 98

2002-03-08 Thread Chris Pratley

I should point out that Word2002 does not actually support WM_UNICHAR
(actually no OfficeXP app does). Only RichEdit 4.0 (riched20.dll) does.
RichEdit is used in many places in the system and in Office and various
applets such as WordPad, and likely Messenger, so that can be handy but
it is not universal.

However, the recommended method for communicating in Unicode to apps
including Office is to
a) use an NT-based OS such as NT4/Win2000/WindowsXP. Everything just
works.
b) or use the Text Services Framework, which is shipped in WindowsXP and
also in OfficeXp. This is what, I believe Keyman actually uses now to
get Unicode in Word2002 on Win98/Me - or the specific Word (object model
based) method Peter mentions below.

Keep in mind that most OfficeXP installations are now running on either
Win2k or WinXP, and this trend is accelerating. The large majority of
customers upgrade their OS or their entire machine at the time they
acquire major new software.

By the time we ship the next release of Office, the % of people who a)
want to get a new version of Office and who b) insist on remaining with
their old Win9x/ME OS will be very small indeed (not zero, I
understand). Generally speaking, the Office team tries to make sure you
can do everything on older OSes that we offer on the newer ones, but
there is a limit to how much back-porting and investment in workarounds
for older OS limitations we will make . We'd rather invest in more
powerful features for the newer OSes that most people are using. So it
is unlikely we will be improving our multilingual support on Win9x/Me -
instead we'll extend it even further on the newer OSes.

Chris


Sent with OfficeXP on WindowsXP


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] 
Sent: March 8, 2002 08:29
To: [EMAIL PROTECTED]
Subject: RE: Keyboard Layouts for Office XP in WIndows 98

On 03/08/2002 04:39:49 AM Marco Cimarosti wrote:

Lateef Sagar wrote:
 How can I create such a keyboard layout that can be
 used with Office XP (in Windows 98).

http://www.tavultesoft.com/keyman/

It also works on Win 98.

There are some issues to keep in mind in relation to Win9x/Me. I won't 
explain all the gory details (I probably have sometime earlier on this 
list), but in a nutshell, for most of the life of Win9x/Me, the
characters 
that could be entered from a keyboard were limited to only those in some

Windows codepage, and a given layout couldn't mix characters from 
different codepages. Late in 2000, MS added a new mechanism that
involved 
using the system message WM_UNICHAR rather than WM_CHAR. This invention 
was quite slick since it could be used without breaking existing
software 
and without requiring any patches to Windows itself. With old apps, it 
would just get ignored (not perfect, but not bad). All it would take to 
use it is (a) an input method that will generate it, and (b) apps that 
will recognise it.

Tavultesoft Keyman will attempt to communicate with an app using 
WM_UNICHAR. If the app doesn't recognise that message, then Keyman will 
gracefully resort to plan B -- if the developer of the particular input 
method included rules for ANSI mode as well as Unicode, then Keyman
will 
fall back to ANSI mode; otherwise, it deactivates that input method (the

IM can be reactivated when focus is switched to another app).

There are not many apps at this point that support WM_UNICHAR, but Word 
2002 is one of them. The other apps in the Office suite do not, however,

with the minor exception that the RichEdit control does support it, so
it 
is supported wherever those other apps use the RichEdit control (e.g.
the 
text boxes in search/replace dialogs). (I've been told that Keyman can
be 
used to give full Unicode input support on Win 98 with Internet
Messenger; 
I'm guessing it must be using RichEdit.)

If you are using Word 2000, you can obtain an add-in (WordLink) from 
Tavultesoft that will add support for WM_UNICHAR.

One last point: Keyman 5 did not provide support for supplementary plane

characters. This will be added in Keyman 6, which will be available this

spring.

So, if you are on Win9x/Me and want to use Unicode characters that are 
*not* supported by a Windows codeage, it can be done with certain 
limitations. Here's a summary:


 Unicode characters that can be input using 
 Keyman 5Keyman 6 (when
released)


Word 2000limited by  limited by 
 Windows codepages   Windows codepages

Word 2000 w/ WordLinkall of BMP  all (planes 0 - 16)

other Office 2000 apps   limited by  limited by 
 Windows codepages   Windows codepages

Word 2002all of BMP  all (planes 0 - 16)

other Office XP apps limited by  limited by 
 

RE: Keyboard Layouts for Office XP in WIndows 98

2002-03-08 Thread Peter_Constable

On 03/08/2002 01:11:37 PM Chris Pratley wrote:

I should point out that Word2002 does not actually support WM_UNICHAR
(actually no OfficeXP app does). 

My mistake (how could I forget -- I was disappointed when it didn't quite 
make it). Word 2002 still needs WordLink, but Publisher 2002 does support 
WM_UNICHAR.


However, the recommended method for communicating in Unicode to apps
including Office is to
a) use an NT-based OS such as NT4/Win2000/WindowsXP. Everything just
works.

I quite agree. There are many users who will be on Win98 for a while 
though (at least, many that I need to support).


b) or use the Text Services Framework, which is shipped in WindowsXP and
also in OfficeXp. This is what, I believe Keyman actually uses now to
get Unicode in Word2002 on Win98/Me - or the specific Word (object model
based) method Peter mentions below.

Not yet. It will in Keyman 6. 

I'll revise my summary in relation to MS apps and Win9x/Me

 Unicode characters that can be input using
 Keyman 5Keyman 6 (when released)

Word 2000limited by  limited by
 Windows codepages   Windows codepages

Word 2000 w/ WordLinkall of BMP  all (planes 0 - 16)

other Office 2000 apps   limited by  limited by
 Windows codepages   Windows codepages

Word 2002limited by  all (planes 0 - 16)
 Windows codepages

Word 2002 w/ WordLinkall of BMP  all (planes 0 - 16)

Publisher 2002   all of BMP  all (planes 0 - 16)

other Office XP apps limited by  limited by
 Windows codepages   Windows codepages



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]





Re: Devanagari variations

2002-03-08 Thread James E. Agenbroad

On Fri, 8 Mar 2002, Michael Everson wrote:

 At 15:16 -0500 07/03/2002, James E. Agenbroad wrote:
 On Wed, 6 Mar 2002 [EMAIL PROTECTED] wrote:
 
   On 03/06/2002 08:25:18 AM Michael Everson wrote:
   [snip]
 
   In
   Cham, independent vowels can take dependent vowel signs. In
   Devanagari, I guess that doesn't occur, but the Brahmic model
   shouldn't be understood to preclude this behaviour.
  [snip]
   - Peter
 
 A similar but not the same situation is found in the fourth example in
 figure 9-3 of Unicode 3.0 (page 214) where an intedpendent vowel has the
 reph (an abridged form of a the consonant 'ra') above it.  Unicode wants
 this encoded as consonant + halant + independent vowel. I believe it is
 better considered as a consonant + vowel sign combination which happens to
 have an odd display and at least one Sanskrit textbook agrees.
 
 Is that the sample you showed me when I was a-photocopying at the 
 Library of Congress in August, James? You're saying that RA + virama 
 + INDEPENDENT VOCALIC R and RA + VOWEL SIGN VOCALIC R should both 
 produce the same glyph?
 -- 
 Michael Everson *** Everson Typography *** http://www.evertype.com
 
 
   Friday, March 8, 2002
Michael,
 Yes.  
 [Call lme Jim]
 Regards,
  Jim Agenbroad ( [EMAIL PROTECTED] )
 It is not true that people stop pursuing their dreams because they
grow old, they grow old because they stop pursuing their dreams. Adapted
from a letter by Gabriel Garcia Marquez.
 The above are purely personal opinions, not necessarily the official
views of any government or any agency of any.
 Addresses: Office: Phone: 202 707-9612; Fax: 202 707-0955; US
mail: I.T.S. Sys.Dev.Gp.4, Library of Congress, 101 Independence Ave. SE, 
Washington, D.C. 20540-9334 U.S.A.
Home: Phone: 301 946-7326; US mail: Box 291, Garrett Park, MD 20896.  





Re: Support for Japanese characters

2002-03-08 Thread Barry Caplan

At 12:21 PM 3/8/2002 -0600, Eric Ray wrote:
Need help
please. 

Problem: 

1. Current library built for unix and supports ASCII
characters only. 

2. This library must now accept wide characters from
Japanese client.
You need to doublebyte enable the library except for the most trivial
uses. Doing so is not trivial.


Facts:
--
1. The library does not really evaluate the Japanese
characters to make logical decisions. 
If the data just passes through, that might be relatively
trivial.

We believe base64
encode the character array to avoid any bad things happening in the
code (such as hitting a null value or other values that could
potential cause problems).
Is the (non-Japanese) data already base 64 encoded? If so, why? Why
create trouble handling that just to avoid checking for null values?
Anyway, if you really aren't going to process the Japanese characters in
this library except to pass them thru, then you need to take the Japanese
text, base64 encode it, and then pass it to the library the usual way.
Then retrieve it the usual way and base64 unencode and voila!
Of course this may just move your questions to other parts of your
program, but you haven't asked about those places. without knowing what
the application is or what the configuration is except unix
it is hard to say more.

2. Cannot rewrite library in time allowed and don't
really need to based on Fact item #1. Plus, pressure to get product
to market is greater than internationalizing the
library.
This is probably a guaranteed method to fail in Japan. Japanese users and
your Japanese partners if you have them have had many years of
experience with bad software form the us that claims to work. They will
know how to break it quickly. Then you will learn a hard lesson
about doing business with Japanese while not taking heed of the well
known requirement for quality.



What I need help with:
--
1. How do I set up an ASCII based unix machine, test
application and test environment to send Japanese characters to the
library in question.
I see from your web site that the application is likely some sort
of encryption device, possibly for email. Having run the Japanese
software group at an email company in the past,I can tell you Japanese
email is fraught with its own perils under any circumstances.
Without knowing what the actual channel is that you want to pass the text
thru, it is hard to say how you will want to test it.
You also have not described the time schedule and why you consider it
tight. Is it safe to assume that your plan to counteract any lack of
experience and time schedule is to spend money to hire someone who has
both?

2. Do I need to create hex input or binary input to
represent Japanese characters. Since I'm using a standard keyboard
how do we get Japanese characters into the
application?
Use the Japanese Input Method Editor supplied with or for the
operating system. But that does not guarantee that the data will actually
get to the application properly if the application has not been coded to
handle it. This is part of internationalizing your code, and now you see
why skipping corners during the initial development is coming back to
haunt you.

3. What am I not considering here? What gotchas
will I come across by not making my library
i18nized?
The gotchas are going to fall into the categories of Won't
work or Data passes thru ok, but the rest of the application
doesn't know how to handle it. OTTOMH, I would watch out for
endianness when you base64 encode Japanese multibyte text too. Probably
OK, but worth taking a close look at.


Unfortunately, I've never done any i18n or l10n work before
so I'm really having trouble figuring out where and how to get
started. Any advice is appreciated.
There is no magic bullet here in general. if Zixit values the opportunity
in Japan, I would suggest you be open to the offers you are sure to get
from experienced folks to assist you. If you don't get any, contact me
off-list and I will put you in touch with some.

Barry Caplan
Publisher,
www.i18n.com



Re: Devanagari variations

2002-03-08 Thread James E. Agenbroad

On Fri, 8 Mar 2002 [EMAIL PROTECTED] wrote:

 Jim Agenbroad responded (off list):
 
 Not quite. On page 214 of 3.0 there is one RA vowel, a halant and a 
 RI
 vowel: RA(d) + RI(n) -- RI(n) +RA(sup)   ( parens in lieu ofsubscript)
 
 I didn't realise that RI meant the vocalic R. I mistook it to mean 
 something else. I find it a weakness of that section that such notations 
 are not defined and prominently displayed in an easy-to-find location.
 
 Thanks for setting me straight. I should have known you knew what you were 
 talking about.
 
 
 Peter
 
 
   Friday, March 8, 2002
Peter,
 I agree there is a weakness there.  Maybe more than one. 
 I have mailed you (Peter) the Deshpande and Monier Williams examples
I cited.  
 Have a nice weekend all!
 Regards,
  Jim Agenbroad ( [EMAIL PROTECTED] )
 It is not true that people stop pursuing their dreams because they
grow old, they grow old because they stop pursuing their dreams. Adapted
from a letter by Gabriel Garcia Marquez.
 The above are purely personal opinions, not necessarily the official
views of any government or any agency of any.
 Addresses: Office: Phone: 202 707-9612; Fax: 202 707-0955; US
mail: I.T.S. Sys.Dev.Gp.4, Library of Congress, 101 Independence Ave. SE, 
Washington, D.C. 20540-9334 U.S.A.
Home: Phone: 301 946-7326; US mail: Box 291, Garrett Park, MD 20896.