Support for Japanese characters

2002-03-08 Thread Eric Ray



Need help 
please. 

Problem: 


1. 
Current librarybuilt forunix and supports ASCII characters 
only. 

2. This 
library must now accept wide characters from Japanese 
client.



Facts:
--
1. The 
library does not really evaluate the Japanese characters to make logical 
decisions. Webelieve base64 encode the character array to avoid any 
"bad things happening in the code" (such as hitting a null value or other values 
that could potential cause problems).

2. 
Cannot rewrite library in time allowed and don't really need to based on Fact 
item #1. Plus, pressure to get product to market is greater than 
internationalizing the library.


What I need 
help with:
--
1.How do I set up an ASCII based unix machine, test 
application and test environment to send Japanese characters to the library in 
question.

2. Do I 
need to create hex input or binary input to represent Japanese characters. 
Since I'm using a standard keyboard how do we get Japanese characters into the 
application?

3. What 
am I not considering here? What gotchas will I come across by not making 
my library i18nized?


Unfortunately, 
I've never done any i18n or l10n work before so I'm really having trouble 
figuring out where and how to get started. Any advice is 
appreciated.


Thanks.

Eric 
Ray


Re: Support for Japanese characters

2002-03-08 Thread Barry Caplan

At 12:21 PM 3/8/2002 -0600, Eric Ray wrote:
Need help
please. 

Problem: 

1. Current library built for unix and supports ASCII
characters only. 

2. This library must now accept wide characters from
Japanese client.
You need to doublebyte enable the library except for the most trivial
uses. Doing so is not trivial.


Facts:
--
1. The library does not really evaluate the Japanese
characters to make logical decisions. 
If the data just passes through, that might be relatively
trivial.

We believe base64
encode the character array to avoid any bad things happening in the
code (such as hitting a null value or other values that could
potential cause problems).
Is the (non-Japanese) data already base 64 encoded? If so, why? Why
create trouble handling that just to avoid checking for null values?
Anyway, if you really aren't going to process the Japanese characters in
this library except to pass them thru, then you need to take the Japanese
text, base64 encode it, and then pass it to the library the usual way.
Then retrieve it the usual way and base64 unencode and voila!
Of course this may just move your questions to other parts of your
program, but you haven't asked about those places. without knowing what
the application is or what the configuration is except unix
it is hard to say more.

2. Cannot rewrite library in time allowed and don't
really need to based on Fact item #1. Plus, pressure to get product
to market is greater than internationalizing the
library.
This is probably a guaranteed method to fail in Japan. Japanese users and
your Japanese partners if you have them have had many years of
experience with bad software form the us that claims to work. They will
know how to break it quickly. Then you will learn a hard lesson
about doing business with Japanese while not taking heed of the well
known requirement for quality.



What I need help with:
--
1. How do I set up an ASCII based unix machine, test
application and test environment to send Japanese characters to the
library in question.
I see from your web site that the application is likely some sort
of encryption device, possibly for email. Having run the Japanese
software group at an email company in the past,I can tell you Japanese
email is fraught with its own perils under any circumstances.
Without knowing what the actual channel is that you want to pass the text
thru, it is hard to say how you will want to test it.
You also have not described the time schedule and why you consider it
tight. Is it safe to assume that your plan to counteract any lack of
experience and time schedule is to spend money to hire someone who has
both?

2. Do I need to create hex input or binary input to
represent Japanese characters. Since I'm using a standard keyboard
how do we get Japanese characters into the
application?
Use the Japanese Input Method Editor supplied with or for the
operating system. But that does not guarantee that the data will actually
get to the application properly if the application has not been coded to
handle it. This is part of internationalizing your code, and now you see
why skipping corners during the initial development is coming back to
haunt you.

3. What am I not considering here? What gotchas
will I come across by not making my library
i18nized?
The gotchas are going to fall into the categories of Won't
work or Data passes thru ok, but the rest of the application
doesn't know how to handle it. OTTOMH, I would watch out for
endianness when you base64 encode Japanese multibyte text too. Probably
OK, but worth taking a close look at.


Unfortunately, I've never done any i18n or l10n work before
so I'm really having trouble figuring out where and how to get
started. Any advice is appreciated.
There is no magic bullet here in general. if Zixit values the opportunity
in Japan, I would suggest you be open to the offers you are sure to get
from experienced folks to assist you. If you don't get any, contact me
off-list and I will put you in touch with some.

Barry Caplan
Publisher,
www.i18n.com