Hi Danny,
Implementing Unicode is a good thing for creating multilingual applications
and for supporting code that is distributed worldwide (or at least to a
number of locales). Based on your questions below, you probably should start
with the Unicode FAQ (on the website) and with the standard book itself (The
Unicode Standard, Version 3.0). You might also want to look at Ken Lunde's
excellent book _CJKV Information Processing_, which explains all about
encodings used in Asia and how they relate to each other.
WRT web browers, etc., it is more common to use the multibyte encoding of
Unicode (called UTF-8) for HTML applications. Most web development
environments support UTF-8 pretty well. Note that the encoding at the
browser says nothing about the internal processing of your system, which may
use the 16-bit encoding of Unicode (called UTF-16 and formerly called
UCS-2).
There is nothing wrong with using "legacy encodings" (which is what
Unicoders call encodings that aren't Unicode;-)) for your HTML interface.
This may make it easier for users in country to view the web pages without
adjusting their browser's font settings and preferences. Ancient browsers
(Netscape and IE 4.x and earlier) defaulted to using a font for Unicode that
only supported Latin (Western European) characters, so Asian users sometimes
would see black squares instead of their own characters unless they adjust
their settings.
It should be noted that, until Unicode 3.1 came out recently, there were a
number of characters encoded in some of the legacy encodings you cite which
were not included in Unicode. Support for Unicode 3.1 is planned for most
environments--eventually--but mostly this support is unavailable at present.
These characters that I just mentioned are generally considered quite rare,
but you should be aware of it as a potential objection that Asian users
might have to a pure Unicode approach.
Good luck with your implementation.
Best Regards,
Addison
Addison P. Phillips
Globalization Architect / Manager, Globalization Engineering
webMethods, Inc. 432 Lakeside Drive, Sunnyvale, CA
+1 408.962.5487 (phone) +1 408.210.3659 (mobile)
-------------------------------------------------
Internationalization is an architecture. It is not a feature.
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of Magda Danish (Unicode)
> Sent: Wednesday, August 01, 2001 10:34 AM
> To: [EMAIL PROTECTED]
> Subject: FW: Unicode in Asia Question
>
>
>
>
> -----Original Message-----
> From: NORIEGA,DANNY (A-HongKong,ex1)
> [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, August 01, 2001 2:56 AM
> To: '[EMAIL PROTECTED]'
> Subject: Unicode in Asia Question
>
>
> Hi:
>
> My company is planning to implement 16-bit Unicode. The
> proposal is to
> go strictly and solely with Unicode (16 bit Unicode for Asia/Japan).
>
> Up to this point we have specified the following encodings:
> - Big 5 (Traditional Chinese)
> - GB 2312 (Simplified Chinese)
> - Shift - JIS (Japanese)
> - KSC 5601 1967 (Korean)
> - Iso-8859-1 (Western character sets)
> - Unicode (we believe this is used for Russian)
>
> I do not fully understand the need for the various encodings.
> I believe
> there are local preferences for browsers (vendors, versions, plugins,
> etc.) that are related to encoding. I have also heard there is some
> need, in Japan for example, where web users routinely view the HTML
> source and expect Shift-JIS.
>
> Can you confirm what browser preferences (encoding driven) are user
> musts? By this I mean IE 4.0+, Netscape, Mosiac, KK Man & etc.. And,
> what are the customer needs that would make an specific encoding
> (Unicode, Big 5, GB 2312 or KSC 5601 1967) a must?
>
> The basic question I'm trying to answer is if we move forward
> with using
> strictly Unicode, will my customers in Asia be adversely affected by
> this decsion? Will they not be able to view content I place on my
> website.
>
> Best regards,
>
> Danny Noriega
> Asia eBusiness Manager
> Agilent Technologies Hong Kong Ltd.
> [EMAIL PROTECTED]
>
>
winmail.dat