See
http://www.microsoft.com/globaldev/reference/iso.asp
or more specifically
http://www.microsoft.com/globaldev/reference/iso/28591.htm
http://www.microsoft.com/globaldev/reference/iso/28592.htm
http://www.microsoft.com/globaldev/reference/iso/28594.htm
http://www.microsoft.com/globaldev/refe
> >See
> >
> >http://www.microsoft.com/globaldev/reference/iso.asp
> >
> >or more specifically
> >
> >http://www.microsoft.com/globaldev/reference/iso/28591.htm
> >http://www.microsoft.com/globaldev/reference/iso/28592.htm
> >http://www.microsoft.com/globaldev/reference/iso/28594.htm
>htt
> >> On the cover of my French driver's license, it says ``Driving
> >> license'' in 10 languages (all the EU languages at the time it was
> >> printed). The titles are ordered alphabetically by the name of the
> >> language in the language itself. The Portuguese don't seem to mind.
> >>
> >>
>I admit to nitpicking because in this particular case, the language
names,
>we may be just lucky so that there are no collation conflicts.
I believe this is an accurate statement... .we ARE lucky, so far.
>But believing that there is a collation order that works across a
> > >(Has somebody written a comprehensive collection of all these collation
> > >problems?)
>
Ok, here is the full list of ones I know about, and the VB code that would
demonstrate them, as needed:
(Note: All of this is coming from the book I am working on that discussed
i18N for Visual Basic,
> I think somebody just mentioned that many Italians like "i" and "j" to be
> "equal".
>
Ah, since I am very "Windows" based I always bow to the built-in sorts in
the NLS database, and never recognize other ones until I have a customer
clamoring for support of that sort in an application for whic
Well, along with posts made to the list earlier, there is the problem of
languages that may have native speakers who are unhappy with your collation
scheme. Period. In my experience the fastest way to piss off a user is to
refuse them the right to see things sorted as they would prefer.
But the g
: Robert A. Rosenberg[SMTP:[EMAIL PROTECTED]]
> Sent: Thursday, June 15, 2000 1:27 PM
> To: Unicode List
> Cc: Unicode List
> Subject: RE: Linguistic precedence [was: (TC304.2313) AND/OR:
>
> At 07:53 AM 06/15/2000 -0800, Michael Kaplan (Trigeminal Inc.) wr
One of things I like about Windows: its so easy to look at different date
formats. See
http://www.trigeminal.com/samples/setlocalesample.asp
Its a US NT4 server so I could do everything I wanted to like Japan,
Korea, TamilNadu, etc. But I tried for a little variety, and stuck a few RTL
langs
To Windows 2000 (and Windows NT circa SP4 as well), UTF-8 is another
multibyte encoding, which you can get to via "code page 65001" and
MultiByteToWideChar and get from via WideCharToMultiByte. So the only
difference between it and any other code page, be it iso-8859-1 or
windows-1252 is that happ
> if it is xml, then have a look at the xml spec (with the errata list!!).
> it is very clearly specified how to figure that all out there.
> ...
>
Actually, the XML spec is quite clear that neither UTF-16 nor UTF-8 require
the encoding tag XML is defined by one of the following:
1) Starts w
Danger is a relative term, I think.
Windows 2000 Notepad includes one so that it can easily recognize a file you
saved as UTF-8 actually being UTF-8 the next time you load it. If you remove
it, then obviously Notepad may not be able to recognize the file as UTF-8.
You should obviously never dis
Thus far it is something that has been implemented in the fonts, rather than
anywhere else for example there are several ligatures in Tamil that will
display one way with the Latha font and the other way with Monotype Tamil
Arial (the way set out in Unicode 3.0 is done in the latter).
Thus s
> > Thus since people who write the language sent both,
>
>
> Do you mean that Tamil writers *purposely* use both the "ancient" and the
> "modern" forms in the same document?
> What is the intent?
>
yes, that is what am I saying. If you go to several of the Tamil resource
sites on the web, you
I agree Gary.
Windows 2000 Notepad, however, does not agree and writes one.
Since Notepad in prior versions of Windows was in fact the defacto standard
for HTML editor (), clearly it is a program to be reckoned with. People
should be aware of the fact that there are going to MANY files out there
Microsoft is very COM-based for its actual data access methods and COM
uses BSTRs that are BOM-less UTF-16. Because of that, the actual storage
format of any database ends up irrelevant since it will be converted to
UTF-16 anyway.
Given that this is what the data layers do, performance is cer
> Sent: Friday, June 23, 2000 11:34 AM
> To: Michael Kaplan (Trigeminal Inc.)
> Cc: Unicode List
> Subject: RE: UTF-8 BOM Nonsense
>
> At 11:31 AM 06/22/2000 -0800, Michael Kaplan (Trigeminal Inc.) wrote:
> >I do not believe that this will require it to be added to a
>But what is the semantic intent, then?
>In other words, what may mean the use of "elephant-trunk" ai vs the
"normal" one?
>What may mean the use of the rounded naa vs the "normal", two
parts, one?
I do not know enough about Tamil usage to understand THAT part. :-)
is
case is hiding the differences.
Michael
> --
> From: [EMAIL PROTECTED][SMTP:[EMAIL PROTECTED]]
> Sent: Friday, June 23, 2000 2:27 PM
> To: Michael Kaplan (Trigeminal Inc.)
> Cc: Unicode List; [EMAIL PROTECTED]
> Subject: RE: Java
If you are on a Microsoft platform and have the code page support for the
arabic code page, then a simple MultiByteToWideChar call will take care of
it. Here are the code page numbers to use:
Arabic (ASMO 708): 708
Arabic (DOS): 720
Arabic (ISO): 28596
Arabic (Mac
I have heard the same thing, and think it is underscores a point that MANY
companies forget: not all dialects of Arabic are the same, despite the fact
that most software packages have *one* Arabic version.
Issues such as this one can obviously cause major issues since it even
affects logical vs.
21 matches
Mail list logo