Unicodersread belowLisa
Send in your submissions now!
Call for Papers!
Twenty-seventh Internationalization and Unicode Conference (IUC27)
Theme: Unicode, Cultural Diversity and Multilingual Computing
See Call for Papers at:
On 12/10/2004 00:10, Mike Ayers wrote:
From: Hohberger, Clive [mailto:[EMAIL PROTECTED]
Sent: Monday, October 11, 2004 11:08 AM
I agree with you... almost.. I think that AD and BC are
really ordinal numbers, which denote relative position in a
series from a 1-origin point. I thought 1 AD
From: Doug Ewell [EMAIL PROTECTED]
Theodore H. Smith delete at elfdata dot com wrote:
- the file mixes UTF-8 and UTF-16
Does this file mix UTF-8 and UTF-16? I thought it just had surrogates
encoded into UTF-8? Of course a surrogate should never exist in UTF-8.
You are right. Philippe's statement
Years were frequently written with Roman numerals - which of course have
no zero.
- Chris
The Unicode Technical Committee has posted three new public review issues.
Details are on the following web page:
http://www.unicode.org/review/
Briefly the new issues are:
47 Changes to default collation of Latin in UCA
In collation, searching, and matching according to the Unicode
On Tue, 12 Oct 2004 20:25:16 +0200, Philippe Verdy [EMAIL PROTECTED] wrote:
From: Doug Ewell [EMAIL PROTECTED]
Theodore H. Smith delete at elfdata dot com wrote:
- the file mixes UTF-8 and UTF-16
Does this file mix UTF-8 and UTF-16? I thought it just had surrogates
encoded into UTF-8?
From: Clark Cox [EMAIL PROTECTED]
unless the file was used as a test for CESU-8
The whole point of the CESU-8-like section is that it is not legal UTF-8.
Except that the document does not even cite CESU-8 but only UTF-16! The
text itself is puzzling as well as nearly all its suggestions about
- Original Message -
From: Christopher Fynn [EMAIL PROTECTED]
To: Unicode List [EMAIL PROTECTED]
Sent: Tuesday, October 12, 2004 8:34 PM
Subject: Re: bit notation in ISO-8859-x is wrong
Years were frequently written with Roman numerals - which of course have
no zero.
Major arcana
There has been a further update to the document for Public Review Issue
#48 (Directional Run) to clarify and expand the proposed definition. If you
have already reviewed the document, I apologize for the inconvenience. The
revised document is linked from the review page:
But for certain purposes e.g. historical astronomical calculations
(used for establishing chronology from records of eclipses etc) the
year numbers used are effectively negative numbers (and zero) AD.
Well, astronomers normally convert everything to Julian Day (JD)
numbers, starting at
Title: RE: bit notation in ISO-8859-x is wrong
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]] On Behalf Of Werner LEMBERG
Sent: Tuesday, October 12, 2004 2:26 PM
But for certain purposes e.g. historical astronomical calculations
(used for establishing chronology from records
Philippe Verdy schrieb:
Examples of bad assumptions that a reader could make:
- [quote](...) Experience so far suggests
that most first-time authors of UTF-8 decoders find at least one
serious problem in their decoder by using this file.[/quote]
This suggests to the reader that if its browser or
Using a certain newly Unicode-aware database application which shall
remain nameless (FileMaker 7):
imported UTF-8 sequences like [U+0065][U+0303] e, tilde get remapped
internally to [U+1ebd] LATIN SMALL LETTER E WITH TILDE.
Is this kind of behavior what one would expect?
It's problematic (and
From: Philipp Reichmuth [EMAIL PROTECTED]
Don't you think you are stretching things a bit? This is an UTF-8 parser
stress test file. If an application opens it in a different encoding,
well, of course the results will be different, and things will not look
UTF-8-ish. Again, this is a
14 matches
Mail list logo