RE: Roundtripping Solved

2004-12-17 Thread Arcane Jill
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Lars Kristan Subject: RE: Roundtripping Solved However, requirements 1 and 2 are actually taken from Unicode standard, they are not my requirements. How's that? Well, they are my requirements also, but instead

Re: Roundtripping Solved

2004-12-17 Thread Arcane Jill
This probably doesn't make any difference, Peter, but just so we're talking the same language as each other, I had actually defined f() to return a stream of Unicode characters, not a stream of UTF-16 code units, so I would have written this as: UTF-16(f(s8)) = UTF-16(utf_8_decode(s8)) which si

RE: Roundtripping Solved

2004-12-16 Thread Arcane Jill
Arcane Jill wrote: #for all possible octet sequences s: #length of (UTF-8(f(s)) <= length of s, No, that is not the requirement. It is: bytelength(f(s)) <= 2*bytelength(s) You haven't understood. By definition, s is an octet stream, and f(s) is a Unicode character s

Re: Roundtripping Solved

2004-12-16 Thread Arcane Jill
the business of the UTC. Hope I haven't misunderstood things completely. That would be /so/ embarrassing! Jill -Original Message- From: Peter Kirk [mailto:[EMAIL PROTECTED] Sent: 16 December 2004 12:09 To: Lars Kristan Cc: Arcane Jill; Unicode Subject: Re: Roundtripping Solved The on

RE: Roundtripping Solved

2004-12-16 Thread Arcane Jill
-Original Message- From: Lars Kristan [mailto:[EMAIL PROTECTED] As for your solution, I didn't really analyze it. But it is escaping, isn't it? Yes With a lot of overhead. If you call string length "overhead", yes. This was to provide reasonable assurance that an escape sequence won't be

Re: Roundtripping Solved

2004-12-16 Thread Arcane Jill
[mailto:[EMAIL PROTECTED] Sent: 15 December 2004 16:28 To: Unicode Mailing List Cc: Arcane Jill Subject: Re: Roundtripping Solved Of course, Jill's scheme uses non-private-use Unicode scalar values to achieve what is essentially a private-use function, so this is still non-conformant. (A simi

Re: Roundtripping Solved

2004-12-15 Thread Arcane Jill
incorrect identification down astronomically low. Jill -Original Message- From: Peter Kirk [mailto:[EMAIL PROTECTED] Sent: 15 December 2004 12:54 To: Arcane Jill Cc: Unicode Subject: Re: Roundtripping Solved But would it not work just as well to for Lars' purposes to use, instead of y

RE: Roundtripping in Unicode

2004-12-15 Thread Arcane Jill
-Original Message- From: [EMAIL PROTECTED] On Behalf Of Philippe Verdy Sent: 14 December 2004 22:47 To: Marcin 'Qrczak' Kowalczyk Cc: [EMAIL PROTECTED] Subject: Re: Roundtripping in Unicode From: "Marcin 'Qrczak' Kowalczyk" <[EMAIL PROTECTED]> "Ar

Roundtripping Solved

2004-12-15 Thread Arcane Jill
I followed (and understood) Lar's explanation as to why the NOT- solution wouldn't work for him. Shame really - but here's another bash at a solution, again without breaking the Unicode model. If I have understood this correctly, these are Lars' requirements: 1) There exists a function, f()

RE: Roundtripping in Unicode

2004-12-14 Thread Arcane Jill
I've been following this thread for a while, and I've pretty much got the hang of the issues here. To summarize: Unix filenames consist of an arbitrary sequence of octets, excluding 0x00 and 0x2F. How they are /displayed/ to any given user depends on that user's locale setting. In this scenario

Re: Roundtripping in Unicode

2004-12-13 Thread Arcane Jill
If I have understood this correctly, filenames are not "in" a locale, they are absolute. Users, on the other hand, are "in" a locale, and users view filenames. The same filename can "look" different to two different users. To user A (whose locale is Latin-1), a filename might look valid; to user

Re: When to validate?

2004-12-13 Thread Arcane Jill
I like that. Makes total sense. Thanks. Jill -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Antoine Leca Sent: 10 December 2004 17:38 To: Unicode Subject: Re: When to validate? As a result, your strings are likely to be some stuctures. Then, it is pretty easy

When to validate?

2004-12-10 Thread Arcane Jill
have to do the validation somewhere else - for example something like t = tolower(trim(validate(s))). where validate(s) does nothing but throw an exception if s is invalid. Other people must have had to make decisions like this. What's the preferred strategy? Arcane Jill

Re: US-ASCII (was: Re: Invalid UTF-8 sequences)

2004-12-09 Thread Arcane Jill
- Original Message - From: "Arcane Jill" <[EMAIL PROTECTED]> To: "Unicode" <[EMAIL PROTECTED]> Sent: Friday, December 10, 2004 7:17 AM Subject: RE: US-ASCII (was: Re: Invalid UTF-8 sequences) Yes, of course it was a joke. Rest assured, if I perceive any k

RE: US-ASCII (was: Re: Invalid UTF-8 sequences)

2004-12-09 Thread Arcane Jill
next time. :-) Oh, and thanks for the interesting historical character set info. Jill -Original Message- From: Doug Ewell [mailto:[EMAIL PROTECTED] Sent: 09 December 2004 16:28 To: Unicode Mailing List Cc: Arcane Jill Subject: US-ASCII (was: Re: Invalid UTF-8 sequences) I hope that's j

Re: Invalid UTF-8 sequences (was: Re: Nicest UTF)

2004-12-09 Thread Arcane Jill
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Antoine Leca Sent: 09 December 2004 11:29 To: Unicode Mailing List Subject: Re: Invalid UTF-8 sequences (was: Re: Nicest UTF) Windows filesystems do know what encoding they use. Err, not really. MS-DOS *need to k

Re: Nicest UTF

2004-12-06 Thread Arcane Jill
is simply that number expressed in binary. But now I'm getting /very/ silly - please don't take any of this seriously.) :-) The "UTF-24" thing seems a reasonably sensible question though. Is it just that we don't like it because some processors have alignment restrictions

RE: Nicest UTF

2004-12-02 Thread Arcane Jill
Oh for a chip with 21-bit wide registers! :-) Jill -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Antoine Leca Sent: 02 December 2004 12:12 To: Unicode Mailing List Subject: Re: Nicest UTF There are other factors that might influence your choice. For example,

RE: Fixed Width Spaces (was: Printing and Displaying DependentVowels)

2004-04-02 Thread Arcane Jill
OK, I was wrong about the ZX80 character set. Seems I was actually thinking about the ZX Spectrum. Ahem. It's character set is listed here: http://www.madhippy.com/8-bit/sinclair/zxspecman/zxmanappa.html Note the distinction between character 0x20 and character 0x80. Arcane Jill

RE: Fixed Width Spaces (was: Printing and Displaying DependentVowels)

2004-04-02 Thread Arcane Jill
indistinguishable from space, but was NOT space. Of course ZX80 characters did not, in general, have properties, but line breaking algorithms looked for character 0x00, not character 0x80, and so graphic-space behaved like a non-space, not like a space. Arcane Jill > -Original Mess

Re: Fixed Width Spaces (was: Printing and Displaying DependentVowels)

2004-04-01 Thread Arcane Jill
de is another matter entirely, but it sounds good to me, so I'll raise it for discussion. Phillippe's idea does have precedent. Arcane Jill > -Original Message- > From: Philippe Verdy [mailto:[EMAIL PROTECTED]] > Sent: Thursday, April 01, 2004 10:52 AM > Subject: Re: Fixed

Re: [OT] proscribed words... (was:What is the principle?)

2004-03-29 Thread Arcane Jill
> -Original Message- > From: Asmus Freytag [mailto:[EMAIL PROTECTED] > Sent: Sunday, March 28, 2004 10:56 PM > Subject: Re: [OT] proscribed words... (was:What is the principle?) > > > being more used to the European practice of > banning certain ideas. Eh? Cou

What is the principle?

2004-03-26 Thread Arcane Jill
Hi, Ignoring all compatibility characters; ignoring everything that has gone before; and considering only present and future characters (that is, characters currently under consideration for inclusion in Unicode, and characters which will be under consideration in the future), which of the fol

RE: help needed with adding new character

2004-03-19 Thread Arcane Jill
UST be hateful or violent, and also no reason why adherents of such a philosophy should not be able to organise sufficiently to agree on standardizing the use of a character. It seems to me that quips such as those below are detrimental, and irrelevant to issues of character encoding. Arcane

What's the BMP being saved for?

2004-03-18 Thread Arcane Jill
erical order on a first-come first-served basis. Maybe someone could assuage my curiousity? Arcane Jill

Re: Investigating: LATIN CAPITAL LETTER J WITH DOT ABOVE

2004-03-17 Thread Arcane Jill
But if you lowercased that, surely you'd get . How should that be rendered? > -Original Message- > From: Kent Karlsson [mailto:[EMAIL PROTECTED] > > A dotted capital J can already be encoded as . > Hence, a separate precomposed such character will not be added. > > /kent k > > > Well, i

RE: Infix profanity (Very OT) (was Phonology)

2004-02-05 Thread Arcane Jill
> Nope, sorry.  Not American -- Minbari. > > For more info on the Minbari, please see: > http://www.sadgeezer.com/babylon5/minbari.htm > > Best regards, > > James Kass > Good point. I was actually referring to the writers, not the character, but you could certainly argue that the writers

RE: Phonology [was: interesting SIL-document]

2004-02-05 Thread Arcane Jill
  -Original Message- From: Hohberger, Clive [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 04, 2004 11:08 PM To: Mike Ayers; 'John Burger'; [EMAIL PROTECTED] Subject: RE: Phonology [was: interesting SIL-document] Mike, Actually "be-f***king-hind" is a B

RE: Does Java 1.5 support Unicode math alphanumerics as variable names?

2004-01-26 Thread Arcane Jill
I would be very surprised if it did, since Java chars are still only sixteen bits wide, and the new math alphanumerics are not in BMP. Still, I'd be very happy to be proved wrong on this one. Actually, I'd quite like to use these as variable names in other languages too, like in C++ for examp

RE: [OT] Keyboards (was: American English translation of character names)

2003-12-19 Thread Arcane Jill
---Original Message- > From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] > Sent: Thursday, December 18, 2003 5:44 PM > To: 'Arcane Jill'; [EMAIL PROTECTED] > Subject: RE: [OT] Keyboards (was: American English translation of > characte r names) > > > Arcane J

RE: [OT] Keyboards (was: American English translation of character names)

2003-12-18 Thread Arcane Jill
From: Doug Ewell [mailto:[EMAIL PROTECTED] > Sent: Thursday, December 18, 2003 4:28 PM > To: Unicode Mailing List > Cc: Arcane Jill > Subject: Re: [OT] Keyboards (was: American English translation of > character names) > > > On U.S. keyboards, there is no letter key to the left

RE: American English translation of character names

2003-12-18 Thread Arcane Jill
> -Original Message- > From: Eric Scace [mailto:[EMAIL PROTECTED]] > Sent: Thursday, December 18, 2003 3:57 PM > To: John Cowan; Arcane Jill > Cc: [EMAIL PROTECTED] > Subject: RE: American English translation of character names > > >    The logical "no

[OT] Keyboards (was: American English translation of character names)

2003-12-18 Thread Arcane Jill
else from this part of the world care to confirm this? Or perhaps explain why?). Jill > -Original Message- > From: John Cowan [mailto:[EMAIL PROTECTED]] > Sent: Thursday, December 18, 2003 2:31 PM > To: Arcane Jill > Cc: [EMAIL PROTECTED] > Subject: Re: American

RE: American English translation of character names

2003-12-18 Thread Arcane Jill
Thanks, that's interesting. It may well be the case that printers, typesetters, etc., are the only people who actually need these things to have names, so I guess their names should be respected. The rest of us just seem to get by without them, somehow. For example, U+00AC (NOT SIGN) is someth

RE: American English translation of character names

2003-12-18 Thread Arcane Jill
> From: Christopher John Fynn [mailto:[EMAIL PROTECTED]] > There is plenty of disagreement about what the "proper" name  for many > characters should be Or, indeed, why the "proper" name for a character must be in English, and spellable in ASCII, instead of, say, Japanese. > From: Kenneth Whi

RE: Case mapping of dotless lowercase letters

2003-12-17 Thread Arcane Jill
> Would it not make more sense to have not two, but three different kinds of lowercase i: , and ?. (And similarly for uppercase). Of course, then you might as well invent COMBINING SOFT DOT ABOVE so we can use it elsewhere. I should have mentioned that in this hypothetical scheme, the fol

RE: Case mapping of dotless lowercase letters

2003-12-17 Thread Arcane Jill
Far be it from me to stir things up even further, but... QUESTION - Is the rendering of {U+0065} {U+0302} (that's ) locale-dependent? I may have got this totally wrong, but it occurs to me that in non-Turkic fonts, U+0065 is "soft-dotted". That is, the dot disappears in the presence of any C

G-Strings

2003-12-16 Thread Arcane Jill
There was talk recently on this list of mapping grapheme clusters to the PUA (for application internal use only, obviously, not for export to the real world). I actually did this recently, though my app ended up in an incomplete state since I got bored and moved onto something else. The algorit

RE: Case mapping of dotless lowercase letters

2003-12-16 Thread Arcane Jill
> Do we have Unicode DNS yet? Yup. You can put Chinese letters in domain names now. You do it like this: (1) Convert to NFC (2) Encode in UTF-8 (3) Replace all reserved characters (space, %, etc.) with the three character string "%hh" (where hh is hex for the substituted character) (4) Now

RE: Stability of WG2

2003-12-16 Thread Arcane Jill
Speaking as a Brit, I would like to know the answer to this one too. What's the problem with answering online? And if you're really not going toanswer this online, you could have just emailed Peter privately, instead of telling the whole list that you're going to keep the answer secret from a

RE: Case mapping of dotless lowercase letters

2003-12-16 Thread Arcane Jill
This occurred to be even before I read Phillppe's email. Since {U+0069} is not canonically equivalent to {U+0131}{U+0307}, I don't see anything to stop me from registering the domain name "un{U+0131}{U+0307}code.org", for example. It is in NFC, after all. Jill  -Original Message-

RE: Case mapping of dotless lowercase letters

2003-12-15 Thread Arcane Jill
Yes, I know - same as dotted a, b, c, d, e, f, g and so on are distinct from dotless a, b, c, d, e, f, g and so on. I just meant that U+0069 could have been considered dotless - with dotted i being somewhere else. This wouldn't necessarily stop font designers for Western markers from putting a

RE: [OT reversing letters to avoid offence] Re: [Fwd: Re: Swastika to be banned by Microsoft?]

2003-12-15 Thread Arcane Jill
Not wishing to bring the conversation down too low-brow, ABBA often spelt their name with the first B reversed. Jill (in a silly mood --- and I sure am glad that this thread is marked OT). > -Original Message- > From: Mark E. Shoulson [mailto:[EMAIL PROTECTED]] > Sent: Monday, Decembe

RE: Case mapping of dotless lowercase letters

2003-12-15 Thread Arcane Jill
I sometimes wonder whether or not it was a wise choice to regard "LATIN SMALL LETTER I" and "LATIN SMALL LETTER DOTLESS I" as distinct. Too late to change it now, of course, but (with the benefit of hindsight) it occurs to me that if U+0069 had been regarded as dotless, all these problems woul

RE: Text Editors and Canonical Equivalence (was Coloured diacritics)

2003-12-12 Thread Arcane Jill
And what, I find myself wondering, does "nearly infinite" mean? Could you perhaps give us an example of a number which is both finite and "nearly infinite" ? ;-) Jill (just havin a larf) -Original Message- From: Philippe Verdy [mailto:[EMAIL PROTECTED] Sent:Friday, December 12,

RE: Text Editors and Canonical Equivalence (was Coloured diacritics)

2003-12-11 Thread Arcane Jill
n text would a reasonable feature for a text editor to offer. (Even in XML documents, it would only affect one character, if I've understood this thread correctly). Jill > -Original Message- > From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, December 09, 2

RE: Text Editors and Canonical Equivalence (was Coloured diacritics)

2003-12-09 Thread Arcane Jill
Hmm. Now here's some C++ source code (syntax colored as Philippe suggests, to imply that the text editor understands C++ at least well :enough to color it) int n = wcslen(L"café"); (That's int n = wcslen(L"café"); for those without HTML email) The L prefix on a string literal makes it a wide

RE: MS Windows and Unicode 4.0 ?

2003-12-04 Thread Arcane Jill
Okay, I've read enough. I've got the message. Microsoft's view = make the customer pay through the nose for everything you can possibly get away with Linux view = you can have whatever you want for free, but you have to be techy enough to understand it in some detail and/or write it yourself Appl

Free Fonts

2003-12-03 Thread Arcane Jill
This should really be in a FAQ somewhere on the Unicode web site, methinks. One thing - the fonts print spectacularly well, but don't seem to display well on the screen (at least, not in Microsoft Word). Any idea why that might be? Jill -Original Message- From: Philippe Verdy [mail

RE: MS Windows and Unicode 4.0 ?

2003-12-03 Thread Arcane Jill
Sigh. What it is to be constantly misunderstood. In an earlier email on this thread, Peter Constable said "So, out of the box, Windows XP does not support (e.g.) Sinhalese, or ship with Sinhalese fonts. And so, if the next version of Windows does include support for Sinhalese and perhaps even

RE: MS Windows and Unicode 4.0 ?

2003-12-03 Thread Arcane Jill
Actually, a number of points have been made in the course of this thread. Of course it is true that Apple's Last Resort font doesn't display every character with an approximation of its shape, I acknowledge that. I still think it's a lot better than nothing though. But - to clarify my expectat

RE: MS Windows and Unicode 4.0 ?

2003-12-02 Thread Arcane Jill
You misunderstand me. Whilst I have no objection to paying for ADDED value, I'm talking about what comes built in, out of the box. Consider the literary equivalent. Suppose I went to a library and borrowed a book, took it home, and attempted to read it (the real world equivalent of viewing a

RE: Fonts on Web Pages

2003-12-02 Thread Arcane Jill
TED] > Sent: Tuesday, December 02, 2003 1:50 PM > To: Arcane Jill > Cc: [EMAIL PROTECTED] > Subject: Re: Fonts on Web Pages > > Well, note that that technology works with Netscape 4.x and > nothing else: > no IE, no Mozilla/Netscape 6/Netscape 7, no Opera. Overall, I t

RE: Fonts on Web Pages

2003-12-02 Thread Arcane Jill
] Sent: Tuesday, December 02, 2003 12:51 PM To: Arcane Jill Cc: [EMAIL PROTECTED] Subject: Re: Fonts on Web Pages Of course Adobe was designed  to do just the problem you defined,

RE: Fonts on Web Pages

2003-12-02 Thread Arcane Jill
e W3C or some other bunch. Jill -Original Message- From: Raymond Mercier [mailto:[EMAIL PROTECTED]] Sent: Tuesday, December 02, 2003 11:29 AM To: Arcane Jill Cc: [EMAIL PROTECTED] Subject: Re: Fonts on Web Pages Surely Adobe Acrobat will solve both problems ? The recipient only needs to

Fonts on Web Pages

2003-12-02 Thread Arcane Jill
Anyone know the current status on embedded fonts in web pages? I basically have two questions. (1) Assume the existence of a font to which I legally own the copyright. For example, let's say I invented it. Now, I design a web page which uses this font. Now, it's easy (but terribly inconvenient

RE: MS Windows and Unicode 4.0 ?

2003-12-02 Thread Arcane Jill
Damn right. I would like to know this too. In particular, I want all the math characters working, and all the musical symbols working. Note that many of these are not in the BMP. I want to be able to put these characters on web pages, and know that they will be displayed correctly on my own ch

RE: no more precomposed characters for 1:1 conversion

2003-12-02 Thread Arcane Jill
Forgive my ignorance. What is ICU? (I like to know what something is before I download it). Jill > -Original Message- > From: Markus Scherer [mailto:[EMAIL PROTECTED] > Sent: Monday, December 01, 2003 10:36 PM > To: unicode > Subject: no more precomposed characters for 1:1 conversion > > >

RE: MS Windows and Unicode 4.0 ?

2003-12-01 Thread Arcane Jill
argue that the default case mappings should be the ones used everywhere. Jill > -Original Message----- > From: Mark E. Shoulson [mailto:[EMAIL PROTECTED] > Sent: Monday, December 01, 2003 1:58 PM > To: Arcane Jill > Cc: [EMAIL PROTECTED] > Subject: Re: MS Windows and Unicode 4.0 ? > > > Shouldn't it permit "assa" and "aßa" to co-exist? It isn't like ß is > canonically equivalent to ss

RE: numeric properties of Nl characters in the UCD

2003-12-01 Thread Arcane Jill
No probs, Doug. I was actually ill over the weekend, and I think I was probably way too sensitive on Friday when it was coming on. I guess I didn't really notice at the time and blamed everyone else for having a go at me when I should have been blaming a bunch of nasty microbes for making me fe

RE: MS Windows and Unicode 4.0 ?

2003-12-01 Thread Arcane Jill
Indeed. The current Windows OS still stores filenames as strings of sixteen-bit wide words (not codpoints; not characters). It allows filenames "assa" and "aßa" to coexist in the same folder, despite its claim to being case-insensitive, and I have even managed to create filenames containing un

RE: Complex Combining

2003-12-01 Thread Arcane Jill
Of course, one really important point is that Unicode text should remain stateless. It would be foolish indeed if, starting from an arbitrary point in the string, one had to parse backwards and forwards to see if there were any invisible brackets. In the extreme, one would have to scan the ent

RE: Complex Combining

2003-11-28 Thread Arcane Jill
You are getting personal and indulging in ad hominem. I consider this out of order. Yes I have read TUS Section 2.2, and indeed the whole of the rest of the book - and understood it, too, so you can stop wondering that right now. Unicode design principles do not change the fact that there are

RE: Decimal digit property - What's it for?

2003-11-28 Thread Arcane Jill
ot; propery) is the one which remains unanswered. Thanks again. Jill > -Original Message- > From: Jim Allan [mailto:[EMAIL PROTECTED]] > Sent: Thursday, November 27, 2003 6:56 PM > To: [EMAIL PROTECTED] > Subject: Re: Decimal digit property - What's it for? > > &

RE: numeric properties of Nl characters in the UCD

2003-11-28 Thread Arcane Jill
ompletely out of context, then I'd feel a lot happier. Of course I know what "decimal" means in everyday language. Do you think I'm an idiot? Please stop treating me as one. Jill > -Original Message- > From: Doug Ewell [mailto:[EMAIL PROTECTED]] > Sent: Thursday

Decimal digit property - What's it for?

2003-11-27 Thread Arcane Jill
Hi, It has been explained to me that the "decimal digit" property has the following meaning: "Decimal numbers are those using in decimal-radix number systems. In particular, the sequence of the ONE character followed by the TWO character is interpreted as having the value of twelve". What's th

Complex Combining

2003-11-27 Thread Arcane Jill
MAIL PROTECTED]] Sent:    Thursday, November 27, 2003 1:01 PM To:    Arcane Jill Cc:    [EMAIL PROTECTED] Subject:    RE: numeric properties of Nl characters in the UCD Arcane Jill writes: > Gotcha. It's all starting to make sense now. Including the opposition to hex. > > Maybe one coul

RE: numeric properties of Nl characters in the UCD

2003-11-27 Thread Arcane Jill
ts in any radix; "number integer" for integer types such as circled 2 which can't be used positionally; "number fraction" for fractions, and "number other" for everything else. Or maybe some other similar scheme. Is it too late to change things now? Jill  --

RE: numeric properties of Nl characters in the UCD

2003-11-27 Thread Arcane Jill
...which brings me back to my question (which no-one's answered yet). What do the properties "digit" versus "decimal digit" actually MEAN? Is it possible for someone to give a PRECISE definition. I mean, it seems pretty clear that "decimal digit" does NOT mean "radix ten digit" (otherwise circl

RE: numeric properties of Nl characters in the UCD

2003-11-26 Thread Arcane Jill
In full agreement with Philippe here. But also, ever since I first discovered Unicode, I have had the opinion that the descriptions in what is now UCD.html are very confusingly worded. For a start, the three types of numeric property are called "decimal digit", "digit", and "numeric". Now, as

RE: Compression through normalization

2003-11-26 Thread Arcane Jill
In the case of GIF versus JPG, which are usually regarded as "lossless" versus "lossy", please note that there is no "orignal", in the sense of a stream of bytes. Why not? Because an image is not a stream of bytes. Period. What is being compressed here is a rectangular array of pixels, and tha

RE: numeric properties of Nl characters in the UCD

2003-11-26 Thread Arcane Jill
That is almost precisely what I said. You repeated it perfectly. Thanks. But actually, there is one small difference between what I said and what you said. I merely observed that no characters have different non-null values for the various number-related properties. But you state (emphasis on

RE: numeric properties of Nl characters in the UCD

2003-11-25 Thread Arcane Jill
Actually, I don't understand why UnicodeData.txt has no less than three different fields for numerical value anyway. I mean, it's not as though there exists EVEN A SINGLE CODEPOINT for which two or more of these fields exist and are defined differently from each other. One never sees, for exam

RE: Compression through normalization

2003-11-25 Thread Arcane Jill
I'm pretty sure it depends on whether you regard a text document as a sequence of characters, or as a sequence of glyphs. (Er - I mean "default grapheme clusters" of course). Regarded as a sequence of characters, normalisation changes that sequence. But regarded as a sequence of glyphs, normali

RE: creating a test font w/ CJKV Extension B characters.

2003-11-20 Thread Arcane Jill
Is anyone able to answer this? I for one would really like to know. Thanks > -Original Message- > From: Frank Yung-Fong Tang [mailto:[EMAIL PROTECTED] > Sent: Thursday, November 20, 2003 2:29 AM > To: John Jenkins > Cc: [EMAIL PROTECTED] > Subject: Re: creating a test font w/ CJKV Extension

RE: creating a test font w/ CJKV Extension B characters.

2003-11-20 Thread Arcane Jill
Actually, I'd also like to know how to create OTF fonts, not just TTF fonts, as OTF seems to be the new big thing, and TTF's successor. Jill