Re: Off-Topic (Re: This spoofing and security thread)

2002-02-13 Thread John Hudson
At 19:13 2/13/2002, Patrick Andries wrote: >There is also one in French where "e" accounts for 15,3% of letters in a >typical text > >It's called "La disparition" (320 pages without an "e"), by Georges Perec. The English one is translation of Perec's _La disparation_ by Gilbert Adair, enti

Re: This spoofing and security thread

2002-02-13 Thread David Starner
On Wed, Feb 13, 2002 at 08:46:31PM -0800, Yves Arrouye wrote: > > What do you mean? I've done works for Project Gutenberg, and looked at a > > number of books with thoughts of reducing them to ASCII. In my opinion, > > Windows-1252 has every character that most English books will need, > > Especi

Re: Off-Topic (Re: This spoofing and security thread)

2002-02-13 Thread John Cowan
Patrick Andries scripsit: > Quite a feat indeed : since "e" accounts for 13% of letters in a typical > English text. Indeed. It's called "Gadsby", and the author of "La disparition" certainly knew it. > There is also one in French where "e" accounts for 15,3% of letters in a > typical text..

Re: This spoofing and security thread

2002-02-13 Thread Elliotte Rusty Harold
At 7:03 PM -0800 2/13/02, Asmus Freytag wrote: >As I tried to hint at above, attempting to give this answer is at >best possible in a fuzzy, probabilistic sense. Even such a simple >statement that 'e' is used for English, can be misleading. There's >at least one novel that does entirely witho

RE: This spoofing and security thread

2002-02-13 Thread Yves Arrouye
> What do you mean? I've done works for Project Gutenberg, and looked at a > number of books with thoughts of reducing them to ASCII. In my opinion, > Windows-1252 has every character that most English books will need, Especially those books that you want to reduce to ASCII :-) YA

Re: spoof buddies

2002-02-13 Thread David Starner
On Wed, Feb 13, 2002 at 06:39:30AM +0300, Vladimir Ivanov wrote: > I have not heard about official (i.e. approved by Unicode Consortium) > Russian names of Unicode Characters. Of course, they could be constructed. > But that implies that such names must be constructed for every official > language

Re: This spoofing and security thread

2002-02-13 Thread David Starner
On Wed, Feb 13, 2002 at 07:03:40PM -0800, Asmus Freytag wrote: > This has been attempted for some sets of latin based languages. I don't > have a link to one of the documents that do that. Main problem is that > many *more* characters are actually used (and used quite commonly) by users > of these

Off-Topic (Re: This spoofing and security thread)

2002-02-13 Thread Patrick Andries
Asmus Freytag wrote: > . There's at least one novel that does entirely without that letter, > but is certainly in English. Quite a feat indeed : since "e" accounts for 13% of letters in a typical English text. There is also one in French where "e" accounts for 15,3% of letters in a typic

Re: This spoofing and security thread

2002-02-13 Thread Asmus Freytag
At 06:37 PM 2/11/02 +, Juliusz Chroboczek wrote: >We, ASCII-age programmers, are used to considering plain text >rendering as being injective up to binary identity. We carefully >choose fonts that distinguish between O and 0, 1 and l. We use >editors that warn us about non-native line endin

Analysis of ISO 639 and mappings to SIL Ethnologue

2002-02-13 Thread Peter_Constable
[apologies in advance to those who receive this multiple times] In connection to work that Gary Simons and I have been doing in interaction with ISO/TC 37/SC 2/WG 1, we have added some new pages to the Ethnologue web site that present an analysis we have done of the existing ISO 639 language

Re: UTF-16 is not Unicode

2002-02-13 Thread Lars Marius Garshol
* Michael Everson | | I think it's clear that Unicode should give some advice as to how to | announce encoding options in a useful way to the end user. For the | two encodings we are discussing, may I suggest the following | standard menu items: | | Unicode (Raw, UTF-16) | Unicode (Web, UTF-8)

RE: UTF-16 is not Unicode

2002-02-13 Thread Marco Cimarosti
David Starner wrote: > On Tue, Feb 12, 2002 at 08:12:08PM +0100, Marco Cimarosti wrote: > > OK, UTF-8 is my favorite default UTF too. However, whatever > the default is, > > it is easier to just call it "Unicode", and call the other > options "Unicode > > (something else)". > > > > That puts on

U+200B, Zero Width Space

2002-02-13 Thread Ake Persson
Unicoders, In Unicode 3.0, U+200B have the "White_space" property. http://www.unicode.org/Public/3.0-Update/PropList-3.0.0.txt In Unicode 3.1, U+200B have lost it's "White_space" property. http://www.unicode.org/Public/3.1-Update/PropList-3.1.0.txt I can't find out why. Can anyone help me? Kin

Re: UTF-16 is not Unicode

2002-02-13 Thread Michael Everson
At 14:28 -0600 2002-02-12, David Starner wrote: > >What happens when a user is told to save in UTF-16? What about when two >users running different operating systems try to pass files about? And >why would Unicode be any clearer to a naive user than UTF-16? > >IMO, UTF-16 is as clear as Unicode, a

Re: UTF-16 is not Unicode

2002-02-13 Thread Lars Marius Garshol
* Marco Cimarosti | | Only if the user selects a menu like "Manual encoding settings", she | should be presented with a choice like "International (Unicode)", | that opposes to "Western (ISO 8859-1)", "Chinese, simplified (GB | 2312-80)", and so on. All entries should have a generic descriptive

Normalisation and case folding (was: IDNA comment)

2002-02-13 Thread David Hopwood
-BEGIN PGP SIGNED MESSAGE- [Cross-posted from the IDN list; reply-to set to [EMAIL PROTECTED] Change it back for replies that relate specifically to IDN.] Mark Davis wrote: > >stringprep(NFC(x)) == stringprep(x) [does not always hold] > > This was brought up early in the Unicode 3.2 dev