RE: More about SCSU (was: Re: A UTF-8 based News Service)

2001-07-21 Thread Yves Arrouye
> > SCSU doesn't look very nice for me. The idea is OK but it's just > > too complicated. Various proposals of encodings differences or xors > > between consecutive characters are IMHO technically better: much > > simpler to implement and work as well. > > These differential schemes seem to b

Re: A UTF-8 based News Service

2001-07-13 Thread David Starner
From: Kevin Bracey <[EMAIL PROTECTED]> > Much as I love SCSU, and much as my web browser supports it, it's not the > sort of thing to start encouraging on the wire when there are already > existing standards to deal with this. Why not? It can be further compressed by currently existing mechanisms

Re: A UTF-8 based News Service

2001-07-13 Thread David Starner
From: Keld Jørn Simonsen <[EMAIL PROTECTED]> > UTF-16 is not just 2 bytes, it is sometimes 2 and sometimes 4 bytes. > IETF is recommending UTF-8 as the prime charset in all Internet protocols. Blah. For his purposes, UTF-16 is 2 bytes. The odds his newspaper will have significant quantities of no

Re: More about SCSU (was: Re: A UTF-8 based News Service)

2001-07-13 Thread David Starner
From: <[EMAIL PROTECTED]> > None as far as I know, which sort of destroys the whole plan. It would sure > be nice if MSIE and Navigator started "quietly" supporting SCSU, in the same > way that they "quietly" (to the average user) began supporting UTF-8. If you want the code in Navigator, write

Re: More about SCSU (was: Re: A UTF-8 based News Service)

2001-07-13 Thread Rick McGowan
> Unfortunately, you don't hear much about SCSU, and in particular the Unicode > Consortium doesn't really seem to promote it much (although they may be > trying to avoid the "too many UTF's" syndrome). Probably that's one point. But also, SCSU is something that's a little more complicated to

RE: A UTF-8 based News Service

2001-07-13 Thread Ayers, Mike
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > Raw UTF-8 4,382,592 > Zipped UTF-82,264,152 (52% of raw UTF-8) > Raw SCSU1,179,688 (27% of raw UTF-8) > Zipped SCSU 104,316 (9% of raw SCSU, < 5% of zipped UTF-8) The data set is truly pa

Re: A UTF-8 based News Service

2001-07-13 Thread DougEwell2
In a message dated 2001-07-13 7:00:26 Pacific Daylight Time, [EMAIL PROTECTED] writes: > Sounds promising! How well does SCSU gzip? If gzip works anything like PKZIP, the answer is, very well indeed. This is because (using the explanation I have heard before) SCSU retargets Unicode text to

Re: More about SCSU (was: Re: A UTF-8 based News Service)

2001-07-13 Thread DougEwell2
In a message dated 2001-07-13 4:07:35 Pacific Daylight Time, [EMAIL PROTECTED] writes: > SCSU doesn't look very nice for me. The idea is OK but it's just > too complicated. Various proposals of encodings differences or xors > between consecutive characters are IMHO technically better: much >

Re: A UTF-8 based News Service

2001-07-13 Thread Daniel Yacob
[EMAIL PROTECTED] wrote: > > As a test, I downloaded the first article on the page: > > http://unicode.ethiozena.net/Gazettas/Kibrit/Archives/1993/Hamle/05/Kibrit.051 > 193.sera.html > > The article, dated 1993-05-11, has the formidable title: > Yesterday in the Ethiopian calendar :) > «

Re: More about SCSU (was: Re: A UTF-8 based News Service)

2001-07-13 Thread Marcin 'Qrczak' Kowalczyk
Fri, 13 Jul 2001 03:01:10 EDT, [EMAIL PROTECTED] <[EMAIL PROTECTED]> pisze: > Unfortunately, you don't hear much about SCSU, and in particular > the Unicode Consortium doesn't really seem to promote it much > (although they may be trying to avoid the "too many UTF's" syndrome). SCSU doesn't look

Re: A UTF-8 based News Service

2001-07-13 Thread Keld Jørn Simonsen
On Fri, Jul 13, 2001 at 02:14:25AM +0100, David Starner wrote: > > As someone involved in the service I often wish there was some > > form of "compressed" Unicode encoding. The 3-byte penalty that > > Ethiopic bears under UTF-8 turns into higher bandwidth that web > > hosting services meter and c

Re: A UTF-8 based News Service

2001-07-13 Thread Kevin Bracey
In message <[EMAIL PROTECTED]> [EMAIL PROTECTED] wrote: > > Encoded in UTF-8, the file was 1891 bytes long. Converted into SCSU, it > dropped to 1121 bytes, which is 40% shorter than the UTF-8 version, better > than UTF-16, and probably better than any existing legacy encoding for

Re: More about SCSU (was: Re: A UTF-8 based News Service)

2001-07-13 Thread DougEwell2
In a message dated 2001-07-12 22:55:09 Pacific Daylight Time, [EMAIL PROTECTED] writes: >> SCSU is also registered as an IANA charset, although you are >> unlikely to find >> raw SCSU text on the Internet, due to its use of control >> characters (bytes below 0x20). > > And what browser suppo

Re: A UTF-8 based News Service

2001-07-13 Thread DougEwell2
In a message dated 2001-07-12 8:27:20 Pacific Daylight Time, [EMAIL PROTECTED] writes: > The Ethiopian News Headlines has relocated to a new server at > http://www.ethiozena.net/ and is making it easier than ever to > read news headlines in Unicode. A companion Unicode only server > is laun

RE: More about SCSU (was: Re: A UTF-8 based News Service)

2001-07-12 Thread Yves Arrouye
> SCSU is also registered as an IANA charset, although you are > unlikely to find > raw SCSU text on the Internet, due to its use of control > characters (bytes > below 0x20). And what browser supports SCSU, and what it that browser's reach in term of population? Because that's usually what m

Re: A UTF-8 based News Service

2001-07-12 Thread David Starner
> As someone involved in the service I often wish there was some > form of "compressed" Unicode encoding. The 3-byte penalty that > Ethiopic bears under UTF-8 turns into higher bandwidth that web > hosting services meter and charge for by the megabyte. For a > popular site this soon makes UTF-8

More about SCSU (was: Re: A UTF-8 based News Service)

2001-07-12 Thread DougEwell2
I should have also mentioned that SCSU is fully supported by the programming toolkit ICU (International Components for Unicode), found at: http://oss.software.ibm.com/icu/ An Open Source project, ICU is available for free and comes with voluminous documentation. SCSU is also registered as

Re: A UTF-8 based News Service

2001-07-12 Thread DougEwell2
In a message dated 2001-07-12 8:27:20 Pacific Daylight Time, [EMAIL PROTECTED] writes: > As someone involved in the service I often wish there was some > form of "compressed" Unicode encoding. The 3-byte penalty that > Ethiopic bears under UTF-8 turns into higher bandwidth that web > hostin