[EMAIL PROTECTED] wrote:
> 
> As a test, I downloaded the first article on the page:
> 
> http://unicode.ethiozena.net/Gazettas/Kibrit/Archives/1993/Hamle/05/Kibrit.051 
> 193.sera.html
> 
> The article, dated 1993-05-11, has the formidable title:
> 

Yesterday in the Ethiopian calendar :) <insert favorite Y2K joke here>



> «p-t negaso gidada wedeTalyan kobelelu teblo yeteseraCew zegeba f`Sum Heset
> new» yeTalyan Embasi
> 

Titles (in <title> markups) remain transliterated since a number of
browsers
that support UTF-8 viewing in the page display area do not in the
"title" area
of the browser's application window.  Transliterated Ethiopic actually
fairs
better than UTF-8 since consonants can be a single byte, syllables 2
bytes
and diphthongs 3.  On average a document might "compress" with
transliteration
down to 53%.  Not so easy on the eyes though but useful as a last
resort.


>
> Encoded in UTF-8, the file was 1891 bytes long.  Converted into SCSU, it
> dropped to 1121 bytes, which is 40% shorter than the UTF-8 version, better
> than UTF-16, and probably better than any existing legacy encoding for
> Ethiopic.  SCSU is a Good Thing.


Sounds promising!  How well does SCSU gzip?

/Daniel

Reply via email to