----- Original Message ----- From: "Adam M. Costello" <[EMAIL PROTECTED]> You're right. My estimate must have been based on an anomalous sample. > Here are the counts for Genesis chapter 1: > > King James: 3167 letters > Basic English: 3088 letters > Chinese Union: 778 ideographs > Korean Revised: 1201 Hangul > > references: > http://www.ccim.org/bible/ > http://bible.wisenet.co.kr/ > > So it's about 4.0 English letters per Chinese ideograph, and about 2.6 > English letters per Korean Hangul.
That might have been a hard work. :-) I know that chinese natural sentences contain many single-chinese-letter verbs,pronouns,adjectives,adverbs and interrogatives, while most chinese nouns are in two or three letters. composite chinese nouns /business names are from combinations of those nouns. If we take a research on chinese nouns and its corresponding english nouns/ transcriptions, the ratio 4.0 may be reduced to 2.X, i guess. The ratio 2.6 for Hangul sentence may be also reduced to lower 2.X, i believe. I will suggest new source of chinese nouns, later time. http://search.yahoo.com/bin/search?p=chinese+english will help for a while. Chinese participanst are welcome to this analysis. Thanks. Soobok Lee > > Each Korean Hangul takes about 2.9 octets in AMC-ACE-Z, which means a > maximal Korean domain label (20 hangul) holds about as much information > as a 52-letter English string, which about 17% less information than > a maximal English domain label (63 letters), and about 38% less > information than a maximal Chinese domain label (19 ideographs). > > I now retract this statement: > > > Of all the languages I've looked at, Korean is by far the least dense > > when encoded using AMC-ACE-Z. > > In light of the new data, I doubt that Korean is the least dense. > > AMC >
