On Tue, 21 Jul 2015 18:10:14 +0800 gfb hjjhjh <c933...@gmail.com> wrote:
> When you write text in modern Chinese, there will not be any break > between different words, and thus if you segment characters according > to the ideographic characters, what being groupped together would > either be a clausee or a sentence, Or even a whole paragraph if you > are handling some older text without punctuations. I had another look at Chinese word breaking algorithms today and saw that their practical purposes were mostly indexing and machine translation. Consequently, I suspect that authors have little incentive to mark word boundaries in the texts they originate. This differs from the Thai situation where marking word boundaries improves layout and spell-checking. Richard.