On Sun, May 06, 2001 at 09:11:07PM +0500, Dan Winship wrote:
> > incorrect utf8.  Ideally to assume that incorrect utf8 is latin1 since
> > that's what it will be most of the time.
> 
> As has been pointed out, "NOT!". See also the Chinese spam that just
> went to evolution-hackers.

In the case of invalid utf8, what we really should do is look at the
statistical distribution of N-grams in the text, make a probabilistic
assessment of what language the mail is written in, and then default
to the most-frequently-used locale/encoding for that language.

Yeah, that's it...

-JT

_______________________________________________
evolution-hackers maillist  -  [EMAIL PROTECTED]
http://lists.helixcode.com/mailman/listinfo/evolution-hackers

Reply via email to