Hi,

Eric Kow <[email protected]> writes:
>> No.  There's no need to tag.
>> 
>> UTF-8 can be detected automatically with 100% certainty in practice.  If
>> a string correctly decodes as UTF-8, then it's most certainly UTF-8.
>
> Could you explain in a bit more detail why this is the case?
>
> Are you saying that the probability of funny characters occurring only
> within UTF-8 compatible sequences like 110xxxxx 10xxxxx is just so
> absurdly low (especially in practice) that we can get away with
> autodetection?

> Come to think of it, what's the harm if we mistakenly detect something
> as UTF-8 from time to time?  Maybe it's no worse than what's we already
> do...
if it's worth anything, I have done some utf8-detection based on a
strict utf8 decoder and I haven't seen a false positive yet (false
negatives are impossible by nature of the test).

I am mildly in favour of auto-detecting utf8 (although I probably
haven't done enough research myself to put up a strong point).

Yours,
   Petr.
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to