Re: [darcs-users] [patch37] Store textual patch metadata encoded in UTF-8

Eric Kow Wed, 11 Nov 2009 02:40:58 -0800

Hi Juliusz,

Nice to hear from you.

On Wed, Nov 11, 2009 at 02:15:11 +0100, Juliusz Chroboczek wrote:
> > The patches makes darcs differentiate between old-style and UTF-8 patches by
> > adding an 'Ignore-this: UTF-8' line to the patch log, and looking for that
> > line when interpreting the patch.
> 
> No.  There's no need to tag.
> 
> UTF-8 can be detected automatically with 100% certainty in practice.  If
> a string correctly decodes as UTF-8, then it's most certainly UTF-8.

Could you explain in a bit more detail why this is the case?

Are you saying that the probability of funny characters occurring only
within UTF-8 compatible sequences like 110xxxxx 10xxxxx is just so
absurdly low (especially in practice) that we can get away with
autodetection?

Come to think of it, what's the harm if we mistakenly detect something
as UTF-8 from time to time?  Maybe it's no worse than what's we already
do...

Thanks,

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9

signature.asc
Description: Digital signature

_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Re: [darcs-users] [patch37] Store textual patch metadata encoded in UTF-8

Reply via email to