Andy,

> If no characters between 128 and 255 are valid UTF-8, and they can
> never be valid UTF-8  characters, and are used by many  encodings,
> why doesn't Fossil simply ignore them when they  are committed?

I think Stephan said it poorly. A solitary byte in that range is never valid UTF-8, but UTF-8 represents all code points higher than 127 as a sequence of bytes in the 128 to 255 range. Those byte sequences have a structure, so it is possible to tell if a string of bytes in that range represents a valid UTF-8 sequence.

-- Shal

--
Shal Farley                                 s...@cheshireeng.com
Cheshire Engineering Corporation      http://www.CheshireEng.com
120 West Olive Avenue                            +1 626 303 1602
Monrovia, CA 91016                           FAX +1 626 303 1590
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to