Werner LEMBERG <w...@gnu.org> writes:

>>> If we get an invalid UTF-8 sequence, I'm all for it.  But it is not
>>> too difficult to not get invalid sequences but still have wrong
>>> output.
>> 
>> Theoretically.  But it is impossible to write just a single
>> non-ASCII byte without hitting an invalid sequence since all
>> non-ASCII bytes must be part of multi-byte sequences.  Only
>> combinations of non-ASCII bytes can form valid utf-8 sequences, and
>> the probability of several of them being "just right" is not all
>> that high.
>
> For single-byte encodings, you are correct.  However, the probability
> is *much* higher if you consider legacy two-byte encodings for CJK
> scripts.

The probability of people accidentally writing two-byte encodings for
CJK scripts in an ASCII-based programming language and being totally
surprised by coding issues is not all that high.  I also consider it
much much more likely that somebody unused to coding problems tries
getting just a composer's name right in a Latin script is higher than
with Chinese letters.  It is much easier to make your computer produce a
diacritical Latin letter foreign to you (like with using a Compose key)
than produce a Chinese letter.

So I don't really see the point in giving up before trying.

-- 
David Kastrup

_______________________________________________
bug-lilypond mailing list
bug-lilypond@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-lilypond

Reply via email to