Kyotaro Horiguchi <horikyota....@gmail.com> writes:
> At Thu, 22 Apr 2021 23:17:19 -0400, Tom Lane <t...@sss.pgh.pa.us> wrote in 
>> Doesn't seem like a good idea, because that locks us into an assumption
>> that the downcasing conversion doesn't change the string's physical
>> length.  There are a lot of counterexamples to that :-(.  I'm not sure

> Mmm. I didn't know of that.

The two examples I know of offhand are in German (eszett "ß" downcases to
"ss") and Turkish (dotted "Í" downcases to "i", likewise dotless "I"
downcases to "ı"; one of each of those pairs is an ASCII letter, the
other is not).  Depending on which encoding is in use, these
transformations *could* be the same number of bytes, but they could
equally well not be.  There are probably other examples.

                        regards, tom lane


Reply via email to