On 7/16/2018 1:11 PM, Richard Damon wrote:

Many consider that UTF-32 is a variable-width encoding because of the combining 
characters. It can take multiple ‘codepoints’ to define what should be a single 
‘character’ for display.

I hope you realize that this is not the standard meaning of 'variable-width encoding', which is 'variable number of bytes for a codepoint'. UTF-16 and UTF-8 are variable width. If one expands the definition enough, Ascii is 'variable width' because 'fi' is two bytes, or more realistically, because <= and >= are two bytes instead of one (as they can be in Unicode!).

If one is using a broader definition than usual, it is clearer to say so.

--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to