On 9/11/2025 1:49 PM, Jim DeLaHunt via Unicode wrote:
On 2025-09-11 12:21, yitin--- via Unicode wrote:

https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-3/#G27288

What is the significance of using different letters (x,y,z,u)
for different bits?  I don't see any consistent pattern in
the naming.  https://www.rfc-editor.org/rfc/rfc3629 just
uses x for all of them.

What I like about Table 3-6's notation is that it shows how the bits in the various code units (x,y,z,u) correspond to the bits in the scalar value.  See for example, the final scalar value:

000uuuuu zzzzyyyy yyxxxxxx
The right-hand part of that row shows that the 'u' bits are encoded in the first and second bytes, the 'z' bits are encoded in the second byte, the 'y' bits are encoded in the third byte, the 'x' bits are encoded in the fourth byte.

The table in section 3 of RFC3629 just shows ranges of scalar values, not the bit patterns within the scalar values. Thus it does not illustrate as much as the Core Spec illustrates.


I agree with Jim's discussion of how this adds readability.

A./


Reply via email to