Peter continued:

> Thanks for the clarification. I should say that the behaviour of NBSP 
> suddenly reverted to what it had been in previous versions of the 
> standard, although a perhaps inadvertant change was made in 4.0.0.

Even that is not correct.

The *Introduction* to UAX #14 was expanded by 3 paragraphs between
the Unicode 3.2.0 and the Unicode 4.0.0 version, in an attempt to
help explain the context of how a line break algorithm works, by
measuring lines and then seeking a locally optimal line break. In
that context, the issue of how compression or expansion of a line
works under justification was raised, and the author of UAX #14
added some explanatory qualifications regarding what spaces are
involved in the kinds of compression and expansion which can impact
line measurement and thus the choice of optimal line break positions.

That text omitted mention of NBSP as parallel to SPACE in that
context -- that was an oversight by the author and not caught in
editorial review. When it became clear that the paragraph in question
was being (erroneously) cited as proving that the intent of the
UTC was that NBSP be implemented as a fixed-width space, the
author acknowledged the oversight and quickly fixed the text.

There is *NO* UTC decision on record to make the NBSP be a fixed-width
space, in the history of its decision making.
 
> Nevertheless, there does seem to be a widespread misunderstanding that 
> NBSP is intended to be fixed width, and in many systems it is 
> implemented as such. Perhaps there is a need to clarify this further, 
> perhaps by reinstating text similar to what was in Unicode 3.0.

I didn't cite the parallel text from Unicode 4.0 along with the
Unicode 1.0, Unicode 2.0, and Unicode 3.0 text I quoted, for the
simple reason that it is almost word-for-word identical to
Unicode 3.0. There is no need to reinstate any text -- it was
unchanged and its intent was unchanged.

> 
> I take your point about the advantages of having the drafters of the 
> standard available to explain parts of the standard which are unclear. I 
> certainly wish we could do that with other texts that you allude to. But 
> there must also be controls here. If the text says "black", we can't 
> have the drafters saying that the text really means "white". They can 
> say that they made a mistake, and correct it in a new version, but there 
> are limits on how far they can reinterpret even a text which they wrote 
> themselves.

Of course. Exegesis provided above.

Now please stop claiming that the status of NBSP has changed,
either pre- or post-4.0.0.

That some implementations treat NBSP as fixed-width is a matter
of those implementations. Note that even SPACE is treated as
fixed-width by some implementations, and has a long history of
that. Any implementation that is mono-pitch has a fixed-width
SPACE, and that goes back to the dark prehistory of SPACE as
a Teletype character.

The Unicode Standard does not require that SPACE or NBSP be
fixed-width, nor does it preclude an implementation which,
for whatever reason (limitations of mechanical rendering,
font design, or simply aesthetics) treats them as fixed-width.

The point the standard is making is that the nominally
*fixed-width* space characters (U+2000..U+200A, U+3000) are,
by their very character identity, associated with particular
display widths. But even for those, as UAX #14 notes, there are
typographical practices which may result, for example, in
an ideographic space character being compressed or a
thin space character being expanded. What *matters* is that
the encoded content of the text be correctly specified in
an interoperable manner and that proper typographic practice
be followed to produce the rendered results that people desire.
The Unicode Standard provides a large number of space
characters to assist that. But if even this most elaborate
set of encoded space characters in the history of character
encoding standards does not suffice, then, as for TeX, you
always have the option to move to mark-up to get the desired
results.

--Ken


Reply via email to