On 18/01/2019 20:09, Asmus Freytag via Unicode wrote:

Marcel,

about your many detailed *technical* questions about the history of character 
properties, I am afraid I have no specific recollection.

Other List Members are welcome to join in, many of whom are aware of how things 
happened. My questions are meant to be rather simple. Summing up the premium 
ones:

1. Why does UTC ignore the need of a non-breakable thin space?
2. Why did UTC not declare PUNCTUATION SPACE non-breakable?

A less important information would be how extensively typewriters with 
proportional advance width were used to write books ready for print.

Another question you do answer below:

French is not the only language that uses a space to group figures. In fact, I 
grew up with thousands separators being spaces, but in much of the existing 
publications or documents there was certainly a full (ordinary) space being 
used. Not surprisingly, because in those years documents were typewritten and 
even many books were simply reproduced from typescript.

When it comes to figures, there are two different types of spaces.

One is a space that has the same width a digit and is used in the layout of lists. For example, if 
you have a leading currency symbol, you may want to have that lined up on the left and leave the 
digits representing the amounts "ragged". You would fill the intervening spaces with this 
"lining" space character and everything lines up.

That is exactly how I understood hot-metal typesetting of tables. What 
surprises me is why computerized layout does work the same way instead of using 
tabulations and appropriate tab stops (left, right, centered, decimal [with all 
decimal separators lining up vertically).

In lists like that, you can get away with not using a narrow thousands 
separator, because the overall context of the list indicates which digits 
belong together and form a number. Having a narrow space may still look nicer, 
but complicates the space fill between the symbol and the digits.

It does not, provided that all numbers have thousands separators, even if 
filling with spaces. It looks nicer because it’s more legible.

Now for numbers in running text using an ordinary space has multiple drawbacks. 
It's definitely less readable and, in digital representation, if you use 0020 
you don't communicate that this is part of a single number that's best not 
broken across lines.

Right.

The problem Unicode had is that it did not properly understand which of the two types of 
"numeric" spaces was represented by "figure space". (I remember that we had 
discussions on that during the early years, but that they were not really resolved and that we 
moved on to other issues, of which many were demanding attention).

You were discussing whether the thousands separator should have the width of a 
digit or the width of a period? Consistently with many other choices, the 
solution would have been to encode them both as non-breakable, the more as both 
were at hand, leaving the choice to the end-user.

Current practice in electronic publishing was to use a non-breakable thin 
space, Philippe Verdy reports. Did that information come in somehow?

ISO 31-0 was published in 1992, perhaps too late for Unicode. It is normally 
understood that the thousands separator should not have the width of a digit. 
The allaged reason is security. Though on a typewriter, as you state, there is 
scarcely any other option. By that time, all computerized text was fixed width, 
Philippe Verdy reports. On-screen, I figure out, not in book print

If you want to do the right thing you need:

(1) have a solution that works as intended for ALL language using some form of 
blank as a thousands separator - solving only the French issue is not enough. 
We should not do this a language at a time.

That is how CLDR works. But as soon as that was set up, I started lobbying for 
support of all relevant locales at once:

https://unicode.org/cldr/trac/ticket/11423

https://unicode.org/pipermail/cldr-users/2018-September/000842.html

https://unicode.org/pipermail/cldr-users/2018-September/000843.html
and
https://unicode.org/cldr/trac/ticket/11423#comment:2

Do you have colleagues in Germany and other countries that can confirm whether 
their practice matches the French usage in all details, or whether there are 
differences? (Including differently acceptability of fallback renderings...).

No I don’t but people may wish to read German Wikipedia:

https://de.wikipedia.org/wiki/Zifferngruppierung#Mit_dem_Tausendertrennzeichen

Shared in ticket #11423:
https://unicode.org/cldr/trac/ticket/11423#comment:15

(2) have a solution that works for lining figures as well as separators.

(3) have a solution that understands ALL uses of spaces that are narrower than normal 
space. Once a character exists in Unicode, people will use it on the basis of 
"closest fit" to make it do (approximately) what they want. Your proposal needs 
to address any issues that would be caused by reinterpreting a character more narrowly 
that it has been used. Only by comprehensively identifying ALL uses of comparable spaces 
in various languages and scripts, you can hope to develop a solution that doesn't simply 
break all non-French text in favor of supporting French typography.

There is no such problem except that NNBSP has never worked properly in 
Mongolian. It was an encoding error, and that is the reason why to date, all 
font developers unanimously request the Mongolian Suffix Connector. That leaves 
the NNBSP for what it is consistently used outside Mongolian: a non-breakable 
thin space, kind of a belated avatar of what PUNCTUATION SPACE should have been 
since the beginning.

Perhaps you see why this issue has languished for so long: getting it right is 
not a simple matter.

Still it is as simple as not skipping PUNCTUATION SPACE when FIGURE SPACE was 
made non-breakable. Now we ended up with a mutated Mongolian Space that does 
not work properly for Mongolian, but does for French and other Latin script 
using languages. It would even more if TUS was blunter, urging all foundries to 
update their whole catalogue soon.

Marcel

Reply via email to