Aw: Re: Re: NBSP supposed to stretch, right?
Festival season is over ... I checked it out, LaTeX does the same for the input of an explicit no break space character. --Jörg Knappen Gesendet: Sonntag, 22. Dezember 2019 um 22:54 Uhr Von: "Shriramana Sharma via Unicode" An: "Jörg Knappen" Cc: "Asmus Freytag" , "UnicoDe List" Betreff: Re: Re: NBSP supposed to stretch, right? So I was wondering whether TeX only does this to the ~ input character or the actual NBSP Unicode character too?
Re: Re: NBSP supposed to stretch, right?
So I was wondering whether TeX only does this to the ~ input character or the actual NBSP Unicode character too?
Aw: Re: NBSP supposed to stretch, right?
Well, in TeX and LaTeX, the no break space (indicated by the active character ~ in TeX input files) is stretchable and stretches to a normal inter-word space such that all inter-word spaces in a line are equal. But multiple no break spaces still add up to wider spaces in the output unlike usual space tokens that are collapsed to one space token. -- Jörg Knappen Gesendet: Dienstag, 17. Dezember 2019 um 17:20 Uhr Von: "Asmus Freytag via Unicode" An: unicode@unicode.org Betreff: Re: NBSP supposed to stretch, right? On 12/17/2019 2:41 AM, Shriramana Sharma via Unicode wrote: On Tue 17 Dec, 2019, 16:09 QSJN 4 UKR via Unicode, <unicode@unicode.org> wrote: Agree. By the way, it is common practice to use multiple nbsp in a row to create a larger span. In my opinion, it is wrong to replace fixed width spaces with non-breaking spaces. Quote from Microsoft Typography Character design standards: «The no-break space is not the same character as the figure space. The figure space is not a character defined in most computer system's current code pages. In some fonts this character's width has been defined as equal to the figure width. This is an incorrect usage of the character no-break space.» Sorry but I don't understand how this addresses the issue I raised. You don't? In principle it may be true that NBSP is not fixed width, but show me software that doesn't treat it that way. In HTML, NBSP isn't subject to space collapse, therefore it's the go-to space character when you need some extra spacing that doesn't disappear. I bet, in many other environments it was typically the only "other" space character, so it ended up overloaded. My hunch is that it is too late at this point to try to promulgate a "clean" implementation of NBSP, because it would effectively change untold documents retroactively. So it would be a massively breaking change. If you have a situation where you need really poor layout (wide inter-word spaces) to justify, the fact that a honorific in front of a name works more like it's part of the same word (because the NBSP doesn't stretch) would be the least of my worries. (Although, on lines where interword spaces are a reduced a bit, I can see that becoming counter-intuitive). If you only fix this in software for high-end typography, you'd still have the issue that things will behave differently if you export your (plain) text. And you would have the issue of what to do when you want fixed spaces to be non-breaking as well (is that ever needed?). A./
Re: NBSP supposed to stretch, right?
On 12/19/19, James Kass via Unicode wrote: > > There's a bug report for the LibreOffice application here... > https://bugs.documentfoundation.org/show_bug.cgi?id=41652 > ...which shows an interesting history of the situation. LOL two years ago almost to the date Shriramana Sharma seems to have already *quoted* the Unicode Standard on this (https://bugs.documentfoundation.org/show_bug.cgi?id=41652#c30): The Unicode standard document http://unicode.org/reports/tr14/ clearly states that: When expanding or compressing interword space according to common typographical practice, only the spaces marked by U+0020 SPACE and U+00A0 NO-BREAK SPACE are subject to compression, and only spaces marked by U+0020 SPACE, U+00A0 NO-BREAK SPACE, and occasionally spaces marked by U+2009 THIN SPACE are subject to expansion. All other space characters normally have fixed width. But we have some people there on that bug saying that: While Unicode is an important standard, it's only of secondary importance to an office suite. Its primary goal is *not* creating a reference comformant implementation of the standard; rather, it should use the standard to the extent it needs to serve its users most. which is a 😒 approach in my eyes but well, that's how the real world is on many things. Anyhow the above comment is continued as: And if legacy requires that some statements of standard be violated to keep existing documents intact, that should be that way, until a better design is invented and implemented, which would make possible to please both sides. This means option #1 I mentioned earlier and which seems to already have been discussed in the bug discussion: provide a per-document option or at least a Word-compatibility option as to how to treat NBSP. -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा 𑀰𑁆𑀭𑀻𑀭𑀫𑀡𑀰𑀭𑁆𑀫𑀸
Re: NBSP supposed to stretch, right?
On 2019-12-21 2:43 AM, Shriramana Sharma via Unicode wrote: Ohkay and that's very nice meaningful feedback from actual developer+user interaction. So the way I look at this going forward is that we have four options: 1) With the existing single NBSP character, provide a software option to either make it flexible or inflexible, but this preference should be stored as part of the document and not the application settings, else shared documents would not preserve the layout intended by the creator. 5) Update the applications to treat NBSP correctly. Process legacy data based on date/time stamp (or metadata) appropriately and offer users the option to update their legacy data algorithmically using proper non-stretching space characters such as FIGURE SPACE. - Options 1 and 5 have the advantage of not requiring the addition of yet more spacing characters to the Standard.
Re: NBSP supposed to stretch, right?
On 12/21/19, Richard Wordingham via Unicode wrote: > On Fri, 20 Dec 2019 17:25:17 +0530 > Shriramana Sharma via Unicode wrote: > >> I don't expect NBSP to ever disappear, because spaces disappear only >> at linebreaks, and NBSP simply doesn't stand at linebreaks. > > I can certainly imagine someone writing " ". You don't need to go so far. Even the Unicode characters can be entered: A0 0A (which makes for a nice smiley like pattern, two ears besides two eyes 😉). Obviously we are talking about *automatic* linebreaks. IIUC the point about NBSP is that *it itself* doesn't break, whereas SP breaks up and is *replaced* by a linebreak. Nobody said anything about manual linebreak characters *following* a space character, whether SP or NBSP or anything else. I also just tested and noticed something related: in my wordprocessor (LibreOffice Writer) when the cursor is near the end of a line and the horizontal space remaining on that line is less than the nominal advance width of the space, pressing space doesn't advance the cursor (or maybe it does and I don't see it) irrespective of whether the paragraph is left-aligned or justified, whereas inputting NBSP goes to the next line, pulling the word before it along with it. This is consistent with the current fixed-width NBSP behaviour of these wordprocessors. -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा 𑀰𑁆𑀭𑀻𑀭𑀫𑀡𑀰𑀭𑁆𑀫𑀸
Re: [EXTERNAL] Re: NBSP supposed to stretch, right?
On 12/21/19, Shriramana Sharma wrote: > 1) > > With the existing single NBSP character, provide a software option to > either make it flexible or inflexible, but this preference should be > stored as part of the document and not the application settings, else > shared documents would not preserve the layout intended by the > creator. One thing I forgot: are there any possibilities that *both* behaviours would be required in the same document? To my imagination, I who expect NBSP to be flexible won't use it between text and punctuation like those Word users, and probably they won't use it like me. -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा 𑀰𑁆𑀭𑀻𑀭𑀫𑀡𑀰𑀭𑁆𑀫𑀸
Re: [EXTERNAL] Re: NBSP supposed to stretch, right?
On 12/21/19, Murray Sargent wrote: > I checked with the Word team and they actually tried out stretching NBSP > back in 2015 in the "good client" mode. But customer feedback was negative. > The problem is that NBSP is used sometimes when stretching isn't wanted such > as between the end of a question and the question mark or in multi-word > trademarks or in italic expressions such as ad infinitum. Another example is > Text«quotation»moretext. One doesn't want the « and » to > be spaced apart from "quotation" for justification purposes. > > Conceivably Word should offer a special justification option to stretch > NBSP, but user feedback has revealed that it's not a good default option. Ohkay and that's very nice meaningful feedback from actual developer+user interaction. So the way I look at this going forward is that we have four options: 1) With the existing single NBSP character, provide a software option to either make it flexible or inflexible, but this preference should be stored as part of the document and not the application settings, else shared documents would not preserve the layout intended by the creator. 2) Consider that the non-stretching behaviour of wordprocessors (probably following MS Word) is correct, and encode a new NBFSP non-breaking flexible space. [I'm looking at that convenient hole at 2065.] DTP software like InDesign/TeX (and browsers like Firefox, though web content is assumed to be more fluid typographically) should then ideally conform to this and potentially break their users' documents (esp in the case of DTP). 3) Consider that the stretching behaviour of DTP software like InDesign is correct, and encode a new FWNBSP fixed-width non-breaking space [at 2065]. Wordprocessors should then ideally conform to this and potentially break their users' documents. 4) Leave alone the existing ambiguous behaviour of NBSP, and encode two new characters [Supplemental Punctuation has space at 2E50…] for NBFSP and FW-NBSP. Like the existing 2028 and 2029 Line and Paragraph Separators with the annotation: “may be used to represent this semantic unambiguously”. -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा 𑀰𑁆𑀭𑀻𑀭𑀫𑀡𑀰𑀭𑁆𑀫𑀸
Re: NBSP supposed to stretch, right?
On Fri, 20 Dec 2019 17:25:17 +0530 Shriramana Sharma via Unicode wrote: > So I never asked for NBSP to disappear. I said I want it to *stretch*. > And to my mind "stretch" means to become wider than one's normal > width. It doesn't include decreasing or disappearing width. Don't spaces sometimes shrink? I thought they did in some 'show codes' modes. > I don't expect NBSP to ever disappear, because spaces disappear only > at linebreaks, and NBSP simply doesn't stand at linebreaks. I can certainly imagine someone writing " ". Richard.
Re: NBSP supposed to stretch, right?
On 12/17/19, Asmus Freytag via Unicode wrote: > On 12/17/2019 2:41 AM, Shriramana Sharma via Unicode wrote: >> >> On Tue 17 Dec, 2019, 16:09 QSJN 4 UKR via Unicode, >> wrote: >>> >>> «The no-break space is not the same character as the figure space. The >>> figure space is not a character defined in most computer system's >>> current code pages. In some fonts this character's width has been >>> defined as equal to the figure width. This is an incorrect usage of >>> the character no-break space.» >> >> >> Sorry but I don't understand how this addresses the issue I raised. > > You don't? > > In principle it may be true that NBSP is not fixed width, but show me > software that doesn't treat it that way. > > In HTML, NBSP isn't subject to space collapse, therefore it's the go-to > space character when you need some extra spacing that doesn't disappear. So I never asked for NBSP to disappear. I said I want it to *stretch*. And to my mind "stretch" means to become wider than one's normal width. It doesn't include decreasing or disappearing width. I don't expect NBSP to ever disappear, because spaces disappear only at linebreaks, and NBSP simply doesn't stand at linebreaks. -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा 𑀰𑁆𑀭𑀻𑀭𑀫𑀡𑀰𑀭𑁆𑀫𑀸
Re: NBSP supposed to stretch, right?
From our colleague’s web site, http://jkorpela.fi/chars/spaces.html “On web browsers, no-break spaces tended to be non-adjustable, but modern browsers generally stretch them on justification.” Jukka Korpela then offers pointers about avoiding unwanted stretching. and “The change in the treatment of no-break spaces, though inconvenient, is consistent with changes in CSS specifications. For example, clause 7 Spacing of CSS Text Module Level 3 (Editor’s Draft 24 Jan. 2019) defines the no-break space, but not the fixed-with spaces, as a word-separator character, stretchable on justification.” So it appears that there’s no interoperability problem with HTML. It seems that the widespread breakage which Asmus Freytag mentions is limited to legacy applications which persist in treating U+00A0 as the old “hard space” such as Word. It also appears that Microsoft tried and failed to correct the problem in Word. Perhaps they should try again. Meanwhile, in the absence of anything from Unicode more explicit than already recommended by the Standard, Shriramana Sharma might be well advised to continue to lobby the respective software people. As more applications migrate towards the correct treatment of U+00A0, they are probably already running into interoperability problems with Microsoft Word and may well have already implemented solutions.
Re: NBSP supposed to stretch, right?
On 2019-12-17 12:50 AM, Shriramana Sharma via Unicode wrote: I would have gone and filed this as a LibreOffice bug since that's the software I use most, but when I found this is a cross-software problem, I thought it would be best to have this discussed and documented here (and in a future version of the standard). There's a bug report for the LibreOffice application here... https://bugs.documentfoundation.org/show_bug.cgi?id=41652 ...which shows an interesting history of the situation. One issue is whether to be Unicode compliant or MS-Word compliant. MS-Word had apparently corrected the bug with Word 2013 but had reverted to the incorrect behavior by the time Word 2016 rolled out. On that page it's noted that applications like InDesign, Firefox, TeX, and QuarkXPress handle U+00A0 correctly.
Re: NBSP supposed to stretch, right?
U+0020 SPACE U+00A0 NO-BREAK SPACE These two characters are equal in every way except that one of them offers an opportunity for a line break and the other does not. If the above statement is true, then any conformant application must treat/process/display both characters identically. Responding to Asmus Freytag, > Now, if someone can show us that there are widespread implementations that > follow the above recommendation and have no interoperability issues with HTML > then I may change my tune. Can anyone show us that there are widespread implementations which would break if they started following the above recommendation? Quoting from this HTML basics page, http://www.htmlbasictutor.ca/non-breaking-space.htm “Some browsers will ignore beyond the first instance of the non-breaking space.” and “Not all browsers acknowledge the additional instances of the non-breaking space.” Fifteen or twenty years ago, we used NO-BREAK SPACE to indent paragraphs and to position text and graphics. Both of those uses are presently considered no-nos because some browsers collapse NBSPs and because there are proper ways now to accomplish these kinds of effects. The introduction of browsers which collapsed NBSP strings broke existing web pages. Perhaps the developers of those browsers decided that SPACE and NO-BREAK SPACE are indeed identical except for line breaking. Are there any modern mark-up language uses of SPACE vs NO-BREAK SPACE which would be broken if they follow the above recommendation?
Re: NBSP supposed to stretch, right?
On 12/17/2019 5:49 PM, James Kass via Unicode wrote: Asmus Freytag wrote, > And any recommendation that is not compatible with what the overwhelming > majority of software has been doing should be ignored (or only enabled on > explicit user input). > > Otherwise, you'll just advocating for a massively breaking change. It seems like the recommendations are already in place and the “overwhelming majority of software” is already disregarding them. so they are dead letter and should be deprecated... I don’t see the massively breaking change here. Are there any illustrations? If legacy text containing NON-BREAK SPACE characters is popped into a justifier, the worst thing that can happen is that the text will be correctly justified under a revised application. That’s not breaking anything, it’s fixing it. Unlike changing the font-face, font size, or page width (which often results in reformatting the text), the line breaks are calculated before justification occurs. If a string of NON-BREAK SPACE characters appears in an HTML file, the browser should proportionally adjust all of those space characters identically with the “normal” space characters. This should preserve the authorial intent. As for pre-Unicode usage of NON-BREAK SPACE, were there ever any exlicit guidelines suggesting that the normal SPACE character should expand or contract for justification but that the NON-BREAK SPACE must not expand or contract?
Re: NBSP supposed to stretch, right?
Asmus Freytag wrote, > And any recommendation that is not compatible with what the overwhelming > majority of software has been doing should be ignored (or only enabled on > explicit user input). > > Otherwise, you'll just advocating for a massively breaking change. It seems like the recommendations are already in place and the “overwhelming majority of software” is already disregarding them. I don’t see the massively breaking change here. Are there any illustrations? If legacy text containing NON-BREAK SPACE characters is popped into a justifier, the worst thing that can happen is that the text will be correctly justified under a revised application. That’s not breaking anything, it’s fixing it. Unlike changing the font-face, font size, or page width (which often results in reformatting the text), the line breaks are calculated before justification occurs. If a string of NON-BREAK SPACE characters appears in an HTML file, the browser should proportionally adjust all of those space characters identically with the “normal” space characters. This should preserve the authorial intent. As for pre-Unicode usage of NON-BREAK SPACE, were there ever any exlicit guidelines suggesting that the normal SPACE character should expand or contract for justification but that the NON-BREAK SPACE must not expand or contract?
Re: NBSP supposed to stretch, right?
On 12/17/2019 11:31 AM, James Kass via Unicode wrote: So it follows that any justification operation should treat NO-BREAK SPACE and SPACE identically. And any recommendation that is not compatible with what the overwhelming majority of software has been doing should be ignored (or only enabled on explicit user input). Otherwise, you'll just advocating for a massively breaking change. NBSP has been supported since way before Unicode. It's way past the point where we can legislate behavior other than the de-facto consensus among implementations. Now, if someone can show us that there are widespread implementations that follow the above recommendation and have no interoperability issues with HTML then I may change my tune. A./
Re: NBSP supposed to stretch, right?
On Tue, 17 Dec 2019 06:20:39 +0530 Shriramana Sharma via Unicode wrote: > Hello. I've just tested LibreOffice, Google Docs and MS Office on > Linux, Android and Windows, and it seems that NBSP doesn't get > stretched like the normal space character when justified alignment > requires it. > > Let me explain. I'm creating a document with the following text > typeset in 12 pt Lohit Tamil with justified alignment on an A5 page > with 0.5" margin all around: > > ஶ்ரீமத் மஹாபாரதம் என்பது நமது தேசத்தின் பெரும் இதிஹாஸமாகும். இதனை > இயற்றியவர் ஶ்ரீ வேத வ்யாஸர். அவரால் அனுக்ரஹிக்கப்பட்டவையான நூல்கள் பல. > > The screenshot > https://sites.google.com/site/jamadagni/files/temp/nbsp-not-expanding.png > may be useful to illustrate the situation. Readers may try such > similar sentences in any software/platform of their choice and report > as to what happens. > > Here the problem arises with the phrase ஶ்ரீ வேத வ்யாஸர். The word > ஶ்ரீ is a honorific applying to the following name of the sage வேத > வ்யாஸர், so it would seem unsightly to the reader if it goes to the > previous line, so I insert an NBSP between it and the name. (Isn't > there such a stylistic convention in English where Mr doesn't stand at > the end of a line? I don't know.) It's not widely taught in so far as it exists. I would avoid placingthe word at the end in wide columns, just as I suppress line breaks in 'Figure 7' and '17 December', but I only apply it to short adjuncts. However, I would find the use of narrower spacing somewhere between acceptable and desirable. Thai has a similar rule, where there is generally no space between title and forename, but an obligatory space between forename and surname. To me, this is a continuation of the principle that line-breaks within phrases make them more difficult to understand. > However, the phrase is shortly followed by a long word > அனுக்ரஹிக்கப்பட்டவையான, which is too long to fit on the same line and > hence goes to the next line, thereby increasing the inter-word spacing > on its previous line significantly. But the NBSP after the honorific > doesn't stretch, making the word layout unsightly. The strategies to deal with this general problem in English are hyphenation and abandoning justification. In this particular case, your text would benefit from using Knuth's algorithm for justification. > IIUC, no-break space is just that: a space that doesn't permit a line > break. This says nothing about it being fixed width. > > Unicode 12.0 §2.3 on p 27 (55 of PDF) says: You're assuming that TUS is a standard. It's much more a collection of influential recommendations. Richard.
Re: NBSP supposed to stretch, right?
On 2019-12-17 10:37 AM, QSJN 4 UKR via Unicode wrote: Agree. By the way, it is common practice to use multiple nbsp in a row to create a larger span. In my opinion, it is wrong to replace fixed width spaces with non-breaking spaces. Quote from Microsoft Typography Character design standards: «The no-break space is not the same character as the figure space. The figure space is not a character defined in most computer system's current code pages. In some fonts this character's width has been defined as equal to the figure width. This is an incorrect usage of the character no-break space.» The mention of code pages made me suspect that this quote was from an archived older web page, but it's current. Here's the link: https://docs.microsoft.com/en-us/typography/develop/character-design-standards/whitespace Quoting from that same page, "Advance width rule : The advance width of the no-break space should be equal to the width of the space." So it follows that any justification operation should treat NO-BREAK SPACE and SPACE identically.
Re: NBSP supposed to stretch, right?
On 12/17/2019 2:41 AM, Shriramana Sharma via Unicode wrote: On Tue 17 Dec, 2019, 16:09 QSJN 4 UKR via Unicode,wrote: Agree. By the way, it is common practice to use multiple nbsp in a row to create a larger span. In my opinion, it is wrong to replace fixed width spaces with non-breaking spaces. Quote from Microsoft Typography Character design standards: «The no-break space is not the same character as the figure space. The figure space is not a character defined in most computer system's current code pages. In some fonts this character's width has been defined as equal to the figure width. This is an incorrect usage of the character no-break space.» Sorry but I don't understand how this addresses the issue I raised. You don't? In principle it may be true that NBSP is not fixed width, but show me software that doesn't treat it that way. In HTML, NBSP isn't subject to space collapse, therefore it's the go-to space character when you need some extra spacing that doesn't disappear. I bet, in many other environments it was typically the only "other" space character, so it ended up overloaded. My hunch is that it is too late at this point to try to promulgate a "clean" implementation of NBSP, because it would effectively change untold documents retroactively. So it would be a massively breaking change. If you have a situation where you need really poor layout (wide inter-word spaces) to justify, the fact that a honorific in front of a name works more like it's part of the same word (because the NBSP doesn't stretch) would be the least of my worries. (Although, on lines where interword spaces are a reduced a bit, I can see that becoming counter-intuitive). If you only fix this in software for high-end typography, you'd still have the issue that things will behave differently if you export your (plain) text. And you would have the issue of what to do when you want fixed spaces to be non-breaking as well (is that ever needed?). A./
Re: NBSP supposed to stretch, right?
On Tue 17 Dec, 2019, 16:09 QSJN 4 UKR via Unicode, wrote: > Agree. > By the way, it is common practice to use multiple nbsp in a row to > create a larger span. In my opinion, it is wrong to replace fixed > width spaces with non-breaking spaces. > Quote from Microsoft Typography Character design standards: > «The no-break space is not the same character as the figure space. The > figure space is not a character defined in most computer system's > current code pages. In some fonts this character's width has been > defined as equal to the figure width. This is an incorrect usage of > the character no-break space.» > Sorry but I don't understand how this addresses the issue I raised.