Hello Branden. G. Branden Robinson wrote in <20220912144641.q2r65kkfpiej4u2u@illithid>: |At 2022-09-12T15:43:00+0200, Steffen Nurpmeso wrote: |> I have problems with the UTF-8 device, it shows |> |> on‐main‐loop‐tick |> instead of |> on-main-loop-tock |> |> ie U+2010 instead of hyphen-minus U+002D. |> |> The above does not feel right, and searching is impossible! |> I would expect U+2010 HYPHEN in hyphenation, but not as a regular |> combiner aka delimiter joined words as are used very often in |> German, for example. | |There are a few points to raise about this. The first is a question. | |1. You don't expect a hyphenated word to use a hyphen?
This is not a hyphenated word. In Germany, not only due to feminist aka suffragette movement, we have lots of names like that even. For example Annette von Droste-Hülshoff, 12. Januar 1797 until 24. Mai 1848. You could hyphenate that, but then, at some point, feminism comes to an end! (For her it has different roots, though.) |2. This is not a "1.23"-specific issue as your subject lines suggests. | |$ groff --version | head -n 1 |GNU groff version 1.22.4 Ok.. this i did not know. Until last week i was solely using 1.22.3, even if the system has 1.22.4 (just not for me). ... |3. If you're secretly in a man page context but didn't disclose that, | then, yes, this is a change from groff 1.22.4. The hyphen-minus, | neutral apostrophe, and grave accent no longer map differently for | man(7) and mdoc(7) than for any other macro package. (\- still does Oh. While cycling dimly recalled there was a discussion here, but did not truly follow it(?). | and there is no prospect of that changing, since there is no *roff | special character defined for the "ASCII hyphen-minus", and it is | essential to express this precise character in man pages. These | issues have been discussed at some length on this mailing list over | the past three years.) Really. The above is just wrong, Branden. Who said such? You cannot use HYPHEN for the above. Hyphen-minus itself, less-than, greater-than, no-break space, LEFT-POINTING DOUBLE ANGLE QUOTATION MARK, only to go until 0xAB. Or standard names like IEEE Std 1003.1™-2017, IEEE Std 1003.1-2008, C-language, code-level, POSIX.1-2017, built-in, this is only the first page of that standard. Or the ISO C17 standard, you search for "-" in the official PDF, and you find it for Storage-class, absolute-value, floating-point, type-generic, thread-specific, and more, and we are still in the TOC. No no -- no HYPHEN here! These are _not_ hyphenated words. If roff can make a difference in true hyphenation points (i had to take a loooong look), then it could change a hyphen-minus on the input side with a hyphen on the output side when it really breaks a line at that point. Otherwise hyphen-minus is the only viable alternative. Or look at the Unicode standard, where real great minds with incredible multi-national professional life careers are involved, get the official PDF (hr-hrm, i have not updated since Unicode 13..), combined words are separated with hyphen-minus, _not_ hyphen. This is really wrong. |4. "on-main-loop-tick" doesn't look a natural language word to me--it | looks like an identifier in a programming language (maybe some | dialect of Lisp). If that is the case, those hyphens need to be | spelled "\-" in the source code. This has always been true in man Well, yes and no. Hyphen is just everywhere in 1.23. | pages, going back to 1979. | | Take | $ grep '\\-[A-Za-z]' ~/src/unix/v7/usr/man/man1/bc.1 |.B \-c Yeees, well, i really had to look you know. This is a language and there was development and it was a lot of woolding. -.th MAIL I 10/25/72 -.sh NAME -mail \*- send mail to another user Who says it is not an evolution of the above? Doug McIlroy is on this list, maybe he reads and knows. Though he said something about the NATO today, and that lying aggressive Endsieg beast is definetely on the other side of the road. And by the way, you mention flags in the above. Flags are different, because often you want this to be a U+2013 EN DASH. Ie, you want to make it _longer_ than a hyphen-minus. Not super short like a hyphen. Imho. ... |5. Searching is not impossible. | 5a. Searching for a word that is broken and hyphenated across lines | is no more impossible than it always was. On occasions when I | have to do this, I break out sed(1) or perl(1). It is not hyphenated, Branden. | 5b. Literals that might be of interest in man pages should be | entered with hyphenation suppressed in the input. The groff man Hey! This is not rocket science or something. I am happy if people at least do _write_ manuals _at_all_. | pages in 1.23 do this much more conscientiously than in past | releases. This is to avoid confusing users who might wonder if | a hyphen is to be interpreted literally or not. | | 5c. You can disable automatic hyphenation altogether when rendering | man pages. See the '-rHY' option in groff_man(7). This feature | has been around for many years. | | 5d. groff's mdoc(7) implementation did not recognize the `HY` | register in groff 1.22.4 and earlier. It does now, though. | | 5e. For me, anyway, searching within less(1) using the pattern with | a dot where the hyphen goes works fine, even though there are 3 | bytes in the input stream instead of one. Evidently less(1) is Fuzzy-search code-wise? ;) | smart enough. For instance, I can match "line-ending" in the | roff(7) page while paging it with "groff -Tutf8 -man | less -R" | by entering "/line.ending" within less(1). | |I hope this clears some things up. Certainly not for me. Hyphen is good at the end of line when a word is hyphenated, otherwise it is misplaced. And using hyphen to combine words is wrong. En dash would look nice, i could imagine. Ciao, --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)