And troff comments appearing in html output as html comments is something I explicitly DON’T want happening. My comments are NOT intended to be part of the finished document in any form.
On Mon, Jan 24, 2022 at 21:09 G. Branden Robinson < g.branden.robin...@gmail.com> wrote: > Hi Alex, > > At 2022-01-24T22:48:29+0100, Alejandro Colomar wrote: > > Hi Branden, > > > > I'd like to see groff comments preserved in the HTML output (as HTML > > comments). > > > > So, for `groff -T html ...`, > > > > .\" hello world > > > > would be transformed to > > > > <!-- hello world --> > > > > Sounds good? > > That's a bigger challenge than the other items you've raised so far > (well, the grohtml relative inset thing, I can imagine being a real PITA > to hammer out, but _conceptually_ it's easy). > > The problem is that troff(1) disposes of comments entirely very early in > parsing. Importantly, they're stripped out of macro definitions before > the definition is even stored. > > It's possible these issues could be overcome by converting comments into > a device control command escape sequence (\X''), but there are quoting > issues to consider (although _maybe_ my recent change to how characters > in such escape sequence get mapped when being written to the > device-independent output addresses that, or makes doing so easier[1], > and possibly other matters I haven't thought of. > > So this one is a heavier lift, I think. > > Regards, > Branden > > [1] commit 9d61b3d142842589b90d7eda0ed3270fbbf6166f > Author: G. Branden Robinson <g.branden.robin...@gmail.com> > Date: Fri Oct 1 19:20:25 2021 +1000 > > [troff]: Enable ASCII in device control escapes. > > [troff]: Convert special character glyphs corresponding to Unicode > Basic > Latin ("ASCII") code points to those code points when they occur in > device escapes. (They should be correct for IBM code page 1047 as > well, > but this is untested.) This is necessary for encoding URLs in device > control commands. Special character identifiers are presumed to be the > defaults documented in groff_char(7); this is a design gap that we > should consider addressing. (We don't have a way to ask "is this the > special character corresponding to Unicode basic Latin code point X?") > > * src/roff/troff/input.cpp (encode_char): Do it. > > I'm not documenting this in NEWS as it feels like a pretty dusty corner > even though I'm about to leverage it for something of much higher > visibility. > > Also see: > 65737d48ad7e75353a67e4f408bb68bc5d5b0773 > 3d1988cabc90f3c4b0b0000bb4a809be61eeba3c > eb695ab2b5e2bae54afa102355c493bda6e29d3e > -- T. Kurt Bond, tkurtb...@gmail.com, https://tkurtbond.github.io