On Wed, Feb 22, 2023 at 08:26:19PM -0600, G. Branden Robinson wrote: > Subject: groff for epub/e-books (was: groff 1.22.4 mandb 2.11.2 man -H tbl > not rendered)
> > [And prefer "open" ePub, vs *proprietary* Adobe PS and PDF, > > PDF is an international standard these days,[1] though I wonder > if it isn't a captive one by Adobe as "Office Open XML" is by > Microsoft.[2] However, it doesn't mean that Adobe's programmers don't make mistakes when working on this proprietary code. When trying to work with PDFs using open source software I often find that I need to use pdftk or something similar to re-distill PDFs that don't behave properly. These are always from Microsoft or Apple platforms. > > as it's zipped html readable in browser addons, or vim or > > less(/open) if desperate. Get on it FSF!] > > I _do_ think ePub would make a good application for groff, and > it's something I've given some thought to. One thing EPUBs > often need to do is reflow and re-render the text, because > someone make take a tablet or phone display and rotate it > frequently. EPUBs _can_ do this, but in my experience, they > often do it poorly. Are there two different discussions going on here? I think I hear some people talking about converting groff files to epub and others perhaps hinting that groff should be the engine driving ebook readers. In my experience, going from arbitrary groff source files to the kind of HTML code needed for epubs (especially the increasingly mandated accessible epubs) is not worth the effort, unless the groff source code is done in a highly structured way (i.e., essentially following XML rules). I would love it if the second possibility were to be undertaken. I read a lot of epubs and they are rendered poorly, especially in respect to hyphenation (if it even exists on the particular platform you have), but also in respect to control of margins, line spacing (e.g., if footnote numbers occur in a line), and if anything other than straight, paragraphed text needs to be displayed. And even pagination itself is usually terrible. If I see extra white space at the bottom of a page, my sub-conscious reading-comprehension starts to wind down, assuming I've reached the end of a chapter or section. Unnecessary and distracting white space shows up a lot in epub rendering. Since I still work in publishing, sales of epubs are very important to me. In Canada, at least, and I think everywhere else, sales have plateaued at less than 20% of a publisher's income. In many cases publishers have stopped selling them actively because the encryption process stymies too many users and takes up too much staff time for customer service. In other cases, publishers have started boosting the prices of epubs to just below that of print books because they can't cover costs. For a long time, Amazon and Apple (in particular) manipulated the market to keep prices low and keep the majority of the sales income, but now their control over this is slipping and prices are going up for everything other than pulp fiction. > PDF apparently doesn't handle this well, which is one of the > reason a bunch of "e-book" document formats popped up. I've > been frustrated with every one I've encountered. In the academic world, epubs render tables, references, footnotes, figures, etc. in extremely awkward ways, and in many cases people prefer PDFs for their increased readability, but on anything smaller than about a 7-inch screen, PDFs can also be very painful. I still prefer reading on paper and I still prefer typesetting for paper as well. > I have noticed that groff generally renders so fast on modern > hardware that I'll wager that a "groff ePub" document could > ship the document _source_ and an "ePub reader" for it would > provide the entire groff rendering system. (For documents that > are slow to render even with this approach, you could > straightforwardly cache the intermediate output for each > display orientation.) I don't see how this would require any > architectural changes to groff itself, and would have many > advantages, particularly for document source accessibility, > archivability, preservation, and "share-alike" licensing > properties. > > (So major publishers would probably hate it and oppose it with > fury.) Publishers wouldn't even notice. It's the manufacturers of ebook readers who make this difficult. One way to start, probably, would be to attach an e-ink screen to a Raspberry PI and run groff to display the HTML from the epub, or the native groff document. That shouldn't be too hard. It might be a lot easier if someone would convert groff into libraries for something like python. That would probably be more efficient in handling the display than using pipes between processes. I'd love to have a python-groff module. It would simplfy a lot of the document transformations I spend a lot of time on, e.g., XML to PDF, plain text to PDF, etc. -- Steve -- Steve Izma - Home: 35 Locust St., Kitchener, Ontario, Canada N2H 1W6 E-mail: si...@golden.net phone: 519-745-1313 cell (text only; not frequently checked): 519-998-2684 == The most erroneous stories are those we think we know best – and therefore never scrutinize or question. -- Stephen Jay Gould, *Full House: The Spread of Excellence from Plato to Darwin*, 1996