Re: [dev] reading an epub book with less: adventures in text processing

2024-03-11 Thread Viktor Grigorov
Rather late to the party and I've already forgotten the initial email. Nevertheless, I'll give the program I most use: epub2txt.[0] It's not perfect, but compared to calibre's ebook-convert, and everything else I found in C in github or codeberg or gitlab, it's the best. A once-over with an

Re: [dev] reading an epub book with less: adventures in text processing

2024-03-11 Thread Κρακ Άουτ
On 2024-03-11 17:44 Greg Reagle wrote: > Now my next question is, what is the tool that does the *best* job of > turning a PDF book into a readable text document? Via html or > docbook or markdown or whatever--doesn't matter. My previous > experience trying things out to achieve this goal is

Re: [dev] reading an epub book with less: adventures in text processing

2024-03-11 Thread Greg Reagle
On Sat, Mar 9, 2024, at 1:15 PM, Greg Minshall wrote: > for some personal tastes/usage cases, this, using pandoc's `-t` > option, might be minor-ly simpler: > > man --local-file --pager 'less -ir' \ > <(pandoc --standalone -t man \ >

[dev] Re: reading an epub book with less: adventures in text processing

2024-03-11 Thread Greg Reagle
I think I finally figured it out! With help, of course, from my wise and helpful community. Thanks! And reading the man page for elinks. :> for direct viewing in less: pandoc -s -t html City_of_Truth-Morrow.epub | elinks -dump-color-mode 2 -force-html | less -ir to make a file to keep,

Re: [dev] Re: reading an epub book with less: adventures in text processing

2024-03-11 Thread Greg Reagle
On Sat, Mar 9, 2024, at 12:53 PM, LM wrote: > You could try modifying sdlbook or bard. It would be nice if either of these > offered keymapping functionality like some programming editors do. Thank you for telling me about these two programs. I had not heard of them.

Re: [dev] reading an epub book with less: adventures in text processing

2024-03-11 Thread Greg Reagle
On Sat, Mar 9, 2024, at 4:06 PM, Georg Lehner wrote: > Option 1: use w3m [snip] All great commands. Thank you. > The reason you loose formatting when saving from less(1) or w3m is, that > these programs on purpose do not save the terminal control characters > which are doing the markup. Line

Re: [dev] reading an epub book with less: adventures in text processing

2024-03-11 Thread Greg Reagle
On Sat, Mar 9, 2024, at 11:33 AM, Hiltjo Posthuma wrote: > Maybe mupdf/mutools or the eGhostscript tools o qpdf? Yes, thank you for this excellent advice. I tried "mutool convert", but I am more satisfied with pandoc's output, for both text and html output (from epub).

Re: [dev] [sbase] Defining scope of sbase and ubase

2024-03-11 Thread Roberto E. Vargas Caballero
Hi, After reading the opinion of the people in this thread, I think the best option is to merge the sbase and ubase repositories and having a mechanism in the build system to select the set of tools to be included in the build. The main drawback of this is that the build system will be more