Hi, Gabriel! On Sat, Feb 21, 2026 at 4:37 AM Gabriel Ellsworth <[email protected]> wrote: > > Here is my situation. > > I am trying to typeset a new edition of a public-domain book. > I have a PDF that contains a scanned copy of a 20th century printing of this > book (about 700 pages). > My output will contain a bit of LilyPond output, but music notation will not > be “the main actor” (to borrow Lucas’s very apt phrase below). I estimate > that the book will be 97% text and 3% LilyPond. > Based on past helpful input from this list, I suspect that LaTeX will be the > best way to create this book. > I have never used LaTeX before. > I know almost nothing about how OCR software or AI works on the back end.
Have you looked at Project Gutenberg to see if they have a copy of this book? https://www.gutenberg.org/ Typically they will have plain text files of the books in their catalog, which serve as a great start for getting the document into LaTeX. You might also check out the internet archive: https://archive.org/ If you can find your book in one of these resources, the OCR will already have been done (and most likely proofread. If you get a plain text file, and want some help in converting it to LaTeX, I can probably spend some time on a messaging platform helping you get started with the process. HTH, Carl
