Re: OCR to Transcribe Text PDF in LaTeX

stalnakergjjr Sat, 21 Feb 2026 08:15:52 -0800
There is this >10 y.o. blog 
post:https://globalblindspot.blogspot.com/2012/06/rudimentary-pdf-to-latex-conversion-in.html?m=1"There
 is only love, and then oblivion. Love is all we have to set against hatred." 
(paraphrased) Ian McEwan
-------- Original message --------From: Carl Sorensen 
<[email protected]> Date: 2/21/26  9:03 AM  (GMT-06:00) To: Gabriel 
Ellsworth <[email protected]> Cc: Lilypond-User Mailing List 
<[email protected]> Subject: Re: OCR to Transcribe Text PDF in LaTeX Hi, 
Gabriel!On Sat, Feb 21, 2026 at 4:37 AM Gabriel 
Ellsworth<[email protected]> wrote:>> Here is my situation.>> I am 
trying to typeset a new edition of a public-domain book.> I have a PDF that 
contains a scanned copy of a 20th century printing of this book (about 700 
pages).> My output will contain a bit of LilyPond output, but music notation 
will not be “the main actor” (to borrow Lucas’s very apt phrase below). I 
estimate that the book will be 97% text and 3% LilyPond.> Based on past helpful 
input from this list, I suspect that LaTeX will be the best way to create this 
book.> I have never used LaTeX before.> I know almost nothing about how OCR 
software or AI works on the back end.Have you looked at Project Gutenberg to 
see if they have a copy of this book?https://www.gutenberg.org/Typically they 
will have plain text files of the books in theircatalog, which serve as a great 
start for getting the document intoLaTeX.You might also check out the internet 
archive:https://archive.org/If you can find your book in one of these 
resources, the OCR willalready have been done (and most likely proofread.If you 
get a plain text file, and want some help in converting it toLaTeX, I can 
probably spend some time on a messaging platform helpingyou get started with 
the process.HTH,Carl
Re: OCR to Transcribe Text PDF in LaTeX

Reply via email to