[OCR] Extract text layer, fix errors, re-import?

Gilles Thu, 29 Aug 2024 12:09:39 -0700

Hello,

I noticed some typos in the text layer added by an OCR into a "bitmap"PDF, ie. pages are actually scanned pages.

I first tried opening the EPUB generated by Abbyy Finereader, butLibreOffice couldn't open it at all, while Sigil could after showing anerror message but lacks a French dictionary to run the job (as far as Ican tell).

As an alternative, pdftotext or mutool (convert) can extract the textlayer from such PDF, but can they put it back after I fixed the typos?


Thank you.

[OCR] Extract text layer, fix errors, re-import?

Reply via email to