Re: [tesseract-ocr] How to overlay hocr output on original scanned pdf.

2018-09-18 Thread Monica
est results. > > > > On Mon, Sep 17, 2018 at 10:08 AM Shree Devi Kumar > wrote: > >> I think pdf creation adds a text layer only and there isn't an option to >> add HOCR to it. >> >> @jbreiden can confirm. >> >> On Mon, Sep 17, 2018 at 6:10 PM,

Re: [tesseract-ocr] How to overlay hocr output on original scanned pdf.

2018-09-17 Thread Monica
I have tried this, but this is showing the default behaviour. I think the default output is overlaying on pdf instead of hocr out. On Mon, Sep 17, 2018 at 5:47 PM Monica wrote: > Thanks Zdenko for you response. > will "tesseract scannedFile.png scanned.pdf -l eng hocr pdf"

Re: [tesseract-ocr] How to overlay hocr output on original scanned pdf.

2018-09-17 Thread Monica
; > > po 17. 9. 2018 o 14:12 monica kumari > napĂ­sal(a): > >> for OCRing a scanned pdf, >> first it is converted to image format then OCRed and gives a temperory >> file of pdf/text format and overlays on original scanned pdf. >> I want the output format to

[tesseract-ocr] How to overlay hocr output on original scanned pdf.

2018-09-17 Thread monica kumari
for OCRing a scanned pdf, first it is converted to image format then OCRed and gives a temperory file of pdf/text format and overlays on original scanned pdf. I want the output format to be hocr. for this, I ran the command "convert scannedFile.pdf scannedFile.png" and then "tesseract