Re: using tesseract hocr output to create a searchable PDF

2012-09-28 Thread Jeffrey Ratcliffe
On 27 September 2012 22:17, Guido wrote: > Did you find any other solution than using tesseract and pdfbeads? What are > your experiences so far? If you are using Linux, try gscan2pdf. Regards Jeff -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" gr

Re: using tesseract hocr output to create a searchable PDF

2012-09-27 Thread Lahiru Himash Madusanka
> > I'm using Quick PSF library in my app to create PDF's > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubs

Re: using tesseract hocr output to create a searchable PDF

2012-09-27 Thread Guido
Did you find any other solution than using tesseract and pdfbeads? What are your experiences so far? I am currently looking for a solution to a similar problem: When I scan documents at university, the only option is to save it as pdf. Afterwards, at home, I'd like to convert those files into s

Re: using tesseract hocr output to create a searchable PDF

2011-12-03 Thread Carlos
zdenko, Thanks for the reply. > You did not specified language but in case of python I am pretty agnostic about language as long as it can run via the CLI on linux - the OCR process is on the backend. In case anyone else runs across this: I am an OCR noob so the past few days have been pretty

Re: using tesseract hocr output to create a searchable PDF

2011-11-30 Thread zdenko podobny
just for remark: Mihail Radu Solcan in 2008 posted 2 articles [1], [2] about adding text to DjVu files. I am not sure if there are such possibilities/tools for pdf. Anyway - he used box file for this task (hocr was not available) You did not specified language but in case of python try to have a

using tesseract hocr output to create a searchable PDF

2011-11-29 Thread Carlos
Tesseract 3.01 hocr2pdf 0.8.5 My project has been using Tesseract to OCR documents for some time and we are really happy with the results. We have been recently asked to offer the documents in our system as searchable PDFs. My initial attempt has been to create a searchable PDF using the hocr ou