I reinstalled with another Tesseract version (tesseract-ocr-setup-3.05.00dev) and it works wel...
2019. május 16., csütörtök 14:43:48 UTC+2 időpontban shree a következőt írta: > > I just tested once again on my installation in ubuntu, it works fine. See > attached. > > Qns. Does multipage tif to txt, hocr, alto, tsv process all pages? > Meaning, is the problem related only to pdf. > > Try to OCR the tif I have attached to see whether that works for you. > > > > On Thu, May 16, 2019 at 5:47 PM András Jeszenkovits <[email protected] > <javascript:>> wrote: > >> I downloaded the installer from here ( >> https://github.com/UB-Mannheim/tesseract/wiki) >> i tried both version >> >> - tesseract-ocr-w32-setup-v4.1.0.20190314 (rc1) >> >> <https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w32-setup-v4.1.0.20190314.exe> >> (32 >> bit) and >> - tesseract-ocr-w64-setup-v4.1.0.20190314 (rc1) >> >> <https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v4.1.0.20190314.exe> >> (64 >> bit) resp. >> >> I just scanned 2 page with handwritten data, just for convert not for OCR. >> You can download form here: >> https://drive.google.com/open?id=1aJXPDAcK5aRt6PPAEgNJhzLTIlvsYP4b >> >> >> >> >> 2019. május 16., csütörtök 13:08:45 UTC+2 időpontban zdenop a következőt >> írta: >>> >>> tesseract NEVER has problem with multipage tiff. >>> If you do not share image file you are alone with your problems. Nobody >>> can help you. >>> >>> Zdenko >>> >>> >>> št 16. 5. 2019 o 12:59 András Jeszenkovits <[email protected]> >>> napísal(a): >>> >>>> I cannot send you the tiff because there are sensitive company data in >>>> the tiff. But i tried to scan another, and the result still 1 page pdf, I >>>> think something bad with my tesseract version, or installation. >>>> >>>> 2019. május 16., csütörtök 11:11:00 UTC+2 időpontban zdenop a >>>> következőt írta: >>>>> >>>>> So provide your tif for reproducing problem. >>>>> >>>>> Zdenko >>>>> >>>>> >>>>> št 16. 5. 2019 o 11:06 András Jeszenkovits <[email protected]> >>>>> napísal(a): >>>>> >>>>>> The OS is Windows 10, I use tesserac OCR engine v4.0.0.20190314, I >>>>>> tried english, russian, hungarian. I tried 32bit/64bit version, i tried >>>>>> a >>>>>> jpg file too, same result (1 page pdf) >>>>>> >>>>>> 2019. május 16., csütörtök 10:31:33 UTC+2 időpontban shree a >>>>>> következőt írta: >>>>>>> >>>>>>> What is your version of tesseract? Which O/S? >>>>>>> >>>>>>> Have you tried it with just one language? >>>>>>> >>>>>>> On Thu, May 16, 2019 at 1:32 PM András Jeszenkovits < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> I thought that too, but the Tesseract create a one page pdf >>>>>>>> >>>>>>>> 2019. május 15., szerda 17:29:36 UTC+2 időpontban shree a >>>>>>>> következőt írta: >>>>>>>>> >>>>>>>>> tesseract In\SPTest.tif Out\Test --psm 3 -l rus+eng pdf >>>>>>>>> >>>>>>>>> This should be enough to create a multi page pdf from a multi page >>>>>>>>> tiff. >>>>>>>>> >>>>>>>>> On Wed, May 15, 2019 at 7:27 PM András Jeszenkovits < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Here: tesseract In\SPTest.tif Out\Test --psm 3 -l rus+eng *-c >>>>>>>>>> tessedit_page_number=-1* pdf >>>>>>>>>> >>>>>>>>>> 2019. május 15., szerda 15:51:31 UTC+2 időpontban zdenop a >>>>>>>>>> következőt írta: >>>>>>>>>>> >>>>>>>>>>> Why are you using tessedit_page_number ? >>>>>>>>>>> >>>>>>>>>>> Zdenko >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> st 15. 5. 2019 o 15:43 András Jeszenkovits <[email protected]> >>>>>>>>>>> napísal(a): >>>>>>>>>>> >>>>>>>>>>>> Hello! >>>>>>>>>>>> >>>>>>>>>>>> Can you help me with this problem? I'm testing the tesseract >>>>>>>>>>>> OCR engine. The input is a scanned multipage TIFF file. I tried to >>>>>>>>>>>> create a >>>>>>>>>>>> PDF from that, but the result is always one page. >>>>>>>>>>>> I used this cmd line: >>>>>>>>>>>> tesseract In\Test.tif Out\TestOutput -l rus+eng -c >>>>>>>>>>>> tessedit_page_number=-1 pdf >>>>>>>>>>>> I found an option to create a multipage pdf with this part: " >>>>>>>>>>>> -c tessedit_page_number=-1" but it doesnt work. I tried to get txt >>>>>>>>>>>> data, >>>>>>>>>>>> but I only found text from the first page. >>>>>>>>>>>> Can you help me with that? >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>> Google Groups "tesseract-ocr" group. >>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>>> To post to this group, send email to [email protected] >>>>>>>>>>>> . >>>>>>>>>>>> Visit this group at >>>>>>>>>>>> https://groups.google.com/group/tesseract-ocr. >>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/9ffedebc-c7b1-4856-bf31-f438d8213d01%40googlegroups.com >>>>>>>>>>>> >>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/9ffedebc-c7b1-4856-bf31-f438d8213d01%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>> . >>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "tesseract-ocr" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to [email protected]. >>>>>>>>>> To post to this group, send email to [email protected]. >>>>>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr >>>>>>>>>> . >>>>>>>>>> To view this discussion on the web visit >>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/69e1428c-20f9-4a59-b56d-e98cd990171e%40googlegroups.com >>>>>>>>>> >>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/69e1428c-20f9-4a59-b56d-e98cd990171e%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>> . >>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> ____________________________________________________________ >>>>>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To post to this group, send email to [email protected]. >>>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/6e52f08e-0319-4e6f-b8a9-712431401a96%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/6e52f08e-0319-4e6f-b8a9-712431401a96%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> ____________________________________________________________ >>>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To post to this group, send email to [email protected]. >>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/1237fc85-7f10-4003-880a-05152d9c6541%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/1237fc85-7f10-4003-880a-05152d9c6541%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/d4862d60-71fc-41eb-9838-235154797b3c%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/d4862d60-71fc-41eb-9838-235154797b3c%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/a36cc84e-e59b-46d8-8885-db33897b766b%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/a36cc84e-e59b-46d8-8885-db33897b766b%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > -- > > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d284ccef-a0f4-4a75-a24c-dc19545ab206%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

