Re: change hocr-pdf image resolution

2014-02-06 Thread universal reseller
Whats the difference!? tiff support resolution changing? On Thu, Feb 6, 2014 at 3:56 AM, santiago sanven...@gmail.com wrote: puede cambiar el tipo formato a tif El 05/02/2014 18:31, peiman F. uniresel...@gmail.com escribió: hello for good ocr output quality image resolution must be up to

Re: hocr2pdf and arabic language

2014-02-06 Thread Jeff Breidenbach
I've merged Nick White's bugfix into hocr-tools. Thank you, Nick. I expect most people will instead use the native PDF support built into Tesseract henceforth, and I intend to focus most of my time and energy there. However, there is still some use for hocr-pdf, especially when working with

Re: hocr2pdf and arabic language

2014-02-06 Thread universal reseller
​do you now 3.03 release time ?!​ -- -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For

Re: hocr2pdf and arabic language

2014-02-06 Thread Jeff Breidenbach
As for Arabic and other right-to-left scripts, please try using the new native PDF capability in Tesseract instead. It is significantly more sophisticated and I think it should work correctly. -- -- You received this message because you are subscribed to the Google Groups tesseract-ocr group.

Re: hocr2pdf and arabic language

2014-02-06 Thread Jeff Breidenbach
I don't know, it is up to Ray. My guess is quite soon. In any case, I just ran on your example images, noticed a small problem, and fixed it. Thank you for providing them. I should also mention that there is no need to convert your binary images to JPEG when using Tesseract's native PDF

jTessBoxEditor out of memory

2014-02-06 Thread peiman F.
hi i used jTessBoxEditor to train some new fonts for arabic language and i got the out of memory error in jre7 i have 10 different tiff file for each font and my boxes have up to 34000 words when i tired to make one trained file for all font it crashed easily so i tried to make .traindata file

Re: jTessBoxEditor out of memory

2014-02-06 Thread Jimmy O'Regan
On 06/02/2014, peiman F. uniresel...@gmail.com wrote: hi i used jTessBoxEditor to train some new fonts for arabic language and i got the out of memory error in jre7 i have 10 different tiff file for each font and my boxes have up to 34000 words when i tired to make one trained file for all