Re: Customising Tesseract for character recognition

2012-10-14 Thread zdenko podobny
On Sat, Oct 13, 2012 at 10:47 PM, JVIyer jawant...@gmail.com wrote: *A lot of times I have seen fairly good number plate images being OCRed inaccurately. This could possibly be due to the word recognition stage. Has anyone found a way to disable the dictionary / word recognition. * Saurabh,

Re: Customising Tesseract for character recognition

2012-10-13 Thread JVIyer
*A lot of times I have seen fairly good number plate images being OCRed inaccurately. This could possibly be due to the word recognition stage. Has anyone found a way to disable the dictionary / word recognition. * Saurabh, Have you been able to accomplish this ? Could you kindly share your

Re: Customising Tesseract for character recognition

2012-02-20 Thread Aruna Devi
by seeing the output i got. My image has 6 rows and 12 columns, but in my output i got 12 rows and 6 columns , and all was read from right first.(should have started from left) On Feb 17, 6:24 pm, Andres andrej...@gmail.com wrote: Just by curiosity, how did you find that ? 2012/2/17 Aruna Devi

Re: Customising Tesseract for character recognition

2012-02-17 Thread Andres
Just by curiosity, how did you find that ? 2012/2/17 Aruna Devi arunadevia...@gmail.com Even i wanted to know how to make tesseract to read my image horizontally. I have an image consisting of 6 rows, After training i found that my image is read from right side(Should be from left) and also

Re: Customising Tesseract for character recognition

2012-02-16 Thread Aruna Devi
Even i wanted to know how to make tesseract to read my image horizontally. I have an image consisting of 6 rows, After training i found that my image is read from right side(Should be from left) and also its going down by column and not the row. How to solve this issue? -- You received this

Re: Customising Tesseract for character recognition

2011-12-14 Thread Prachi Joshi
how to set all these variables? -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more

Re: Customising Tesseract for character recognition

2011-03-14 Thread Jose
Hi Dmitry, thanks for the help! and the end what I did is modify the return result function and include the top location of the the bounding box. then I have the following result: valuex/valuetopy/top valuex1/valuetopy1/top valuex2/valuetopy2/top

Re: Customising Tesseract for character recognition

2011-03-14 Thread Jose
yes, I got the information from the result! I only modify has the result method prints the result.. nothing more of course! I got the information from the bounding box of the result! I'm not modifying it deeper than that. -- You received this message because you are subscribed to the Google

Re: Customising Tesseract for character recognition

2011-03-14 Thread Dmitry Silaev
Ehmm... I don't get it. If you've succeeded in using iterators, it's at your full disposal to format the output in any way you want programmatically, isn't it? Warm regards, Dmitry Silaev On Mon, Mar 14, 2011 at 1:56 PM, Jose diox...@gmail.com wrote: *I only modify how the result is

Re: Customising Tesseract for character recognition

2011-03-14 Thread Jose
In future that will be my desired approach! for the time beeing I just need a fast and easy solution! I know it's not the most beautiful approach... but I haven't touch a lot of the tesseract framework in order to break anything! I was just short of time and it was easier for me to modify the

Re: Customising Tesseract for character recognition

2011-03-13 Thread Jose
Hi Dmitry, sorry for the delay... I produced some samples and see if you can give them a look! regards, jose -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send email to tesseract-ocr@googlegroups.com. To unsubscribe

Re: Customising Tesseract for character recognition

2011-03-13 Thread patrickq
Tesseract 3.00 gets this text 100% correct, including the smudged numbers at the bottom. See: http://www.scanbizcards.com/plate1.jpg http://www.scanbizcards.com/plate2.jpg (scanning was done with ScanBizCards on an iPhone - if you try it yourself with the app on Android or iPhone, please disable

Re: Customising Tesseract for character recognition

2011-03-13 Thread Jose
Hi Patrick, yes the results are correct! but the format of the results it is not! that's my trouble -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send email to tesseract-ocr@googlegroups.com. To unsubscribe from this

Re: Customising Tesseract for character recognition

2011-03-13 Thread patrickq
You expect way too much from Tesseract: it's not Tesseract's job to slice and dice the text according to various organizational requirements of applications - that's for the application to handle. You can get all the coordinates of all characters and easily determine which one are in what you

Re: Customising Tesseract for character recognition

2011-03-13 Thread Dmitry Silaev
Jose, I run Tesseract revision 549 from the command line under Windows with no special config and get the segmentation which is almost correct. What language file do you use? I used the following command line tesseract 3.tiff test3 -l eng with no pageseg_mode (-psm argument) as well as with it,

Re: Customising Tesseract for character recognition

2011-02-24 Thread Dmitry Silaev
I don't know if it's affordable for you, but imho decent results can only be achieved if you do segmentation yourself and then pass image fragments to Tesseract on a word-by-word basis. Problems may appear when you have words that are too short, however, as I can see, it's not your case. Long

Re: Customising Tesseract for character recognition

2011-02-24 Thread Jose
Dmitry the recognition works the only thing is the way it is parsing it... :S I think segmentation of the images would be too much painful! I only won't to change the other that is display or the bounding boxes so I could now the x and y of the word recognized and thereby can organise the results

Re: Customising Tesseract for character recognition

2011-02-24 Thread Dmitry Silaev
Unfortunately not only text output order can suffer from Tess's segmentation, but also extents of some text fragments can be identified incorrectly (say one segmented row can span over two real rows, probably in partial way), and that in turn can lead to *completely* irrelevant recognition

Re: Customising Tesseract for character recognition

2011-02-24 Thread Jose
In my particular case is just a matter that the first word of each column is in one font and the other is in another so instead of reading column by column it reads all the columns of the first row and then all the columns of the second row! My god is really hard to explain in english. I get an

Re: Customising Tesseract for character recognition

2011-02-24 Thread Dmitry Silaev
The best way to explain everything would be just to send your source image examples, describe what information you want to get from them and provide the community with the code snippets you use to interface with Tess. And please be as detailed as possible. Warm regards, Dmitry Silaev On Thu,

Re: Customising Tesseract for character recognition

2011-02-24 Thread Jose
Ok I'll try to do that this afternoon. thank you for the help regards, jose -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send email to tesseract-ocr@googlegroups.com. To unsubscribe from this group, send email to

Re: Customising Tesseract for character recognition

2011-02-21 Thread Jose
Is there no other workarround? If I reduce the white space size of the WORD1 WORD2 then it all works fine! This space is making the OCR think it's another column! Is there no another way? Splitting the image as many rows looks something not really eficient -- You received this message because

Re: Customising Tesseract for character recognition

2011-02-21 Thread Jose
this is JPG look like *WORD1 * WORD2 (white space is quite big *WORD1 *WORD2 *WORD1 *WORD2 *WORD1 *WORD2 *WORD1 *WORD2 *WORD1 *WORD2 *WORD1 *WORD2 and it reads like: *WORD1 * *WORD1 * *WORD1 * *WORD1 * *WORD1 * *WORD1* WORD2 WORD2 WORD2 WORD2 WORD2 WORD2 WORD2 any help would be

Re: Customising Tesseract for character recognition

2011-02-21 Thread Jose
Ok I'm recompiling now... I'll let you know when it's done! thanks for the help anyway :) -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send email to tesseract-ocr@googlegroups.com. To unsubscribe from this group, send

Re: Customising Tesseract for character recognition

2011-02-21 Thread Jose Granja
Hi, do you now how to force the page layout to recognise it as horizontal? my issue is with that! you ll make me the happiest person on earth On 17 Feb 2011, at 04:48, Saurabh Gandhi saurabh...@gmail.com wrote: Hello everyone, I am currently using tesseract 3.x for license plate

Re: Customising Tesseract for character recognition

2011-02-18 Thread Sriranga(78yrsold)
*Customise the tesseract engine to recognize only the characters from **A-Z,0-9,.(dot), (space) by setting the character white-list * Kindly furnish the name of the folder in which whitelist as well as blacklist are existed. I want to utilise the same for Kannada scripts. -sriranga(78yrs) On

Re: Customising Tesseract for character recognition

2011-02-18 Thread Saurabh Gandhi
You can simply use this in your program just after init to set whitelist / blacklist: *api.Init(argv[**0**],** **lang,** **(argv[arg]),** **argc-arg,** **false** );** **api.SetVariable(**tessedit_char_whitelist**,** ** ABCDEFGHIJKLMNOPQRSTUVWXYZ.0123456789 **);* -- Regards, Saurabh Gandhi On

Re: Customising Tesseract for character recognition

2011-02-18 Thread Saurabh Gandhi
Hello Jose, Setting the mode to PSM_SINGLE_BLOCK or PSM_SINGLE_LINE will not force horizontal reading. These modes will just assume that your input image itself is segmented and consists of just a single line. So, if you want horizontal reading you will have to segment your image and provide it

Re: Customising Tesseract for character recognition

2011-02-18 Thread Saurabh Gandhi
Did you try PSM_SINGLE_COLUMN. I think that is what you need. Could you try this and let us know how it behaves please. PSM_SINGLE_COLUMN, /// Assume a single column of text of variable sizes. -- Regards, Saurabh Gandhi On Fri, Feb 18, 2011 at 4:29 PM, Jose diox...@gmail.com wrote: Is

Re: Customising Tesseract for character recognition

2011-02-18 Thread Saurabh Gandhi
Yes, thats right. -- Regards, Saurabh Gandhi On Fri, Feb 18, 2011 at 4:57 PM, Jose diox...@gmail.com wrote: ok I'll try that! I have to modify this on the tesseractmain.cpp right? (I'm using command line execution) I replace this line : api.SetPageSegMode(tesseract::PSM_AUTO); for