Hi Patrick, Do you have experience that it works (e.g. it produces different output for different "Page seg mode")?
I tried several options but I got the same output. I used scan of 4
column magazine page as input file.
Maybe I did something wrong, maybe I do not understand what should be
result...
I created new config file (/usr/local/share/tessdata/tessconfigs/PSM)
with line:
tessedit_pageseg_mode 3
and than I run:
$ /usr/local/bin/tesseract multicolumn.tif ouput_3 PSM
tesseract accepted config file (if I replace "3" with "PSM_SINGLE_LINE"
tesseract will complain that "variable not found: tessedit_pageseg_mode"
- there must be number as explain in ccmain/tesseractclass.cpp:
"Page seg mode: 0=auto, 1=col, 2=block, 3=line, 4=word, 6=char")
When I add to config file another line:
tessedit_dump_pageseg_images true
it produces the same images as input image even I use different "Page
seg mode..." (I expected that it will create different images for
different "Page seg mode")
Zd.
Dn(a 28.04.2010 17:08, patrickq wrote / napísal(a):
> Hi all,
>
> I stands to reason that can achieve what you want by setting the
> segmentation mode. This is how we use that setting:
> myTess->SetPageSegMode(tesseract::PSM_AUTO);
>
> We use PSM_AUTO in our iPhone app (ScanBizCards) but for small images
> perhaps using another mode will achieve what you need. Here is the
> list of options:
>
> PSM_AUTO, // Fully automatic page segmentation.
> PSM_SINGLE_COLUMN, // Assume a single column of text of
> variable
> sizes.
> PSM_SINGLE_BLOCK, // Assume a single uniform block of text.
> (Default.)
> PSM_SINGLE_LINE, // Treat the image as a single text line.
> PSM_SINGLE_WORD, // Treat the image as a single word.
> PSM_SINGLE_CHAR, // Treat the image as a single character.
>
> Patrick
>
> On Apr 28, 9:56 am, zdenko podobny <[email protected]> wrote:
>
>> If find how to turn it off, please share this info ;-)
>>
>> Zd.
>>
>>
>>
>> On Sun, Apr 25, 2010 at 5:43 PM, Jan <[email protected]> wrote:
>>
>>> Thanks for the info, when I will try to change in the
>>> tesseractmain.cpp.
>>>
>>
>>> Jan
>>>
>>
>>> On 23 Apr., 09:38, zdenko podobny <[email protected]> wrote:
>>>
>>>> Hello,
>>>>
>>
>>>> http://code.google.com/p/tesseract-ocr/wiki/ReadMe, section Installation
>>>> Notes - 3.00 Prerelease:
>>>> In the executable, page layout analysis is enabled by default. You may
>>>>
>>> need
>>>
>>>> to turn it off to process small images. No command-line control for this
>>>> yet. Sorry. See tesseractmain.cpp.
>>>>
>>
>>>> Zd.
>>>>
>>
>>>> On Wed, Apr 21, 2010 at 10:08 AM, Jan <[email protected]> wrote:
>>>>
>>>>> Hallo,
>>>>> is it possible to use tesseract 3.0 without page layout analysis, or
>>>>> in one column mode?
>>>>> Especially using the tesseract.exe?
>>>>> Thanks!!
>>>>>
>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>>
>>> Groups
>>>
>>>>> "tesseract-ocr" group.
>>>>> To post to this group, send email to [email protected].
>>>>> To unsubscribe from this group, send email to
>>>>> [email protected]<tesseract-ocr%[email protected]>
>>>>>
>>> <tesseract-ocr%[email protected]<tesseract-ocr%[email protected]>
>>>
>>
>>>>> .
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>>>>
>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups
>>>>
>>> "tesseract-ocr" group.
>>>
>>>> To post to this group, send email to [email protected].
>>>> To unsubscribe from this group, send email to
>>>>
>>> [email protected]<tesseract-ocr%[email protected]>
>>> .
>>>
>>>> For more options, visit this group athttp://
>>>>
>>> groups.google.com/group/tesseract-ocr?hl=en.
>>>
>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "tesseract-ocr" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to
>>> [email protected]<tesseract-ocr%[email protected]>
>>> .
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group
>> athttp://groups.google.com/group/tesseract-ocr?hl=en.
>>
>
smime.p7s
Description: S/MIME Cryptographic Signature

