Re: Provide/visualize baseline info?

2012-06-21 Thread Dmitri Silaev
Yes, the time has come to reveal it ))

In short:

   - Install ScrollView
   - Run Tess with the "inter" and "segdemo" configs
   - Choose Modes->Show BL Norm Word from the main menu
   - Click a word in the image view. A new window opens
   - Pan (with scrollbars) and zoom (with mouse wheel) to see baselines and 
   outlines
   - Keep clicking words in the main image view to see their baselines
   
For details please refer to 
http://rdaemons.blogspot.com/2012/06/tesseract-ocr-interactive-debugging.html

Warm regards,
Dmitri Silaev
www.CustomOCR.com


On Thursday, June 21, 2012 10:28:13 AM UTC+4, uni wrote:
>
> Hi...I wanted to know, what is the procedure to run the debug mode in 
> tesseract 3.01?
> Also, could you find the answer to visualizing baselines? 
> PLease help!
>
> Awaiting your reply
> Uni
>
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this group, send email to
tesseract-ocr+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en


Re: Provide/visualize baseline info?

2012-06-21 Thread uni
Hi...I wanted to know, what is the procedure to run the debug mode in 
tesseract 3.01?
Also, could you find the answer to visualizing baselines? 
PLease help!

Awaiting your reply
Uni

On Tuesday, February 8, 2011 10:54:45 PM UTC+5:30, Dmitri Silaev wrote:
>
> Sriranga,
>
> I'm glad you've succeeded. 
>
> Thanks to Zdenko for his guiding thought. I was aware of this Tess's 
> debugging capability but also strangely kept overlooking it.
>
> I think the empty "test1.txt" is mostly a normal situation. I noticed this 
> fact also, but as I can remember there were times when it was filled with 
> recognized text. Probably it depends on actions you perform during the 
> ScrollView session or their specific sequence - I really have no time to 
> investigate.
>
> Probably I'll publish more tutorials on what can be done using ScrollView. 
> But for my own needs this tool adds almost no value ((
>
> *** I'm still seeking for somebody's help regarding this topic's subject. 
> ***
>
> Warm regards,
> Dmitry Silaev
>
>
>
>
> 2011/2/8 Sriranga(78yrsold) 
>
>> Dmitry,
>>
>> Congratulations !! successfully installed in winXP and tried using 
>> phototest.tif 
>> 1st commandline " tesseract phototest.tif test1 segdemo inter"  works 
>> well
>> 2nd command line "tesseract phototest.tif test1 matdemo inter" wokrs well
>> however it is observed that output text1 is zero KB - where i made a 
>> mistake?
>> Just Now tested for Kannda script as well as Khem script - it displayed 
>> correctly and  only text file does contains 
>> 0 KB
>>
>> It would be nice if the screenshots of Modes, Display and others are 
>> reproduced for benefit of newbies/users - this will enable to view  
>> their own lang other than english.Really it is boon to newbies like me.
>>
>> It would be better to have your extract  of your blog published in the 
>> wiki section for benefit of users of forum by the owner.
>>
>>
>> With Warmest Regards,
>> -sriranga(78yrs)
>>
>>
>> 2011/2/6 Dmitry Silaev 
>>
>> Here are the brief instructions on how to set up the Tesseract 
>>> interactive debug environment (ScrollView) on Windows:
>>>
>>>1. Make sure you have Java Runtime Environment installed
>>>2. Download my home-brewed single archived installation suite from 
>>>http://www.4shared.com/get/Z4gnbJdP/tess_debug.html 
>>>3. Unpack the installation suit 
>>>4. Run cmd.exe
>>>5. Change working directory to where you've unpacked the 
>>>installation suit
>>>6. Follow the instructions in 
>>>http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging to run 
>>>Tesseract+ScrollView from the command line
>>>
>>> To keep the reasonable forum post size here in Google Groups, I placed 
>>> the more verbose and overall nicer looking instructions in my blog at 
>>> http://rdaemons.blogspot.com/2011/02/tesseract-ocr-setting-up-interactive.html
>>>
>>> Warm regards,
>>> Dmitry Silaev
>>>
>>>
>>> 2011/2/6 Sriranga(78yrsold) 
>>>
>>> Dear dmitry,
 Though it may or may not help me much atleast it will be benefited for  
 users of tesseract-ocr -
 for which users of the forum/newbies shall be thankful to you.
 With Warmest Regards,
 -sriranga(78yrs)  
  
 On Sun, Feb 6, 2011 at 1:47 AM, Dmitry Silaev wrote:

> Dear Sriranga,
>
> I've just managed to start the interactive Tess's visualizer. I don't 
> really know if it might help you much, but I can publish the step-by-step 
> instructions on how to make it work. At least these instructions may help 
> some of Tess community newbies. Most likely, I'll be able to publish this 
> within the next 24 hours.
>
> However it's not a workable solution for me. I still in desperate need 
> to know if I can provide Tess with my own baseline info using some 
> high-level structures and methods. Or whatever information you may have 
> on 
> this subject.
>
> Warm regards,
> Dmitry Silaev
>
>
>
>
> 2011/2/5 Sriranga(78yrsold) 
>
> Tried to install in WinXP but failed. extract of cmd is reproduced 
>> below for further guidance please.
>> C:\>set JAVA_HOME=C:\jdk1.4
>>
>> C:\>.\build.bat all (win32)
>> '.\build.bat' is not recognized as an internal or external command,
>> operable program or batch file.
>>
>> C:\>
>> C:\>j:
>>
>> J:\tesseract-ocr-3.01alpha-r527\java>.\build.bat all (win32)
>> Piccolo Build System
>> ---
>> Building with classpath 
>> C:\jdk1.4\lib\tools.jar;.\lib\ant.jar;.\lib\junit.jar;
>> Starting Ant...
>> The system cannot find the path specified.
>>
>> J:\tesseract-ocr-3.01alpha-r527\java>
>> J:\tesseract-ocr-3.01alpha-r527\java>
>>  
>> I may kindly be intimated where I made a mistake?
>> with warmest regards,
>> -sriranga(78yrs)
>>
>>
>> On Sat, Feb 5, 2011 at 7:28 PM, Sriranga(78yrsold) <
>> withblessi...@gmail.com> wrote:
>

Re: Provide/visualize baseline info?

2011-02-08 Thread Dmitry Silaev
Sriranga,

I'm glad you've succeeded.

Thanks to Zdenko for his guiding thought. I was aware of this Tess's
debugging capability but also strangely kept overlooking it.

I think the empty "test1.txt" is mostly a normal situation. I noticed this
fact also, but as I can remember there were times when it was filled with
recognized text. Probably it depends on actions you perform during the
ScrollView session or their specific sequence - I really have no time to
investigate.

Probably I'll publish more tutorials on what can be done using ScrollView.
But for my own needs this tool adds almost no value ((

*** I'm still seeking for somebody's help regarding this topic's subject.
***

Warm regards,
Dmitry Silaev




2011/2/8 Sriranga(78yrsold) 

> Dmitry,
>
> Congratulations !! successfully installed in winXP and tried using
> phototest.tif
> 1st commandline " tesseract phototest.tif test1 segdemo inter"  works well
> 2nd command line "tesseract phototest.tif test1 matdemo inter" wokrs well
> however it is observed that output text1 is zero KB - where i made a
> mistake?
> Just Now tested for Kannda script as well as Khem script - it displayed
> correctly and  only text file does contains
> 0 KB
>
> It would be nice if the screenshots of Modes, Display and others are
> reproduced for benefit of newbies/users - this will enable to view  their
> own lang other than english.Really it is boon to newbies like me.
>
> It would be better to have your extract  of your blog published in the wiki
> section for benefit of users of forum by the owner.
>
>
> With Warmest Regards,
> -sriranga(78yrs)
>
>
> 2011/2/6 Dmitry Silaev 
>
> Here are the brief instructions on how to set up the Tesseract interactive
>> debug environment (ScrollView) on Windows:
>>
>>1. Make sure you have Java Runtime Environment installed
>>2. Download my home-brewed single archived installation suite from
>>http://www.4shared.com/get/Z4gnbJdP/tess_debug.html
>>3. Unpack the installation suit
>>4. Run cmd.exe
>>5. Change working directory to where you've unpacked the installation
>>suit
>>6. Follow the instructions in
>>http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging to run
>>Tesseract+ScrollView from the command line
>>
>> To keep the reasonable forum post size here in Google Groups, I placed the
>> more verbose and overall nicer looking instructions in my blog at
>> http://rdaemons.blogspot.com/2011/02/tesseract-ocr-setting-up-interactive.html
>>
>> Warm regards,
>> Dmitry Silaev
>>
>>
>> 2011/2/6 Sriranga(78yrsold) 
>>
>> Dear dmitry,
>>> Though it may or may not help me much atleast it will be benefited for
>>> users of tesseract-ocr -
>>> for which users of the forum/newbies shall be thankful to you.
>>> With Warmest Regards,
>>> -sriranga(78yrs)
>>>
>>> On Sun, Feb 6, 2011 at 1:47 AM, Dmitry Silaev wrote:
>>>
 Dear Sriranga,

 I've just managed to start the interactive Tess's visualizer. I don't
 really know if it might help you much, but I can publish the step-by-step
 instructions on how to make it work. At least these instructions may help
 some of Tess community newbies. Most likely, I'll be able to publish this
 within the next 24 hours.

 However it's not a workable solution for me. I still in desperate need
 to know if I can provide Tess with my own baseline info using some
 high-level structures and methods. Or whatever information you may have on
 this subject.

 Warm regards,
 Dmitry Silaev




 2011/2/5 Sriranga(78yrsold) 

 Tried to install in WinXP but failed. extract of cmd is reproduced below
> for further guidance please.
> C:\>set JAVA_HOME=C:\jdk1.4
>
> C:\>.\build.bat all (win32)
> '.\build.bat' is not recognized as an internal or external command,
> operable program or batch file.
>
> C:\>
> C:\>j:
>
> J:\tesseract-ocr-3.01alpha-r527\java>.\build.bat all (win32)
> Piccolo Build System
> ---
> Building with classpath
> C:\jdk1.4\lib\tools.jar;.\lib\ant.jar;.\lib\junit.jar;
> Starting Ant...
> The system cannot find the path specified.
>
> J:\tesseract-ocr-3.01alpha-r527\java>
> J:\tesseract-ocr-3.01alpha-r527\java>
>
> I may kindly be intimated where I made a mistake?
> with warmest regards,
> -sriranga(78yrs)
>
>
> On Sat, Feb 5, 2011 at 7:28 PM, Sriranga(78yrsold) <
> withblessi...@gmail.com> wrote:
>
>> As per wiki instruction on debug mode , On Windows: The build process
>> for building ScrollView.jar is not defined. Instead copy piccolo-1.2.jar 
>> and
>> piccolox-1.2.jar to tesseract/java - which appears prescribed 
>> for*tesseract 2.04
>> *
>> .
>> It is presumed whether  by coping  piccolo-1.2jar and piccolox-1.2 to
>> tesseract/java folder of tesserac-3.01Alpha
>> ( r527) will work? For this purpose whether picol

Re: Provide/visualize baseline info?

2011-02-08 Thread Sriranga(78yrsold)
Dmitry,

Congratulations !! successfully installed in winXP and tried using
phototest.tif
1st commandline " tesseract phototest.tif test1 segdemo inter"  works well
2nd command line "tesseract phototest.tif test1 matdemo inter" wokrs well
however it is observed that output text1 is zero KB - where i made a
mistake?
Just Now tested for Kannda script as well as Khem script - it displayed
correctly and  only text file does contains
0 KB

It would be nice if the screenshots of Modes, Display and others are
reproduced for benefit of newbies/users - this will enable to view  their
own lang other than english.Really it is boon to newbies like me.

It would be better to have your extract  of your blog published in the wiki
section for benefit of users of forum by the owner.

With Warmest Regards,
-sriranga(78yrs)


2011/2/6 Dmitry Silaev 

> Here are the brief instructions on how to set up the Tesseract interactive
> debug environment (ScrollView) on Windows:
>
>1. Make sure you have Java Runtime Environment installed
>2. Download my home-brewed single archived installation suite from
>http://www.4shared.com/get/Z4gnbJdP/tess_debug.html
>3. Unpack the installation suit
>4. Run cmd.exe
>5. Change working directory to where you've unpacked the installation
>suit
>6. Follow the instructions in
>http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging to run
>Tesseract+ScrollView from the command line
>
> To keep the reasonable forum post size here in Google Groups, I placed the
> more verbose and overall nicer looking instructions in my blog at
> http://rdaemons.blogspot.com/2011/02/tesseract-ocr-setting-up-interactive.html
>
> Warm regards,
> Dmitry Silaev
>
>
> 2011/2/6 Sriranga(78yrsold) 
>
> Dear dmitry,
>> Though it may or may not help me much atleast it will be benefited for
>> users of tesseract-ocr -
>> for which users of the forum/newbies shall be thankful to you.
>> With Warmest Regards,
>> -sriranga(78yrs)
>>
>> On Sun, Feb 6, 2011 at 1:47 AM, Dmitry Silaev wrote:
>>
>>> Dear Sriranga,
>>>
>>> I've just managed to start the interactive Tess's visualizer. I don't
>>> really know if it might help you much, but I can publish the step-by-step
>>> instructions on how to make it work. At least these instructions may help
>>> some of Tess community newbies. Most likely, I'll be able to publish this
>>> within the next 24 hours.
>>>
>>> However it's not a workable solution for me. I still in desperate need to
>>> know if I can provide Tess with my own baseline info using some high-level
>>> structures and methods. Or whatever information you may have on this
>>> subject.
>>>
>>> Warm regards,
>>> Dmitry Silaev
>>>
>>>
>>>
>>>
>>> 2011/2/5 Sriranga(78yrsold) 
>>>
>>> Tried to install in WinXP but failed. extract of cmd is reproduced below
 for further guidance please.
 C:\>set JAVA_HOME=C:\jdk1.4

 C:\>.\build.bat all (win32)
 '.\build.bat' is not recognized as an internal or external command,
 operable program or batch file.

 C:\>
 C:\>j:

 J:\tesseract-ocr-3.01alpha-r527\java>.\build.bat all (win32)
 Piccolo Build System
 ---
 Building with classpath
 C:\jdk1.4\lib\tools.jar;.\lib\ant.jar;.\lib\junit.jar;
 Starting Ant...
 The system cannot find the path specified.

 J:\tesseract-ocr-3.01alpha-r527\java>
 J:\tesseract-ocr-3.01alpha-r527\java>

 I may kindly be intimated where I made a mistake?
 with warmest regards,
 -sriranga(78yrs)


 On Sat, Feb 5, 2011 at 7:28 PM, Sriranga(78yrsold) <
 withblessi...@gmail.com> wrote:

> As per wiki instruction on debug mode , On Windows: The build process
> for building ScrollView.jar is not defined. Instead copy piccolo-1.2.jar 
> and
> piccolox-1.2.jar to tesseract/java - which appears prescribed 
> for*tesseract 2.04
> *
> .
> It is presumed whether  by coping  piccolo-1.2jar and piccolox-1.2 to
> tesseract/java folder of tesserac-3.01Alpha
> ( r527) will work? For this purpose whether picolo.java1.2( compiled
> source 4.3MB)have to be downloaded for WinXP? Kindly confirm - since I am
> not programmer/developer.
> With Regards,
> -Sriranga(78yrs)
>
>
> 2011/2/5 Zdenko Podobný 
>
>  I am not sure what you if it helps you, but did you try debug mode (
>> http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging)?
>>
>> Zd.
>>
>>
>> Dňa 05.02.2011 01:33, daemon-s  wrote / napísal(a):
>>
>> Hi!
>>
>> I train Tess using separate images for every text line. Recognition is
>> also ran over single text line images. Recognition performs pretty
>> well, however there are many errors that, I believe, related to
>> misdetected baselines, during training or recognition - I don't know.
>> These include:
>>
>> " (double quote) detected as n
>> S detected as s (and vice ve

Re: Provide/visualize baseline info?

2011-02-06 Thread Dmitry Silaev
Here are the brief instructions on how to set up the Tesseract interactive
debug environment (ScrollView) on Windows:

   1. Make sure you have Java Runtime Environment installed
   2. Download my home-brewed single archived installation suite from
   http://www.4shared.com/get/Z4gnbJdP/tess_debug.html
   3. Unpack the installation suit
   4. Run cmd.exe
   5. Change working directory to where you've unpacked the installation
   suit
   6. Follow the instructions in
   http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging to run
   Tesseract+ScrollView from the command line

To keep the reasonable forum post size here in Google Groups, I placed the
more verbose and overall nicer looking instructions in my blog at
http://rdaemons.blogspot.com/2011/02/tesseract-ocr-setting-up-interactive.html

Warm regards,
Dmitry Silaev


2011/2/6 Sriranga(78yrsold) 

> Dear dmitry,
> Though it may or may not help me much atleast it will be benefited for
> users of tesseract-ocr -
> for which users of the forum/newbies shall be thankful to you.
> With Warmest Regards,
> -sriranga(78yrs)
>
> On Sun, Feb 6, 2011 at 1:47 AM, Dmitry Silaev wrote:
>
>> Dear Sriranga,
>>
>> I've just managed to start the interactive Tess's visualizer. I don't
>> really know if it might help you much, but I can publish the step-by-step
>> instructions on how to make it work. At least these instructions may help
>> some of Tess community newbies. Most likely, I'll be able to publish this
>> within the next 24 hours.
>>
>> However it's not a workable solution for me. I still in desperate need to
>> know if I can provide Tess with my own baseline info using some high-level
>> structures and methods. Or whatever information you may have on this
>> subject.
>>
>> Warm regards,
>> Dmitry Silaev
>>
>>
>>
>>
>> 2011/2/5 Sriranga(78yrsold) 
>>
>> Tried to install in WinXP but failed. extract of cmd is reproduced below
>>> for further guidance please.
>>> C:\>set JAVA_HOME=C:\jdk1.4
>>>
>>> C:\>.\build.bat all (win32)
>>> '.\build.bat' is not recognized as an internal or external command,
>>> operable program or batch file.
>>>
>>> C:\>
>>> C:\>j:
>>>
>>> J:\tesseract-ocr-3.01alpha-r527\java>.\build.bat all (win32)
>>> Piccolo Build System
>>> ---
>>> Building with classpath
>>> C:\jdk1.4\lib\tools.jar;.\lib\ant.jar;.\lib\junit.jar;
>>> Starting Ant...
>>> The system cannot find the path specified.
>>>
>>> J:\tesseract-ocr-3.01alpha-r527\java>
>>> J:\tesseract-ocr-3.01alpha-r527\java>
>>>
>>> I may kindly be intimated where I made a mistake?
>>> with warmest regards,
>>> -sriranga(78yrs)
>>>
>>>
>>> On Sat, Feb 5, 2011 at 7:28 PM, Sriranga(78yrsold) <
>>> withblessi...@gmail.com> wrote:
>>>
 As per wiki instruction on debug mode , On Windows: The build process
 for building ScrollView.jar is not defined. Instead copy piccolo-1.2.jar 
 and
 piccolox-1.2.jar to tesseract/java - which appears prescribed 
 for*tesseract 2.04
 *
 .
 It is presumed whether  by coping  piccolo-1.2jar and piccolox-1.2 to
 tesseract/java folder of tesserac-3.01Alpha
 ( r527) will work? For this purpose whether picolo.java1.2( compiled
 source 4.3MB)have to be downloaded for WinXP? Kindly confirm - since I am
 not programmer/developer.
 With Regards,
 -Sriranga(78yrs)


 2011/2/5 Zdenko Podobný 

  I am not sure what you if it helps you, but did you try debug mode (
> http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging)?
>
> Zd.
>
>
> Dňa 05.02.2011 01:33, daemon-s  wrote / napísal(a):
>
> Hi!
>
> I train Tess using separate images for every text line. Recognition is
> also ran over single text line images. Recognition performs pretty
> well, however there are many errors that, I believe, related to
> misdetected baselines, during training or recognition - I don't know.
> These include:
>
> " (double quote) detected as n
> S detected as s (and vice versa)
> V detected as v (and vice versa)
> etc.
>
> Is there any (preferably high-level) way to provide Tess with baseline
> info? Or at least obtain baseline info from Tess in order to visualize
> it further for debugging?
>
> Thanks,
> Dmitry
>
>
>   --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> To unsubscribe from this group, send email to
> tesseract-ocr+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>


>>>  --
>>> You received this message because you are subscribed to the Google Groups
>>> "tesseract-ocr" group.
>>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> tesseract-ocr+unsubscr...@googlegr

Re: Provide/visualize baseline info?

2011-02-05 Thread Sriranga(78yrsold)
Dear dmitry,
Though it may or may not help me much atleast it will be benefited for
users of tesseract-ocr -
for which users of the forum/newbies shall be thankful to you.
With Warmest Regards,
-sriranga(78yrs)

On Sun, Feb 6, 2011 at 1:47 AM, Dmitry Silaev  wrote:

> Dear Sriranga,
>
> I've just managed to start the interactive Tess's visualizer. I don't
> really know if it might help you much, but I can publish the step-by-step
> instructions on how to make it work. At least these instructions may help
> some of Tess community newbies. Most likely, I'll be able to publish this
> within the next 24 hours.
>
> However it's not a workable solution for me. I still in desperate need to
> know if I can provide Tess with my own baseline info using some high-level
> structures and methods. Or whatever information you may have on this
> subject.
>
> Warm regards,
> Dmitry Silaev
>
>
>
>
> 2011/2/5 Sriranga(78yrsold) 
>
> Tried to install in WinXP but failed. extract of cmd is reproduced below
>> for further guidance please.
>> C:\>set JAVA_HOME=C:\jdk1.4
>>
>> C:\>.\build.bat all (win32)
>> '.\build.bat' is not recognized as an internal or external command,
>> operable program or batch file.
>>
>> C:\>
>> C:\>j:
>>
>> J:\tesseract-ocr-3.01alpha-r527\java>.\build.bat all (win32)
>> Piccolo Build System
>> ---
>> Building with classpath
>> C:\jdk1.4\lib\tools.jar;.\lib\ant.jar;.\lib\junit.jar;
>> Starting Ant...
>> The system cannot find the path specified.
>>
>> J:\tesseract-ocr-3.01alpha-r527\java>
>> J:\tesseract-ocr-3.01alpha-r527\java>
>>
>> I may kindly be intimated where I made a mistake?
>> with warmest regards,
>> -sriranga(78yrs)
>>
>>
>> On Sat, Feb 5, 2011 at 7:28 PM, Sriranga(78yrsold) <
>> withblessi...@gmail.com> wrote:
>>
>>> As per wiki instruction on debug mode , On Windows: The build process for
>>> building ScrollView.jar is not defined. Instead copy piccolo-1.2.jar and
>>> piccolox-1.2.jar to tesseract/java - which appears prescribed for*tesseract 
>>> 2.04
>>> *
>>> .
>>> It is presumed whether  by coping  piccolo-1.2jar and piccolox-1.2 to
>>> tesseract/java folder of tesserac-3.01Alpha
>>> ( r527) will work? For this purpose whether picolo.java1.2( compiled
>>> source 4.3MB)have to be downloaded for WinXP? Kindly confirm - since I am
>>> not programmer/developer.
>>> With Regards,
>>> -Sriranga(78yrs)
>>>
>>>
>>> 2011/2/5 Zdenko Podobný 
>>>
>>>  I am not sure what you if it helps you, but did you try debug mode (
 http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging)?

 Zd.


 Dňa 05.02.2011 01:33, daemon-s  wrote / napísal(a):

 Hi!

 I train Tess using separate images for every text line. Recognition is
 also ran over single text line images. Recognition performs pretty
 well, however there are many errors that, I believe, related to
 misdetected baselines, during training or recognition - I don't know.
 These include:

 " (double quote) detected as n
 S detected as s (and vice versa)
 V detected as v (and vice versa)
 etc.

 Is there any (preferably high-level) way to provide Tess with baseline
 info? Or at least obtain baseline info from Tess in order to visualize
 it further for debugging?

 Thanks,
 Dmitry


   --
 You received this message because you are subscribed to the Google
 Groups "tesseract-ocr" group.
 To post to this group, send email to tesseract-ocr@googlegroups.com.
 To unsubscribe from this group, send email to
 tesseract-ocr+unsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/tesseract-ocr?hl=en.

>>>
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>> To unsubscribe from this group, send email to
>> tesseract-ocr+unsubscr...@googlegroups.com
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> To unsubscribe from this group, send email to
> tesseract-ocr+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.



Re: Provide/visualize baseline info?

2011-02-05 Thread Dmitry Silaev
Dear Sriranga,

I've just managed to start the interactive Tess's visualizer. I don't really
know if it might help you much, but I can publish the step-by-step
instructions on how to make it work. At least these instructions may help
some of Tess community newbies. Most likely, I'll be able to publish this
within the next 24 hours.

However it's not a workable solution for me. I still in desperate need to
know if I can provide Tess with my own baseline info using some high-level
structures and methods. Or whatever information you may have on this
subject.

Warm regards,
Dmitry Silaev




2011/2/5 Sriranga(78yrsold) 

> Tried to install in WinXP but failed. extract of cmd is reproduced below
> for further guidance please.
> C:\>set JAVA_HOME=C:\jdk1.4
>
> C:\>.\build.bat all (win32)
> '.\build.bat' is not recognized as an internal or external command,
> operable program or batch file.
>
> C:\>
> C:\>j:
>
> J:\tesseract-ocr-3.01alpha-r527\java>.\build.bat all (win32)
> Piccolo Build System
> ---
> Building with classpath
> C:\jdk1.4\lib\tools.jar;.\lib\ant.jar;.\lib\junit.jar;
> Starting Ant...
> The system cannot find the path specified.
>
> J:\tesseract-ocr-3.01alpha-r527\java>
> J:\tesseract-ocr-3.01alpha-r527\java>
>
> I may kindly be intimated where I made a mistake?
> with warmest regards,
> -sriranga(78yrs)
>
>
> On Sat, Feb 5, 2011 at 7:28 PM, Sriranga(78yrsold) <
> withblessi...@gmail.com> wrote:
>
>> As per wiki instruction on debug mode , On Windows: The build process for
>> building ScrollView.jar is not defined. Instead copy piccolo-1.2.jar and
>> piccolox-1.2.jar to tesseract/java - which appears prescribed for*tesseract 
>> 2.04
>> *
>> .
>> It is presumed whether  by coping  piccolo-1.2jar and piccolox-1.2 to
>> tesseract/java folder of tesserac-3.01Alpha
>> ( r527) will work? For this purpose whether picolo.java1.2( compiled
>> source 4.3MB)have to be downloaded for WinXP? Kindly confirm - since I am
>> not programmer/developer.
>> With Regards,
>> -Sriranga(78yrs)
>>
>>
>> 2011/2/5 Zdenko Podobný 
>>
>>  I am not sure what you if it helps you, but did you try debug mode (
>>> http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging)?
>>>
>>> Zd.
>>>
>>>
>>> Dňa 05.02.2011 01:33, daemon-s  wrote / napísal(a):
>>>
>>> Hi!
>>>
>>> I train Tess using separate images for every text line. Recognition is
>>> also ran over single text line images. Recognition performs pretty
>>> well, however there are many errors that, I believe, related to
>>> misdetected baselines, during training or recognition - I don't know.
>>> These include:
>>>
>>> " (double quote) detected as n
>>> S detected as s (and vice versa)
>>> V detected as v (and vice versa)
>>> etc.
>>>
>>> Is there any (preferably high-level) way to provide Tess with baseline
>>> info? Or at least obtain baseline info from Tess in order to visualize
>>> it further for debugging?
>>>
>>> Thanks,
>>> Dmitry
>>>
>>>
>>>   --
>>> You received this message because you are subscribed to the Google Groups
>>> "tesseract-ocr" group.
>>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> tesseract-ocr+unsubscr...@googlegroups.com
>>> .
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>>
>>
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> To unsubscribe from this group, send email to
> tesseract-ocr+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.



Re: Provide/visualize baseline info?

2011-02-05 Thread Sriranga(78yrsold)
Tried to install in WinXP but failed. extract of cmd is reproduced below for
further guidance please.
C:\>set JAVA_HOME=C:\jdk1.4

C:\>.\build.bat all (win32)
'.\build.bat' is not recognized as an internal or external command,
operable program or batch file.

C:\>
C:\>j:

J:\tesseract-ocr-3.01alpha-r527\java>.\build.bat all (win32)
Piccolo Build System
---
Building with classpath
C:\jdk1.4\lib\tools.jar;.\lib\ant.jar;.\lib\junit.jar;
Starting Ant...
The system cannot find the path specified.

J:\tesseract-ocr-3.01alpha-r527\java>
J:\tesseract-ocr-3.01alpha-r527\java>

I may kindly be intimated where I made a mistake?
with warmest regards,
-sriranga(78yrs)

On Sat, Feb 5, 2011 at 7:28 PM, Sriranga(78yrsold)
wrote:

> As per wiki instruction on debug mode , On Windows: The build process for
> building ScrollView.jar is not defined. Instead copy piccolo-1.2.jar and
> piccolox-1.2.jar to tesseract/java - which appears prescribed for*tesseract 
> 2.04
> *
> .
> It is presumed whether  by coping  piccolo-1.2jar and piccolox-1.2 to
> tesseract/java folder of tesserac-3.01Alpha
> ( r527) will work? For this purpose whether picolo.java1.2( compiled source
> 4.3MB)have to be downloaded for WinXP? Kindly confirm - since I am not
> programmer/developer.
> With Regards,
> -Sriranga(78yrs)
>
>
> 2011/2/5 Zdenko Podobný 
>
>  I am not sure what you if it helps you, but did you try debug mode (
>> http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging)?
>>
>> Zd.
>>
>>
>> Dňa 05.02.2011 01:33, daemon-s  wrote / napísal(a):
>>
>> Hi!
>>
>> I train Tess using separate images for every text line. Recognition is
>> also ran over single text line images. Recognition performs pretty
>> well, however there are many errors that, I believe, related to
>> misdetected baselines, during training or recognition - I don't know.
>> These include:
>>
>> " (double quote) detected as n
>> S detected as s (and vice versa)
>> V detected as v (and vice versa)
>> etc.
>>
>> Is there any (preferably high-level) way to provide Tess with baseline
>> info? Or at least obtain baseline info from Tess in order to visualize
>> it further for debugging?
>>
>> Thanks,
>> Dmitry
>>
>>
>>   --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>> To unsubscribe from this group, send email to
>> tesseract-ocr+unsubscr...@googlegroups.com
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.



Re: Provide/visualize baseline info?

2011-02-05 Thread Sriranga(78yrsold)
As per wiki instruction on debug mode , On Windows: The build process for
building ScrollView.jar is not defined. Instead copy piccolo-1.2.jar and
piccolox-1.2.jar to tesseract/java - which appears prescribed for* tesseract
2.04*
.
It is presumed whether  by coping  piccolo-1.2jar and piccolox-1.2 to
tesseract/java folder of tesserac-3.01Alpha
( r527) will work? For this purpose whether picolo.java1.2( compiled source
4.3MB)have to be downloaded for WinXP? Kindly confirm - since I am not
programmer/developer.
With Regards,
-Sriranga(78yrs)


2011/2/5 Zdenko Podobný 

>  I am not sure what you if it helps you, but did you try debug mode (
> http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging)?
>
> Zd.
>
>
> Dňa 05.02.2011 01:33, daemon-s  wrote / napísal(a):
>
> Hi!
>
> I train Tess using separate images for every text line. Recognition is
> also ran over single text line images. Recognition performs pretty
> well, however there are many errors that, I believe, related to
> misdetected baselines, during training or recognition - I don't know.
> These include:
>
> " (double quote) detected as n
> S detected as s (and vice versa)
> V detected as v (and vice versa)
> etc.
>
> Is there any (preferably high-level) way to provide Tess with baseline
> info? Or at least obtain baseline info from Tess in order to visualize
> it further for debugging?
>
> Thanks,
> Dmitry
>
>
>   --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> To unsubscribe from this group, send email to
> tesseract-ocr+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.



Re: Provide/visualize baseline info?

2011-02-05 Thread Zdenko Podobný
I am not sure what you if it helps you, but did you try debug mode 
(http://code.google.com/p/tesseract-ocr/wiki/ViewerDebugging)?


Zd.


Dn(a 05.02.2011 01:33, daemon-s  wrote / napísal(a):

Hi!

I train Tess using separate images for every text line. Recognition is
also ran over single text line images. Recognition performs pretty
well, however there are many errors that, I believe, related to
misdetected baselines, during training or recognition - I don't know.
These include:

" (double quote) detected as n
S detected as s (and vice versa)
V detected as v (and vice versa)
etc.

Is there any (preferably high-level) way to provide Tess with baseline
info? Or at least obtain baseline info from Tess in order to visualize
it further for debugging?

Thanks,
Dmitry



--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com.
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.