Re: [GSoC 2014]Optical Character Recognition project - Introduction

2014-03-24 Thread DImuthu Upeksha
Hi John, I looked at processTextPosition method in PDFTextStripper. But I couldn't understand actual process happening inside the method. What should be the input for that method? In my case I have words with bounding box's coordinates. How can I make those data to compatible with the input of

[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread brijesh (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944868#comment-13944868 ] brijesh commented on PDFBOX-1994: - Hi, while testing the .jar , getting exception,

[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Timo Boehme (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944880#comment-13944880 ] Timo Boehme commented on PDFBOX-1994: - Looks like you are missing a number of

[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944897#comment-13944897 ] Tilman Hausherr commented on PDFBOX-1994: - Btw the 2nd param of loadnonseq can be

[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread brijesh (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944913#comment-13944913 ] brijesh commented on PDFBOX-1994: - hi , tested with PDDocument doc =

[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944924#comment-13944924 ] Tilman Hausherr commented on PDFBOX-1994: - Could you add a writeln before and

[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread brijesh (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944932#comment-13944932 ] brijesh commented on PDFBOX-1994: - yes, add println statements in between . that is why i

[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread brijesh (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944946#comment-13944946 ] brijesh commented on PDFBOX-1994: - hi , i am using PDFBox 1.8 and java 1,4 locally , it

[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944983#comment-13944983 ] Tilman Hausherr commented on PDFBOX-1994: - The problem is that as long as you

[jira] [Created] (PDFBOX-1996) PDSeparation separtion optimization

2014-03-24 Thread Dave Smith (JIRA)
Dave Smith created PDFBOX-1996: -- Summary: PDSeparation separtion optimization Key: PDFBOX-1996 URL: https://issues.apache.org/jira/browse/PDFBOX-1996 Project: PDFBox Issue Type: Improvement

[jira] [Updated] (PDFBOX-1996) PDSeparation separtion optimization

2014-03-24 Thread Dave Smith (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Smith updated PDFBOX-1996: --- Attachment: pdfbox.patch Patch that caches black and white values PDSeparation separtion

[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945386#comment-13945386 ] Tilman Hausherr commented on PDFBOX-1994: - What happens if you use the PDF app

[jira] [Updated] (PDFBOX-1997) CIE LAB item missing in rendering

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1997: Summary: CIE LAB item missing in rendering (was: CIA LAB item missing in rendering)

[jira] [Updated] (PDFBOX-1996) PDSeparation separation optimization

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1996: Summary: PDSeparation separation optimization (was: PDSeparation separtion optimization)

[jira] [Updated] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1996: Summary: PDSeparation optimization (was: PDSeparation separation optimization)

[jira] [Commented] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945463#comment-13945463 ] Tilman Hausherr commented on PDFBOX-1996: - While I'm not the one who will commit

[jira] [Created] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-1998: --- Summary: PDF rendering with reversed colors Key: PDFBOX-1998 URL: https://issues.apache.org/jira/browse/PDFBOX-1998 Project: PDFBox Issue Type: Bug

[jira] [Updated] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1998: Attachment: PDFBOX-1998.PDF PDF rendering with reversed colors

[jira] [Commented] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945578#comment-13945578 ] Tilman Hausherr commented on PDFBOX-1998: - What I did notice: the PDF has this:

[jira] [Commented] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945764#comment-13945764 ] Tilman Hausherr commented on PDFBOX-1998: - Heh heh :-) I found this in

[jira] [Comment Edited] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945764#comment-13945764 ] Tilman Hausherr edited comment on PDFBOX-1998 at 3/24/14 11:08 PM:

[jira] [Commented] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945852#comment-13945852 ] John Hewson commented on PDFBOX-1996: - Which type of function does your PDF use for

[jira] [Updated] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewson updated PDFBOX-1996: Priority: Minor (was: Major) PDSeparation optimization -

[jira] [Commented] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945856#comment-13945856 ] John Hewson commented on PDFBOX-1998: - Looks good to me, the ImageMask is on the

[jira] [Updated] (PDFBOX-1999) JBIG2Filter - FlateDecoded Globals Table

2014-03-24 Thread Dave Smith (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Smith updated PDFBOX-1999: --- Attachment: pdfbox.patch Here is the fix... JBIG2Filter - FlateDecoded Globals Table

[jira] [Created] (PDFBOX-1999) JBIG2Filter - FlateDecoded Globals Table

2014-03-24 Thread Dave Smith (JIRA)
Dave Smith created PDFBOX-1999: -- Summary: JBIG2Filter - FlateDecoded Globals Table Key: PDFBOX-1999 URL: https://issues.apache.org/jira/browse/PDFBOX-1999 Project: PDFBox Issue Type: Bug

[jira] [Commented] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread Dave Smith (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946020#comment-13946020 ] Dave Smith commented on PDFBOX-1996: The pdf is not public. I can send it to you off