[jira] [Commented] (PDFBOX-2012) Extend CMAPEncodingEntry API

2014-04-02 Thread Philip Helger (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958578#comment-13958578 ] Philip Helger commented on PDFBOX-2012: --- These methods would be helpful to determin

[jira] [Commented] (PDFBOX-2013) Please extend PDTrueTypeFont API

2014-04-02 Thread Philip Helger (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958573#comment-13958573 ] Philip Helger commented on PDFBOX-2013: --- I'm trying get the Unicode Text thingy don

[jira] [Commented] (PDFBOX-2002) Show deprecation in the build / fix deprecated calls / delete longtime deprecated stuff

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958498#comment-13958498 ] Tilman Hausherr commented on PDFBOX-2002: - Same for PDFTextStripper, deleted 2 me

[jira] [Commented] (PDFBOX-2002) Show deprecation in the build / fix deprecated calls / delete longtime deprecated stuff

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958495#comment-13958495 ] Tilman Hausherr commented on PDFBOX-2002: - Same for COSDictionary, 1 method has b

Re: PDFTextPositions

2014-04-02 Thread Sireesha Chilakamarri
Thank you Alin. Appreciate your response. I f you can help with a sample code - if you are free sometime, maybe I get a better idea of your explanation. Sireesha On Wed, Apr 2, 2014 at 4:31 PM, Alin Mazilu wrote: > Not that I know of. PDFBox provides mostly low level access to the PDF > form

Re: PDFTextPositions

2014-04-02 Thread Alin Mazilu
Not that I know of. PDFBox provides mostly low level access to the PDF format. The only relatively easy way to do it would be keep the TextPosition objects and also grab the text output of the PDFTextStripper. Then you can search the output (a String) for the position of the word you are looking fo

Re: PDFTextPositions

2014-04-02 Thread Sireesha Chilakamarri
Hi Allin, I am able to run the PrintTextLocations example. This gives me the locations details for every characters. Is there a easier way to get coordinates for a Word as a whole, instead of all its characters? To Search for Text, I used a method prescribed in http://www.programming-free.com/20

Re: PDFTextPositions

2014-04-02 Thread Alin Mazilu
You have to extend the PDFTextStripper class and override the processTextPosition(...) method. From there the logic depends on you. You can also override the writePage() method to grab the charactersByArticle Vector and then you would look for your words in there by iterating over it. Basically in

PDFTextPositions

2014-04-02 Thread Sireesha Chilakamarri
Hi, I would like to Search and Obtain Text Position (X/Y/Width/height) for the searched Text. Suppose text "Hello_World" appears at different location and on different pages on the PDF document, I would like to see its X/Y/Width/Height for every occurence. How do I achieve this? Thank you, Sire

[jira] [Commented] (PDFBOX-2002) Show deprecation in the build / fix deprecated calls / delete longtime deprecated stuff

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958246#comment-13958246 ] Tilman Hausherr commented on PDFBOX-2002: - Same for COSDictionary, 2 methods were

[jira] [Commented] (PDFBOX-2002) Show deprecation in the build / fix deprecated calls / delete longtime deprecated stuff

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958241#comment-13958241 ] Tilman Hausherr commented on PDFBOX-2002: - The methods in COSInteger have been de

[jira] [Updated] (PDFBOX-2002) Show deprecation in the build / fix deprecated calls / delete longtime deprecated stuff

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2002: Summary: Show deprecation in the build / fix deprecated calls / delete longtime deprecated

[jira] [Commented] (PDFBOX-2013) Please extend PDTrueTypeFont API

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958133#comment-13958133 ] Tilman Hausherr commented on PDFBOX-2013: - It would be helpful if you'd write a f

Jenkins build is back to normal : PDFBox 1.8.x » Apache PDFBox #104

2014-04-02 Thread Apache Jenkins Server
See

Jenkins build is back to normal : PDFBox 1.8.x #104

2014-04-02 Thread Apache Jenkins Server
See

[jira] [Commented] (PDFBOX-1975) Improve TestImageIOUtils unit tests to check image resolution and compression

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958086#comment-13958086 ] Tilman Hausherr commented on PDFBOX-1975: - I added an extra test for ImageIOUtil.

[jira] [Updated] (PDFBOX-2013) Please extend PDTrueTypeFont API

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2013: Issue Type: Wish (was: Bug) > Please extend PDTrueTypeFont API >

[jira] [Updated] (PDFBOX-2011) Please extend base class "Encoding" with 2 methods to access global name2char and char2name maps

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2011: Issue Type: Wish (was: Bug) > Please extend base class "Encoding" with 2 methods to acces

[jira] [Updated] (PDFBOX-2012) Extend CMAPEncodingEntry API

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2012: Issue Type: Wish (was: Bug) > Extend CMAPEncodingEntry API >

[jira] [Updated] (PDFBOX-2010) Please make "protected PDFont getDescendantFont()" public as it is in 2.0.0

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2010: Issue Type: Wish (was: Bug) > Please make "protected PDFont getDescendantFont()" public a

Re: Apache PDFBox April 2014 board report due

2014-04-02 Thread DImuthu Upeksha
Hi all, Regarding GSoC project idea https://issues.apache.org/jira/browse/PDFBOX-1912 Proposal : [1] Currently implemented parts - JNI wrapper for Tesseract C++ OCR API [2] OCR plugin for PDFBox [3] [1] https://www.dropbox.com/s/z63xhmii4hdtivx/AbstractApachePDFBox-OpticalCharacterRecognition.p

[jira] [Comment Edited] (PDFBOX-2010) Please make "protected PDFont getDescendantFont()" public as it is in 2.0.0

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958026#comment-13958026 ] Tilman Hausherr edited comment on PDFBOX-2010 at 4/2/14 6:50 PM: --

[jira] [Resolved] (PDFBOX-2010) Please make "protected PDFont getDescendantFont()" public as it is in 2.0.0

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-2010. - Resolution: Fixed Fix Version/s: 1.8.5 Assignee: Tilman Hausherr Since i

[jira] [Comment Edited] (PDFBOX-1975) Improve TestImageIOUtils unit tests to check image resolution and compression

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957942#comment-13957942 ] Tilman Hausherr edited comment on PDFBOX-1975 at 4/2/14 6:41 PM: --

[jira] [Commented] (PDFBOX-2012) Extend CMAPEncodingEntry API

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958007#comment-13958007 ] Tilman Hausherr commented on PDFBOX-2012: - For a start, I committed your optimiza

Build failed in Jenkins: PDFBox 1.8.x #103

2014-04-02 Thread Apache Jenkins Server
See Changes: [tilman] PDFBOX-2008: fix error in check of max generation number [tilman] PDFBOX-1975: Transferred the improved ImageIOUtils and its tests from the trunk, modified for the pre-refactoring API; modified the POM to get the

Build failed in Jenkins: PDFBox 1.8.x » Apache PDFBox #103

2014-04-02 Thread Apache Jenkins Server
See Changes: [tilman] PDFBOX-2008: fix error in check of max generation number [tilman] PDFBOX-1975: Transferred the improved ImageIOUtils and its tests from the trunk, modified for the pre-refactoring API; mod

[jira] [Created] (PDFBOX-2013) Please extend PDTrueTypeFont API

2014-04-02 Thread Philip Helger (JIRA)
Philip Helger created PDFBOX-2013: - Summary: Please extend PDTrueTypeFont API Key: PDFBOX-2013 URL: https://issues.apache.org/jira/browse/PDFBOX-2013 Project: PDFBox Issue Type: Bug

[jira] [Commented] (PDFBOX-2008) Off-by-one error in BaseParser.readGenerationNumber()

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957983#comment-13957983 ] Tilman Hausherr commented on PDFBOX-2008: - The bug was introduced with the patch

[jira] [Resolved] (PDFBOX-2008) Off-by-one error in BaseParser.readGenerationNumber()

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-2008. - Resolution: Fixed > Off-by-one error in BaseParser.readGenerationNumber() >

[jira] [Created] (PDFBOX-2012) Extend CMAPEncodingEntry API

2014-04-02 Thread Philip Helger (JIRA)
Philip Helger created PDFBOX-2012: - Summary: Extend CMAPEncodingEntry API Key: PDFBOX-2012 URL: https://issues.apache.org/jira/browse/PDFBOX-2012 Project: PDFBox Issue Type: Bug Com

[jira] [Updated] (PDFBOX-2008) Off-by-one error in BaseParser.readGenerationNumber()

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2008: Fix Version/s: 2.0.0 1.8.5 > Off-by-one error in BaseParser.readGenerat

[jira] [Updated] (PDFBOX-2008) Off-by-one error in BaseParser.readGenerationNumber()

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2008: Affects Version/s: 2.0.0 1.8.5 > Off-by-one error in BaseParser.rea

[jira] [Commented] (PDFBOX-2008) Off-by-one error in BaseParser.readGenerationNumber()

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957972#comment-13957972 ] Tilman Hausherr commented on PDFBOX-2008: - Although I wonder why you would have a

[jira] [Created] (PDFBOX-2011) Please extend base class "Encoding" with 2 methods to access global name2char and char2name maps

2014-04-02 Thread Philip Helger (JIRA)
Philip Helger created PDFBOX-2011: - Summary: Please extend base class "Encoding" with 2 methods to access global name2char and char2name maps Key: PDFBOX-2011 URL: https://issues.apache.org/jira/browse/PDFBOX-2011

[jira] [Created] (PDFBOX-2010) Please make "protected PDFont getDescendantFont()" public as it is in 2.0.0

2014-04-02 Thread Philip Helger (JIRA)
Philip Helger created PDFBOX-2010: - Summary: Please make "protected PDFont getDescendantFont()" public as it is in 2.0.0 Key: PDFBOX-2010 URL: https://issues.apache.org/jira/browse/PDFBOX-2010 Project

[jira] [Commented] (PDFBOX-1975) Improve TestImageIOUtils unit tests to check image resolution and compression

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957942#comment-13957942 ] Tilman Hausherr commented on PDFBOX-1975: - I transferred the improved ImageIOUtil

[jira] [Updated] (PDFBOX-1975) Improve TestImageIOUtils unit tests to check image resolution and compression

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1975: Affects Version/s: 1.8.5 > Improve TestImageIOUtils unit tests to check image resolution a

[jira] [Updated] (PDFBOX-1975) Improve TestImageIOUtils unit tests to check image resolution and compression

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1975: Fix Version/s: 1.8.5 > Improve TestImageIOUtils unit tests to check image resolution and c

[jira] [Created] (PDFBOX-2009) PDFStreamEngine.processEncodedText incorrectly handling UTF-16 text with BOM FEFF

2014-04-02 Thread Philip Helger (JIRA)
Philip Helger created PDFBOX-2009: - Summary: PDFStreamEngine.processEncodedText incorrectly handling UTF-16 text with BOM FEFF Key: PDFBOX-2009 URL: https://issues.apache.org/jira/browse/PDFBOX-2009 P

[jira] [Updated] (PDFBOX-2009) PDFStreamEngine.processEncodedText incorrectly handling UTF-16 text with BOM FEFF

2014-04-02 Thread Philip Helger (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Helger updated PDFBOX-2009: -- Description: When having a text print operation like Tj than the PDFStreamEngine.processEncod

[jira] [Created] (PDFBOX-2008) Off-by-one error in BaseParser.readGenerationNumber()

2014-04-02 Thread Christian S. (JIRA)
Christian S. created PDFBOX-2008: Summary: Off-by-one error in BaseParser.readGenerationNumber() Key: PDFBOX-2008 URL: https://issues.apache.org/jira/browse/PDFBOX-2008 Project: PDFBox Issue

Re: Apache PDFBox April 2014 board report due

2014-04-02 Thread Maruan Sahyoun
Hi, to unsubscribe please follow the information at http://pdfbox.apache.org/mailinglists.html BR Maruan Sahyoun Am 02.04.2014 um 10:02 schrieb Somnath Jadhav : > Hello , > > Can I know how to unsubscribe from this alerts ? > > I no longer needs those alerts and I cant see any option for > u

Re: Apache PDFBox April 2014 board report due

2014-04-02 Thread Somnath Jadhav
Hello , Can I know how to unsubscribe from this alerts ? I no longer needs those alerts and I cant see any option for unsubscribe..Please help. Regards, Somnath Jadhav, +91-9270153230 www.somnathjadhav.com On 2 April 2014 12:58, Timo Boehme wrote: > +1 with the GSoC additions. > > > Best, >

Re: Apache PDFBox April 2014 board report due

2014-04-02 Thread Timo Boehme
+1 with the GSoC additions. Best, Timo Am 30.03.2014 16:29, schrieb Andreas Lehmkuehler: Hi, find attached a quick draft of the board report we're expected to submit this month. @Johm, @Tilman Please add something about the GSoC status. Any further comments, objections or additions? T

[jira] [Commented] (PDFBOX-2007) Performance regression since PDFRenderer

2014-04-02 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957412#comment-13957412 ] Tilman Hausherr commented on PDFBOX-2007: - Btw don't worry if the answer to the f