[jira] [Commented] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217505#comment-15217505 ] Andreas Lehmkühler commented on PDFBOX-3295: [~tilman] Thanks again for the f

Re: Custom glyphlist for text extraction

2016-03-29 Thread John Hewson
> On 30 Mar 2016, at 01:59, John Hewson wrote: > > > > -- John > >> On 29 Mar 2016, at 21:31, Daniel Persson wrote: >> >> Hi Maruan >> >> I extended the class to override that. Then again I extended the >> PDFStreamEngine because I required more extensive changes but the principle >> shou

Re: Custom glyphlist for text extraction

2016-03-29 Thread John Hewson
-- John > On 29 Mar 2016, at 21:31, Daniel Persson wrote: > > Hi Maruan > > I extended the class to override that. Then again I extended the > PDFStreamEngine because I required more extensive changes but the principle > should be sound. That's right but subclasses of PDFTextStreamEngine suc

[jira] [Comment Edited] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216890#comment-15216890 ] Tilman Hausherr edited comment on PDFBOX-3295 at 3/29/16 9:46 PM: -

[jira] [Comment Edited] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216890#comment-15216890 ] Tilman Hausherr edited comment on PDFBOX-3295 at 3/29/16 9:42 PM: -

[jira] [Reopened] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr reopened PDFBOX-3295: - The build failed, and there are problems with several files: - PDFBOX-1907, which has 2 pages

Jenkins build became unstable: PDFBox 2.0.x » Apache PDFBox #7

2016-03-29 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

Jenkins build became unstable: PDFBox 2.0.x #7

2016-03-29 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

Jenkins build is back to normal : PDFBox 1.8.x » PDFBox parent #562

2016-03-29 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

Jenkins build is back to normal : PDFBox 1.8.x #562

2016-03-29 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

Jenkins build became unstable: PDFBox-trunk » Apache PDFBox #2802

2016-03-29 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

Jenkins build became unstable: PDFBox-trunk #2802

2016-03-29 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

[jira] [Commented] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216808#comment-15216808 ] Andreas Lehmkühler commented on PDFBOX-3295: After eliminating the mentioned

[jira] [Resolved] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler resolved PDFBOX-3295. Resolution: Fixed > Improve parsing performance of object streams > ---

[jira] [Updated] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-3295: --- Affects Version/s: 1.8.11 > Improve parsing performance of object streams > -

[jira] [Updated] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-3295: --- Fix Version/s: 1.8.12 > Improve parsing performance of object streams > -

[jira] [Commented] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216804#comment-15216804 ] ASF subversion and git services commented on PDFBOX-3295: - Commit

[jira] [Commented] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216799#comment-15216799 ] ASF subversion and git services commented on PDFBOX-3295: - Commit

[jira] [Commented] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216795#comment-15216795 ] ASF subversion and git services commented on PDFBOX-3295: - Commit

[jira] [Created] (PDFBOX-3295) Improve parsing performance of object streams

2016-03-29 Thread JIRA
Andreas Lehmkühler created PDFBOX-3295: -- Summary: Improve parsing performance of object streams Key: PDFBOX-3295 URL: https://issues.apache.org/jira/browse/PDFBOX-3295 Project: PDFBox Is

Re: Custom glyphlist for text extraction

2016-03-29 Thread Daniel Persson
Hi Maruan I extended the class to override that. Then again I extended the PDFStreamEngine because I required more extensive changes but the principle should be sound. best regards Daniel On Tue, Mar 29, 2016, 20:12 Maruan Sahyoun wrote: > Hi, > > I was wondering if we lost the capability to s

Custom glyphlist for text extraction

2016-03-29 Thread Maruan Sahyoun
Hi, I was wondering if we lost the capability to supply a custom glyph list file as discussed here: http://stackoverflow.com/questions/35972788/how-to-read-control-characters-in-a-pdf-using-java/36034529#36034529 PDFTextStreamEngine seems to have it hardcoded ["org/apache/pdfbox/resources/glyp

[jira] [Closed] (PDFBOX-3294) After loading pdf through PDFRenderer(pdf), We are trying to take the first page and convert that into preview image, but it is sometimes gets out of memory.

2016-03-29 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed PDFBOX-3294. --- Resolution: Not A Problem Closing (you can still comment), I was able to successfully run PDF

[jira] [Updated] (PDFBOX-3293) Font glyphs with overlapping paths not rendered correctly

2016-03-29 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3293: Attachment: fontforge.png PDFBOX-3293.ttf PDFBOX-3293_reduce

RE: shading/relocating 1.8.x?

2016-03-29 Thread Allison, Timothy B.
Got it. That's what I had assumed. I'll hold off on opening truncated file issue(s) on PDFBox's JIRA... I opened TIKA-1912 to track this on our side. Thank you, again! Best, Tim -Original Message- From: Andreas Lehmkühler [mailto:andr...@lehmi.de] Sent: Tuesday, March 29,

RE: shading/relocating 1.8.x?

2016-03-29 Thread Andreas Lehmkühler
> "Allison, Timothy B." hat am 28. März 2016 um 21:02 > geschrieben: > > > Oh, wow, so it really might be possible without too much work? I'm more than > happy to supply examples. :) Ups, it isn't as simply as it sounds. If we simply swallow the exception pdfbox most likel runs into a NPE. IMH

[jira] [Commented] (PDFBOX-922) True type PDFont subclass only supports WinAnsiEncoding (hardcoded!)

2016-03-29 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215842#comment-15215842 ] Andreas Lehmkühler commented on PDFBOX-922: --- [~fibe] First of all, don't use alr

Re: PDFBox 2.1

2016-03-29 Thread Andreas Lehmkühler
> Maruan Sahyoun hat am 29. März 2016 um 12:28 > geschrieben: > > > Hi, > > now as PDFBox 2.0 is out what about collecting ideas for 2.1? Could put that > on our website the same way we had the old ideas published. Goodi idea! > From my perspective: > - simplify creation of AcroForm fields > -

PDFBox 2.1

2016-03-29 Thread Maruan Sahyoun
Hi, now as PDFBox 2.0 is out what about collecting ideas for 2.1? Could put that on our website the same way we had the old ideas published. From my perspective: - simplify creation of AcroForm fields - appearance generation for new AcroForm fields - rework/enhancement to the plain text formatte

[jira] [Commented] (PDFBOX-3284) Big Pdf parsing to text - Out of memory

2016-03-29 Thread Nicolas Daniels (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215729#comment-15215729 ] Nicolas Daniels commented on PDFBOX-3284: - Thanks for the investigations. I thin

[jira] [Commented] (PDFBOX-922) True type PDFont subclass only supports WinAnsiEncoding (hardcoded!)

2016-03-29 Thread Filip Bellander (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215705#comment-15215705 ] Filip Bellander commented on PDFBOX-922: I tried to update to 2.0.0 today. Now sud

[jira] [Commented] (PDFBOX-3294) After loading pdf through PDFRenderer(pdf), We are trying to take the first page and convert that into preview image, but it is sometimes gets out of memory.

2016-03-29 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215592#comment-15215592 ] Tilman Hausherr commented on PDFBOX-3294: - That is because your PDF is huge. The