Re: migrating Tika to 2.0.0

2015-07-07 Thread Tilman Hausherr
Am 08.07.2015 um 04:19 schrieb Allison, Timothy B.: There are two embedded/inline images (not regular attachments) that are processed by pdfbox app's ExtractImages. In 1.8.9, there's a tiff (lightbulb) and a jpeg (flag/fireworks). With trunk, there is a log warning saying that tiff isn't supp

[jira] [Commented] (PDFBOX-2857) Saving XFA document caused prompt saying Extended features has been disabled

2015-07-07 Thread Maruan Sahyoun (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14618012#comment-14618012 ] Maruan Sahyoun commented on PDFBOX-2857: The error message is expected and that's

[jira] [Commented] (PDFBOX-2857) Saving XFA document caused prompt saying Extended features has been disabled

2015-07-07 Thread tmjee (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617983#comment-14617983 ] tmjee commented on PDFBOX-2857: --- Tired to modify a pdf with xfa through {code:borderStyle

[jira] [Updated] (PDFBOX-2857) Saving XFA document caused prompt saying Extended features has been disabled

2015-07-07 Thread tmjee (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tmjee updated PDFBOX-2857: -- Attachment: sample.pdf Sample of pdf file with xfa. > Saving XFA document caused prompt saying Extended featur

[jira] [Updated] (PDFBOX-2857) Saving XFA document caused prompt saying Extended features has been disabled

2015-07-07 Thread tmjee (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tmjee updated PDFBOX-2857: -- Description: Using the following code to read and write back a pdf (with xfa) caused the newly written pdf whe

[jira] [Commented] (PDFBOX-2857) Saving XFA document caused prompt saying Extended features has been disabled

2015-07-07 Thread tmjee (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617968#comment-14617968 ] tmjee commented on PDFBOX-2857: --- Also when we do {code:borderStyle=solid} doc.saveIncre

[jira] [Created] (PDFBOX-2857) Saving XFA document caused prompt saying Extended features has been disabled

2015-07-07 Thread tmjee (JIRA)
tmjee created PDFBOX-2857: - Summary: Saving XFA document caused prompt saying Extended features has been disabled Key: PDFBOX-2857 URL: https://issues.apache.org/jira/browse/PDFBOX-2857 Project: PDFBox

RE: PDFBox 1.8.10 release

2015-07-07 Thread Allison, Timothy B.
Had to dig into code to make sure that our extension of PDFTextStripper winds up calling the code that you are interested in. I think it does, so, yes, all we'd have to do is two builds, one with and one without the change. Should I make the change locally or do you plan to commit? Thank you!

RE: migrating Tika to 2.0.0

2015-07-07 Thread Allison, Timothy B.
There are two embedded/inline images (not regular attachments) that are processed by pdfbox app's ExtractImages. In 1.8.9, there's a tiff (lightbulb) and a jpeg (flag/fireworks). With trunk, there is a log warning saying that tiff isn't supported and then an empty tiff file and a jpeg. -O

[jira] [Assigned] (PDFBOX-2856) Markedly slower processing for particular file in 2.0.0-trunk vs 1.8.9

2015-07-07 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewson reassigned PDFBOX-2856: --- Assignee: John Hewson > Markedly slower processing for particular file in 2.0.0-trunk vs 1.8

[jira] [Commented] (PDFBOX-2856) Markedly slower processing for particular file in 2.0.0-trunk vs 1.8.9

2015-07-07 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617789#comment-14617789 ] John Hewson commented on PDFBOX-2856: - The slowness is due to the lack of inter-page

[jira] [Comment Edited] (PDFBOX-2856) Markedly slower processing for particular file in 2.0.0-trunk vs 1.8.9

2015-07-07 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617789#comment-14617789 ] John Hewson edited comment on PDFBOX-2856 at 7/8/15 1:13 AM: -

[jira] [Commented] (PDFBOX-2370) Move caching outside of PDResources

2015-07-07 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617786#comment-14617786 ] John Hewson commented on PDFBOX-2370: - We now use an Iterator for PDPage, which solve

Jenkins build is back to stable : PDFBox-trunk (JDK 1.6.0 unlimited security) #23

2015-07-07 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

Jenkins build is back to stable : PDFBox-trunk (JDK 1.6.0 unlimited security) » Apache PDFBox #23

2015-07-07 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pd

Jenkins build is still unstable: PDFBox-trunk #2278

2015-07-07 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

Jenkins build is still unstable: PDFBox-trunk » Apache PDFBox #2278

2015-07-07 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

[jira] [Commented] (PDFBOX-2841) Make it easier to work with RadioButton Groups

2015-07-07 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617395#comment-14617395 ] ASF subversion and git services commented on PDFBOX-2841: - Commit

[jira] [Commented] (PDFBOX-2854) TTFSubsetter NoSuchElementException

2015-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617379#comment-14617379 ] Tilman Hausherr commented on PDFBOX-2854: - This fixes the first problem, but not

[jira] [Commented] (PDFBOX-2854) TTFSubsetter NoSuchElementException

2015-07-07 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617373#comment-14617373 ] ASF subversion and git services commented on PDFBOX-2854: - Commit

[jira] [Commented] (PDFBOX-2854) TTFSubsetter NoSuchElementException

2015-07-07 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617375#comment-14617375 ] ASF subversion and git services commented on PDFBOX-2854: - Commit

[jira] [Commented] (PDFBOX-2854) TTFSubsetter NoSuchElementException

2015-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617304#comment-14617304 ] Tilman Hausherr commented on PDFBOX-2854: - There are two problems. The first is t

Re: PDFBox 1.8.10 release

2015-07-07 Thread Tilman Hausherr
Am 07.07.2015 um 19:16 schrieb Allison, Timothy B.: Will create separate wrapper that relies solely on PDFTextStripper instead of what we currently do now. Results in a few days... This sounds like work. Isn't all that is needed to run a version before the change, one after the change, and d

Re: migrating Tika to 2.0.0

2015-07-07 Thread Tilman Hausherr
Am 07.07.2015 um 21:39 schrieb Allison, Timothy B.: Thank you, Andreas. I opened PDFBox-2856. How about tiffs not being handled by ExtractImages...is this expected? I also noticed that the tiff file is no longer extracted (2.0.0 logger says tiff not handled, but a tiff is extracted with 1

[jira] [Resolved] (PDFBOX-2855) Allow some flexibility for divergences from the standard on Seq vs Bag in DomXMPParser

2015-07-07 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved PDFBOX-2855. - Resolution: Won't Fix > Allow some flexibility for divergences from the standard on Seq vs Bag in

[jira] [Commented] (PDFBOX-2855) Allow some flexibility for divergences from the standard on Seq vs Bag in DomXMPParser

2015-07-07 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617246#comment-14617246 ] Tim Allison commented on PDFBOX-2855: - Got it. Sorry to hear it, but it makes sense.

RE: migrating Tika to 2.0.0

2015-07-07 Thread Allison, Timothy B.
Thank you, Andreas. I opened PDFBox-2856. How about tiffs not being handled by ExtractImages...is this expected? >I also noticed that the tiff file is no longer extracted (2.0.0 logger > says tiff not handled, but a tiff is extracted with 1.8.9). Is this expected? Thank you, again. Best,

[jira] [Updated] (PDFBOX-2856) Markedly slower processing for particular file in 2.0.0-trunk vs 1.8.9

2015-07-07 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated PDFBOX-2856: Summary: Markedly slower processing for particular file in 2.0.0-trunk vs 1.8.9 (was: Markedly slo

[jira] [Commented] (PDFBOX-2855) Allow some flexibility for divergences from the standard on Seq vs Bag in DomXMPParser

2015-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617244#comment-14617244 ] Tilman Hausherr commented on PDFBOX-2855: - Uh... yes. Yeah, you should keep your

[jira] [Updated] (PDFBOX-2856) Markedly slower processing for file in 2.0.0-trunk vs 1.8.9

2015-07-07 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated PDFBOX-2856: Attachment: testPDF_childAttachments.pdf triggering file > Markedly slower processing for file in

[jira] [Created] (PDFBOX-2856) Markedly slower processing for file in 2.0.0-trunk vs 1.8.9

2015-07-07 Thread Tim Allison (JIRA)
Tim Allison created PDFBOX-2856: --- Summary: Markedly slower processing for file in 2.0.0-trunk vs 1.8.9 Key: PDFBOX-2856 URL: https://issues.apache.org/jira/browse/PDFBOX-2856 Project: PDFBox I

[jira] [Commented] (PDFBOX-2855) Allow some flexibility for divergences from the standard on Seq vs Bag in DomXMPParser

2015-07-07 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617227#comment-14617227 ] Tim Allison commented on PDFBOX-2855: - Oh...Ok. I guess we'll have to keep our own c

[jira] [Commented] (PDFBOX-2855) Allow some flexibility for divergences from the standard on Seq vs Bag in DomXMPParser

2015-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617209#comment-14617209 ] Tilman Hausherr commented on PDFBOX-2855: - I doubt that this will be possible, xm

Re: migrating Tika to 2.0.0

2015-07-07 Thread Andreas Lehmkuehler
Hi, Am 07.07.2015 um 18:59 schrieb Allison, Timothy B.: All, As part of TIKA-1285, I updated Jeremy Anderson's original patch for our wrapper for PDFBox 2.0.0 on Tika. I'm having some problems running the unit tests because at least one of our files [0] is causing hefty resource utilizat

[jira] [Commented] (PDFBOX-19) Linearize command line tool

2015-07-07 Thread Abel Salgado Romero (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617192#comment-14617192 ] Abel Salgado Romero commented on PDFBOX-19: --- We have assessed it a couple of time

[jira] [Commented] (PDFBOX-19) Linearize command line tool

2015-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617094#comment-14617094 ] Tilman Hausherr commented on PDFBOX-19: --- Have you tried using "qpdf"? http://qpdf.sou

[jira] [Commented] (PDFBOX-19) Linearize command line tool

2015-07-07 Thread Abel Salgado Romero (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617069#comment-14617069 ] Abel Salgado Romero commented on PDFBOX-19: --- I know this is an old issue, but I w

[jira] [Created] (PDFBOX-2855) Allow some flexibility for divergences from the standard on Seq vs Bag in DomXMPParser

2015-07-07 Thread Tim Allison (JIRA)
Tim Allison created PDFBOX-2855: --- Summary: Allow some flexibility for divergences from the standard on Seq vs Bag in DomXMPParser Key: PDFBOX-2855 URL: https://issues.apache.org/jira/browse/PDFBOX-2855

RE: PDFBox 1.8.10 release

2015-07-07 Thread Allison, Timothy B.
Will create separate wrapper that relies solely on PDFTextStripper instead of what we currently do now. Results in a few days... Thank you, Tilman, for pinging me. :) -Original Message- From: Andreas Lehmkühler [mailto:andr...@lehmi.de] Sent: Thursday, July 02, 2015 2:24 AM To: dev@pdf

migrating Tika to 2.0.0

2015-07-07 Thread Allison, Timothy B.
All, As part of TIKA-1285, I updated Jeremy Anderson's original patch for our wrapper for PDFBox 2.0.0 on Tika. I'm having some problems running the unit tests because at least one of our files [0] is causing hefty resource utilization, which sends my laptop into paging. The parse does even

Re: rotation info of PDImage

2015-07-07 Thread Tilman Hausherr
Am 07.07.2015 um 10:56 schrieb Manfred Pock: Hi, is there a possiblity that i can get the rotation of an PDImageObject. The rotation of the page is 90 degrees, and it seems to be that the embedded Pdimage also have this rotation. How can i get this information from PDImage-Obj? Not at all.

[jira] [Commented] (PDFBOX-2853) CCITT: Background is rendered as transparent color

2015-07-07 Thread Jakob Pyttlik (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616651#comment-14616651 ] Jakob Pyttlik commented on PDFBOX-2853: --- Thank you for looking into this - Unfortun

rotation info of PDImage

2015-07-07 Thread Manfred Pock
Hi, is there a possiblity that i can get the rotation of an PDImageObject. The rotation of the page is 90 degrees, and it seems to be that the embedded Pdimage also have this rotation. How can i get this information from PDImage-Obj? regarts, Manfred -

Re: 2.0.0. RC was Re: PDFBox 2.0.0 release

2015-07-07 Thread Maruan Sahyoun
Hi, > Am 07.07.2015 um 09:34 schrieb Andreas Lehmkühler : > > Hi, > >> Andreas Lehmkühler hat am 6. Juli 2015 um 11:55 >> geschrieben: >> >> >> Hi, >> >> >> I'd like to do a 2.0.0 release rather sooner than later and I guess I'm not >> the >> only one ;-) >> >> We are down to 24 issues mar

Jenkins build became unstable: PDFBox-trunk (JDK 1.6.0 unlimited security) » Apache PDFBox #22

2015-07-07 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pd

Jenkins build became unstable: PDFBox-trunk (JDK 1.6.0 unlimited security) #22

2015-07-07 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

Jenkins build is still unstable: PDFBox-trunk #2277

2015-07-07 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

Jenkins build is still unstable: PDFBox-trunk » Apache PDFBox #2277

2015-07-07 Thread Apache Jenkins Server
See - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org

[jira] [Commented] (PDFBOX-2849) fix problems with setting existing AcroForm buttons

2015-07-07 Thread Maruan Sahyoun (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616352#comment-14616352 ] Maruan Sahyoun commented on PDFBOX-2849: it's already an abstraction as there is

[jira] [Commented] (PDFBOX-2854) TTFSubsetter NoSuchElementException

2015-07-07 Thread simon steiner (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616339#comment-14616339 ] simon steiner commented on PDFBOX-2854: --- Any font > TTFSubsetter NoSuchElementExce

Re: PDFBox 2.0.0 release

2015-07-07 Thread Andreas Lehmkühler
> Tilman Hausherr hat am 6. Juli 2015 um 19:08 > geschrieben: > > > Yes it would be great that the 2.0 version be released. Before the > opening of the new Berlin airport. > > IMO only the following issues are important for 2.0: > - PDFBOX-2301 - RandomAccessBuffer consumes too much memory -

2.0.0. RC was Re: PDFBox 2.0.0 release

2015-07-07 Thread Andreas Lehmkühler
Hi, > Andreas Lehmkühler hat am 6. Juli 2015 um 11:55 > geschrieben: > > > Hi, > > > I'd like to do a 2.0.0 release rather sooner than later and I guess I'm not > the > only one ;-) > > We are down to 24 issues marked with "Fix Version 2.0.0". > > @Assignees: please have a look at "your" is

[jira] [Comment Edited] (PDFBOX-2844) Printing has bigger margins than expected

2015-07-07 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616314#comment-14616314 ] John Hewson edited comment on PDFBOX-2844 at 7/7/15 7:28 AM: -

[jira] [Comment Edited] (PDFBOX-2844) Printing has bigger margins than expected

2015-07-07 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616314#comment-14616314 ] John Hewson edited comment on PDFBOX-2844 at 7/7/15 7:27 AM: -

[jira] [Commented] (PDFBOX-2844) Printing has bigger margins than expected

2015-07-07 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616314#comment-14616314 ] John Hewson commented on PDFBOX-2844: - You've misunderstood... I'm saying that we sho

[jira] [Commented] (PDFBOX-2842) Overhaul font substitution

2015-07-07 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616305#comment-14616305 ] ASF subversion and git services commented on PDFBOX-2842: - Commit

[jira] [Commented] (PDFBOX-2184) Jenkins: CMMException: Invalid profile data

2015-07-07 Thread JIRA
[ https://issues.apache.org/jira/browse/PDFBOX-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616297#comment-14616297 ] Andreas Lehmkühler commented on PDFBOX-2184: [~jahewson] Do you port those ch

[jira] [Commented] (PDFBOX-2849) fix problems with setting existing AcroForm buttons

2015-07-07 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616295#comment-14616295 ] John Hewson commented on PDFBOX-2849: - Oh, I see, that's annoying. I'm still not sure

[jira] [Comment Edited] (PDFBOX-2849) fix problems with setting existing AcroForm buttons

2015-07-07 Thread John Hewson (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616295#comment-14616295 ] John Hewson edited comment on PDFBOX-2849 at 7/7/15 7:05 AM: -