Re: PDF without stamps

2015-03-20 Thread Andreas Lehmkuehler
Hi, Am 20.03.2015 um 16:02 schrieb Kevin Morin: Hi guys, have you found a solution to fix this issue? Follow https://issues.apache.org/jira/browse/PDFBOX-2679 to be up-to-date BR Andreas BR Kevin On 03/03/2015 18:57, Tilman Hausherr wrote: Hi Kevin, The problem is that our annotation

Re: Looking for some guidance on using PDFBox to analyze page content

2015-03-20 Thread Peter Murray-Rust
We do a great deal of this and have created two downstream packages which consume the output of PDFBox: * https://bitbucket.org/petermr/pdf2svg/ (which translates the PDF into SVG) * https://bitbucket.org/petermr/svg2xml (which tries to convert the SVG into high-level constructs) There are roughl

Re: Looking for some guidance on using PDFBox to analyze page content

2015-03-20 Thread Tilman Hausherr
Yes, by analysing the content stream operators (e.g. "l", "c"), but you will have the problem that e.g. an underlined text is a drawed font (which technically is also vector graphics) and a line. And you won't be able to tell easily that this line is related to the font. Tilman Am 20.03.2015

Re: PDF without stamps

2015-03-20 Thread Kevin Morin
Hi guys, have you found a solution to fix this issue? BR Kevin On 03/03/2015 18:57, Tilman Hausherr wrote: Hi Kevin, The problem is that our annotation rendering doesn't work on rotated pages. If the "rotation 90" is removed at the page level, the annotations appear at the correct place in t

Re: Error on PDDocument.load

2015-03-20 Thread Kevin Morin
HI, a little up ;) Have a nice weekend. BR Kevin On 02/03/2015 16:19, Kevin Morin wrote: Hi, Andreas, you said in the issue that you have a solution in mind, did you succeed in fixing it or not? It seems that my users have a lot of files of this kind... Thanks BR Kevin On 11/02/2015 23:16

Re: Looking for some guidance on using PDFBox to analyze page content

2015-03-20 Thread Eliot Kimber
You can definitely analyze all the raster images in a PDF and get their format (as stored in the PDF data stream). Vector may be harder since PDF is fundamentally a drawing language and it may not be possible to reliably distinguish drawing commands that are just decorating a page or producing a t

Looking for some guidance on using PDFBox to analyze page content

2015-03-20 Thread Warren Gallagher
Greetings, Is there a means to determine if a page contains: * vector graphics * raster graphics (and what format) Regards, WARREN GALLAGHER - CTO warren.gallag...@apxconsult.com M: 613-791-4987 W: 613-262-2601 Advance Property eXposure Canada Inc. 1755 Woodward Drive, S