I have a customer that uses a LOT of PDF files. They currently have 2
files that are failing when we try to render them.
The same files can be viewed with Acrobat Reader or Foxit PDF with no
errors reported.
From Acrobat Reader file info:
PDF Producer: PDFOut V3.8 – build 201 – Oct 28 2022
OSName.OCG)) {
dict.setItem(COSName.TYPE, COSName.OCG);
}
PDOptionalContentGroup grp = new PDOptionalContentGroup(dict);
On 11/21/2023 10:52 PM, Andreas Lehmkühler wrote:
Am 21.11.23 um 21:26 schrieb John Lussmyer:
Ugh, formatting mess.
For more info, this is the "addOCGs:OCG
3 10:56 AM, John Lussmyer wrote:
I'm using PDFBox 3.0.0 to combine some PDF files. One of the files
uses an Optional Content Group.
Note that this code has been working just fine for many other files
both with and without OCG's.
For this file, I get this exception:
java.lang.IllegalArgumen
I'm using PDFBox 3.0.0 to combine some PDF files. One of the files uses
an Optional Content Group.
Note that this code has been working just fine for many other files both
with and without OCG's.
For this file, I get this exception:
java.lang.IllegalArgumentException: Provided dictionary is
On 11/8/2023 5:28 PM, Peter Wyatt wrote:
I would think supporting the following PDF 2.0 features are highly relevant,
given that other implementations are already generating PDF 2.0 files today
(seehttps://pdfa.org/supporting-pdf20/)
A bunch of useful suggestions elided..
What I REALLY
I doubt there is a way.
It's most likely that the signing code makes a MD5 checksum (or similar)
of the file when it is signed.
If the file is changed, checking the signing will re-calculate the
checksum and find that it is different. There isn't any info on what
changed, just that SOMETHING
ile(MemoryUsageSetting.setupMixed(100)));
(I use it with tempFileOnly, but the rest are the same)
On Thu, Oct 5, 2023 at 9:50 PM John Lussmyer wrote:
I'm trying to update to the latest PDFBox 3.0.0.
The code was using a call to
loadPDF(file,MemoryUsageSetting.setupMixed(MB100); // 100 MB
I see that that no lon
I'm trying to update to the latest PDFBox 3.0.0.
The code was using a call to
loadPDF(file,MemoryUsageSetting.setupMixed(MB100); // 100 MB
I see that that no longer exists, but the only mention of it doesn't
seem to provide any info on how to configure an equivalent replacement?
Any
[EXTERNAL]
On 04.01.2023 19:22, John Lussmyer wrote:
I have a pdf with several Optional Content groups.
I can find their definitions in the Page/Resources/Properties dictionary, but I
don't see how they are enabled or disabled.
Where is that controlled?
This is below the document root, use P
I have a pdf with several Optional Content groups.
I can find their definitions in the Page/Resources/Properties dictionary, but I
don't see how they are enabled or disabled.
Where is that controlled?
Confidentiality notice: This message may contain confidential information. It
is intended only
I was able to get ahold of the customers PDF file - but it (of course) works
just FINE for me on my system.
I have logs showing multiple identical failures for the customer - and lots of
other files succeeding.
I'd really like to test your possible fix - but first I have to figure out how
to
We are using PDFBox to render various PDF files in our product.
One customer is having issues due to PDFBox throwing a NullPointerException
when certain files are rendered. (No, I don't have copies of the files - yet)
Any ideas on what could cause this?
java.lang.NullPointerException: null
We have an app that can generate multi-page PDF Files. We recently ran into a
problem where the library we were using would keep ALL the pages in memory.
For a quick workaround we have it write out single-page PDF files, then use
PDFBox to combine them.
We recently found a bug in the way
On Sun Jan 23 10:02:08 PST 2022 rc...@pobox.com said:
>I am using PDFBox's PDFTextStripper.getText() for a particular kind of
>PDF file generated by a government agency, and the text I'm getting does
>not match that displayed by Acrobat Reader for the same files. The
>getText() calls occasionally
On Fri Jan 07 08:55:38 PST 2022 ke...@trumpetinc.com said:
>If you use the temporary file memory storage, it should be possible to work
>with very large files.
Thanks, I was hoping there was some way to deal with this case.
I just ran a quick test, generating a 2000 page PDF by placing a 1 page
I have a need to merge a couple thousand PDF's into one humongous PDF.
The old tool we use for PDF manipulation runs out of memory as it builds the
result PDF in memory, and only writes it out when done.
Can PDFBox do something more like streaming the output as it's built? or even
not load all
On Thu Sep 09 10:10:52 PDT 2021 thaush...@t-online.de said:
>In theory one could make separate rendering hints for fonts and for
>ordinary vectors, but that would be messy and hard to understand. (And
>who knows whether it will work for your file)
>
>I recommend that you try doing this yourself by
On Wed Sep 08 20:31:47 PDT 2021 thaush...@t-online.de said:
>Ooops, you didn't mention that you turned antialiasing off. The image
>looks as if interpolation was also turned off. If you set rendering
>hints you always have to set all the hints you need. Here's the default:
>
> private
said:
>On Wed Sep 08 12:20:59 PDT 2021 thaush...@t-online.de said:
>>Am 08.09.2021 um 21:16 schrieb John Lussmyer:
>>> Ok, just tried that - no change.
>>>
>>> We are currently trying PDFBox 3.0.0-RC1 - is that a problem?
>>
>>No, this is excell
On Wed Sep 08 12:20:59 PDT 2021 thaush...@t-online.de said:
>Am 08.09.2021 um 21:16 schrieb John Lussmyer:
>> Ok, just tried that - no change.
>>
>> We are currently trying PDFBox 3.0.0-RC1 - is that a problem?
>
>No, this is excellent; there will be a new release of
Ok, just tried that - no change.
We are currently trying PDFBox 3.0.0-RC1 - is that a problem?
On Wed Sep 08 11:55:56 PDT 2021 thaush...@t-online.de said:
>The default rendering is high quality oder speed, although there is one
>obscure option you could try,
We are trying to switch to using PDFBox to create the thumbnail images of PDF
Pages in our application. (The older product we currently use fails on OS 11).
I'm running into a problem if there is text on the page, the thumbnail image
makes it hard to make any sense at all of the text. (yes,
On Thu Nov 14 08:32:20 PST 2019 sahy...@fileaffairs.de said:
>well - PDF ist not really easily streamable as
>
>- it's organized as a random access format
>- the refernce table about the objects forming the PDF is at the end of the
>file to you have to read the last parts first and
>then move
On Tue Oct 29 21:59:57 PDT 2019 thaush...@t-online.de said:
>IIRC tesseract can do this. Not as annotation, but as invisible font.
As far as I can tell, it does it the same way that other programs do.
It's added to the content stream, mixed with all the commands for positioning,
font size,
I have a bunch of PDF files that have had an OCR package run against them.
The problem is that it adds the text to the normal Page content, and tries to
position the recognized text at the location in the image it was found.
So the text is mixed with lots of positioning, etc.. information.
I'd
On Mon Jun 27 14:34:03 PDT 2016 j...@jahewson.com said:
>Right, and if it was a leak then system.gc would not have fixed it.
That is only SOMETIMES true. I've run into "memory leaks" where the leak was
uncleared references to objects. So the old objects just hung around forever.
--
Bobcats
On Sun Feb 14 12:15:12 PST 2016 bigal...@gmail.com said:
>Thank you for both your answers.
>
>The html is very appealing, but what I did not mention is in working
>within a rather rigid IT environment.
>
>I won't be able to install a html server. So back to Java executable (which
>I can use)
;> Olaf
>>
>> > On 14.02.2016, at 21:40, Al Grant <bigal...@gmail.com> wrote:
>> >
>> > I would not have the permission rights to install a web server :(
>> > On 15/02/2016 9:27 am, "John Lussmyer" <cou...@casadelgato.com> wro
On Wed Feb 18 23:34:09 PST 2015 thaush...@t-online.de said:
Assuming you are using 1.8.8, put the ccitt stream into a PDStream
object, then call the PDCcitt constructor with that PDStream.
PDStream pd =new PDStream(doc, new
ByteArrayInputStream(data), true);
Thanks, that worked!
unique for you? I'm just wondering if I should add such code
to the 2.0 or 2.1 version.
Tilman
Am 19.02.2015 um 19:09 schrieb John Lussmyer:
On Wed Feb 18 23:34:09 PST 2015 thaush...@t-online.de said:
Assuming you are using 1.8.8, put the ccitt stream into a PDStream
object, then call the PDCcitt
So, I have a block of data (byte[]) that represents a scanned image, compressed
using CCITTG4.
I'm new to PDFBox. (of course)
So far, I haven't been able to figure out how I can create a page that consists
of just that image.
All the examples want to read the image from a file, and decompress
31 matches
Mail list logo