How to flatedecode and find all acroform fields in a compressed PDF

Balaji Venkatamohan Tue, 19 May 2015 12:38:20 -0700

Hello,

I am using PDFBox 1.8.9 for my product which would read acroform fields and
write into acroform fields of a PDF. Right now, I have two versions of a
PDF given to us by a potential customer, one is compressed and another is
not. The PDF contains a bunch of acroform fields and it is three pages
long. Sorry, I cannot share the PDF because of restrictions.


The size of the uncompressed file is 1.67 MB and the size of the compressed
version is 27 KB.
I could open both the PDFs in Reader X1, foxit reader and many other reader
software and I am able to successfully modify values for acroform fields
and save them.

When I open the uncompressed PDF using PDDocument.load(File f), I am able
to successfully read from and write into the many acroform fields using the
below API calls:

PDDocumentCatalog docCatalog = document.getDocumentCatalog();
 PDAcroForm acroForm = docCatalog.getAcroForm();

However, I am unable to see any acroForm fields when I open the compressed
PDF.

I opened the compressed PDF using notepad and I Could see that the objects
have been compressed using FlateEncoding. The below line is from the first
page of the PDF:

2 0 obj
<</Filter/FlateDecode/Length 5675>>stream
....
....
....
endstream
endobj

I see a FlateFilter.java class with encode and decode methods which are
inturn used by public methods in COSStream.java. I am unable to connect the
dots and flatedecode the PDF.

My question is: how do I flatedecode a PDF so that I can find all the
acroform fields within it. ANy help or pointers would be highly appreciated.

Thanks,
Balaji

How to flatedecode and find all acroform fields in a compressed PDF

Reply via email to