Not neccessarily!
You could create a new class that inherits from PdfReader and overrides
RemoveUnusedObjects() (unless it's private... but I think it's public).
Yep, sure is.
So your version of RemoveUnusedObjects can do all that stuff I suggested
the first time around.
--Mark Storer
Senior Software Engineer
Cardiff.com
import legalese.Disclaimer;
Disclaimer<Cardiff> DisCard = null;
Autonomy Corp., an HP Company
________________________________
From: WMJ [mailto:[email protected]]
Sent: Friday, October 07, 2011 6:49 PM
To: [email protected]
Subject: Re: [iText-questions] What are those unused objects in
a PDFdocument
Hello,
Thank you for your reply. They are really filtered, since the
RemoveUnusedObjects method is called when a PDF document is loaded. Thus
I've got to hack the source code and modify the ReadPdf method in the
PdfReader class and comment out the calling statement.
________________________________
From: Mark Storer <[email protected]>
Subject: RE: [iText-questions] What are those unused
objects in a PDFdocument
Filtered? No. Inaccessible from the root following
various indirect references? Yep. That's why their unused.
If you really don't want to touch the source, you should
be able to step through all the objects before and after
RemoveUnusedObjects, and dump the ones that were removed after the fact.
Here's some pseudocode.
Map<int, str> allObjects
For each indirect object in the reader
allObjects.put(obj number, string representation);
RemoveUnusedObjects();
for each indirect object in the reader
allObjects.remove(obj number);
for each entry in allObjects
System.out.println(entry.value);
For PdfArrays and PdfDictionaries, you should be able to
use the toString() of the underlying ArrayList or HashMap. For
PdfStreams, you probably just want to note that its a stream unless you
really want to show all the binary data of some image/font/content
stream. The last might be interesting, but the binary gobbledegook of
images and fonts probably won't tell you much.
--Mark Storer
Senior Software Engineer
Cardiff.com
import legalese.Disclaimer;
Disclaimer<Cardiff> DisCard = null;
Autonomy Corp., an HP Company
________________________________
From: WMJ [mailto:[email protected]]
Sent: Thursday, October 06, 2011 4:32 PM
To: Post all your questions about iText here
Subject: Re: [iText-questions] What are those
unused objects in a PDFdocument
It seems that those unused objects are already
filtered when the document is loaded and the only way to inspect them is
to hack the source code.
________________________________
Subject: RE: [iText-questions] What are
those unused objects in a PDF document
You could hack the source to dump a
string representation of those objects to System.out prior to removing
them. Shouldn't be too hard.
--Mark Storer
Senior Software Engineer
Cardiff.com
import legalese.Disclaimer;
Disclaimer<Cardiff> DisCard = null;
Autonomy Corp., an HP Company
________________________________
From: WMJ [mailto:[email protected]]
Sent: Thursday, October 06, 2011 2:17 AM
To:
[email protected]
Subject: [iText-questions] What are
those unused objects in a PDF document
Hello,
I find that there's a
RemoveUnusedObjects method in the PdfReader class.
It seems that there can be some objects
which are not referenced by other objects and hence are "unused".
Some PDF files, after processed by the
RemoveUnusedObjects, can dramatically reduce to 30% the size of the
original one.
I wondered what's inside those unused
objects.
Is it possible to find out what things
are in those removed objects?
WMJ
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php