Many thanks for this Bruno - code works well.  However, it does not identify
the Indirect Object I am looking to deal with!

 

The object I am interesting in is the <<Linearization>> dictionary.  This is
added to the file when the file is Saved As linearized but is not referenced
anywhere in the file.  Hence, my trying the brute force method!

 

See below for extract from PDF file, code and its output.

 

The code returns object 24 (<<Linearization>>) as null.

 

My question is simple - is there any way I can retrieve the
<<Linearization>> dictionary as an object (direct or indirect)?  Or, do I
have to treat the file as a stream and search for the term
'<<Linearization>>' and deal with its values manually? [Note: no problem to
do this but it would be nicer to deal with this dictionary as an object if
possible]

 

 

PDF File extract

-----------------

%PDF-1.6

%âãÏÓ

24 0 obj

<</Linearized 1/L 9873/O 26/E 5001/N 1/T 9560/H [ 456 164]>>

endobj

...

 

Code

-----

PdfObject obj;

for (int i = 0; i < reader.XrefSize; i++) {

  obj = reader.GetPdfObject (i);

  if (obj != null) {

    sb.AppendLine ("obj  " + i + "  =  " + obj.ToString ());

    }

}

Console.WriteLine (sb.ToString ());

 

Ouput

--------

...

obj  22  =  Dictionary

obj  23  =  Dictionary

obj  25  =  Dictionary of type: /Catalog

obj  26  =  Dictionary of type: /Page

...

 

Kind regards

William

 

From: 1T3XT BVBA [mailto:[email protected]] 
Sent: 26 September 2011 15:41
To: Post all your questions about iText here
Subject: Re: [iText-questions] Retrieving all Indirect Object

 

On 26/09/2011 16:33, William Bell wrote: 

Is there a way to retrieve an array or collection of all Indirect Objects in
a PDF document such that they can be stepped through one at a time?  For
example:

 

PdfReader reader = new PdfReader(file);

foreach (IndirectObject io in reader)

{

.......

}


So you want to inspect a PDF with brute force ;-)
That's indeed the best way to find out which objects are inside the PDF.
Because you can have objects that aren't referenced from anywhere.
They're just there, without any reason.

There's an example of the brute force approach on p442 of the book:
PdfObject object;
for (int i = 1; i < reader.getXrefSize(); i++) {
   object = reader.getPdfObject(i);
}
Note that there can be gaps in the numbering of the objects,
so you shouldn't be surprised if object returns null in many cases.

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to