It looks like you have all the bases covered. The only way you might be more thorough would be to iterate through all the indirect objects by object number:
int numObjs = reader.getXrefSize(); for (int i = 0; i < numObjs; ++i) { PdfObject curObj = reader.getPdfObject( i ); // no need to worry about indirect references this way // if curObj is a dict, traverse it as a dict // if it's an array, traverse it as an array } You can then modify your traverse functions to ignore indirect references (because you either will or have already addressed them). You don't need to worry about revisiting objects either, so you can ditch the "traversed" set. Following these suggestions (plus a little code cleanup), traversePdfDictionary would look like this: 163 public static boolean traversePdfDictionary(final PdfDictionary dict, /*final Set traversed,*/ boolean containsJs) { 164 /* if (traversed.contains(dict)) { 165 return containsJs; 166 } else { 167 traversed.add(dict); 168 }*/ 169 170 for (Object key : dict.getKeys()) { 171 PdfObject data = (PdfObject) dict.get((PdfName) key); 172 /** removed per above suggestion if (data instanceof PRIndirectReference) { 173 data = resolveReference((PRIndirectReference) data); 174 } **/ 175 176 if (PdfName.JS.equals( key )) { 177 dict.put(PdfName.JS, new PdfString("")); // you could use a single empty pdfString over and over here, throughout the entire document 178 containsJs = true; 179 } 180 // Parents are ALWAYS indirect references, so the object-by-object stepping will find them 181 /* (PdfName.PARENT.equals( key )) { none the less, here's some code cleanup to give you ideas in the future 182 if (data.isDictionary()) { 183 containsJs |= traversePdfDictionary((PdfDictionary)data, traversed,containsJs); 184 } else if (data.isArray(){ 185 containsJs |= traversePdfArray((PdfArray)data, traversed,containsJs); 186 } 187 } */ 188 } 189 return containsJs; 190 } Furthermore, the only reason I can think of to traverse arrays would be to stomp on the document level javascripts, but that can be done more directly: reader.getCatalog().getAsDictionary( PdfName.NAMES ).remove( PdfName.JAVASCRIPT ); There's a potential NPE in there (/Names is optional), but I'll leave that as an exercise for the reader. You could also remove any "AA" references (from pages and annotations). Finally, you'll want to call reader.removeUnusedObjects() prior to stamper.close() (line 149-ish). Even if you don't change anything else, the value of a /JS can be a stream reference. With all these other suggestions, you're looking at Quite A Few orphaned objects loitering around your file. We end up sticking around 100k of boilerplate script into every pdf form we generate, plus a fair amout of overhead from the objects wrapping all that script... (a test I just ran shows the total savings at 75kb) without the call to removeUnusedObjects(), your program wouldn't noticably change the size of the file when those scripts are in streams. --Mark Storer Senior Software Engineer Cardiff.com #include <disclaimer> typedef std::Disclaimer<Cardiff> DisCard; > -----Original Message----- > From: Andrea Lombardoni [mailto:andrea.lombard...@oneoverzero.net] > Sent: Friday, October 09, 2009 8:10 AM > To: itext-questions@lists.sourceforge.net > Subject: [iText-questions] PDF Javascript Stripper > > > I just released a small project that uses iText: PDF > Javascript Stripper > > It takes a PDF as input and tries to remove (better, nullify) > all the Javascript > code inside it. > > My company uses it to be sure that we do not produced/deliver > PDF document with > malicious content. > > I hope it can be interesting to other people as well. > You can download it here: > > https://sourceforge.net/projects/pdfjavascriptst/ > > > > -------------------------------------------------------------- > ---------------- > Come build with us! The BlackBerry(R) Developer Conference in SF, CA > is the only developer event you need to attend this year. > Jumpstart your > developing skills, take BlackBerry mobile applications to > market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > iText-questions mailing list > iText-questions@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/itext-questions > > Buy the iText book: http://www.1t3xt.com/docs/book.php > Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/ ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.1t3xt.com/docs/book.php Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/