Hi Mac, you can use PDFTextStripper for this. it will return you all texts from pages
Best regards Juraj Lonc GI-BÓN, spol. s r.o. Management Systems Bratislavská 11 SK - 010 01 Žilina Tel: +421-41-564 3437-8 Mobil: +421-907-815 147 Fax: +421-41-564 3439 e-mail: jl...@gi-bon.sk homepage: http://www.gi-bon.sk From: Mac P <pon...@hotmail.com> To: pdfbox <users@pdfbox.apache.org>, Date: 01. 09. 2012 10:02 Subject: How can I manipulate text in PDF'd by using PDFBox Hello Forum Is there any way to to split a master pdf file consisted of so many pages into separate pages based on the content or keywords in each page? Each page has the person's first and last name. I would like to grep the last name and write a scripts to separate each page, turn it into a new pdf file with the last name being part of the file name instead of sequential numbers matching the total number of pages at the end of each file name. I know PDFs are binary documents. Are there any tools to look up the last names and manipulate them that way? Thanks Mac