The concept of a page-manager is a useful one, and it makes sense to me to group the functionality you suggest with the stuff I called a page manager (handles reusing images, line-breaking, and page-breaking). A new level of abstraction (a page manager) is necessary in order to cache some things before writing them to the underlying stream (cache lines as the line-breaking is being calculated, cache pages as the page-breaking is being calculated). Here is the PageManager code I submitted last week. It doesn't import pages from other PDFs, but if people decide to incorporate this code into PDFBox, then I think your functionality would belong on this same PageManager: https://issues.apache.org/jira/browse/PDFBOX-1527
On Fri, Mar 8, 2013 at 4:52 AM, Maruan Sahyoun <sahy...@fileaffairs.de> wrote: > Hi, > > currently there are several areas in pdfbox where pages are imported from > pdfs and reused to form new content e.g. Overlay, OverlayPDF, PDFMerger, > PDFSplit. Some of these do have their own ways to handle the actual import > some do reuse utility classes. For overlay purposes we need an imported page > as xObject for splitting that's not necessary. > > As I do not have a complete overview about the lib would it make sense to > come up with something like a PageManager to handle these tasks e.g. > PageManager.importPage(PDPage page), PageManager.importPage(PDDocument > pdDocument, int pageNumber) … or is that not needed? Is a call to PDage > page.getContents() reliable to get the content stream or does it have to be > done by iterating and copying the individual parts as has be done in > OverlayPDF? Could that be enhanced? Shall we handle page imports always as > xObjects? > > Thanks for your feedback on these open questions. > > Maruan Sahyoun -- Glen K. Peterson (828) 393-0081