Re: Handling page imports

2013-03-10 Thread Andreas Lehmkuehler

Hi,

Am 10.03.2013 14:25, schrieb Glen Peterson:

BTW I quickly looked at your contribution. You put a lot of effort into what 
was a completely missing part!


Thanks for taking the time to look, and for the compliment - you just
made my day!


PageManager I was talking about is more low level than yours which is more 
towards a LayoutManager


Ooh, good feedback.  Since your email, I'm planning to rename it.


A higher level API like yours could then rely on the low level API.  There 
might be some overlap though.


Yes, I could see it being completely separate, though there is a
strong dependency.  The dependency made me think that it belongs with
PDFBox, especially since the collection of features called iText
includes layout-manager functionality.

It occurs to me that you guys are getting ready for a release and
might not want to consider adding a whole new feature until you start
a new chunk of development.  Also, it really can be released

Correct, unfortunately it is to late and I don't want to postpone the release,
as a lot of people are waiting for it.


completely separately from PDFBox and you are currently breaking
PDFBox up into some smaller projects.  I'm thinking now of calling the
project com.planbase.pdf.LayoutManager or some such thing and hosting
it on GitHub under the Apache license.  That will let me track it in
source control and make it easier for me to move forward with it
without cluttering up your mailing list.  If people use it, it's just
not that hard for them to have to change a few imports to reflect it
being moved to a different project.

+1


You guys know it exists, and you know I'm excited about incorporating
it into PDFBox if you want it there.  But I think that for now, it's
probably best to consider it a separate project until we are all ready
to put the two together if we decide that's a good idea.  When I
actually make the move, I'll remove the code from the JIRA issue and
replace it with a link to the GitHub project.

I really appreciate your offer! There are a lot of people looking for such
features. I guess it would perfectly fit into our next major release 2.0. till
then you might wanna have a look at our license agreements [1]. Your
contribution would be a substantial change to our codebase and we would ask you
to sign a iCLA/CCLA. Feel free to ask if anything is unclear.



Thanks!

-Glen K. Peterson



Thanks again for your offer and your interest in PDFBox!

BR
Andreas Lehmkühler

[1] http://www.apache.org/licenses/#clas


Handling page imports

2013-03-08 Thread Maruan Sahyoun
Hi,

currently there are several areas in pdfbox where pages are imported from pdfs 
and reused to form new content e.g. Overlay, OverlayPDF, PDFMerger, PDFSplit. 
Some of these do have their own ways to handle the actual import some do reuse 
utility classes. For overlay purposes we need an imported page as xObject for 
splitting that's not necessary.

As I do not have a complete overview about the lib would it make sense to come 
up with something like a PageManager to handle these tasks e.g. 
PageManager.importPage(PDPage page), PageManager.importPage(PDDocument 
pdDocument, int pageNumber) …  or is that not needed? Is a call to PDage 
page.getContents() reliable to get the content stream or does it have to be 
done by iterating and copying the individual parts as has be done in 
OverlayPDF? Could that be enhanced? Shall we handle page imports always as 
xObjects?

Thanks for your feedback on these open questions.

Maruan Sahyoun


Re: Handling page imports

2013-03-08 Thread Glen Peterson
The concept of a page-manager is a useful one, and it makes sense to
me to group the functionality you suggest with the stuff I called a
page manager (handles reusing images, line-breaking, and
page-breaking).  A new level of abstraction (a page manager) is
necessary in order to cache some things before writing them to the
underlying stream (cache lines as the line-breaking is being
calculated, cache pages as the page-breaking is being calculated).
Here is the PageManager code I submitted last week.  It doesn't import
pages from other PDFs, but if people decide to incorporate this code
into PDFBox, then I think your functionality would belong on this same
PageManager:
https://issues.apache.org/jira/browse/PDFBOX-1527

On Fri, Mar 8, 2013 at 4:52 AM, Maruan Sahyoun sahy...@fileaffairs.de wrote:
 Hi,

 currently there are several areas in pdfbox where pages are imported from 
 pdfs and reused to form new content e.g. Overlay, OverlayPDF, PDFMerger, 
 PDFSplit. Some of these do have their own ways to handle the actual import 
 some do reuse utility classes. For overlay purposes we need an imported page 
 as xObject for splitting that's not necessary.

 As I do not have a complete overview about the lib would it make sense to 
 come up with something like a PageManager to handle these tasks e.g. 
 PageManager.importPage(PDPage page), PageManager.importPage(PDDocument 
 pdDocument, int pageNumber) …  or is that not needed? Is a call to PDage 
 page.getContents() reliable to get the content stream or does it have to be 
 done by iterating and copying the individual parts as has be done in 
 OverlayPDF? Could that be enhanced? Shall we handle page imports always as 
 xObjects?

 Thanks for your feedback on these open questions.

 Maruan Sahyoun



--
Glen K. Peterson
(828) 393-0081


Re: Handling page imports

2013-03-08 Thread Maruan Sahyoun
Hi Glen,

thanks for your feedback. I was thinking in the lines of generalizing how to 
deal with page imports so the PageManager I was talking about is more low level 
than yours which is more towards a LayoutManager. If you look at Overlay.java, 
OverlayPDF.java …. all handle it slightly differently (as I was in some of our 
projects). It might also be possible to add functions to change the page order 
…. A higher level API like yours could then rely on the low level API. There 
might be some overlap though. BTW I quickly looked at your contribution. You 
put a lot of effort into what was a completely missing part!

With kind regards - Maruan

Am 08.03.2013 um 14:09 schrieb Glen Peterson g...@organicdesign.org:

 The concept of a page-manager is a useful one, and it makes sense to
 me to group the functionality you suggest with the stuff I called a
 page manager (handles reusing images, line-breaking, and
 page-breaking).  A new level of abstraction (a page manager) is
 necessary in order to cache some things before writing them to the
 underlying stream (cache lines as the line-breaking is being
 calculated, cache pages as the page-breaking is being calculated).
 Here is the PageManager code I submitted last week.  It doesn't import
 pages from other PDFs, but if people decide to incorporate this code
 into PDFBox, then I think your functionality would belong on this same
 PageManager:
 https://issues.apache.org/jira/browse/PDFBOX-1527
 
 On Fri, Mar 8, 2013 at 4:52 AM, Maruan Sahyoun sahy...@fileaffairs.de wrote:
 Hi,
 
 currently there are several areas in pdfbox where pages are imported from 
 pdfs and reused to form new content e.g. Overlay, OverlayPDF, PDFMerger, 
 PDFSplit. Some of these do have their own ways to handle the actual import 
 some do reuse utility classes. For overlay purposes we need an imported page 
 as xObject for splitting that's not necessary.
 
 As I do not have a complete overview about the lib would it make sense to 
 come up with something like a PageManager to handle these tasks e.g. 
 PageManager.importPage(PDPage page), PageManager.importPage(PDDocument 
 pdDocument, int pageNumber) …  or is that not needed? Is a call to PDage 
 page.getContents() reliable to get the content stream or does it have to be 
 done by iterating and copying the individual parts as has be done in 
 OverlayPDF? Could that be enhanced? Shall we handle page imports always as 
 xObjects?
 
 Thanks for your feedback on these open questions.
 
 Maruan Sahyoun
 
 
 
 --
 Glen K. Peterson
 (828) 393-0081