On 06.08.2008 14:45:23 Niall Pemberton wrote: > On Tue, Aug 5, 2008 at 3:58 PM, Jeremias Maerki <[EMAIL PROTECTED]> wrote: > > The outline is ok. As an alternative ..incubator/pdfbox could be moved > > to ..incubator/pdfbox/main if we don't merge all three. > > > > As noted some time ago, XML Graphics Commons (XGC) has an XMP facility > > that's more or less equivalent to JempBox, but it's not yet available as > > a separate JAR (with only XMP stuff). Gut feeling is that XGC is > > slightly stronger than JempBox but I'm biased. OTOH, I believe that XGC > > has a few things that PDFBox could also use in time (like the image > > loader framework [1][2]). Anyway, one of my priorities, now that PDFBox > > is set up, is to write a set of Metadata adapters which enable XMP > > reading/writing against XGC's XMP stuff and Adobe's XMP library. If in > > any way possible, I'd really like to consolidate XMP handling inside ASF > > projects. I'll do that as a proposal with code. Whether this is accepted > > by the various communities is a different story. I do envision a Apache > > Commons component. Please stop me if this is a silly idea. Of course, > > XGC could also simply be reused but the XMP classes might not be as > > complete as Adobe's XMP toolkit. > > > > [1] http://xmlgraphics.apache.org/commons/image-loader.html > > [2] Using the image loader framework, PDFBox could do things like > > loading SVG and WMF images (if FOP and Batik are in the classpath) and > > embed them in the PDF. Or it can easily support embedding Barcodes when > > I've written image converters for Barcode4J. MathML support with JEuclid > > etc. etc. This stuff is pretty powerful. > > > > Concerning the font stuff: FOP has extensive font code that is destined > > to move to XGC (when I finally get my affairs to together to start it). > > Batik, too, has code to read TrueType fonts. So, some overlap with FOP > > is already there. FontBox adds to that. That said, I'm not sure > > consolidation is very easy since besides me I don't see many other > > people who would help to push that. I have to choose my priorities > > carefully. I'm stretched thin already. > > > > FontBox is sufficiently separate from the whole PDF topic that it makes > > sense to keep it as a separate subproject. This encourages a clean > > separation. If anyone can make use of FontBox outside the PDFBox context, > > all the better. > > > > I'm currently leaning towards this: Consolidate XMP stuff from XGC and > > JempBox into an Apache Commons component and discard JempBox. Or just > > use XGC. Leave FontBox as subproject to PDFBox with a separate JAR as > > dependency for PDFBox. > > If the desire is for JempBox to graduate to Apache Commons, then it > would be best to raise this on the Commons dev list - the sooner the > better. With my *commons* hat on, I imagine the main issues will be: > 1) the JempBox committers being unknown by the commons team > 2) Are the JempBox committers likely to stick around to continue to support > it > > If we discuss the possibility of JempBox graduating to Commons early > on with Commons devs and invite anyone interested to monitor > incubation here then I think it will help when the time comes to ask > Commons to accept JempBox as a new Commons component. > > Also if graduation to Commons is likely, then this is a reason to hold > off on a package rename for JempBox.
Hmm, I think I may have expressed myself badly. I apologize. What I was proposing is to basically retire JempBox but use it as a reference for what XMP schema adapters are needed for a metadata component that could serve both PDFBox and the XML Graphics project (and potentially others like Tika). I'd like to approach this in two layers: 1. Underlying XMP data model (i.e. Adobe's XMP toolkit or XMLGraphics Commons' XMP package). 2. XMP namespace adapters (for Dublin Core, PDF/A etc. etc.) for which I'd write implementations against both XMP toolkit and XGC's XMP stuff. I would write this off-line (within the next two weeks if possible) and then present it as a proposal (at the risk of doing something in vain). Mostly this is just rewriting some of the code in XGC I already have and decoupling the two layers a bit. Nothing big but pretty useful and versatile in the end, I believe. If it helps I can certainly write a proposal (without code) beforehand for the Commons Wiki. I've also thought about just requesting a lab and do it there. Feedback welcome. > Niall > > > I'm eager to hear other opinions and ideas. > > > > On 05.08.2008 16:21:40 Jukka Zitting wrote: > >> Hi, > >> > >> Just a quick outline of the SVN structure I came up with: > >> > >> * The main PDFBox codebase has its trunk,tags,branches structure right > >> below https://svn.apache.org/repos/asf/incubator/pdfbox. > >> > >> * The FontBox codebase has a separate trunk,tags,branches structure > >> below https://svn.apache.org/repos/asf/incubator/pdfbox/fontbox. > >> > >> * The JempBox codebase has a separate trunk,tags,branches structure > >> below https://svn.apache.org/repos/asf/incubator/pdfbox/jempbox. > >> > >> Note that the FontBox and JempBox codebases still need to be cleaned > >> for svn:eol-style settings, etc. > >> > >> Should we keep FontBox and JempBox as separate codebases or perhaps > >> merge them into the main PDFBox codebase? In other words, are there > >> many (potential) users for those projects outside PDFBox? > >> > >> If we keep FontBox and JempBox separate, then I guess we should also > >> set up separate Jira projects for them and start planning for the > >> respective initial org.apache.* releases. > >> > >> BR, > >> > >> Jukka Zitting > > > > > > > > > > Jeremias Maerki > > > > Jeremias Maerki
