Re: Metadata situation and XMP support in Tika

2012-04-13 Thread Ray Gauss II
Yeah, I think that was the original motivation behind the Metadata class, a simple set of common properties for devs. I prefer the stricter aliasing approach to the list as I think it will be easier for devs who aren't intimately familiar with the standard they're working with. If a dev is wor

RE: Metadata situation and XMP support in Tika

2012-04-13 Thread Joerg Ehrlich
Hi Ray, Using ExifTool as external parser is a good idea. Currently at Adobe we also use the XMPFiles C++ library in our Java projects to read/write metadata, although not as a Tika parser (yet). But that is one idea for the future. And yes, we should definitely coordinate on the metadata enhanc

RE: Metadata situation and XMP support in Tika

2012-04-13 Thread Joerg Ehrlich
Hi Ray, Yes, that is pretty much what I would propose. Aliasing is one idea, or you could simply have a list like the ones at the end of IPTC class which simply reference the namespace properties. I haven't got a strong opinion here. And I'm with you that I don't really see the benefit of includ

Re: Metadata situation and XMP support in Tika

2012-04-13 Thread Ray Gauss II
For the IPTC example specifically, all properties are defined using their respective namespaces, but some are defined 'inline' while others are an alias to the referenced standard, i.e. Property KEYWORDS = DublinCore.DC_SUBJECT; If I'm understanding you correctly your proposal is to do that

RE: Metadata situation and XMP support in Tika

2012-04-13 Thread Joerg Ehrlich
Hi, Looking at the current constants defined for the Metadata map, the interfaces do not follow a common pattern. They are organized in interfaces for specific namespaces like DublinCore or XMPDM, there are interfaces for specific standards like IPTC or CreativeCommons and there are interfaces