Hi, On Fri, Sep 23, 2011 at 3:06 PM, Ken Krugler <kkrugler_li...@transpac.com> wrote: > On Sep 23, 2011, at 3:24am, Jukka Zitting wrote: >> In any case it would still be good to mapRDFa <meta> tags also to the >> Metadata object. To do that properly (and to open the way to better >> XMP integration, my favourite TODO item :-), we'll probably need to >> extend the Metadata class to handle things like namespaces and >> structured values. > > That's what I was afraid of :) > > My head starts to hurt when I have to deal with namespaces and RDF.
>From the client perspective the Metadata class should still provide a simple key-value interface for basic things, just like the Tika facade hides the more powerful constructs of the Parser and Detector interfaces under a simplified API. Of course the implementation side would be more complex... > So I think I'll just patch my local copy to do the Q&D thing, and wait for > someone with more XML/RDF-fu to deal with it properly. Until Someone (TM, :-) does that, I'd be very happy to see the simple property=xxx mapping you described added to HtmlParser. It's obviously an improvement to the way Tika currently works, and I don't see any major backwards compatibility issues caused by starting with a simple solution like that and later on migrating to a more complete RDF-based metadata model. BR, Jukka Zitting