Hi Antoni, > Chris Mattman has written that it's necessary to > strike a balance between functionality and over-bloating. From my own > experience i can say that it is VERY difficult :).
Well from my own experience I can tell you that it *is* difficult, but certainly doable. I've been working with different forms of metadata (Dublin Core, ISO 11179, RDF, OWL/etc.), been involved in international standards organizations (CCSDS, ISO) who are developing metadata standards, and worked on several projects that deal with metadata (Object Oriented Data Technology [OODT], Semantic Web for Earth and Environmental Terminology [SWEET]) in different domains (earth science, planetary science, space science, cancer research/etc.) for almost 7 years now. Sure, there are a lot of standards and people can talk about coming up with a one-size-fits-all cookie cutter type library for these capabilities, however, I think it's important to understand that developing such libraries (rather than striking the balance) in my mind is the most difficult problem to tackle. I think that in the end, all we can do as software developers, as people who are trying to standardize metadata, is to try and develop core libraries and functions that others can build upon for their own needs. I don't think the Tika folks should be in the business of trying to develop high capability metadata libraries, because in the end, just as everyone is saying, those need to be tailored to a specific use-case or domain. On the other hand, I think it's a much-more attainable goal to come up with a simple, easy-to-use metadata library, that folks who need higher level capability (inference, multi-language support, representation/etc.) can build upon for their own needs. In other words, someone shouldn't have to rewrite the ability to have met keys, with multiple values associated with them, with ways to map between the keys, etc., however, it's reasonable that someone may need to rewrite the ability to represent metadata in RDF (versus OWL), to rewrite the ability to do language translation (e.g., using XMP versus Adobe's toolkit), that type of thing. In any case, I'm happy to participate in any standardization efforts wearing my Tika hat, with the understanding that whatever gets developed needs to "fit in" the right place, be architected for extensibility, and have cognizance of what was done previously, what the gaps are, and why the gaps should be addressed. Thanks! Cheers, Chris ______________________________________________ Chris Mattmann, Ph.D. [EMAIL PROTECTED] Cognizant Development Engineer Early Detection Research Network Project _________________________________________________ Jet Propulsion Laboratory Pasadena, CA Office: 171-266B Mailstop: 171-246 _______________________________________________________ Disclaimer: The opinions presented within are my own and do not reflect those of either NASA, JPL, or the California Institute of Technology.
