Hi Martin, -----Original Message-----
From: Martin Desruisseaux <[email protected]> Organization: Geomatys Reply-To: "[email protected]" <[email protected]> Date: Wednesday, February 26, 2014 1:44 AM To: "[email protected]" <[email protected]> Subject: Re: About DefaultFeature >Hello Chris! > >Le 26/02/14 07:59, Mattmann, Chris A (3980) a écrit : >> How about making DefaultFeature leverage Apache Tika's Metadata [1] >> class? It's a key->multi-value structure, and uses Adobe XMP properties >> to represent the value distribution. > >If I'm understanding right, Tika can work before Lucene in order to >represents data from various sources (PDF, Office, TIFF, etc.) in a >uniform way that Lucene can index, is that right? Sure, but Tika isn't dependent at all on Lucene. Tika is a generalized content detection and analysis library. One of its key components is a generic metadata container, that is typed. > >Actually one of our guys made some experiments with Tika, and we have >the feeling that it is a good match for the 'org.apache.sis.metadata' >package. The SIS metadata classes were necessary for ISO 19115 / 19139 >support, but we should probably provide an adapter to Tika metadata for >Lucene indexing. It could be done in a "sis-tika" module in order to >keep the dependency in its dedicated module, as we did for "sis-netcdf" >for instance. Yep I think it would be great for that portion of the package. > >The match may be less direct for Feature, since a Feature instance is >not really like metadata but rather like a single row in a database >table. In particular we will need to introduce later FeatureType, >AttributeType and PropertyType for describing the "feature schema" >(similar to declaring the columns of a database table). There is also a >wish to follow the ISO 19109:2013 standard (which defines the >above-cited types), and have classes that we can map to GML (Geographic >Markup Language). Yeah I was just suggesting that the key->value portion of the DefaultFeature may want to use Tika. > >I think we should create a JIRA task for a "sis-tika" module mapping >metadata. Do you want me to do so? Mapping Feature could also be >investigated, but this seem less obvious to me than metadata. Do you >wish to elaborate on Feature-Tika mapping, or do we focus on metadata >for now? +1, this makes sense. Cheers, Chris > > Cheers, > > Martin ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-283, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >
