Hey Martin, On 1/18/13 12:12 PM, "Martin Desruisseaux" <[email protected]> wrote:
>Le 18/01/13 11:31, Adam Estrada a écrit : >> Spot on with Tika being an SIS dependency, Martin! The idea is to be >>able >> to extract content from as may file formats as possible based on their >>MIME >> types. GDAL provides the interface to a lot more geospatial formats. > >We have the notion of "data source" interface (not yet committed), and >Tika or GDAL can be one of them. GeoTIFF, NetCDF, etc. are other data >sources (we have some extra flexibility if we read NetCDF files directly >rather than through GDAL for instance, but we would do that only for the >most important formats instead than duplicating the totality of GDAL). >However "data sources" appear downstream relative to metadata and other >basic modules. A list of modules in approximative dependency order can be: > > - utility > - metadata > - referencing > - geometry > - feature > - coverage > - data source <-- Tika/GDAL can be plugged here > - styles > - renderer +1 that makes sense to me. Note I also believe there is another dependency from Tika to SIS (especially for the WKT parsing). > >I'm not sure if "filter" would be before or after "data source" - Johann >Sorel would known better (I think he is watching this list, even if he >didn't sent emails yet). Come on Johann, come out and say hi! :) > >Actually the "sis-metadata" module being built is not about arbitrary >metadata, but rather about the "lingua franca" to be used in SIS for >metadata. Many metadata model could be choose for this purpose, but the >proposed SIS approach is to select ISO standards as the lingua franca. >All other sources of metadata would need to be converted to ISO 19115 >before to be used in a source-independent way by all SIS modules. This >is the purpose for instance of the NetCDF - ISO mapping mentioned in >previous email. This explain why "data source", which is where >input/output happen, is so far away from metadata in the above >dependency chain; all preceding modules define the models which will >represent the data read by the data sources. It would be great to use Tika to convert *insert format here* to ISO 19115 if possible. > >Obviously the XML (un)marshalling is an exception to what I just said, >since it is defined straight in the core metadata module instead than as >a data source. But we should have (I hope) few such exceptions. This >exception exists for two reasons: 1) as a side effect of the way JAXB >works (annotations straight in the source code), and 2) because while >ISO 19115 would be the "lingua franca" for the conceptual model, XML is >the "lingua franca" for the file format at least at OGC/ISO/INSPIRE, so >maybe it deserves that special treatment... +1. Cheers, Chris > > Martin >
