+1 to making it work for vector formats too -- geospatial imagery was just the 
first notch to tackle... :)

Cheers,
Chris

On Feb 26, 2012, at 11:09 AM, Joe White wrote:

> Chris,
> One other thing occurred to me while looking at this.  All of the discussion 
> I've seen thus far revolves around geospatial imagery.  Has there been any 
> discussion about using Tika on any of the geospatial vector formats?  I would 
> think they would go hand in hand, and OGR recognizes many of them.
> 
> Joe
> 
> On Feb 26, 2012, at 1:10 PM, Mattmann, Chris A (388J) wrote:
> 
>> Hi Joe,
>> 
>> Awesome! Thanks for picking this up and getting interested in this work. 
>> Right now, the only use cases we've had so far
>> is to represent lats and lons (WGS84). It would be great to extract more 
>> information and come up with a policy for representing
>> more WKTs and so forth. We should probably start by coming up with a scheme 
>> for encoding the extracted information in the 
>> Tika metadata object and in its output XHTML. Do you have any ideas about 
>> how to do that? Right now in the existing patch
>> on TIKA-605, I simply was intended to use the met object and its 
>> key-multi-value structure to represent the extracted information
>> but to take advantage of streaming and of content handlers, we ought to 
>> encode this information in the output XHTML.
>> 
>> Thoughts?
>> 
>> Cheers,
>> Chris
>> 
>> On Feb 26, 2012, at 9:39 AM, Joe White wrote:
>> 
>>> Hi,
>>> I'm looking into implementing a bridge/link between Tika and GDAL so that 
>>> geospatial information can be saved from georeferenced images and vector 
>>> types.  One thing that I have noticed while going through the code is that 
>>> the code only defines geographic coordinate types, using latitudes and 
>>> longitudes.  Is this by design?  If GDAL is wrapped into Tika, and a 
>>> projected image is imported, are the geospatial extents meant to be held in 
>>> the metadata as geographic points, possibly as WGS 84?  
>>> 
>>> Thanks
>>> 
>>> Joe White
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattm...@nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to