Chris, One other thing occurred to me while looking at this. All of the discussion I've seen thus far revolves around geospatial imagery. Has there been any discussion about using Tika on any of the geospatial vector formats? I would think they would go hand in hand, and OGR recognizes many of them.
Joe On Feb 26, 2012, at 1:10 PM, Mattmann, Chris A (388J) wrote: > Hi Joe, > > Awesome! Thanks for picking this up and getting interested in this work. > Right now, the only use cases we've had so far > is to represent lats and lons (WGS84). It would be great to extract more > information and come up with a policy for representing > more WKTs and so forth. We should probably start by coming up with a scheme > for encoding the extracted information in the > Tika metadata object and in its output XHTML. Do you have any ideas about how > to do that? Right now in the existing patch > on TIKA-605, I simply was intended to use the met object and its > key-multi-value structure to represent the extracted information > but to take advantage of streaming and of content handlers, we ought to > encode this information in the output XHTML. > > Thoughts? > > Cheers, > Chris > > On Feb 26, 2012, at 9:39 AM, Joe White wrote: > >> Hi, >> I'm looking into implementing a bridge/link between Tika and GDAL so that >> geospatial information can be saved from georeferenced images and vector >> types. One thing that I have noticed while going through the code is that >> the code only defines geographic coordinate types, using latitudes and >> longitudes. Is this by design? If GDAL is wrapped into Tika, and a >> projected image is imported, are the geospatial extents meant to be held in >> the metadata as geographic points, possibly as WGS 84? >> >> Thanks >> >> Joe White > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >
