Hi Tim,
Over in the SIS community [1], eventually writing a driver for Hive or HBase to
have spatial
support a la PostGIS is something that we've wanted to get around to, but
haven't yet. The
goal of SIS is to be an ALv2 licensed spatial toolkit, with no surprises [2].
If you are interested
in contributing to the SIS community and helping out, I'd certainly appreciate
it. As would I
appreciate anyone in the HIVE community that has time to help us write the HIVE
driver for SIS.
We currently have the ability to support point/radius and bbox QuadTree based
searches, and
the loading of GeoRSS data into the QuadTree index.
Cheers,
Chris
[1] http://incubator.apache.org/sis/
[2] http://wiki.apache.org/incubator/SpatialProposal/
On Mar 16, 2012, at 2:21 AM, Tim Robertson wrote:
> Hi all,
>
> I need to perform a lot of "point in polygon" checks and want to use Hive
> (currently I mix Hive, Sqoop and PostGIS in an Oozie workto do this).
>
> In an ideal world, I would like to create a Hive table from a Shapefile
> containing polygons, and then do the likes of the following:
>
> SELECT p.id, pp.id FROM points p, polygons pp WHERE pp.contains(geom,
> toPoint(p.lat,p.lng))
>
> Has anyone done anything along these lines?
>
> Alternatively I am capable of doing a UDF that would read the shape file into
> memory and basically do a map side join using something like a slab
> decomposition technique. It is more limited but would meet my needs allowing
> e.g.:
>
> SELECT contains(p.lat,p.lng, '/data/shapefiles/countries.shp') FROM points;
>
> Before I start I thought I'd ask folks as I suspect people are doing this
> kind of thing on Hive by now (thinking FB and user profiling by political
> boundaries etc)
>
> I'd love to hear from anyone who's investigated this or could provide any
> advice.
>
> Thanks!
> Tim
>
++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++