I'm pleased to announce that Ying Jiang has had the jena-spatial project [2] accepted into the the Google Summer of Code.

Please welcome Ying Jiang [1].

        Andy


[1] Intro.  http://s.apache.org/cgS

[2] Submitted project description:

Background

GeoSPARQL [1] is a complete approach to spatial query but it is complicated and directed more towards the specialist, not the average linked data developer with SPARQL knowledge. In fact, not all spatial queries are complicated. The sweet spot is something simpler (and less capable) than GeoSPARQL, because the average web developer, even if writing SPARQL, isn't looking for a complete solution to all geospatial use cases. They are looking for something easier (= smaller). Many use cases are covered by provision of a single facility like get all objects within a given radius or within a given bounding box. For example, this query makes a spatial query for the places within 10 kilometers of Bristol UK (which as latitude/longitude of 51.46, 2.6).

SELECT ?placeName

{
   ?place spatial:query (51.46 2.6 10) .
   ?place rdfs:label ?placeName
}

My mentor has developed an initial and experimental implementation of GeoARQ [3] as proof of the idea, which uses Lucene spatial capabilities to provide a spatial property function for ARQ. However, GeoARQ is not specifically related to the formal Jena project, whose internal design is quite old and does not play so well with RDF datasets and update, as well as adding assembler capabilities to integrate with Fuseki. To resolve these problems, I will develop an extension to Jena ARQ, called jena-spatial, which is exploiting the spatial capabilities of Lucene to create a fully integrated capability that we can add to the main download when it's stable. Jena provides a similar property function architecture in jena-text [2], so this project is to take that concept and apply it to geospatial information.



Project Scopes and Approaches

I have already got engaged with the Jena community through email in the past weeks. I’m sure that I understand the needs of the project and the commitments to make to my mentor. Here’re the project scopes and their approaches summarized from the discussions with my mentor.



1. Spatial Data Indexer

Firstly, I’ll design and develop a module for spatial information Indexing. The indexer can read the spatial data from Jena Model and Statement, and transform them into Lucene Document. The indexing process can be controlled by startIndexing(), finishIndexing(), abortIndexing() and close(). A command line tool (by extending CmdARQ [4]) should be provided for reading spatial datasets and indexing them for assembling, with the arguments like “[--desc | --dataset] assemblerPath”.



2. Spatial Property Functions

What kinds of spatial property functions should be developed in this project? GeoSPARQL seems to be a complete solution. But it’s not necessary to get into full GeoSPARQL which is too complicated for non-specialists. Geospatial stretched to much more complicated relationships but a lot of useful things can be done with less than the full geospatial model which is too complicated for the average web developer to take on board with their limited time. For example, there is a simple vocabulary for expressing WGS80 information ; there is also the point information in WKT ; there’s spatial relations ontology for reference.

On the other hand, I've studied Lucene sptatial to figure out what spatial relations are possible to be implemented. For the current release of Lucene 4.2.1, it provides a high level abstraction for spatial query usage.

- SpatialOperation : compares a stored geometry to a supplied geometry. such as "IsWithin" and "Intersects". Actually, all of them are not supported.

- SpatialStrategy : encapsulates an approach to indexing and searching based on shapes. Different implementations will support different features. There're 3 implementations now: PointVectorStrategy, RecursivePrefixTreeStrategy and TermQueryPrefixTreeStrategy. For example, PointVectorStrategy supports only "IsWithin" for Rectangle or Circle, while RecursivePrefixTreeStrategy can caculate "Intersects" of any kinds of Shapes. I'd like to make full use of Lucene to make jena-text support as many spatial relationships as possible. On the other hand, I can also make new SpatialOperations like "Northing/Westing" mentioned in that are not available in Lucene.

To wrap things up, here're the jena-spatial property functions that I can do this summer:



2.1 ?A -> within -> B

A: Point Var

B: Rectangle or Circle

Approach: PointVectorStrategy directly supports this

Note: a circle can be specified in the query as (point, radius). We can treat Rectangle and Circle differently in this way:

?point :withinCircle  ( x,y,r ) .

?point :withinBox  ( x1,y1,x2,y2) .



2.2 A -> nearby ->?B, C

A: Point

B: Point Var

C: radius

Approach: RecursivePrefixTreeStrategy makes

SpatialOperation.Intersects for the Circle with the center of A and the radius of C



2.3 A -> intersects -> ?B

A: any Shape

B: non-Point Shape Var

Approach: RecursivePrefixTreeStrategy directly supports this (note: not tested)



2.4 A -> intersects -> ?B

A: any Shape

B: Point Var

Approach: TermQueryPrefixTreeStrategy directly supports this



2.5 A -> northing/southing/easting/westing -> B

A: Point

B: Point

Approach: use rangeQuery, or RecursivePrefixTreeStrategy makes SpatialOperation.Intersects of north/south/east/west Rectangle



2.6 A -> disjoints ->B

A: Rectangle

B: Rectangle

Approach: PointVectorStrategy directly supports this (note: it doesn't handle dateline cross).

From the users’ views, the most common use case is to query all places with a box or circle centered on a given point. Therefore, I’d like to emphasize on 2.1 and 2.2, with the others marked as the optional ones to be implemented if time permits.



3. Jena Assembler Configuration with Fuseki Integration

I’ll provide a way to describe the Lucene spatial index with a Jena assembler description in configuration. For example, the user can have one field, mapping a property to a spatial index field. The Fuseki configuration simply points to the spatial dataset as the fuseki:dataset of the service. It should be possible to have a Fuseki server with spatial support by simply adding it to the build/classpath.

Reply via email to