comments inline On Mon, Dec 3, 2018 at 5:14 PM Greg Albiston <galbis...@mail.com> wrote:
> Hi Marco, > > 1. As mentioned this shouldn't be too difficult to support. > indeed not difficult but needs a decision you could try with the following geonames dataset all-geonames_lotico.ttl.gz > > 2. Yes, the indexing, or rather caching, is in-memory, but it is > on-demand. There shouldn't be any delay at start-up beyond what Jena > needs to do. The cost comes during query execution. The key invariant > data produced for solutions is retained for a short period of time (but > can be configured to be retained until termination). Some regularly > re-used info is always kept until termination (e.g. any spatial > reference system transformation that has been requested). > the following will create and populate the TDB dataset ./geosparql-fuseki --loopback false --rdf_file ./lm.ttl --tdb TDB1 I presume this message refers to the creation of the spatial cache / index 6:05:46.685 INFO Applying GeoSPARQL Schema - Started 6:07:44.826 INFO Applying GeoSPARQL Schema - Completed next time I can call TDB directly ./geosparql-fuseki --loopback false --tdb TDB1 6:08:38.665 INFO Applying GeoSPARQL Schema - Started 6:10:18.661 INFO Applying GeoSPARQL Schema - Completed takes approximately 2m for a very small data set. the same fuseki with tdb+jena-spatial restarts almost instantaneously even with reasonably large data sets (see geonames). > The main benefit of this is de-serialising geometry literals. The > spatial relations arguments are between a pair of geometry literals, one > of which is likely to be the same in the next solution, so keeping hold > of both means in alot of cases the de-serialisation can be avoided for > one (and possibly both if still retained from a previous set of solutions). > might be a good idea to serialize the cache object of de-serialisized geometries to disk to speed up the boot process. maybe Andy could assist or even align this with tdb > > The aim was to only do work that's needed, not do repeat work and to be > generally quick (i.e. rely on JTS to be optimised for quick solutions > between the geometry pairs and Jena to optimise queries). There are 24 > spatial relations and about half a dozen other functions so > pre-computing every combination gets big quickly and produces data that > users might not want/use. > > A rough check of most the spatial relations only requires a bounding box > intersection or type check, so negative results can be quickly > discarded. I looked into caching and storing to file, but there just > wasn't the benefit in my use case. It took longer to load up then > execute than just execute from fresh and cache. Also, the spatial > indexes implemented by JTS aren't designed/suited for the spatial > relations. If there is a use-case that gets more benefit from > pre-computing or storing between programme execution then I'm sure it > can be adapted for, but in the context of GeoSPARQL this approach was > effective. > > 3. If you could send me the dataset that causes these errors then I'll > happily have a look into it. > you can use this simple list of point geometries here http://www.lotico.com/lm.ttl.gz this query will parse and execute PREFIX geo: <http://www.opengis.net/ont/geosparql#> PREFIX geof: <http://www.opengis.net/def/function/geosparql/> SELECT ?well WHERE { ?well <http://www.wikidata.org/entity/P625> ?geometry . FILTER(geof:sfWithin(?geometry,"POLYGON((-10 50,2 50,2 55,-10 55,-10 50))"^^geo:wktLiteral)) } LIMIT 10 this one will parse and fail PREFIX geo: <http://www.opengis.net/ont/geosparql#> PREFIX geof: <http://www.opengis.net/def/function/geosparql/> SELECT ?well WHERE { ?well <http://www.wikidata.org/entity/P625> ?geometry . FILTER(geof:sfWithin(?geometry,"POLYGON((-10 50,2 50,2 55,-10 55,-10 51))"^^geo:wktLiteral)) } LIMIT 10 warn/error messages 6:17:45.887 ERROR Points of LinearRing do not form a closed linestring - Illegal WKT literal: POLYGON((-10 50,2 50,2 55,-10 55,-10 51)) 6:17:45.887 WARN General exception in (< http://www.opengis.net/def/function/geosparql/sfWithin> ?geometry "POLYGON((-10 50,2 50,2 55,-10 55,-10 51))"^^< http://www.opengis.net/ont/geosparql#wktLiteral>) org.apache.jena.datatypes.DatatypeFormatException: Points of LinearRing do not form a closed linestring - Illegal WKT literal: POLYGON((-10 50,2 50,2 55,-10 55,-10 51)) at io.github.galbiston.geosparql_jena.implementation.datatype.WKTDatatype.parse(WKTDatatype.java:109) at io.github.galbiston.geosparql_jena.implementation.GeometryWrapper.extract(GeometryWrapper.java:905) at io.github.galbiston.geosparql_jena.implementation.GeometryWrapper.extract(GeometryWrapper.java:834) at io.github.galbiston.geosparql_jena.geof.topological.GenericFilterFunction.exec(GenericFilterFunction.java:57) at io.github.galbiston.geosparql_jena.geof.topological.GenericFilterFunction.exec(GenericFilterFunction.java:42) at org.apache.jena.sparql.function.FunctionBase2.exec(FunctionBase2.java:55) at org.apache.jena.sparql.function.FunctionBase.exec(FunctionBase.java:63) at org.apache.jena.sparql.expr.E_Function.evalSpecial(E_Function.java:89) at org.apache.jena.sparql.expr.ExprFunctionN.eval(ExprFunctionN.java:100) at org.apache.jena.sparql.expr.ExprNode.isSatisfied(ExprNode.java:41) at org.apache.jena.sparql.engine.iterator.QueryIterFilterExpr.accept(QueryIterFilterExpr.java:49) at org.apache.jena.sparql.engine.iterator.QueryIterProcessBinding.hasNextBinding(QueryIterProcessBinding.java:69) at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) at org.apache.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:58) at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) at org.apache.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76) at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) at org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:39) at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) at org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:39) at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) at org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:39) at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114) at org.apache.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:74) at org.apache.jena.sparql.engine.ResultSetCheckCondition.hasNext(ResultSetCheckCondition.java:55) at org.apache.jena.fuseki.servlets.SPARQL_Query.executeQuery(SPARQL_Query.java:350) at org.apache.jena.fuseki.servlets.SPARQL_Query.execute(SPARQL_Query.java:288) at org.apache.jena.fuseki.servlets.SPARQL_Query.executeWithParameter(SPARQL_Query.java:242) at org.apache.jena.fuseki.servlets.SPARQL_Query.perform(SPARQL_Query.java:217) at org.apache.jena.fuseki.servlets.ActionService.executeLifecycle(ActionService.java:183) at org.apache.jena.fuseki.servlets.ActionService.execCommonWorker(ActionService.java:98) at org.apache.jena.fuseki.servlets.ActionBase.doCommon(ActionBase.java:74) at org.apache.jena.fuseki.servlets.FusekiFilter.doFilter(FusekiFilter.java:73) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.Server.handle(Server.java:503) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) at java.base/java.lang.Thread.run(Thread.java:834) > > 4. The "geo:" prefix is the one used throughout the GeoSPARQL > documentation, so has been used for consistency when needed. The code > doesn't have a dependency on the "geo:" prefix, so there is no > requirement on the user. It would probably cause more confusion to those > following GeoSPARQL examples to not use the "geo:" prefix when necessary. > > I know but it needs some discussion about re-purposing of prefixes here > Thanks, > > Greg > > On 03/12/2018 15:46, Marco Neumann wrote: > > Hi Greg, ok let's do it in the dev list first. > > > > 1. indeed the picking up of lat/long is a common if not the most common > use > > case for building a spatial index. last but not least to perform a > > proximity search on 2D point geometries. (I know that the ogc recommends > a > > transformation path with a sparql query to turn lat / long into a WKT > > geometry datatypes maybe we could provide this as a convenient option > with > > the release) > > > > 2. as far as I can see the spatial index in geosparql-jena is memory > based. > > it creates additional load time during server startup. Am I missing > > something here, is there a file base spatial index as well? > > > > 3. error handling is disruptive. since we are hitting the spatial index > > first during query execution I am seeing a number of unpleasant side > > effects with syntactically correct sparql but semantically incorrect > > spatial queries. e.g. > > > > PREFIX geo: <http://www.opengis.net/ont/geosparql#> > > PREFIX geof: <http://www.opengis.net/def/function/geosparql/> > > > > SELECT ?well > > WHERE { > > ?well <http://www.wikidata.org/entity/P625> ?geometry . > > FILTER(geof:sfWithin(?geometry,"POLYGON((-77 38,-77 0,0 38,0 0,0 > > 0))"^^geo:wktLiteral)) > > } LIMIT 10 > > > > 4. The re-use of the geo: prefix really isn't your problem I know but it > > will create confusion. Wouldn't geosparql: be a better prefix for this? > Is > > the OGC now married to this prefix? It used to be > > http://www.w3.org/2003/01/geo/wgs84_pos# > > > > and there is more to come... > > > > again thank you for working on this with your team Greg, much > appreciated. > > > > > > > > > > > > > > > > > > On Mon, Dec 3, 2018 at 2:15 PM Greg Albiston <galbis...@mail.com> wrote: > > > >> Hi Marco, > >> > >> I've had a look at the doucmentation for Jena Spatial and it would seem > >> the main data change is the use of the Lat/Lon pairs. > >> This doesn't comply with the GeoSPARQL standard so support for this > >> would be a Jena extension. > >> > >> This could be accomodated by a property function to convert to a WKT > >> Point literal with WGS84/CRS84 spatial reference. > >> Users would then be able to use the result in query for any of the > >> GeoSPARQL functions. > >> > >> Alternatively, the spatial relations could all have an extra property > >> function defined, provide the conversion and hand over to the GeoSPARQL > >> equivalent property function. This wouldn't take long to integrate as > >> individual spatial relation property functions are very minimal. > >> > >> The other item that jumps out is the Jena spatial property functions. > >> > >> spatial:nearby, spatial:withinCircle, spatial:withinBox and > >> spatial:interesectBox all seem to be variations of Simple Features > >> spatial relations that are covered by GeoSPARQL. These property > >> functions can be incorpated for backward compatability but it's whether > >> these should just be offered as the current Lat/Lon pairs or expanded to > >> accept geometry literals (i.e. WKT, GML etc.)? The latter option > >> shouldn't be hard to provide for the same reason as above. > >> > >> spatial:north, spatial:south, spatial:west and spatial:east are not in > >> GeoSPARQL. Again its a question of whether these should be provided more > >> generally for WKT, GML geometry literals? There might need to be a bit > >> of extra work handling both geographic and planar spatial reference > >> systems, as Jean Spatial is only doing a spatial reference system. > >> > >> I don't think it would be too difficult to support the existing Jena > >> Spatial functionality, at least based on the webpage > >> (https://jena.apache.org/documentation/query/spatial-query.html), as an > >> extension to what is provided by GeoSPARQL. > >> > >> Is there anything else that you were concerned about? > >> > >> Thanks, > >> > >> Greg > >> > >> > >> On 03/12/2018 10:53, Marco Neumann wrote: > >>> so I've had a look at this and while I think geosparql-jena is a very > >>> welcomed contribution to the jena project I don't think we should rush > >> with > >>> the retirement of jena-spatial at this point as Greg's approach will > >>> require users to make changes to their existing data. > >>> > >>> I will engage Greg on us...@jena.apache.org again to clarify a few > >> things > >>> and hopefully get more people involved in this conversation around > >> spatial, > >>> geosparql and jena. > >>> > >>> > >>> > >>> On Fri, Nov 30, 2018 at 1:23 PM Marco Neumann <marco.neum...@gmail.com > > > >>> wrote: > >>> > >>>> how quickly can you hook geosparql into the release? > >>>> > >>>> this would make lucene spatial obsolete in the next release. has Greg > >>>> released performance benchmarks for his implementation? as I said I > will > >>>> take a look at it over the weekend when time permits. > >>>> > >>>> On Fri, Nov 30, 2018 at 11:02 AM Andy Seaborne <a...@apache.org> > wrote: > >>>> > >>>>> We could retire jena-spatial immediately after 3.10.0 - given the > >> Lucene > >>>>> change that might be smoother, one release with updated dependencies. > >>>>> > >>>>> If that is the way forward, I think it is (mildly) better to take it > >> out > >>>>> of the Fuseki/Full build in 3.10.0. > >>>>> > >>>>> Andy > >>>>> > >>>>> On 29/11/2018 17:00, Marco Neumann wrote: > >>>>>> I will have to look into that I guess since I am frequent user of > >>>>> spatial > >>>>>> data. > >>>>>> > >>>>>> why not go to 7.5? was there an incompatibility? > >>>>>> > >>>>>> On Thu 29. Nov 2018 at 16:53, Andy Seaborne <a...@apache.org> > wrote: > >>>>>> > >>>>>>> Jena 3.1.0 would be around the end of the year. I'd like to make > use > >> of > >>>>>>> Greg's GeoSPARQL project the "headline" item for the release and to > >>>>>>> retire jena-spatial in 3.10.0 as an indication of this. > >>>>>>> > >>>>>>> Because retirement is a new process for the project, I'm sending > this > >>>>>>> first 3.10.0 message quite early to give us discussion time. > >>>>>>> > >>>>>>> == Retirements > >>>>>>> > >>>>>>> We have talked about this before but not actually done anything. > See > >>>>>>> separate thread for discussion on retirement process and for the > >> first > >>>>>>> modules: > >>>>>>> > >>>>>>> jena-spatial > >>>>>>> jena-fuseki1 > >>>>>>> jena-csv > >>>>>>> > >>>>>>> == Headlines > >>>>>>> > >>>>>>> JENA-664 : GeoSPARQL support > >>>>>>> > >>>>>>> I'd like to make use of Greg's GeoSPARQL project the "headline" > item > >>>>> for > >>>>>>> the release and to retire jena-spatial in 3.10.0 as an indication > of > >>>>> this. > >>>>>>> JENA-1621 : Lucene upgrade to 7.4 > >>>>>>> May need to reload lucene indexes. > >>>>>>> (e.g. the lucene index was create originally with Lucene v5.x > (prior > >>>>>>> Jena 3.3.0). See Lucene upgrade tool. > >>>>>>> https://lucene.apache.org/solr/guide/7_4/indexupgrader-tool.html > >>>>>>> > >>>>>>> JENA-1623 : Fuseki security > >>>>>>> JENA-1627 : HTTP support > >>>>>>> https://issues.apache.org/jira/browse/JENA-1623 > >>>>>>> > >> > http://jena.staging.apache.org/documentation/fuseki2/data-access-control > >>>>>>> == JIRA: > >>>>>>> > >>>>>>> 31 currently. > >>>>>>> > >>>>>>> https://s.apache.org/jena-3.10.0-jira > >>>>>>> > >>>>>>> == Updates > >>>>>>> > >>>>>>> Only plugins. JENA-1624 > >>>>>>> > >>>>>>> surefire : 2.21.0 -> 2.22.1 (+ SUREFIRE-1588) > >>>>>>> compiler : 3.7.0 -> 3.8.0 > >>>>>>> shade : 3.1.0 -> 3.2.0 > >>>>>>> > >>>>>>> Andy > >>>>>>> > >>>> -- > >>>> > >>>> > >>>> --- > >>>> Marco Neumann > >>>> KONA > >>>> > >>>> > > > -- --- Marco Neumann KONA