Hi Lorenz,

Thanks for the info. I'll look into adding options to disable these
dynamic conversions between units and geometries, as they can create
such large errors over large distances, so that users can choose to
switch them on.

Thanks,

Greg

On 24/02/2022 07:30, Lorenz Buehmann wrote:
Hi Greg,

thanks for providing such an informative answer.

On 23.02.22 17:59, Greg wrote:
Hi Lorenz,

Regarding your final point on the use of Euclidean distance for the
geof:distance, this is derived from Requirement A.3.14 on page 38 of the
GeoSPARQL standard (quoted below). The definition of the distance and
other query functions follows that of the Simple Features standard (ISO
19125-1). The Simple Features standard uses a two dimensional planar
approach, the distance calculation is Euclidean and Great Circle is out
of scope. Applying the distance function to non-planar SRS coordinates
is regarded as an acceptable error.
Ok, I see - now I'm understanding better.

The Jena implementation follows the GeoSPARQL standard by converting the
second Geometry Literal's coordinates to the first Geometry Literal's
SRS, if required.

A Great Circle distance filter function has been provided as
*spatialF:greatCircleGeom(...)*
(https://jena.apache.org/documentation/geosparql/). This is an extension
namepsace for Jena as it is outside the GeoSPARQL standard.
Yep, as you can see from my query I did use spatialF:distance in the
end which maps to Haversine

Could you provide the WKT Geometry Literals returned by your query, so
that they can be tested directly for the asymmetry?

Sure, the data comes from Wikidata but here is a self-contained query
with just the WKT literals in the VALUES clause:


PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX spatialF: <http://jena.apache.org/function/spatial#>
PREFIX afn: <http://jena.apache.org/ARQ/function#>

SELECT * {
  VALUES ?wkt1 {"Point(11.416666666667
53.633333333333)"^^geo:wktLiteral "Point(11.575
48.1375)"^^geo:wktLiteral}
  VALUES ?wkt2 {"Point(11.416666666667
53.633333333333)"^^geo:wktLiteral "Point(11.575
48.1375)"^^geo:wktLiteral}
  #FILTER(?wkt1 != ?wkt2 && str(?wkt1) < str(?wkt2))
  BIND(geof:distance(?wkt1, ?wkt2, uom:kilometer) as ?d1)
  BIND(geof:distance(?wkt2, ?wkt1, uom:kilometer) as ?d2)
  BIND(abs(?d1 - ?d2) as ?diff_d1_d2)
  BIND(spatialF:distance(?wkt1, ?wkt2, uom:kilometer) as ?d_hav)
  BIND(afn:max(abs(?d1 - ?d_hav), abs(?d2 - ?d_hav)) as ?diff_eucl_hav)
}

Note, I commented out the filter because here must be some bug,
FILTER(?wkt1 != ?wkt2) always leads to an error or false. Can somebody
verify this?

I also checked the source code, indeed the raw euclidean measure is
the same for two points p1 and p2 - but the post-processing to map the
value to a unit like kilometers does more math and depends on the
starting longitude value.



Thanks,

Greg

*A.3.1.4 /conf/geometry-extension/query-functions*

Requirement: /req/geometry-extension/query-functions
Implementations shall support geof:distance, geof:buffer,
geof:convexHull, geof:intersection, geof:union, geof:difference,
geof:symDifference, geof:envelope and geof:boundary as SPARQL extension
functions, consistent with the definitions of the corresponding
functions (distance, buffer, convexHull, intersection, difference,
symDifference, envelope and boundary respectively) in Simple Features
[ISO 19125-1].


On 23/02/2022 08:56, Lorenz Buehmann wrote:
Thanks both for your very helpful input - I'm still a GeoSPARQL novice
and trying to learn stuff and first of all just use the Jena
implementation as efficient as possible.

On 21.02.22 15:22, Andy Seaborne wrote:


On 21/02/2022 09:07, Lorenz Buehmann wrote:
Any experience or comments so far?

Using SubsystemLifecycle, could make the conversions by

    GeoSPARQLOperations.convertGeoPredicates

extensible.

    Andy

But having coordinate location (P625), located on astronomical body
(P376) as properties of a thing, is dangerous because of monotonicity
in RDF:

   SELECT * { ?x wdt:P625 ?coords }

the association of P625 and P376 is lost.

Yep, I could simply omit the extra-terrestrial entities for now when
storing the GeoSPARQL conform triples in the separate graph - clearly,
this would need the Wikidata full dump as qualifiers are not contained
in truthy.

As Marco pointed out there is ongoing discussion on Wikidata
community:
https://www.wikidata.org/wiki/Wikidata:Property_proposal/planetary_coordinates



What is the range of P625? It is not "earth geometry" any more.
What if there is no P376 on ?x?

Wikidata doesn't really have a concept of range or let's say they do
not make use of RDFS at all. They use "property constraints" and if I
look at https://www.wikidata.org/wiki/Property:P625 they more or less
define some kind of domain

"not being human or company or railway" and some other more weird like
"not being a female given name" etc. - I can'T see any range at least
not in a structured data format maybe in some discussion only.

Currently, I'd treat absence of P376 as "on Earth" but that's just my
intepretation.


As with any n-ary-like relationship, the indirection keeps the
related properties together.

This is not unique to geo. Temperatures with units for example


---------------------------------

This brings me to another "issue" - or let's call it unexpected
behavior which for me is counter-intuitive:

I used geof:distance function and according to GeoSPARQL standard this
is defined as

Returns the shortest distance in units between any two Points in the
two geometric
objects as calculated in the spatial reference system ofgeom1.
so I'd consider some metric regarding the used CRS and if absent it
should be CRS84. But Jena does implement just the euclidean distance
according to source code, is this intended? Here is an example of a
few cities in Germany with it's pairwise distance as well as the
Haversine distance:

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX spatialF: <http://jena.apache.org/function/spatial#>
PREFIX afn: <http://jena.apache.org/ARQ/function#>

SELECT ?s ?o ?d1 ?d2 ?diff_d1_d2 ?d_hav ?diff_eucl_hav {
  VALUES ?s {wd:Q1709 wd:Q64 wd:Q1729 wd:Q1718 wd:Q1726}
  VALUES ?o {wd:Q1709 wd:Q64 wd:Q1729 wd:Q1718 wd:Q1726}
  ?s wdt:P625 ?wkt1 .
  ?o wdt:P625 ?wkt2 .
  FILTER(?s != ?o && str(?s) < str(?o))
  BIND(geof:distance(?wkt1, ?wkt2, uom:kilometer) as ?d1)
  BIND(geof:distance(?wkt2, ?wkt1, uom:kilometer) as ?d2)
  BIND(abs(?d1 - ?d2) as ?diff_d1_d2)
  BIND(spatialF:distance(?wkt1, ?wkt2, uom:kilometer) as ?d_hav)
  BIND(afn:max(abs(?d1 - ?d_hav), abs(?d2 - ?d_hav)) as
?diff_eucl_hav)
}
with result

|+----------+----------+--------------+--------------+----------------------+--------------+----------------------+||


|||    s     |    o     |      d1      |      d2 | diff_d1_d2
|    d_hav     |    diff_eucl_hav     |||
||+----------+----------+--------------+--------------+----------------------+--------------+----------------------+||


||| wd:Q1709 | wd:Q64   | 149.280218e0 | 153.202637e0 |
3.922419000000019e0  | 180.75785e0  | 31.477632e0 |||
||| wd:Q1709 | wd:Q1729 | 177.123944e0 | 188.077111e0 |
10.953167000000008e0 | 296.42569e0  | 119.30174599999998e0 |||
||| wd:Q1709 | wd:Q1718 | 345.13558e0  | 364.477344e0 |
19.341764000000012e0 | 412.752229e0 | 67.616649e0 |||
||| wd:Q1709 | wd:Q1726 | 362.915021e0 | 408.448278e0 |
45.53325699999999e0  | 611.210126e0 | 248.29510499999992e0 |||
||| wd:Q1729 | wd:Q64   | 197.116217e0 | 190.514338e0 |
6.601878999999997e0  | 235.639289e0 | 45.12495099999998e0 |||
||| wd:Q1718 | wd:Q64   | 469.456614e0 | 456.224537e0 |
13.232077000000004e0 | 475.626349e0 | 19.401812000000007e0 |||
||| wd:Q1718 | wd:Q1729 | 297.248804e0 | 298.880777e0 |
1.6319730000000163e0 | 298.493316e0 | 1.244511999999986e0 |||
||| wd:Q1718 | wd:Q1726 | 398.21636e0  | 424.395158e0 |
26.178797999999972e0 | 487.365165e0 | 89.14880499999998e0 |||
||| wd:Q1726 | wd:Q64   | 351.968792e0 | 320.94899e0 |
31.019802000000027e0 | 503.534544e0 | 182.585554e0 |||
||| wd:Q1726 | wd:Q1729 | 214.882078e0 | 202.734071e0 |
12.148007000000007e0 | 318.297549e0 | 115.563478e0 |||
||+----------+----------+--------------+--------------+----------------------+--------------+----------------------+|



the result is showing huge differences between:

- the euclidean distance itself, it's not even symmetric? I mean, we
have differences of 45km here
- the euclidean distance and the Haversine in worst case we have 250km
difference

So my question is if this is just what we have to live with and if so
how would people be aware of even the asymmetry of this measure?


Reply via email to