Thanks both for your very helpful input - I'm still a GeoSPARQL novice and trying to learn stuff and first of all just use the Jena implementation as efficient as possible.

On 21.02.22 15:22, Andy Seaborne wrote:


On 21/02/2022 09:07, Lorenz Buehmann wrote:
Any experience or comments so far?

Using SubsystemLifecycle, could make the conversions by

    GeoSPARQLOperations.convertGeoPredicates

extensible.

    Andy

But having coordinate location (P625), located on astronomical body (P376) as properties of a thing, is dangerous because of monotonicity in RDF:

   SELECT * { ?x wdt:P625 ?coords }

the association of P625 and P376 is lost.

Yep, I could simply omit the extra-terrestrial entities for now when storing the GeoSPARQL conform triples in the separate graph - clearly, this would need the Wikidata full dump as qualifiers are not contained in truthy.

As Marco pointed out there is ongoing discussion on Wikidata community: https://www.wikidata.org/wiki/Wikidata:Property_proposal/planetary_coordinates


What is the range of P625? It is not "earth geometry" any more.
What if there is no P376 on ?x?

Wikidata doesn't really have a concept of range or let's say they do not make use of RDFS at all. They use "property constraints" and if I look at https://www.wikidata.org/wiki/Property:P625 they more or less define some kind of domain

"not being human or company or railway" and some other more weird like "not being a female given name" etc. - I can'T see any range at least not in a structured data format maybe in some discussion only.

Currently, I'd treat absence of P376 as "on Earth" but that's just my intepretation.


As with any n-ary-like relationship, the indirection keeps the related properties together.

This is not unique to geo. Temperatures with units for example


---------------------------------

This brings me to another "issue" - or let's call it unexpected behavior which for me is counter-intuitive:

I used geof:distance function and according to GeoSPARQL standard this is defined as

Returns the shortest distance in units between any two Points in the two geometric
objects as calculated in the spatial reference system ofgeom1.
so I'd consider some metric regarding the used CRS and if absent it should be CRS84. But Jena does implement just the euclidean distance according to source code, is this intended? Here is an example of a few cities in Germany with it's pairwise distance as well as the Haversine distance:

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX spatialF: <http://jena.apache.org/function/spatial#>
PREFIX afn: <http://jena.apache.org/ARQ/function#>

SELECT ?s ?o ?d1 ?d2 ?diff_d1_d2 ?d_hav ?diff_eucl_hav {
  VALUES ?s {wd:Q1709 wd:Q64 wd:Q1729 wd:Q1718 wd:Q1726}
  VALUES ?o {wd:Q1709 wd:Q64 wd:Q1729 wd:Q1718 wd:Q1726}
  ?s wdt:P625 ?wkt1 .
  ?o wdt:P625 ?wkt2 .
  FILTER(?s != ?o && str(?s) < str(?o))
  BIND(geof:distance(?wkt1, ?wkt2, uom:kilometer) as ?d1)
  BIND(geof:distance(?wkt2, ?wkt1, uom:kilometer) as ?d2)
  BIND(abs(?d1 - ?d2) as ?diff_d1_d2)
  BIND(spatialF:distance(?wkt1, ?wkt2, uom:kilometer) as ?d_hav)
  BIND(afn:max(abs(?d1 - ?d_hav), abs(?d2 - ?d_hav)) as ?diff_eucl_hav)
}
with result

|+----------+----------+--------------+--------------+----------------------+--------------+----------------------+||
|||    s     |    o     |      d1      |      d2 |      diff_d1_d2      |    d_hav     |    diff_eucl_hav     |||
||+----------+----------+--------------+--------------+----------------------+--------------+----------------------+||
||| wd:Q1709 | wd:Q64   | 149.280218e0 | 153.202637e0 | 3.922419000000019e0  | 180.75785e0  | 31.477632e0          ||| ||| wd:Q1709 | wd:Q1729 | 177.123944e0 | 188.077111e0 | 10.953167000000008e0 | 296.42569e0  | 119.30174599999998e0 ||| ||| wd:Q1709 | wd:Q1718 | 345.13558e0  | 364.477344e0 | 19.341764000000012e0 | 412.752229e0 | 67.616649e0          ||| ||| wd:Q1709 | wd:Q1726 | 362.915021e0 | 408.448278e0 | 45.53325699999999e0  | 611.210126e0 | 248.29510499999992e0 ||| ||| wd:Q1729 | wd:Q64   | 197.116217e0 | 190.514338e0 | 6.601878999999997e0  | 235.639289e0 | 45.12495099999998e0  ||| ||| wd:Q1718 | wd:Q64   | 469.456614e0 | 456.224537e0 | 13.232077000000004e0 | 475.626349e0 | 19.401812000000007e0 ||| ||| wd:Q1718 | wd:Q1729 | 297.248804e0 | 298.880777e0 | 1.6319730000000163e0 | 298.493316e0 | 1.244511999999986e0  ||| ||| wd:Q1718 | wd:Q1726 | 398.21636e0  | 424.395158e0 | 26.178797999999972e0 | 487.365165e0 | 89.14880499999998e0  ||| ||| wd:Q1726 | wd:Q64   | 351.968792e0 | 320.94899e0 | 31.019802000000027e0 | 503.534544e0 | 182.585554e0         ||| ||| wd:Q1726 | wd:Q1729 | 214.882078e0 | 202.734071e0 | 12.148007000000007e0 | 318.297549e0 | 115.563478e0         |||
||+----------+----------+--------------+--------------+----------------------+--------------+----------------------+|

the result is showing huge differences between:

- the euclidean distance itself, it's not even symmetric? I mean, we have differences of 45km here - the euclidean distance and the Haversine in worst case we have 250km difference

So my question is if this is just what we have to live with and if so how would people be aware of even the asymmetry of this measure?

Reply via email to