Thanks both for your very helpful input - I'm still a GeoSPARQL novice
and trying to learn stuff and first of all just use the Jena
implementation as efficient as possible.
On 21.02.22 15:22, Andy Seaborne wrote:
On 21/02/2022 09:07, Lorenz Buehmann wrote:
Any experience or comments so far?
Using SubsystemLifecycle, could make the conversions by
GeoSPARQLOperations.convertGeoPredicates
extensible.
Andy
But having coordinate location (P625), located on astronomical body
(P376) as properties of a thing, is dangerous because of monotonicity
in RDF:
SELECT * { ?x wdt:P625 ?coords }
the association of P625 and P376 is lost.
Yep, I could simply omit the extra-terrestrial entities for now when
storing the GeoSPARQL conform triples in the separate graph - clearly,
this would need the Wikidata full dump as qualifiers are not contained
in truthy.
As Marco pointed out there is ongoing discussion on Wikidata community:
https://www.wikidata.org/wiki/Wikidata:Property_proposal/planetary_coordinates
What is the range of P625? It is not "earth geometry" any more.
What if there is no P376 on ?x?
Wikidata doesn't really have a concept of range or let's say they do not
make use of RDFS at all. They use "property constraints" and if I look
at https://www.wikidata.org/wiki/Property:P625 they more or less define
some kind of domain
"not being human or company or railway" and some other more weird like
"not being a female given name" etc. - I can'T see any range at least
not in a structured data format maybe in some discussion only.
Currently, I'd treat absence of P376 as "on Earth" but that's just my
intepretation.
As with any n-ary-like relationship, the indirection keeps the related
properties together.
This is not unique to geo. Temperatures with units for example
---------------------------------
This brings me to another "issue" - or let's call it unexpected behavior
which for me is counter-intuitive:
I used geof:distance function and according to GeoSPARQL standard this
is defined as
Returns the shortest distance in units between any two Points in the
two geometric
objects as calculated in the spatial reference system ofgeom1.
so I'd consider some metric regarding the used CRS and if absent it
should be CRS84. But Jena does implement just the euclidean distance
according to source code, is this intended? Here is an example of a few
cities in Germany with it's pairwise distance as well as the Haversine
distance:
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX spatialF: <http://jena.apache.org/function/spatial#>
PREFIX afn: <http://jena.apache.org/ARQ/function#>
SELECT ?s ?o ?d1 ?d2 ?diff_d1_d2 ?d_hav ?diff_eucl_hav {
VALUES ?s {wd:Q1709 wd:Q64 wd:Q1729 wd:Q1718 wd:Q1726}
VALUES ?o {wd:Q1709 wd:Q64 wd:Q1729 wd:Q1718 wd:Q1726}
?s wdt:P625 ?wkt1 .
?o wdt:P625 ?wkt2 .
FILTER(?s != ?o && str(?s) < str(?o))
BIND(geof:distance(?wkt1, ?wkt2, uom:kilometer) as ?d1)
BIND(geof:distance(?wkt2, ?wkt1, uom:kilometer) as ?d2)
BIND(abs(?d1 - ?d2) as ?diff_d1_d2)
BIND(spatialF:distance(?wkt1, ?wkt2, uom:kilometer) as ?d_hav)
BIND(afn:max(abs(?d1 - ?d_hav), abs(?d2 - ?d_hav)) as ?diff_eucl_hav)
}
with result
|+----------+----------+--------------+--------------+----------------------+--------------+----------------------+||
||| s | o | d1 | d2 |
diff_d1_d2 | d_hav | diff_eucl_hav |||
||+----------+----------+--------------+--------------+----------------------+--------------+----------------------+||
||| wd:Q1709 | wd:Q64 | 149.280218e0 | 153.202637e0 |
3.922419000000019e0 | 180.75785e0 | 31.477632e0 |||
||| wd:Q1709 | wd:Q1729 | 177.123944e0 | 188.077111e0 |
10.953167000000008e0 | 296.42569e0 | 119.30174599999998e0 |||
||| wd:Q1709 | wd:Q1718 | 345.13558e0 | 364.477344e0 |
19.341764000000012e0 | 412.752229e0 | 67.616649e0 |||
||| wd:Q1709 | wd:Q1726 | 362.915021e0 | 408.448278e0 |
45.53325699999999e0 | 611.210126e0 | 248.29510499999992e0 |||
||| wd:Q1729 | wd:Q64 | 197.116217e0 | 190.514338e0 |
6.601878999999997e0 | 235.639289e0 | 45.12495099999998e0 |||
||| wd:Q1718 | wd:Q64 | 469.456614e0 | 456.224537e0 |
13.232077000000004e0 | 475.626349e0 | 19.401812000000007e0 |||
||| wd:Q1718 | wd:Q1729 | 297.248804e0 | 298.880777e0 |
1.6319730000000163e0 | 298.493316e0 | 1.244511999999986e0 |||
||| wd:Q1718 | wd:Q1726 | 398.21636e0 | 424.395158e0 |
26.178797999999972e0 | 487.365165e0 | 89.14880499999998e0 |||
||| wd:Q1726 | wd:Q64 | 351.968792e0 | 320.94899e0 |
31.019802000000027e0 | 503.534544e0 | 182.585554e0 |||
||| wd:Q1726 | wd:Q1729 | 214.882078e0 | 202.734071e0 |
12.148007000000007e0 | 318.297549e0 | 115.563478e0 |||
||+----------+----------+--------------+--------------+----------------------+--------------+----------------------+|
the result is showing huge differences between:
- the euclidean distance itself, it's not even symmetric? I mean, we
have differences of 45km here
- the euclidean distance and the Haversine in worst case we have 250km
difference
So my question is if this is just what we have to live with and if so
how would people be aware of even the asymmetry of this measure?