Dear Ruben,

LDF seems a very promising solution to build reliable Linked Data production 
environment with high scalability at relatively low cost.

However, I'm not sure if the solution works well on queries like the ones 
discussed here (see below). It would be very interesting to learn how exactly 
such a query would be dealt with in an LDF client / server setting.

To me, a crucial point seems to be that I'm trying to look up a large number of 
distinct entities in two endpoints and join them. In the "real life" case 
discussed here, about 430.000 "economists" extracted from GND and about 320.000 
"persons with GND id" from wikidata. The result of the join are about 30.000 
wikidata items, for which the German and English wikipedia site links are 
required.

How could an LDF client get this information effectively?

Cheers, Joachim

> PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
> PREFIX schema: <http://schema.org/>
> #
> construct {
>    ?gnd schema:about ?sitelink .
> }
> where {
>    # the relevant wikidata items have already been 
>    # identified and loaded to the econ_pers endpoint in a 
>    # previous step
>    service <http://zbw.eu/beta/sparql/econ_pers/query> {
>      ?gnd skos:prefLabel [] ;
>           skos:exactMatch ?wd .
>      filter(contains(str(?wd), 'wikidata'))
>    }
>    ?sitelink schema:about ?wd ;
>              schema:inLanguage ?language .
>    filter (contains(str(?sitelink), 'wikipedia'))
>    filter (lang(?wdLabel) = ?language && ?language in ('en', 'de')) }
>

-----Ursprüngliche Nachricht-----
Von: Ruben Verborgh [mailto:ruben.verbo...@ugent.be] 
Gesendet: Donnerstag, 18. Februar 2016 14:02
An: wikidata@lists.wikimedia.org
Cc: Neubert, Joachim
Betreff: Re: [Wikidata] Make federated queries possible / was: SPARQL CONSTRUCT 
results truncated

Dear all,

I don't mean to hijack the thread, but for federation purposes, you might be 
interested in a Triple Pattern Fragments interface [1]. TPF offers lower server 
cost to reach high availability, at the expense of slower queries and higher 
bandwidth [2]. This is possible because the client performs most of the query 
execution.

I noticed the Wikidata SPARQL endpoint has had an excellent track record so far 
(congratulations on this), so the TPF solution might not be necessary for 
server cost / availability reasons.

However, TPF is an excellent solution for federated queries. In (yet to be 
pulbished) experiments, we have verified that the TPF client/server solution 
performs on par with state-of-the-art federation frameworks based on SPARQL 
endpoints for many simple and complex queries. Furthermore, there are no 
security problems etc. ("open proxy"), because all federation is performed by 
the client.

You can see a couple of example queries here with other datasets:
- Works by writers born in Stockholm (VIAF and DBpedia - 
http://bit.ly/writers-stockholm) - Books by Swedish Nobel prize winners that 
are in the Harvard Library (VIAF, DBpedia, Harvard - 
http://bit.ly/swedish-nobel-harvard)

It might be a quick win to set up a TPF interface on top of the existing SPARQL 
endpoint.
If you want any info, don't hesitate to ask.

Best,

Ruben

[1] http://linkeddatafragments.org/in-depth/
[2] http://linkeddatafragments.org/publications/iswc2014.pdf

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to