Re: [Wikidata] Can LDF scale?

2016-12-22 Thread Ruben Verborgh
Hi Markus, > I am not sure but would guess that my 1h20min query has not received much > more than 100MB of data. That might be possible (we'd need to verify), but then this means the other time was spent computing, which shows the query plan or execution was highly inefficient. So not an inhere

Re: [Wikidata] Can LDF scale? (Was: Linked data fragment enabled on the Query Service)

2016-12-22 Thread Ruben Verborgh
Hi Stas, > It is possible to have more horizontal-scale replication - i.e. adding > servers - of course, at the cost of hardware which inevitably raises the > question of budget - Since the number of non-empty TPFs per dataset is finite, just more caching should do, depending of course on the cha

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-22 Thread Ruben Verborgh
Hi Stas, I found an important problem with the current configuration, that is likely a major factor in the performance loss we see. I noticed that https://query.wikidata.org/bigdata/ldf is served on HTTPS, and even with HTTP/2. However, the hypermedia controls inside of the message direcs the ser

Re: [Wikidata] Can LDF scale? (Was: Linked data fragment enabled on the Query Service)

2016-12-22 Thread Ruben Verborgh
Hi Markus, > A thing I was wondering about while testing LDF is how this type of service > might behave under load. In the tests I am doing, my single browser issues > several 100,000 requests for a single query, at an average rate close to 100 > requests per second. This is one user. That's i

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-22 Thread Ruben Verborgh
>> Finally on Chrome: 320 results in 3496.4s in my settings. > > I got the same result on Chrome, but it took 4600 sec here (1h 20min). Definitely a case of the query planner making a very wrong decision here. These are interesting examples we need to check when designing the new query engine. F

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-22 Thread Ruben Verborgh
Probably a relevant question for Stas given the varying measurements: is the TPF server being fronted by a cache? If not, that might partly explain some of the things we're seeing. Here's an example NGINX config I use on fragments.dbpedia.org: https://gist.github.com/RubenVerborgh/6d4ac975f0f36b6d

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-22 Thread Ruben Verborgh
Hi all, Thanks for all your feedback regarding the results and timings. I must say I'm quite surprised to see the high number of variations; the software already started 3 years ago, and has been battle-tested many times already, so such high deviations are highly unexpected. We haven't had simil

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-22 Thread Ruben Verborgh
Hi Kingsley, >> will see a substantial increase in server costs >> when they try to host that same data as a public SPARQL HTTP service. > > Again subjective. No, that's not subjective, that's perfectly measurable. And that's exactly what we did in our research. The problem with the SPARQL prot

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Thad, > Looks like Jena is doing most of your heavy lifting in the Java client ? Absolutely, Jena is doing almost everything. However, Jena is built with certain assumptions that don't hold for querying over the public Web as with TPF, so it doesn't work as optimally as for other backends.

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Stas, > Also, I suspect this engine does not implement path > queries right It doesn't implement them at all; it reads the first predicate and ignores the rest. Right now, the engine implements: – BGPs – UNION – OPTIONAL – some FILTERs (https://github.com/LinkedDataFragments/Client.js/blob/m

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Markus, >> Any query is possible (given a completely implemented engine), >> but some of them just take a lot of time. > > Sorry, but this is really not what I am seeing. The queries I have tried all > failed entirely. They were not slow, they completed computation with no > results, time ou

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Markus, > It was clearly not built for interactive operation On the contrary, it is: imagine applications in the browser that react when each result comes in. Don't focus on the total time, focus on the results streaming in. Web querying takes time, especially in a federated setting. The whol

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Markus, > (I did read your paper ;-) Awesome :-) > Of course all the issues I observe might be due to implementation. Or due to the kind of queries. Some queries will always be hard; with SPARQL endpoints, you pay the price in server cost; with TPF, you pay the price in query time and bandwi

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Kingsley, > The Semantic Web community hasn't focused exclusively on query execution > speed. Let me clarify myself: the scientific SemWeb community mostly focused on speed, as is apparent from publications about SPARQL query execution (and, from personal experience, many researchers and revie

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Markus, > I got the 55sec time while in my office, connected via an Ethernet cable to > the university network. This is more or less directly wired up to the > backbone of the Internet, so bandwidth cannot be the issue here. The latency is the main culprit, not bandwidth. I also noted a big

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
> Yes, the plan is to have a whitelist of SPARQL endpoints for which we > allow federation queries. I was going to post about it in January when > everybody's back from vacations :) I'm very curious to see how this will fare. Federation is where I think public SPARQL endpoints will fail, so it wil

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Markus, > The example query related to films returns 55 results, while on the official > endpoint it returns 128. This turned out to be due to an incomplete implementation of LANGMATCHES, which I have now fixed in the query engine (https://github.com/LinkedDataFragments/Client.js/commit/09c82

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Lucas, > I tried playing with it a bit and noticed an oddity in the JSON format: if > the predicate and object are both left unspecified, "P_" keys will sometimes > refer to full statement nodes and sometimes to truthy values. An example item > with not too many statements where you can witn

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
Hi Markus, Answering this as the LDF lead developer. > (1) The results do not seem to be correct. The example query related to films > returns 55 results, while on the official endpoint it returns 128. It seems > that this is not because of missing data, but because of wrong multiplicities > (

Re: [Wikidata] Linked data fragment enabled on the Query Service

2016-12-21 Thread Ruben Verborgh
bines data from Wikidata, DBpedia, and VIAF live from the Web to answer the question: which works were created by cubist painters? I hope you all have fun with this these new query opportunities! Best, Ruben -- Ruben Verborgh Postdoctoral Researcher in Semantic Hypermedia Ghent University –

Re: [Wikidata] [wikicite-discuss] Entity tagging and fact extraction (from a scholarly publisher perspective)

2016-11-19 Thread Ruben Verborgh
> > Dario, I would love for the WDQS to support federated queries. I'm curious about how reliable this would become, and at what cost. In case you consider scenarios where clients perform federation, you might be interested to see that lightweight interfaces can outperform full SPARQL interfaces

[Wikidata] [CfP] ESWC2017 – still plenty of time left

2016-11-05 Thread Ruben Verborgh
ESWC2017 – 2nd CALL FOR PAPERS conference: 28 May to 1 June 2017 in Portorož, Slovenia submission: 7 December (abstract) and 14 December 2016 (paper) details: http://2017.eswc-conferences.org/call-papers ESWC is a major venue for discussing scientific results and innovations around semantic techn

[Wikidata] [CfP] ESWC2017 – Call for Events

2016-10-07 Thread Ruben Verborgh
ESWC2017 – CALL FOR CHALLENGES, TUTORIALS, WORKSHOPS conference: 28 May to 1 June 2017 in Portorož, Slovenia proposal submission: 18 November 2016 details: http://2017.eswc-conferences.org/ ESWC is a major venue for discussing the latest scientific results and technology innovations around semant

[Wikidata] [CfP] ESWC2017 – Call for Papers

2016-10-07 Thread Ruben Verborgh
ESWC2017 – CALL FOR PAPERS conference: 28 May to 1 June 2017 in Portorož, Slovenia submission: 7 December (abstract) and 14 December 2016 (paper) details: http://2017.eswc-conferences.org/call-papers ESWC is a major venue for discussing scientific results and innovations around semantic technolog

Re: [Wikidata] question on claim-filtered search/dump and works on a Wikidata subset search engine

2016-04-29 Thread Ruben Verborgh
> @ruben @stas I'm not very familiar with Linked Data Fragments, so any > additional links to get a better understanding of how this could help > address this usecase is welcome! The best way to get started is to try it: http://client.linkeddatafragments.org/ (note that text filtering is not part

Re: [Wikidata] question on claim-filtered search/dump and works on a Wikidata subset search engine

2016-04-28 Thread Ruben Verborgh
Hi Maxime, (@Tom, thanks for pinging me.) We have created a self-describing interface for literal search, which can be used for autocompletion. More details here: http://ruben.verborgh.org/publications/vanherwegen_iswc_2015/ Let me know if we can help you! Best, Ruben _

Re: [Wikidata] Make federated queries possible / was: SPARQL CONSTRUCT results truncated

2016-02-18 Thread Ruben Verborgh
Hi Joachim, > To me, a crucial point seems to be that I'm trying to look up a large number > of distinct entities in two endpoints and join them. In the "real life" case > discussed here, about 430.000 "economists" extracted from GND and about > 320.000 "persons with GND id" from wikidata. The

Re: [Wikidata] Make federated queries possible / was: SPARQL CONSTRUCT results truncated

2016-02-18 Thread Ruben Verborgh
Dear all, I don't mean to hijack the thread, but for federation purposes, you might be interested in a Triple Pattern Fragments interface [1]. TPF offers lower server cost to reach high availability, at the expense of slower queries and higher bandwidth [2]. This is possible because the client