[Wikidata] Re: Wikidata Graph Split update

2024-06-17 Thread David Causse
Hi Egon, please see my answers inline. On Tue, Jun 11, 2024 at 4:39 PM Egon Willighagen wrote: > > Hi, thank you for the update. > > The email writes that "Queries that need federation will need to be > rewritten. You can ask for help to rewrite queries". > > Do you have guidelines on how to do

[Wikidata] WDQS lag incident - 2023/09/27

2023-09-28 Thread David Causse
between two of our kafka clusters [1], such incident should not have impacted WDQS but it uncovered improper sandboxing of the WDQS updater test setup [2]. Sorry for the inconvenience. -- David Causse Software Engineer, Wikimedia Foundation 0: https://www.mediawiki.org/wiki/Manual:Maxlag_parameter 1

[Wikidata] Re: Help make this Property Query faster

2021-11-05 Thread David Causse
On Fri, Nov 5, 2021 at 3:46 PM Thomas Douillard wrote: > [...] > But a query planner/rewriter should be able to detect a pattern like « > filter lang() = "en" » to take advantage of such an index ? > > With how blazegraph works it is hard to apply filters on literals unless the data to filter is

[Wikidata] Re: Help make this Property Query faster

2021-11-05 Thread David Causse
ing this kind of query and aligning it (which > David Causse might know if that could be changed), essentially its metadata > about Wikidata (it's available properties). > 2. it's 2.2 MB of data > > I think that Yi Liu's Wikidata Property Explorer service then might w

[Wikidata] Re: Documentation of the interaction between AUTO_LANGUAGE and JSON from WDQS?

2021-08-31 Thread David Causse
Hi Daniel, Sadly "[AUTO_LANGUAGE]" is a magic word managed by the GUI itself[1] (using javascript on the client browser) and explains why this query cannot be used as-is via the sparql endpoint. You can check by using a very simple query[2] and inspect the query sent to the sparql endpoint using y

Re: [Wikidata] Differences in label searching with SPARQL and MediaWiki API

2020-08-07 Thread David Causse
Some answers inline, On Fri, Aug 7, 2020 at 6:07 PM Thad Guidry wrote: > Very nice David! > > 1. Does the MINUS actually utilize ElasticSearch indexes or just > Blazegraph? > > No, elasticsearch is being used only during the call to the wikibase:mwapi SERVICE. Everything happening outside this c

Re: [Wikidata] Differences in label searching with SPARQL and MediaWiki API

2020-08-07 Thread David Causse
query to take this into consideration using the MINUS keyword: https://w.wiki/Yzt . Hope it helps, David. On Thu, Aug 6, 2020 at 11:26 PM Thad Guidry wrote: > Hi David Causse, > > Curious why https://www.wikidata.org/wiki/Q24033349 is not being returned > in the below SPARQL? >

Re: [Wikidata] WDQS outage - 2020/07/23

2020-07-27 Thread David Causse
On Sun, Jul 26, 2020 at 11:59 PM Kingsley Idehen wrote: > On 7/26/20 1:44 PM, Egon Willighagen wrote: > > What did I miss? > > > Hi, Thanks for looking into this, note that we don't have yet a reproducible scenario to trigger the dead-lock that caused the incident. Given the strong relations (ti

Re: [Wikidata] Wikimedia Commons Query Service (WCQS)

2020-07-24 Thread David Causse
On Fri, Jul 24, 2020 at 10:52 AM Thomas Pellissier Tanon < tho...@pellissier-tanon.fr> wrote: > > Is there a particular reason that schema:contentUrl was chosen to > point to URLs of the form < > https://upload.wikimedia.org/wikipedia/commons/q/q1/filename.jpg> rather > than

Re: [Wikidata] Differences in label searching with SPARQL and MediaWiki API

2020-07-13 Thread David Causse
On Sat, Jul 11, 2020 at 7:12 PM Thad Guidry wrote: > This query times out: > > SELECT ?item ?label > WHERE > { > ?item wdt:P31 ?instance ; > rdfs:label ?label ; > rdfs:label ?enLabel . > FILTER(CONTAINS(lcase(?label), "Soriano")). > FILTER(?instance != wd:Q5). > SERVICE wikibase:l

[Wikidata] Blank node deprecation in WDQS & Wikibase RDF model

2020-04-16 Thread David Causse
the Contact the development team (query service and search) wiki page[7]. Thanks! -- David Causse 0: https://phabricator.wikimedia.org/T244590 1: https://en.wikipedia.org/wiki/Blank_node 2: https://phabricator.wikimedia.org/T244341#5889997 3: https://phabricator.wikimedia.org/T

Re: [Wikidata] MWAPI internal server error when using search

2020-04-08 Thread David Causse
On Tue, Apr 7, 2020 at 8:16 PM Jan-Christoph Klie < k...@ukp.informatik.tu-darmstadt.de> wrote: > Hello, > > When we issue queries using the MWAPI [1] programatically, we sometimes > get a 500 error. I would guess that happens in more than 30% of the cases. > The following SPARQL query causes the

Re: [Wikidata] [discovery-private] Indexing all item properties in ElasticSearch

2018-07-28 Thread David Causse
On Sat, Jul 28, 2018 at 2:02 AM Stas Malyshev wrote: > Hi! > > > The top 1000 > > is: > https://docs.google.com/spreadsheets/d/1E58W_t_o6vTNUAx_TG3ifW6-eZE4KJ2VGEaBX_74YkY/edit?usp=sharing > > This one is pretty interesting, how do I extract this data? It may be > useful independently of what we'

Re: [Wikidata] [discovery-private] Indexing all item properties in ElasticSearch

2018-07-27 Thread David Causse
On Fri, Jul 27, 2018 at 3:31 PM David Causse wrote: > What I'd try to avoid in general is indexing terms that have only doc > since they are pretty useless. > I meant: that have only *one* doc ___ Wikidata mailing list Wikidata@lists

Re: [Wikidata] [discovery-private] Indexing all item properties in ElasticSearch

2018-07-27 Thread David Causse
Hi, I think we already index way more than P31 and P279. For instance we have 102.301.706 (approximation) distinct values in the term lexicon for statement_keywords. Sadly I can't extract the list of unique PIDs used (we'd have to enable field_data on statement_keywords.property). The top 1000 is: