On Thu, Jul 9, 2020 at 4:52 PM Egon Willighagen <egon.willigha...@gmail.com>
wrote:

>
> Dear Guillaume,
>
> On Thu, Jul 9, 2020 at 3:23 PM Guillaume Lederrey <gleder...@wikimedia.org>
> wrote:
>
>> Some very preliminary analysis indicates that less then 2% of the queries
>> on WDQS generate more than 90% of the load. This is definitely something we
>> need to better understand.
>>
>
> Is the data behind that available? I wonder if I recognize any of the top
> 25 queries.
>

No, the data isn't publicly available. Queries can (and do) contain private
information, so we don't publish raw queries. We might publish a subset of
those queries at some point, but only after having reviewed them manually
to ensure they are clean.

(I guess the top 2% can be simple queries run very many times, as well as
> hard queries rarely run, correct?)
>

The analysis at this point is just on individual queries, with no
aggregation of similar queries. This means that this 2% of queries are very
expensive queries. We need to refine that analysis, and aggregation of
similar queries is one of the things we should be working on.


> Egon
>
>
> --
> Hi, do you like citation networks? Already 51% of all citations are
> available <https://i4oc.org/> available for innovative new uses
> <https://twitter.com/hashtag/acs2ioc>. Join me in asking the American
> Chemical Society to join the Initiative for Open Citations too
> <https://www.change.org/p/asking-the-american-chemical-society-to-join-the-initiative-for-open-citations>.
>  SpringerNature,
> the RSC and many others already did <https://i4oc.org/#publishers>.
>
> -----
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: https://www.zotero.org/egonw
> ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286>
> ImpactStory: https://impactstory.org/u/egonwillighagen
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Guillaume Lederrey
Engineering Manager, Search Platform
Wikimedia Foundation
UTC+1 / CET
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to