Dear wikidata list,

One of the key things we do as Wikidata people is go round the internet, hassling people to create nice identifiers for their things, with URIs and landing-pages that we can link to.

It brought me up quite short to realise that actually applies to *us* too -- there is an important thing of ours that haven't got a linkable identifier for, that it would be useful if there was a short linkable url for, and that is WDQS queries.


So here's the use-case:

[[User:PKM]] and I have been working with a new external project who want to build a "Gazetteer of Early Modern England and Wales" (EMEW) -- basically a historical GIS for 15th & 16th century England and Wales, able to plot things on this map:
  https://viaeregiae.org/index.php/map/?layers=l9001l0007

There's huge scope for collaboration with Wikidata, with deep linking both ways, as we both try to improve our coverage of C16-C17 England and Wales (expect WikiProject [[:d:WD:EMEW]] just as soon as we can get the pages made)

Something that we realise we want to be able to do, from the EMEW site, is for a user to be able to give it the URL for a WDQS sparql query, for the site to send that to WDQS, get back a file of EMEW ids, and show the results on the EMEW map.


Now of course that *can* be done with full WDQS query URLs. But they are horribly long and awkward.

What would be much nicer would be if each time WDQS ran a query it hadn't seen before, it generated a new short identifier, that the UI would display, and which could then be used to refer to the query.

So on the EMEW site, one would just put in the short identifier, that would send it to WDQS with some appropriate URL-start to show it was a request for data rather than for a link, back would come the results -- and all the user would have had to copy in was a short identifier.


Of course to some extent the URL-shortener does this, but there are some issues:

1) The maintainers point-blank refuse to let it allow URLs of more than 2000 characters. ( https://phabricator.wikimedia.org/T220703 ) Gnarly WDQS queries can often be longer than this, sometimes a lot longer.

2) The short URL could be for anything on any wiki site -- the EMEW site can't be sure that it corresponds to a SPARQL query

3) The short URL needs to be adjusted, to turn it from a WDQS url that's a link to the query in the GUI into a WDQS url that's an external request for query results. This is not straightforward.


A short identifier for a WDQS query would get round all these things.

It also might be one step forwards towards creating a place like Quarry (https://quarry.wmflabs.org/) where users could save their queries, share them, document them, see other people's shared queries, and come back to them later. But that's another ticket (https://phabricator.wikimedia.org/T104762 open since July 2015).

All I am suggesting, first, is an identifier.


One objection that I thought of might be that if identifiers were automatically assigned, without having to actually request them, then people might be able to "spy" on what other queries people happened to be writing at any one time. I don't know how serious an objection this is - it doesn't seem to be a problem for Quarry - but could largely be avoided if the query-number was hashed to make the sequence less predictable.

(Or alternatively, query-numbers could just be issued on request).


Anyway, just putting this out here, for thoughts.

Best wishes to everybody,

   James.






_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to