Lucas_Werkmeister_WMDE added a comment.

@Krinkle thanks for your comments!

And this uses the same Sparql endpoint at https://query.wikidata.org/ as for public queries?

Yup.

but I didn't know it was used e.g. when saving edits (assuming validation happens there).

No, constraint checks are currently a fairly “external” feature – they’re done when users visit an item, or after they’ve saved a statement, and only for users who have the checkConstraints gadget enabled. There are plans to enable the gadget by default (T173695), and to check constraints before saving, too (T168626), but we don’t plan to make that mandatory.

ConstraintCheck/SparqlHelper in Wikidata currently has a timeout of 5 seconds. Presumably in order to support some of the more complex regular expressions and larger input texts.

That’s not really the main reason – we also use the query service for type checking: is some item an instance of some class or a subclass of it? We try to do this check in PHP at first, but if that takes too long (e. g. because the type hierarchy is too deep, too branched, or even circular), we bail out and ask the query service instead. For just the regexes, I’d be fine with a much shorter timeout.

Response times look fairly good in Graphite (source).

Wow, I didn’t know about that thing. Nice! (Looks like I can configure the graph by tweaking the URL even though visiting https://graphite.wikimedia.org/ gets me an authentication error :) )

The minutely p99 over the last 5 days ranges from 100-500ms… This is okay.

By the way, what’s the round-trip time for memcached assuming a cache hit? If it’s, dunno, 100 ms, then the patch is probably not worth the trouble :)

(And @Jonas, perhaps we can switch the average times in the Grafana board from mean to median? Those graphite stats look a lot less serious than the Grafana board.)

I wonder if something like a "simple" PHP or Python subprocess would work (something that runs preg_match or re.match, using tight firejail with cgroup restrictions, like other MediaWiki subprocesses).

Hehe, I’ve actually written a service like that :) we have a test system for constraints on Cloud VPS (wikidata-constraints.wmflabs.org), and it doesn’t have enough RAM to run Blazegraph, so I wrote minisparql, which just listens for regex SPARQL queries and checks them with preg_match. Of course, if we were to do something like this on Wikidata itself, we could skip the SPARQL wrapping. But I have no idea how much trouble it is to deploy a new service like that…

Or perhaps using LuaSandbox?

Yeah, the different pattern flavors are a problem… the situation is already unfortunate because WDQS doesn’t quite implement PCRE either, but at least it’s similar. Lua Patterns seem to be completely different. (We’re stuck with PCRE-like patterns because there’s also an external service, KrBot, which checks constraints daily and saves the results in WD:Database Reports/Constraint Violations, and we need to be (mostly) compatible with it.)


TASK DETAIL
https://phabricator.wikimedia.org/T173696

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: Krinkle, aaron, gerritbot, Ladsgroup, daniel, Aklapper, Jonas, Lucas_Werkmeister_WMDE, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, Agabi10, Izno, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to