https://bugzilla.wikimedia.org/show_bug.cgi?id=41529

--- Comment #2 from jeb...@gmail.com 2012-10-30 11:14:15 UTC ---
I have a more complete description somewhere, but for Wikipedia it can be
implemented as a "quote" tag function that also takes a url to the referred
site. In Wikidata it would be part of the reference object.

An easy way to do it is to first assume the quote to be correct, but push a job
to the job queue if it doesn't already exist in memcached. If it exist in
memcached it can be mared as valid or invalid right away. It will be cached for
a day or two in memcached, then a new job will be generated. When the job is
run it will check the external site.

There should be a small set of markers that act as wildcards during testing,
mostly just square brackets (could need localization) that can contain
anything. During matching they will be replaced by a non-greedy dot-star (.*?).

Also the page requested will need some cleanup, but it seems that a pretty
simple regex-base scrubbing will be sufficient. Getting the raw text from a
page (screen scraping) isn't that uncommon for bot and it is fairly simple.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to