Thanks Amir ..
https://github.com/wikimedia/parsoid/blob/master/tools/fetch_ve_nowiki_edits.js is a quick hackjob of a script that I pulled together back in Oct 2015 which I used for a while to monitor counts (and the actual incidents) of nowikis ... This script could use a refresh and update ... plus it could even run some greps against the page to classify into different types on different wikis.
Subbu. On 05/18/2016 06:25 AM, Amir E. Aharoni wrote:
Hi, There's a thing I've been doing for exactly one year now, and some people on this list may find it interesting: I've been counting how many article space edits in the Hebrew Wikipedia added a <nowiki> tag. These tags are very rarely needed in articles, but they are often added in edits that go through Parsoid (VisualEditor and ContentTranslation). Experienced editors complained that they are added too frequently and they have to fix them manually, so I started meticulously counting _how_ frequently, and also _why_ are they added, so I'd be able to report Parsoid / VE / ContentTranslation bugs with the hope of reducing it. I did the counting by checking Recent Changes every day for edits tagged " nowiki" (added by a locally-defined AbuseFilter if a main space edit has a < nowiki> tag in the new text), and checking every diff. The full analyzed and sorted results are at https://he.wikipedia.org/wiki /WP:VE/nowiki . I did my best to translate the most essential parts to English, but please ask me if you have any more questions. A summary of findings: * There are on average about 3000 article-space edits in the Hebrew Wikipedia per day.[1] * There are on average about 450 edits with the VisualEditor tag in the Hebrew Wikipedia per day.[2] * There are rarely more than 20 edits per day that have <nowiki>, and usually much less than that. * The most common reason for the appearance of <nowiki> is writing two apostrophes ('') instead of a double quotation mark (").[3] It's remarkable how many people make this mistake, although it's possible that it's more common in the Hebrew language because of the peculiar ways in which quote characters are used in it and how they appear on common keyboards. * The other most common reason is what I call "bad links" and "wrong links". Both involve letters added after internal links, with a <nowiki/> added immediately after the closing ']]'; for an explanation about the difference "bad" and "wrong", see the linked page. Counted together, these two categories of errors is the most common cause for the appearance of < nowiki>. * After the above reasons, the most common are vandalism (and I don't consider it an issue on VisualEditor or Parsoid) and making mistakes in the wiki syntax of template parameters. As a result of this work I reported many Parsoid and VisualEditor bugs, and their excellent developers fixed a bunch: Wiki syntax pasted in VisualEditor is now correctly auto-converted in a DWIM way; empty runs of < nowiki>'''</nowiki> are not created any longer if somebody makes text bold but doesn't write anything; _some_ bugs related to ISBN and external links handling were fixed (though a few remain); and more. Something similar was also being done in the French Wikipedia[4] for some time, but not updated since August 2015 :( I wish I could do it for other languages, but there's no chance that I'll find time for that. However, if anybody volunteers to do it for the Wikipedia in their language, I'll be very happy to help you get started. I'd be super-interested to know how it is in English, Spanish, Dutch, Polish, Czech, Russian, Hungarian, and any other language. Takes no more than 5 minutes per day with the volume of edits in Hebrew, but the time for other languages will probably be different. P.S. I'm stupid, please correct my queries if they are wrong. [1] select substring(rev_timestamp, 1, 8) rev_date, count(rev_id) from revision, page where page_id = rev_page and page_namespace = 0 and rev_timestamp > 20160100000000 group by rev_date order by rev_date; [2] select substring(rev_timestamp, 1, 8) rev_date, count(rev_id) from revision, page, change_tag where page_id = rev_page and page_namespace = 0 and rev_timestamp > 20160100000000 and ct_tag = "visualeditor" and ct_rev_id = rev_id group by rev_date order by rev_date; [3] https://phabricator.wikimedia.org/T106641 [4] https://fr.wikipedia.org/w/index.php?title=Wikip%C3%A9dia:%C3%89diteurVisuel/Avis/ Nowiki&action=history -- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l