Hi Everyone,

I am a GSOC student working with Parsoid Team [1] <
https://www.mediawiki.org/wiki/Parsoid> to build a Parsoid based linter
(Linttrap) [2] <
https://www.mediawiki.org/wiki/Parsoid/Linting/GSoC_2014_Application>.
Lintrap will detect broken wikitext found on the wiki pages and will
also collect
stats about certain wikitext usage patterns.

Currently, for this demo, Lintrap can detect 4 types of broken wikitext, But,
other kinds of issues could be detected in the coming weeks and months :
 * Fostered Wikitext : Eg [3]
   <http://lintbridge.wmflabs.org/_html/issues/53a2fe1e94641d9101f8e8b2>
 * Missing End Tag : Eg [4]
   <http://lintbridge.wmflabs.org/_html/issues/53a2fca994641d9101f8e72a>
 * Missing Start Tag : Eg [5]
   <http://lintbridge.wmflabs.org/_html/issues/53a2fdc394641d9101f8e850>
 * Stripped Tags : Eg [6]
   <http://lintbridge.wmflabs.org/_html/issues/53a2fd4394641d9101f8e819>

Linttrap also collect information about transclusion usages where multiple
templates are used to construct a DOM structure [7] <
http://lintbridge.wmflabs.org/_html/issues/53a3026794641d9101f8ea81>. Here's
our stats page [8] <http://lintbridge.wmflabs.org/stats>.

Once a page is parsed, Lintrap uses parsoid based logger facility to log
them to a web service. We call it Lintbridge [9] <
http://lintbridge.wmflabs.org/>. Currently Lintbridge is hosted on
Wikimedia Labs and use mongodb to store all the issues. Lintbridge offers a
REST api which can be used by bots and other applications to fix the broken
wikitext. Linttrap uses this REST api to store issues into Lintbridge.

We have also built a HTML app on top of Lintbridge [10]  <
http://lintbridge.wmflabs.org/_html/issues>. This is a basic app for now
which is used to demonstrate linttrap abilities. But, It is quite useful as
it is today. Feel free to browse over the issues.

  * You can use the links in the table to filter the issues.
  * Click on issue type to filter issue by issue type.
  * You can filter issue by page too.
  * You can use Fix All to fix all issue for that page.
  * You can even use filters on the top bar to filter by Wiki and Type.
  * Each issue contain a info about wiki, page, revision on the left and
the wikitext snippet on the right.

Just for the demo of this working prototype, we have collected issues by
parsing 1000 picked from  http://parsoid-tests.wikimedia.org/topfails
<http://parsoid-tests.wikimedia.org/topfails/0>. If you want to try the
JSON API you can use the following routes.

GET  /_api/issues : Show all issues (
http://lintbridge.wmflabs.org/_api/issues)
GET /_api/issues/type/issue-type : Filter by issue-type (
http://lintbridge.wmflabs.org/_api/issues/type/fostered)
GET /_api/enwiki/issues : Filter by enwiki (
http://lintbridge.wmflabs.org/_api/enwiki/issues)

POST /_api/add : Add a issue to the Lintbridge
Inviting feedback.

Thank you
Hardik Juneja

--
​[1] Parsoid: https://www.mediawiki.org/wiki/Parsoid
[2] Linttrap:
https://www.mediawiki.org/wiki/Parsoid/Linting/GSoC_2014_Application
[3] Fostered Ex :
http://lintbridge.wmflabs.org/_html/issues/53a2fe1e94641d9101f8e8b2
[4] Missing End Tag eg :
http://lintbridge.wmflabs.org/_html/issues/53a2fca994641d9101f8e72a
[5] Missing Start Tag eg :
http://lintbridge.wmflabs.org/_html/issues/53a2fdc394641d9101f8e85
<http://lintbridge.wmflabs.org/_html/issues/53a2fdc394641d9101f8e850>0
<http://lintbridge.wmflabs.org/_html/issues/53a2fdc394641d9101f8e850>
[6] Stripped Tag eg :
http://lintbridge.wmflabs.org/_html/issues/53a2fd4394641d9101f8e819
[7] Mixed Template eg :
http://lintbridge.wmflabs.org/_html/issues/53a3026794641d9101f8ea81
[8] Stats Page : http://lintbridge.wmflabs.org/stats
[9] Lintbridge: http://lintbridge.wmflabs.org/
[10] HTML App: http://lintbridge.wmflabs.org/_html/issues
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to