On 07/23/2013 06:13 PM, John Vandenberg wrote:
On Wed, Jul 24, 2013 at 9:02 AM, Subramanya Sastry
<ssas...@wikimedia.org> wrote:
On 07/23/2013 05:28 PM, John Vandenberg wrote:
On Wed, Jul 24, 2013 at 2:06 AM, Subramanya Sastry
<ssas...@wikimedia.org> wrote:
Hi John and Risker,

First off, I do want to once again clarify that my intention in the
previous
post was not to claim that VE/Parsoid is perfect.  It was more that we've
fixed sufficient bugs at this point that the most significant "bugs"
(bugs,
not missing features) that need fixing (and are being fixed) are those
that
have to do with usability tweaks.
How do you know that?  Have you performed automated tests on all
Wikipedia content?  Or are you waiting for users to find these bugs?

http://parsoid.wmflabs.org:8001/stats

This is the url for our round trip testing on 160K pages (20K each from 8
wikipedias).
Fantastic!  How frequently are those tests re-run?  Could you add a
last-run-date on that page?

The tests are re-run after a bunch of commits that we think should be regression tested -- usually updated one or more times a day (when a lot of patches are being merged) or after a few days (during periods of low activity). The last code udpate was Thursday

http://parsoid.wmflabs.org:8001/commits gives you the list of commits (and date when code was updated) http://parsoid.wmflabs.org:8001/topfails gives you individual test results on every tested page for more detail.

Currently we are updating our rt testing infrastructure to gather performance numbers as well (this has been on the cards for a long time, but never got the attention it needed). But, Marco is working that part of our codebase as we speak. https://bugzilla.wikimedia.org/show_bug.cgi?id=46659 and other related ones.

We do not deploy to production before we have run tests on a subset of pages in rt-testing. Given the nature of how tests are run, it is usually sufficient to run on about a 1000 pages to know if there are serious regressions .. sometimes we run on a larger subset of pages.

Was a regression testsuite built using the issues encountered during
the last parser rewrite?

We also continually update a parser tests file (in the code repository) with minimized test cases based on regressions and odd wikitext usage. About 1100 tests so far that run in 4 modes (wt2html, wt2wt, html2wt, html2html) plus 14000 randomly generated edits to the tests to mimic edits and test our selective serializer. This is our first guard against bad code.

Subbu.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to