Re: [Wikitech-l] dirty diffs and VE

2013-07-25 Thread C. Scott Ananian
On Thu, Jul 25, 2013 at 2:19 PM, Subramanya Sastry wrote: > And, both Roan and Scott are correct. Pathway 2. would be a test of of > external libraries (HTML5 and Domino, not just domino). And, we did have > bugs in the HTML5 parsing library we used (which I fixed based on reports > from Roan) a

Re: [Wikitech-l] dirty diffs and VE

2013-07-25 Thread Subramanya Sastry
On 07/25/2013 01:03 PM, Roan Kattouw wrote: On Wed, Jul 24, 2013 at 2:49 PM, C. Scott Ananian wrote: For what it's worth, both the DOM serialization-to-a-string and DOM parsing-from-a-string are done with the domino package. It has a substantial test suite of its own (originally from http://ww

Re: [Wikitech-l] dirty diffs and VE

2013-07-25 Thread Roan Kattouw
On Wed, Jul 24, 2013 at 2:49 PM, C. Scott Ananian wrote: > For what it's worth, both the DOM serialization-to-a-string and DOM > parsing-from-a-string are done with the domino package. It has a > substantial test suite of its own (originally from > http://www.w3.org/html/wg/wiki/Testing I believe

Re: [Wikitech-l] dirty diffs and VE

2013-07-24 Thread C. Scott Ananian
On Wed, Jul 24, 2013 at 11:20 AM, Subramanya Sastry wrote: > On 07/24/2013 09:58 AM, Roan Kattouw wrote: > >> There are a few things I wish it tested, but they're mostly about how it >> tests things rather than what data is collected. For instance, it would be >> nice if the round-trip tests could

Re: [Wikitech-l] dirty diffs and VE

2013-07-24 Thread Marc Ordinas i Llopis
On Wed, Jul 24, 2013 at 4:58 PM, Roan Kattouw wrote: > > Or just drop by #wikimedia-parsoid, I'm marcoil there. > > > The channel is #mediawiki-parsoid :) Yes, sorry… I hadn't had enough coffee :) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.

Re: [Wikitech-l] dirty diffs and VE

2013-07-24 Thread Subramanya Sastry
On 07/24/2013 09:58 AM, Roan Kattouw wrote: There are a few things I wish it tested, but they're mostly about how it tests things rather than what data is collected. For instance, it would be nice if the round-trip tests could round-trip from wikitext to HTML *string* and back, rather than to H

Re: [Wikitech-l] dirty diffs and VE

2013-07-24 Thread Roan Kattouw
On Wed, Jul 24, 2013 at 3:10 AM, Marc Ordinas i Llopis wrote: > As Subbu said, I'm currently working on improving the round-trip test > server, mostly on porting it from sqlite to MySQL but also on expanding the > stats kept (with things like performance, etc.). If you think of some other > data w

Re: [Wikitech-l] dirty diffs and VE

2013-07-24 Thread Marc Ordinas i Llopis
On Wed, Jul 24, 2013 at 1:55 AM, John Vandenberg wrote: > Could you provide a dump of the list of 24000 bustable pages? Split > by project? Each community could then investigate those pages for > broken tables, and more critically .. templates which emit broken > wikisyntax that is causing your

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread C. Scott Ananian
On Tue, Jul 23, 2013 at 7:55 PM, John Vandenberg wrote: > On Wed, Jul 24, 2013 at 9:02 AM, Subramanya Sastry > wrote: > > http://parsoid.wmflabs.org:8001/stats > > > > This is the url for our round trip testing on 160K pages (20K each from 8 > > wikipedias). > > Very minor point .. there are ~40

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread Subramanya Sastry
On 07/23/2013 06:55 PM, John Vandenberg wrote: On Wed, Jul 24, 2013 at 9:02 AM, Subramanya Sastry wrote: http://parsoid.wmflabs.org:8001/stats This is the url for our round trip testing on 160K pages (20K each from 8 wikipedias). Very minor point .. there are ~400 missing pages on the list; i

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread John Vandenberg
On Wed, Jul 24, 2013 at 9:02 AM, Subramanya Sastry wrote: > http://parsoid.wmflabs.org:8001/stats > > This is the url for our round trip testing on 160K pages (20K each from 8 > wikipedias). Very minor point .. there are ~400 missing pages on the list; is that intentional ? ;-) One is 'Mos:time'

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread Subramanya Sastry
On 07/23/2013 06:02 PM, Subramanya Sastry wrote: On 07/23/2013 05:28 PM, John Vandenberg wrote: VE and Parsoid devs have put in a lot and lot of effort to recognize broken wikitext source, fix it or isolate it, My point was that you dont appear to be doing analysis of how of all Wikipedia con

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread C. Scott Ananian
On Tue, Jul 23, 2013 at 7:24 PM, C. Scott Ananian wrote: > >> Was a regression testsuite built using the issues encountered during >> the last parser rewrite? >> > > Yes, mediawiki/core/tests/parser/parserTests.txt (which predates parsoid) > has been continuously updated throughout the development

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread C. Scott Ananian
On Tue, Jul 23, 2013 at 7:13 PM, John Vandenberg wrote: > > http://parsoid.wmflabs.org:8001/stats > > > > This is the url for our round trip testing on 160K pages (20K each from 8 > > wikipedias). > > Fantastic! How frequently are those tests re-run? Could you add a > last-run-date on that page

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread Subramanya Sastry
On 07/23/2013 06:13 PM, John Vandenberg wrote: On Wed, Jul 24, 2013 at 9:02 AM, Subramanya Sastry wrote: On 07/23/2013 05:28 PM, John Vandenberg wrote: On Wed, Jul 24, 2013 at 2:06 AM, Subramanya Sastry wrote: Hi John and Risker, First off, I do want to once again clarify that my intention

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread John Vandenberg
On Wed, Jul 24, 2013 at 9:02 AM, Subramanya Sastry wrote: > On 07/23/2013 05:28 PM, John Vandenberg wrote: >> >> On Wed, Jul 24, 2013 at 2:06 AM, Subramanya Sastry >> wrote: >>> >>> Hi John and Risker, >>> >>> First off, I do want to once again clarify that my intention in the >>> previous >>> po

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread Subramanya Sastry
On 07/23/2013 05:28 PM, John Vandenberg wrote: On Wed, Jul 24, 2013 at 2:06 AM, Subramanya Sastry wrote: Hi John and Risker, First off, I do want to once again clarify that my intention in the previous post was not to claim that VE/Parsoid is perfect. It was more that we've fixed sufficient b

Re: [Wikitech-l] dirty diffs and VE

2013-07-23 Thread C. Scott Ananian
On Tue, Jul 23, 2013 at 6:28 PM, John Vandenberg wrote: > On Wed, Jul 24, 2013 at 2:06 AM, Subramanya Sastry > wrote: > > Hi John and Risker, > > > > First off, I do want to once again clarify that my intention in the > previous > > post was not to claim that VE/Parsoid is perfect. It was more

[Wikitech-l] dirty diffs and VE

2013-07-23 Thread John Vandenberg
On Wed, Jul 24, 2013 at 2:06 AM, Subramanya Sastry wrote: > Hi John and Risker, > > First off, I do want to once again clarify that my intention in the previous > post was not to claim that VE/Parsoid is perfect. It was more that we've > fixed sufficient bugs at this point that the most significa