Re: [Wikitech-l] Diff algorithms: the shootout

2016-05-06 Thread Isarra Yos

Cool!

It's entirely possible that the shortcuts it takes correspond to what 
makes a more cohesive thing to present to the user - the same shortcuts 
in the diff implementation are what we want in the front-end when 
looking for meaningful changes anyway.


I mean, this is just random speculation, but it would be interesting if 
this is indeed the case.


On 21/04/16 20:53, Max Semenik wrote:

All right, votes indicate that wikidiff3 is even better in quality, so here
we go:
https://gerrit.wikimedia.org/r/#/c/284003/ removes DairikiDiff. After it's
merged, I plan to refactor this area further and work on improving diff
quality now that we'll have 2 places to make changes instead of 3.

On Mon, Apr 18, 2016 at 1:56 PM, Antoine Musso  wrote:


Le 16/04/2016 04:00, MZMcBride a écrit :

Is there a related Phabricator Maniphest task about this? I'm not sure I
understand the motivation for making a switch. I would think that heavy
diffs are a very small portion of traffic.

An intensive would be for MediaWiki core to only have a single diff
system instead of two.

For the historic part, wikidiff3 got introduced in August 2008:

  https://www.mediawiki.org/wiki/Special:Code/MediaWiki/38653
  commit e45cf2b8


--
Antoine "hashar" Musso


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l







___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Diff algorithms: the shootout

2016-04-21 Thread Max Semenik
All right, votes indicate that wikidiff3 is even better in quality, so here
we go:
https://gerrit.wikimedia.org/r/#/c/284003/ removes DairikiDiff. After it's
merged, I plan to refactor this area further and work on improving diff
quality now that we'll have 2 places to make changes instead of 3.

On Mon, Apr 18, 2016 at 1:56 PM, Antoine Musso  wrote:

> Le 16/04/2016 04:00, MZMcBride a écrit :
> > Is there a related Phabricator Maniphest task about this? I'm not sure I
> > understand the motivation for making a switch. I would think that heavy
> > diffs are a very small portion of traffic.
>
> An intensive would be for MediaWiki core to only have a single diff
> system instead of two.
>
> For the historic part, wikidiff3 got introduced in August 2008:
>
>  https://www.mediawiki.org/wiki/Special:Code/MediaWiki/38653
>  commit e45cf2b8
>
>
> --
> Antoine "hashar" Musso
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



-- 
Best regards,
Max Semenik ([[User:MaxSem]])
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Diff algorithms: the shootout

2016-04-18 Thread Antoine Musso
Le 16/04/2016 04:00, MZMcBride a écrit :
> Is there a related Phabricator Maniphest task about this? I'm not sure I
> understand the motivation for making a switch. I would think that heavy
> diffs are a very small portion of traffic.

An intensive would be for MediaWiki core to only have a single diff
system instead of two.

For the historic part, wikidiff3 got introduced in August 2008:

 https://www.mediawiki.org/wiki/Special:Code/MediaWiki/38653
 commit e45cf2b8


-- 
Antoine "hashar" Musso


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Diff algorithms: the shootout

2016-04-17 Thread Brian Wolff
On Friday, April 15, 2016, MZMcBride  wrote:
> Max Semenik wrote:
>>Right now, MediaWiki has 2 pure-PHP engines to produce diffs (there's also
>>a native PHP extension wikidiff2, but we're not discussing it right now):
>>* DairikiDiff is what everybody uses, and
>>* Wikidiff3, and alternative implementation by Guy Van den Broeck that was
>>around for 8 years but required a configuration change
>>While less battle-tested, Wikidiff3 offers vastly improved performance on
>>heavy diffs compared to DairikiDiff. The price, however, is that it makes
>>certain shortcuts if the diff is too complex. I ran through 100K diffs
>>from English Wikipedia, and 6% of diffs were different. Lots of changes
>>were seemingly insignificant but I need your help with determining if
>>it's really so.
>
> Is there a related Phabricator Maniphest task about this? I'm not sure I
> understand the motivation for making a switch. I would think that heavy
> diffs are a very small portion of traffic.
>
> MZMcBride
>
>


I think optimizing the worst case performance makes sense, especially if we
dont really lose anything in doing so.

To clarify, this is just for third parties, right? Wmf uses wikidiff2.

--
-bawolff
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Diff algorithms: the shootout

2016-04-17 Thread Andre Klapper
On Fri, 2016-04-15 at 21:00 -0500, MZMcBride wrote:
> Max Semenik wrote:
> > 
> > Right now, MediaWiki has 2 pure-PHP engines to produce diffs (there's also
> > a native PHP extension wikidiff2, but we're not discussing it right now):
> > * DairikiDiff is what everybody uses, and
> > * Wikidiff3, and alternative implementation by Guy Van den Broeck that was
> > around for 8 years but required a configuration change
> > While less battle-tested, Wikidiff3 offers vastly improved performance on
> > heavy diffs compared to DairikiDiff. The price, however, is that it makes
> > certain shortcuts if the diff is too complex. I ran through 100K diffs
> > from English Wikipedia, and 6% of diffs were different. Lots of changes
> > were seemingly insignificant but I need your help with determining if
> > it's really so.
> Is there a related Phabricator Maniphest task about this? I'm not sure I
> understand the motivation for making a switch. I would think that heavy
> diffs are a very small portion of traffic.

https://phabricator.wikimedia.org/T128896 looks related.

andre
-- 
Andre Klapper | Wikimedia Bugwrangler
http://blogs.gnome.org/aklapper/



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Diff algorithms: the shootout

2016-04-15 Thread MZMcBride
Max Semenik wrote:
>Right now, MediaWiki has 2 pure-PHP engines to produce diffs (there's also
>a native PHP extension wikidiff2, but we're not discussing it right now):
>* DairikiDiff is what everybody uses, and
>* Wikidiff3, and alternative implementation by Guy Van den Broeck that was
>around for 8 years but required a configuration change
>While less battle-tested, Wikidiff3 offers vastly improved performance on
>heavy diffs compared to DairikiDiff. The price, however, is that it makes
>certain shortcuts if the diff is too complex. I ran through 100K diffs
>from English Wikipedia, and 6% of diffs were different. Lots of changes
>were seemingly insignificant but I need your help with determining if
>it's really so.

Is there a related Phabricator Maniphest task about this? I'm not sure I
understand the motivation for making a switch. I would think that heavy
diffs are a very small portion of traffic.

MZMcBride



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l