Re: [Wikitech-l] Diff algorithms: the shootout
Cool! It's entirely possible that the shortcuts it takes correspond to what makes a more cohesive thing to present to the user - the same shortcuts in the diff implementation are what we want in the front-end when looking for meaningful changes anyway. I mean, this is just random speculation, but it would be interesting if this is indeed the case. On 21/04/16 20:53, Max Semenik wrote: All right, votes indicate that wikidiff3 is even better in quality, so here we go: https://gerrit.wikimedia.org/r/#/c/284003/ removes DairikiDiff. After it's merged, I plan to refactor this area further and work on improving diff quality now that we'll have 2 places to make changes instead of 3. On Mon, Apr 18, 2016 at 1:56 PM, Antoine Musso wrote: Le 16/04/2016 04:00, MZMcBride a écrit : Is there a related Phabricator Maniphest task about this? I'm not sure I understand the motivation for making a switch. I would think that heavy diffs are a very small portion of traffic. An intensive would be for MediaWiki core to only have a single diff system instead of two. For the historic part, wikidiff3 got introduced in August 2008: https://www.mediawiki.org/wiki/Special:Code/MediaWiki/38653 commit e45cf2b8 -- Antoine "hashar" Musso ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Diff algorithms: the shootout
All right, votes indicate that wikidiff3 is even better in quality, so here we go: https://gerrit.wikimedia.org/r/#/c/284003/ removes DairikiDiff. After it's merged, I plan to refactor this area further and work on improving diff quality now that we'll have 2 places to make changes instead of 3. On Mon, Apr 18, 2016 at 1:56 PM, Antoine Musso wrote: > Le 16/04/2016 04:00, MZMcBride a écrit : > > Is there a related Phabricator Maniphest task about this? I'm not sure I > > understand the motivation for making a switch. I would think that heavy > > diffs are a very small portion of traffic. > > An intensive would be for MediaWiki core to only have a single diff > system instead of two. > > For the historic part, wikidiff3 got introduced in August 2008: > > https://www.mediawiki.org/wiki/Special:Code/MediaWiki/38653 > commit e45cf2b8 > > > -- > Antoine "hashar" Musso > > > ___ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > -- Best regards, Max Semenik ([[User:MaxSem]]) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Diff algorithms: the shootout
Le 16/04/2016 04:00, MZMcBride a écrit : > Is there a related Phabricator Maniphest task about this? I'm not sure I > understand the motivation for making a switch. I would think that heavy > diffs are a very small portion of traffic. An intensive would be for MediaWiki core to only have a single diff system instead of two. For the historic part, wikidiff3 got introduced in August 2008: https://www.mediawiki.org/wiki/Special:Code/MediaWiki/38653 commit e45cf2b8 -- Antoine "hashar" Musso ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Diff algorithms: the shootout
On Friday, April 15, 2016, MZMcBride wrote: > Max Semenik wrote: >>Right now, MediaWiki has 2 pure-PHP engines to produce diffs (there's also >>a native PHP extension wikidiff2, but we're not discussing it right now): >>* DairikiDiff is what everybody uses, and >>* Wikidiff3, and alternative implementation by Guy Van den Broeck that was >>around for 8 years but required a configuration change >>While less battle-tested, Wikidiff3 offers vastly improved performance on >>heavy diffs compared to DairikiDiff. The price, however, is that it makes >>certain shortcuts if the diff is too complex. I ran through 100K diffs >>from English Wikipedia, and 6% of diffs were different. Lots of changes >>were seemingly insignificant but I need your help with determining if >>it's really so. > > Is there a related Phabricator Maniphest task about this? I'm not sure I > understand the motivation for making a switch. I would think that heavy > diffs are a very small portion of traffic. > > MZMcBride > > I think optimizing the worst case performance makes sense, especially if we dont really lose anything in doing so. To clarify, this is just for third parties, right? Wmf uses wikidiff2. -- -bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Diff algorithms: the shootout
On Fri, 2016-04-15 at 21:00 -0500, MZMcBride wrote: > Max Semenik wrote: > > > > Right now, MediaWiki has 2 pure-PHP engines to produce diffs (there's also > > a native PHP extension wikidiff2, but we're not discussing it right now): > > * DairikiDiff is what everybody uses, and > > * Wikidiff3, and alternative implementation by Guy Van den Broeck that was > > around for 8 years but required a configuration change > > While less battle-tested, Wikidiff3 offers vastly improved performance on > > heavy diffs compared to DairikiDiff. The price, however, is that it makes > > certain shortcuts if the diff is too complex. I ran through 100K diffs > > from English Wikipedia, and 6% of diffs were different. Lots of changes > > were seemingly insignificant but I need your help with determining if > > it's really so. > Is there a related Phabricator Maniphest task about this? I'm not sure I > understand the motivation for making a switch. I would think that heavy > diffs are a very small portion of traffic. https://phabricator.wikimedia.org/T128896 looks related. andre -- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Diff algorithms: the shootout
Max Semenik wrote: >Right now, MediaWiki has 2 pure-PHP engines to produce diffs (there's also >a native PHP extension wikidiff2, but we're not discussing it right now): >* DairikiDiff is what everybody uses, and >* Wikidiff3, and alternative implementation by Guy Van den Broeck that was >around for 8 years but required a configuration change >While less battle-tested, Wikidiff3 offers vastly improved performance on >heavy diffs compared to DairikiDiff. The price, however, is that it makes >certain shortcuts if the diff is too complex. I ran through 100K diffs >from English Wikipedia, and 6% of diffs were different. Lots of changes >were seemingly insignificant but I need your help with determining if >it's really so. Is there a related Phabricator Maniphest task about this? I'm not sure I understand the motivation for making a switch. I would think that heavy diffs are a very small portion of traffic. MZMcBride ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l