On 11/12/2016 03:58 PM, Jun Wu wrote:
Excerpts from Mads Kiilerich's message of 2016-11-03 22:34:15 +0100:
# HG changeset patch
# User Mads Kiilerich <mad...@unity3d.com>
# Date 1478208837 -3600
#      Thu Nov 03 22:33:57 2016 +0100
# Node ID be83c5f4ec8931cb7e771a80a3ef5e2042a005c1
# Parent  3e0216b2a0995cb21946bc13fb21391013332c57
bdiff: make sure we append to repeated lines instead of inserting into range

This will mitigate the symptoms that tests exposed in the previous changeset.

Arguably, we need similar handling for longer sequences of repeated lines...
I'm concerned about this. This patch looks like a workaround for the
simplest case. While you are aware it could have issues with "longer
sequences of repeated lines". And that's not that uncommon. There are
real-world JSON files like:

   [
     ...
     {
       "a": "b",
     },
     {
       "a": "b",
     },
     {
       "a": "b",
     },
     ...
   ]

Actually, those kinds of JSON files are part of the motivation of mpm's
f1ca249696ed. This series would surely affect them, while I'm not sure the
change is in a good way or not yet.

Note that after f1ca249696ed, there are still unsolved issues with some
large JSON files - we have seen (from user report) diff output being
"wrong", where human could generate much smaller diff easily.

If you are not in a hurry, I'd like to learn this area deeper before
commenting confidently. That may take me 1 to 2 weeks.

The test case I added was made to cover the same problem as your JSON example. I think v2 of my bdiff patch series addresses the concerns you mention in a reasonable way.

Sure, I would appreciate to get more thorough review and feedback and test cases. I page in and out and have no hurry ... except that Greg is making rapid progress in the same code ;-)

/Mads
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to