Re: [PATCH 1 of 2] bdiff: early pruning of common prefix before doing expensive computations

Gregory Szorc Thu, 17 Nov 2016 10:54:06 -0800

On Thu, Nov 17, 2016 at 9:16 AM, Mads Kiilerich <m...@kiilerich.com> wrote:


> # HG changeset patch
> # User Mads Kiilerich <mad...@unity3d.com>
> # Date 1479321935 -3600
> #      Wed Nov 16 19:45:35 2016 +0100
> # Node ID 7f39bccc1c96bffc83f3c6e774da57ecd38982a7
> # Parent  fe6a3576b863955a6c40ca46bd1d6c8e5384dedf
> bdiff: early pruning of common prefix before doing expensive computations
>
> It seems quite common that files don't change completely. New lines are
> often
> pretty much appended, and modifications will often only change a small
> section
> of the file which on average will be in the middle.
>
> There can thus be a big win by pruning a common prefix before starting the
> more
> expensive search for longest common substrings.
>
> Worst case, it will scan through a long sequence of similar bytes without
> encountering a newline. Splitlines will then have to do the same again ...
> twice for each side. If similar lines are found, splitlines will save the
> double iteration and hashing of the lines ... plus there will be less
> lines to
> find common substrings in.
>
> This change might in some cases make the algorith pick shorter or less
> optimal
> common substrings. We can't have the cake and eat it.
>
> This make hg --time bundle --base null -r 4.0 go from 14.5 to 15 s - a 3%
> increase.
>
> On mozilla-unified:
> perfbdiff -m 3041e4d59df2
> ! wall 0.053088 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) to
> ! wall 0.024618 comb 0.020000 user 0.020000 sys 0.000000 (best of 116)
> perfbdiff 0e9928989e9c --alldata --count 10
> ! wall 0.702075 comb 0.700000 user 0.700000 sys 0.000000 (best of 15) to
> ! wall 0.579235 comb 0.580000 user 0.580000 sys 0.000000 (best of 18)
>

\o/

Thank you for implementing this! This is likely the single biggest "easy"
perf win in bdiff to be found.


>
> diff --git a/mercurial/bdiff_module.c b/mercurial/bdiff_module.c
> --- a/mercurial/bdiff_module.c
> +++ b/mercurial/bdiff_module.c
> @@ -61,12 +61,12 @@ nomem:
>
>  static PyObject *bdiff(PyObject *self, PyObject *args)
>

Implementing this in the Python module means that CFFI/PyPy won't be able
to realize the perf wins :(

How do you feel about implementing this in bdiff.c?


>  {
> -       char *sa, *sb, *rb;
> +       char *sa, *sb, *rb, *ia, *ib;
>         PyObject *result = NULL;
>         struct bdiff_line *al, *bl;
>         struct bdiff_hunk l, *h;
>         int an, bn, count;
> -       Py_ssize_t len = 0, la, lb;
> +       Py_ssize_t len = 0, la, lb, li = 0, lcommon = 0, lmax;
>         PyThreadState *_save;
>
>         l.next = NULL;
> @@ -80,8 +80,17 @@ static PyObject *bdiff(PyObject *self, P
>         }
>
>         _save = PyEval_SaveThread();
> -       an = bdiff_splitlines(sa, la, &al);
> -       bn = bdiff_splitlines(sb, lb, &bl);
> +
> +       lmax = la > lb ? lb : la;
> +       for (ia = sa, ib = sb;
> +            li < lmax && *ia == *ib;
> +            ++li, ++ia, ++ib)
> +               if (*ia == '\n')
> +                       lcommon = li + 1;
> +       /* we can almost add: if (li == lmax) lcommon = li; */
>

I can't get this to apply cleanly locally against the committed repo, so I
can't look at the assembly. But, my guess is there are still some
performance optimizations to be found.

If we compare words instead of bytes, this will require fewer loop
iterations and fewer executed instructions.

My work with bdiff_splitlines() tells me that the \n check inside the tight
loop is likely adding a bit of overhead. It might be worth backtracking
after first mismatch to find the last newline. Alternatively, if you
iterate on word boundaries, you can do a bitwise AND against 0x0a0a0a0a...
and compare against 0x0 and store the last known word offset containing a
newline. On 64 bit machines, the cost of that newline check is reduced by
8x.

Of course, this should be deferred to followup work.



> +
> +       an = bdiff_splitlines(sa + lcommon, la - lcommon, &al);
> +       bn = bdiff_splitlines(sb + lcommon, lb - lcommon, &bl);
>         if (!al || !bl)
>                 goto nomem;
>
> @@ -112,8 +121,8 @@ static PyObject *bdiff(PyObject *self, P
>         for (h = l.next; h; h = h->next) {
>                 if (h->a1 != la || h->b1 != lb) {
>                         len = bl[h->b1].l - bl[lb].l;
> -                       putbe32((uint32_t)(al[la].l - al->l), rb);
> -                       putbe32((uint32_t)(al[h->a1].l - al->l), rb + 4);
> +                       putbe32((uint32_t)(al[la].l + lcommon - al->l),
> rb);
> +                       putbe32((uint32_t)(al[h->a1].l + lcommon -
> al->l), rb + 4);
>                         putbe32((uint32_t)len, rb + 8);
>                         memcpy(rb + 12, bl[lb].l, len);
>                         rb += 12 + len;
> diff --git a/tests/test-bdiff.py.out b/tests/test-bdiff.py.out
> --- a/tests/test-bdiff.py.out
> +++ b/tests/test-bdiff.py.out
> @@ -34,10 +34,9 @@ showdiff(
>  showdiff(
>    'a\nb\nb\nb\nc\n.\nd\ne\n.\nf\n',
>    'a\nb\nb\na\nb\nb\nb\nc\n.\nb\nc\n.\nd\ne\nf\n'):
> - 0 0 '' -> 'a\nb\nb\n'
> - 'a\nb\nb\nb\nc\n.\n'
> - 12 12 '' -> 'b\nc\n.\n'
> - 'd\ne\n'
> + 'a\nb\nb\n'
> + 6 6 '' -> 'a\nb\nb\nb\nc\n.\n'
> + 'b\nc\n.\nd\ne\n'
>   16 18 '.\n' -> ''
>   'f\n'
>  done
> diff --git a/tests/test-check-code.t b/tests/test-check-code.t
> --- a/tests/test-check-code.t
> +++ b/tests/test-check-code.t
> @@ -15,6 +15,10 @@ New errors are not allowed. Warnings are
>    Skipping hgext/fsmonitor/pywatchman/msc_stdint.h it has no-che?k-code
> (glob)
>    Skipping hgext/fsmonitor/pywatchman/pybser.py it has no-che?k-code
> (glob)
>    Skipping i18n/polib.py it has no-che?k-code (glob)
> +  mercurial/bdiff_module.c:89:
> +   >   //printf("common prefix: %i next line: %i\n", li, llf);
> +   don't use //-style comments
>    Skipping mercurial/httpclient/__init__.py it has no-che?k-code (glob)
>    Skipping mercurial/httpclient/_readers.py it has no-che?k-code (glob)
>    Skipping mercurial/statprof.py it has no-che?k-code (glob)
> +  [1]
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>

_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Re: [PATCH 1 of 2] bdiff: early pruning of common prefix before doing expensive computations

Reply via email to