Re: [PATCH] spelling: fixes of non-dictionary words

2016-11-16 Thread Mads Kiilerich

On 11/16/2016 04:16 PM, Augie Fackler wrote:

On Nov 16, 2016, at 08:47, Mads Kiilerich  wrote:

On 10/18/2016 02:20 AM, Augie Fackler wrote:

On Mon, Oct 17, 2016 at 11:38:02PM +0200, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1476739015 -7200
#  Mon Oct 17 23:16:55 2016 +0200
# Node ID efd5397e9da5b1d7e2c3353b0b06fc904651b150
# Parent  8a864844d5a0c34bdb24d2e098a0cd339e32e020
spelling: fixes of non-dictionary words

Queued, with some hunks discarded. I've tried to note the interesting ones.

Thanks for pushing.

You omitted prioritises -> prioritizes. I thought we used US english?

We do, but I'm bad at remembering which is which. I'd gladly do that cleanup, 
but only if you'll publish your tool for this (it'd be nice to have this happen 
on an ongoing basis somehow rather than have you drop in with a monster patch 
once a year).


Sure. I will clean it up and post it now when the fixes have landed.

It hasn't changed much since 
https://www.mercurial-scm.org/pipermail/mercurial-devel/2012-August/043846.html 
and I still don't think it is feasible to upstream it ;-)


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] spelling: fixes of non-dictionary words

2016-11-16 Thread Mads Kiilerich

On 10/18/2016 02:20 AM, Augie Fackler wrote:

On Mon, Oct 17, 2016 at 11:38:02PM +0200, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1476739015 -7200
#  Mon Oct 17 23:16:55 2016 +0200
# Node ID efd5397e9da5b1d7e2c3353b0b06fc904651b150
# Parent  8a864844d5a0c34bdb24d2e098a0cd339e32e020
spelling: fixes of non-dictionary words

Queued, with some hunks discarded. I've tried to note the interesting ones.


Thanks for pushing.

You omitted prioritises -> prioritizes. I thought we used US english?

/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 5 of 7 v2] bdiff: give slight preference to longest matches in the middle of the B side

2016-11-15 Thread Mads Kiilerich

On 11/15/2016 09:57 PM, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1478626653 -3600
#  Tue Nov 08 18:37:33 2016 +0100
# Node ID 5ae7e8061a9671e8941a7a17e316254d228acf59
# Parent  5bb26f29b1509520ca3af4c540775cab50b4d6c0
bdiff: give slight preference to longest matches in the middle of the B side


This and the following patches can probably be rearranged and folded to 
give less churn. I would however appreciate to get another round of 
thorough review with this structure - that seems to me to give a more 
natural progression.


The benefit from these changes is mainly "better diffs". Better diffs do 
not necessarily compress better and there might be some small increases 
in actual diff size. I have not noticed any significant performance changes.


Greg, can you verify it doesn't impact your bdiff benchmarks in your 
environment in a bad way?


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 5 of 5 rfc] bdiff: make sure we append to repeated lines instead of inserting into range

2016-11-15 Thread Mads Kiilerich

On 11/12/2016 03:58 PM, Jun Wu wrote:

Excerpts from Mads Kiilerich's message of 2016-11-03 22:34:15 +0100:

# HG changeset patch
# User Mads Kiilerich 
# Date 1478208837 -3600
#  Thu Nov 03 22:33:57 2016 +0100
# Node ID be83c5f4ec8931cb7e771a80a3ef5e2042a005c1
# Parent  3e0216b2a0995cb21946bc13fb21391013332c57
bdiff: make sure we append to repeated lines instead of inserting into range

This will mitigate the symptoms that tests exposed in the previous changeset.

Arguably, we need similar handling for longer sequences of repeated lines...

I'm concerned about this. This patch looks like a workaround for the
simplest case. While you are aware it could have issues with "longer
sequences of repeated lines". And that's not that uncommon. There are
real-world JSON files like:

   [
 ...
 {
   "a": "b",
 },
 {
   "a": "b",
 },
 {
   "a": "b",
 },
 ...
   ]

Actually, those kinds of JSON files are part of the motivation of mpm's
f1ca249696ed. This series would surely affect them, while I'm not sure the
change is in a good way or not yet.

Note that after f1ca249696ed, there are still unsolved issues with some
large JSON files - we have seen (from user report) diff output being
"wrong", where human could generate much smaller diff easily.

If you are not in a hurry, I'd like to learn this area deeper before
commenting confidently. That may take me 1 to 2 weeks.


The test case I added was made to cover the same problem as your JSON 
example. I think v2 of my bdiff patch series addresses the concerns you 
mention in a reasonable way.


Sure, I would appreciate to get more thorough review and feedback and 
test cases. I page in and out and have no hurry ... except that Greg is 
making rapid progress in the same code ;-)


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 3 of 7 v2] bdiff: adjust criteria for getting optimal longest match in the A side middle

2016-11-15 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1478626653 -3600
#  Tue Nov 08 18:37:33 2016 +0100
# Node ID fbce612c9be26860338f6ba4ab5df819b6d79521
# Parent  0f97aaa1b3f5c25e02296c73e6e93d77a197910c
bdiff: adjust criteria for getting optimal longest match in the A side middle

We prefer matches closer to the middle to balance recursion, as introduced in
f1ca249696ed.

For ranges with uneven length, matches starting exactly in the middle should
have preference. That will be optimal for matches of length 1. We will thus
accept equality in the half check.

For ranges with even length, half was ceil'ed when calculated but we got the
preference for low matches from the 'less than half' check. To get the same
result as before when we also accept equality, floor it. Without that,
test-annotate.t would show some different (still correct but less optimal)
results.

This will change the heuristics. Tests shows a slightly different output - and
sometimes slightly smaller bundles.

The bundle size for 4.0 (hg bundle --base null -r 4.0 x.hg) happens to go from
22804885 to 22803824 bytes - an 0.005% reduction.

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -151,7 +151,7 @@ static int longest_match(struct bdiff_li
if (a2 - a1 > 3)
a1 = a2 - 3;
 
-   half = (a1 + a2) / 2;
+   half = (a1 + a2 - 1) / 2;
 
for (i = a1; i < a2; i++) {
/* skip all lines in b after the current block */
@@ -177,7 +177,7 @@ static int longest_match(struct bdiff_li
 
/* best match so far? we prefer matches closer
   to the middle to balance recursion */
-   if (k > mk || (k == mk && (i <= mi || i < half))) {
+   if (k > mk || (k == mk && (i <= mi || i <= half))) {
mi = i;
mj = j;
mk = k;
diff --git a/tests/test-bdiff.py b/tests/test-bdiff.py
--- a/tests/test-bdiff.py
+++ b/tests/test-bdiff.py
@@ -88,7 +88,7 @@ print("Diff 1 to 3 lines - preference fo
 showdiff('a\n', 'a\n' * 3)
 print("Diff 1 to 5 lines - preference for adding / removing at the end of 
sequences:")
 showdiff('a\n', 'a\n' * 5)
-print("Diff 3 to 1 lines - preference for adding / removing at the end of 
sequences:")
+print("Diff 3 to 1 lines - preference for balanced recursion:")
 showdiff('a\n' * 3, 'a\n')
-print("Diff 5 to 1 lines - this diff seems weird:")
+print("Diff 5 to 1 lines - preference for balanced recursion:")
 showdiff('a\n' * 5, 'a\n')
diff --git a/tests/test-bdiff.py.out b/tests/test-bdiff.py.out
--- a/tests/test-bdiff.py.out
+++ b/tests/test-bdiff.py.out
@@ -67,16 +67,17 @@ showdiff(
   'a\na\na\na\na\n'):
  'a\n'
  2 2 '' -> 'a\na\na\na\n'
-Diff 3 to 1 lines - preference for adding / removing at the end of sequences:
+Diff 3 to 1 lines - preference for balanced recursion:
 showdiff(
   'a\na\na\n',
   'a\n'):
+ 0 2 'a\n' -> ''
  'a\n'
- 2 6 'a\na\n' -> ''
-Diff 5 to 1 lines - this diff seems weird:
+ 4 6 'a\n' -> ''
+Diff 5 to 1 lines - preference for balanced recursion:
 showdiff(
   'a\na\na\na\na\n',
   'a\n'):
- 0 2 'a\n' -> ''
+ 0 4 'a\na\n' -> ''
  'a\n'
- 4 10 'a\na\na\n' -> ''
+ 6 10 'a\na\n' -> ''
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 7 v2] tests: explore some bdiff cases

2016-11-15 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1478626653 -3600
#  Tue Nov 08 18:37:33 2016 +0100
# Node ID 0f97aaa1b3f5c25e02296c73e6e93d77a197910c
# Parent  b7354208678098580f6e5f975d734c1e8dd9d846
tests: explore some bdiff cases

diff --git a/tests/test-bdiff.py b/tests/test-bdiff.py
--- a/tests/test-bdiff.py
+++ b/tests/test-bdiff.py
@@ -78,3 +78,17 @@ testfixws("", "", 1)
 testfixws("", "", 0)
 
 print("done")
+
+print("Odd diff for a trivial change:")
+showdiff(
+''.join('<%s\n-\n' % i for i in range(5)),
+''.join('>%s\n-\n' % i for i in range(5)))
+
+print("Diff 1 to 3 lines - preference for adding / removing at the end of 
sequences:")
+showdiff('a\n', 'a\n' * 3)
+print("Diff 1 to 5 lines - preference for adding / removing at the end of 
sequences:")
+showdiff('a\n', 'a\n' * 5)
+print("Diff 3 to 1 lines - preference for adding / removing at the end of 
sequences:")
+showdiff('a\n' * 3, 'a\n')
+print("Diff 5 to 1 lines - this diff seems weird:")
+showdiff('a\n' * 5, 'a\n')
diff --git a/tests/test-bdiff.py.out b/tests/test-bdiff.py.out
--- a/tests/test-bdiff.py.out
+++ b/tests/test-bdiff.py.out
@@ -42,3 +42,41 @@ showdiff(
  'f\n'
 done
 done
+Odd diff for a trivial change:
+showdiff(
+  '<0\n-\n<1\n-\n<2\n-\n<3\n-\n<4\n-\n',
+  '>0\n-\n>1\n-\n>2\n-\n>3\n-\n>4\n-\n'):
+ 0 8 '<0\n-\n<1\n' -> '>0\n'
+ '-\n'
+ 10 13 '<2\n' -> '>1\n'
+ '-\n'
+ 15 18 '<3\n' -> '>2\n'
+ '-\n'
+ 20 23 '<4\n' -> '>3\n'
+ '-\n'
+ 25 25 '' -> '>4\n-\n'
+Diff 1 to 3 lines - preference for adding / removing at the end of sequences:
+showdiff(
+  'a\n',
+  'a\na\na\n'):
+ 'a\n'
+ 2 2 '' -> 'a\na\n'
+Diff 1 to 5 lines - preference for adding / removing at the end of sequences:
+showdiff(
+  'a\n',
+  'a\na\na\na\na\n'):
+ 'a\n'
+ 2 2 '' -> 'a\na\na\na\n'
+Diff 3 to 1 lines - preference for adding / removing at the end of sequences:
+showdiff(
+  'a\na\na\n',
+  'a\n'):
+ 'a\n'
+ 2 6 'a\na\n' -> ''
+Diff 5 to 1 lines - this diff seems weird:
+showdiff(
+  'a\na\na\na\na\n',
+  'a\n'):
+ 0 2 'a\n' -> ''
+ 'a\n'
+ 4 10 'a\na\na\n' -> ''
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 6 of 7 v2] bdiff: give slight preference to appending lines

2016-11-15 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1479243409 -3600
#  Tue Nov 15 21:56:49 2016 +0100
# Node ID e022e995415bd48ffe6d4d03dc97b99dd547cde1
# Parent  5ae7e8061a9671e8941a7a17e316254d228acf59
bdiff: give slight preference to appending lines

[This change could be folded into the previous changeset to minimize the repo
churn ...]

The general preference to matches in the middle of bdiff ranges helps getting
balanced recursion and efficient computation. But, as previous changes have
shown, it might also give diffs that seems "obviously wrong".

To mitigate that: If the best match on the A side starts at the beginning of
the bdiff range, don't aim for the middle-most B side match but for the
earliest.

This will make the matches balanced (by both sides being "early") even though
the bisection will be less balanced. Still, this case only apply if the *best*
and middle-most match was fully unbalanced on the A side. Each recursion will
thus even in this worst case reduce the problem significantly and we are not
re-introducing the problem that was fixed in f1ca249696ed.

The bundle size for 4.0 (hg bundle --base null -r 4.0 x.hg) happens to go from
22806817 to 22807275 bytes - a 0.002% increase.

This make the recent test-bdiff.py changes give a more pretty output ... but
they no longer show that the recursion is around middle matches (because it in
these cases isn't).

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -188,7 +188,7 @@ static int longest_match(struct bdiff_li
/* same match but closer to half */
mi = i;
mj = j;
-   } else if (i == mi && mj > bhalf) {
+   } else if (i == mi && (mj > bhalf || i == a1)) {
/* same i but best earlier j */
mj = j;
}
diff --git a/tests/test-annotate.t b/tests/test-annotate.t
--- a/tests/test-annotate.t
+++ b/tests/test-annotate.t
@@ -91,8 +91,8 @@ annotate (JSON)
 annotate -n b
 
   $ hg annotate -n b
+  0: a
   1: a
-  0: a
   1: a
   3: b4
   3: b5
@@ -111,8 +111,8 @@ annotate --no-follow b
 annotate -nl b
 
   $ hg annotate -nl b
-  1:1: a
   0:1: a
+  1:2: a
   1:3: a
   3:4: b4
   3:5: b5
@@ -121,8 +121,8 @@ annotate -nl b
 annotate -nf b
 
   $ hg annotate -nf b
+  0 a: a
   1 a: a
-  0 a: a
   1 a: a
   3 b: b4
   3 b: b5
@@ -131,8 +131,8 @@ annotate -nf b
 annotate -nlf b
 
   $ hg annotate -nlf b
-  1 a:1: a
   0 a:1: a
+  1 a:2: a
   1 a:3: a
   3 b:4: b4
   3 b:5: b5
@@ -156,8 +156,8 @@ annotate -nlf b
 annotate after merge
 
   $ hg annotate -nf b
+  0 a: a
   1 a: a
-  0 a: a
   1 a: a
   3 b: b4
   4 b: c
@@ -166,8 +166,8 @@ annotate after merge
 annotate after merge with -l
 
   $ hg annotate -nlf b
-  1 a:1: a
   0 a:1: a
+  1 a:2: a
   1 a:3: a
   3 b:4: b4
   4 b:5: c
@@ -198,7 +198,7 @@ annotate after merge with -l
 annotate after rename merge
 
   $ hg annotate -nf b
-  1 a: a
+  0 a: a
   6 b: z
   1 a: a
   3 b: b4
@@ -209,7 +209,7 @@ annotate after rename merge
 annotate after rename merge with -l
 
   $ hg annotate -nlf b
-  1 a:1: a
+  0 a:1: a
   6 b:2: z
   1 a:3: a
   3 b:4: b4
@@ -226,7 +226,7 @@ Issue2807: alignment of line numbers wit
   $ echo more >> b
   $ hg ci -mmore -d '7 0'
   $ hg annotate -nlf b
-   1 a: 1: a
+   0 a: 1: a
6 b: 2: z
1 a: 3: a
3 b: 4: b4
@@ -240,15 +240,15 @@ Issue2807: alignment of line numbers wit
 linkrev vs rev
 
   $ hg annotate -r tip -n a
+  0: a
   1: a
-  0: a
   1: a
 
 linkrev vs rev with -l
 
   $ hg annotate -r tip -nl a
-  1:1: a
   0:1: a
+  1:2: a
   1:3: a
 
 Issue589: "undelete" sequence leads to crash
diff --git a/tests/test-bdiff.py b/tests/test-bdiff.py
--- a/tests/test-bdiff.py
+++ b/tests/test-bdiff.py
@@ -84,9 +84,9 @@ showdiff(
 ''.join('<%s\n-\n' % i for i in range(5)),
 ''.join('>%s\n-\n' % i for i in range(5)))
 
-print("Diff 1 to 3 lines - preference for balanced recursion:")
+print("Diff 1 to 3 lines - preference for appending:")
 showdiff('a\n', 'a\n' * 3)
-print("Diff 1 to 5 lines - preference for balanced recursion:")
+print("Diff 1 to 5 lines - preference for appending:")
 showdiff('a\n', 'a\n' * 5)
 print("Diff 3 to 1 lines - preference for balanced recursion:")
 showdiff('a\n' * 3, 'a\n')
diff --git a/tests/test-bdiff.py.out b/tests/test-bdiff.py.out
--- a/tests/test-bdiff.py.out
+++ b/tests/test-bdiff.py.out
@@ -56,20 +56,18 @@ showdiff(
  '-\n'
  20 23 '<4\n' -> '>4\n'
  '-\n'
-Diff 1 to 3 lines - preference for balanced recursion:

[PATCH 7 of 7 v2] bdiff: give slight preference to removing trailing lines

2016-11-15 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1479243409 -3600
#  Tue Nov 15 21:56:49 2016 +0100
# Node ID efa30442953a6e1f096f8833c4ee8047375600d6
# Parent  e022e995415bd48ffe6d4d03dc97b99dd547cde1
bdiff: give slight preference to removing trailing lines

[This change could be folded into the previous changeset to minimize the repo
churn ...]

Similar to the previous change, introduce an exception to the general
preference for matches in the middle of bdiff ranges: If the best match on the
B side starts at the beginning of the bdiff range, don't aim for the
middle-most A side match but for the earliest.

New (later) matches on the A side will only be considered better if the
corresponding match on the B side *not* is at the beginning of the range.
Thus, if the best (middle-most) match on the B side turns out to be at the
beginning of the range, the earliest match on the A side will be used.

The bundle size for 4.0 (hg bundle --base null -r 4.0 x.hg) happens to go from
22807275 to 22808120 bytes - a 0.004% increase.

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -184,7 +184,7 @@ static int longest_match(struct bdiff_li
mj = j;
mk = k;
} else if (k == mk) {
-   if (i > mi && i <= half) {
+   if (i > mi && i <= half && j > b1) {
/* same match but closer to half */
mi = i;
mj = j;
diff --git a/tests/test-bdiff.py b/tests/test-bdiff.py
--- a/tests/test-bdiff.py
+++ b/tests/test-bdiff.py
@@ -88,7 +88,7 @@ print("Diff 1 to 3 lines - preference fo
 showdiff('a\n', 'a\n' * 3)
 print("Diff 1 to 5 lines - preference for appending:")
 showdiff('a\n', 'a\n' * 5)
-print("Diff 3 to 1 lines - preference for balanced recursion:")
+print("Diff 3 to 1 lines - preference for removing trailing lines:")
 showdiff('a\n' * 3, 'a\n')
-print("Diff 5 to 1 lines - preference for balanced recursion:")
+print("Diff 5 to 1 lines - preference for removing trailing lines:")
 showdiff('a\n' * 5, 'a\n')
diff --git a/tests/test-bdiff.py.out b/tests/test-bdiff.py.out
--- a/tests/test-bdiff.py.out
+++ b/tests/test-bdiff.py.out
@@ -68,17 +68,15 @@ showdiff(
   'a\na\na\na\na\n'):
  'a\n'
  2 2 '' -> 'a\na\na\na\n'
-Diff 3 to 1 lines - preference for balanced recursion:
+Diff 3 to 1 lines - preference for removing trailing lines:
 showdiff(
   'a\na\na\n',
   'a\n'):
- 0 2 'a\n' -> ''
  'a\n'
- 4 6 'a\n' -> ''
-Diff 5 to 1 lines - preference for balanced recursion:
+ 2 6 'a\na\n' -> ''
+Diff 5 to 1 lines - preference for removing trailing lines:
 showdiff(
   'a\na\na\na\na\n',
   'a\n'):
- 0 4 'a\na\n' -> ''
  'a\n'
- 6 10 'a\na\n' -> ''
+ 2 10 'a\na\na\na\n' -> ''
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 7 v2] tests: make test-bdiff.py easier to maintain

2016-11-15 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1479243409 -3600
#  Tue Nov 15 21:56:49 2016 +0100
# Node ID b7354208678098580f6e5f975d734c1e8dd9d846
# Parent  e1d6aa0e4c3aed73e0dc523b8a8fd5f9fe23510a
tests: make test-bdiff.py easier to maintain

Add more stdout logging to help navigate the .out file.

diff --git a/tests/test-bdiff.py b/tests/test-bdiff.py
--- a/tests/test-bdiff.py
+++ b/tests/test-bdiff.py
@@ -11,13 +11,12 @@ def test1(a, b):
 if d:
 c = mpatch.patches(a, [d])
 if c != b:
-print("***", repr(a), repr(b))
-print("bad:")
-print(repr(c)[:200])
-print(repr(d))
+print("bad diff+patch result from\n  %r to\n  %r:" % (a, b))
+print("bdiff: %r" % d)
+print("patched: %r" % c[:200])
 
 def test(a, b):
-print("***", repr(a), repr(b))
+print("test", repr(a), repr(b))
 test1(a, b)
 test1(b, a)
 
@@ -43,13 +42,21 @@ test("a\nb", "a\nb")
 
 #issue1295
 def showdiff(a, b):
+print('showdiff(\n  %r,\n  %r):' % (a, b))
 bin = bdiff.bdiff(a, b)
 pos = 0
+q = 0
 while pos < len(bin):
 p1, p2, l = struct.unpack(">lll", bin[pos:pos + 12])
 pos += 12
-print(p1, p2, repr(bin[pos:pos + l]))
+if p1:
+print('', repr(a[q:p1]))
+print('', p1, p2, repr(a[p1:p2]), '->', repr(bin[pos:pos + l]))
 pos += l
+q = p2
+if q < len(a):
+print('', repr(a[q:]))
+
 showdiff("x\n\nx\n\nx\n\nx\n\nz\n", "x\n\nx\n\ny\n\nx\n\nx\n\nz\n")
 showdiff("x\n\nx\n\nx\n\nx\n\nz\n", "x\n\nx\n\ny\n\nx\n\ny\n\nx\n\nz\n")
 # we should pick up abbbc. rather than bc.de as the longest match
diff --git a/tests/test-bdiff.py.out b/tests/test-bdiff.py.out
--- a/tests/test-bdiff.py.out
+++ b/tests/test-bdiff.py.out
@@ -1,27 +1,44 @@
-*** 'a\nc\n\n\n\n' 'a\nb\n\n\n'
-*** 'a\nb\nc\n' 'a\nc\n'
-*** '' ''
-*** 'a\nb\nc' 'a\nb\nc'
-*** 'a\nb\nc\nd\n' 'a\nd\n'
-*** 'a\nb\nc\nd\n' 'a\nc\ne\n'
-*** 'a\nb\nc\n' 'a\nc\n'
-*** 'a\n' 'c\na\nb\n'
-*** 'a\n' ''
-*** 'a\n' 'b\nc\n'
-*** 'a\n' 'c\na\n'
-*** '' 'adjfkjdjksdhfksj'
-*** '' 'ab'
-*** '' 'abc'
-*** 'a' 'a'
-*** 'ab' 'ab'
-*** 'abc' 'abc'
-*** 'a\n' 'a\n'
-*** 'a\nb' 'a\nb'
-6 6 'y\n\n'
-6 6 'y\n\n'
-9 9 'y\n\n'
-0 0 'a\nb\nb\n'
-12 12 'b\nc\n.\n'
-16 18 ''
+test 'a\nc\n\n\n\n' 'a\nb\n\n\n'
+test 'a\nb\nc\n' 'a\nc\n'
+test '' ''
+test 'a\nb\nc' 'a\nb\nc'
+test 'a\nb\nc\nd\n' 'a\nd\n'
+test 'a\nb\nc\nd\n' 'a\nc\ne\n'
+test 'a\nb\nc\n' 'a\nc\n'
+test 'a\n' 'c\na\nb\n'
+test 'a\n' ''
+test 'a\n' 'b\nc\n'
+test 'a\n' 'c\na\n'
+test '' 'adjfkjdjksdhfksj'
+test '' 'ab'
+test '' 'abc'
+test 'a' 'a'
+test 'ab' 'ab'
+test 'abc' 'abc'
+test 'a\n' 'a\n'
+test 'a\nb' 'a\nb'
+showdiff(
+  'x\n\nx\n\nx\n\nx\n\nz\n',
+  'x\n\nx\n\ny\n\nx\n\nx\n\nz\n'):
+ 'x\n\nx\n\n'
+ 6 6 '' -> 'y\n\n'
+ 'x\n\nx\n\nz\n'
+showdiff(
+  'x\n\nx\n\nx\n\nx\n\nz\n',
+  'x\n\nx\n\ny\n\nx\n\ny\n\nx\n\nz\n'):
+ 'x\n\nx\n\n'
+ 6 6 '' -> 'y\n\n'
+ 'x\n\n'
+ 9 9 '' -> 'y\n\n'
+ 'x\n\nz\n'
+showdiff(
+  'a\nb\nb\nb\nc\n.\nd\ne\n.\nf\n',
+  'a\nb\nb\na\nb\nb\nb\nc\n.\nb\nc\n.\nd\ne\nf\n'):
+ 0 0 '' -> 'a\nb\nb\n'
+ 'a\nb\nb\nb\nc\n.\n'
+ 12 12 '' -> 'b\nc\n.\n'
+ 'd\ne\n'
+ 16 18 '.\n' -> ''
+ 'f\n'
 done
 done
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 5 of 7 v2] bdiff: give slight preference to longest matches in the middle of the B side

2016-11-15 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1478626653 -3600
#  Tue Nov 08 18:37:33 2016 +0100
# Node ID 5ae7e8061a9671e8941a7a17e316254d228acf59
# Parent  5bb26f29b1509520ca3af4c540775cab50b4d6c0
bdiff: give slight preference to longest matches in the middle of the B side

We already have a slight preference for matches close to the middle on the A
side. Now, do the same on the B side.

j is iterating the b range backwards and we thus accept a new j if the previous
match was in the upper half.

This makes the test-bhalf diff "correct". It obviously also gives more
preference to balanced recursion than to appending to sequences. That is kind
of correct, but will also unfortunately make some bundles bigger. No doubt, we
can also create examples where it will make them smaller ...

The bundle size for 4.0 (hg bundle --base null -r 4.0 x.hg) happens to go from
22803824 to 22806817 bytes - an 0.01% increase.

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -143,7 +143,7 @@ static int longest_match(struct bdiff_li
struct pos *pos,
 int a1, int a2, int b1, int b2, int *omi, int *omj)
 {
-   int mi = a1, mj = b1, mk = 0, i, j, k, half;
+   int mi = a1, mj = b1, mk = 0, i, j, k, half, bhalf;
 
/* window our search on large regions to better bound
   worst-case performance. by choosing a window at the end, we
@@ -152,6 +152,7 @@ static int longest_match(struct bdiff_li
a1 = a2 - 3;
 
half = (a1 + a2 - 1) / 2;
+   bhalf = (b1 + b2 - 1) / 2;
 
for (i = a1; i < a2; i++) {
/* skip all lines in b after the current block */
@@ -187,8 +188,8 @@ static int longest_match(struct bdiff_li
/* same match but closer to half */
mi = i;
mj = j;
-   } else if (i == mi) {
-   /* same i but earlier j */
+   } else if (i == mi && mj > bhalf) {
+   /* same i but best earlier j */
mj = j;
}
}
diff --git a/tests/test-annotate.t b/tests/test-annotate.t
--- a/tests/test-annotate.t
+++ b/tests/test-annotate.t
@@ -91,9 +91,9 @@ annotate (JSON)
 annotate -n b
 
   $ hg annotate -n b
+  1: a
   0: a
   1: a
-  1: a
   3: b4
   3: b5
   3: b6
@@ -111,8 +111,8 @@ annotate --no-follow b
 annotate -nl b
 
   $ hg annotate -nl b
+  1:1: a
   0:1: a
-  1:2: a
   1:3: a
   3:4: b4
   3:5: b5
@@ -121,9 +121,9 @@ annotate -nl b
 annotate -nf b
 
   $ hg annotate -nf b
+  1 a: a
   0 a: a
   1 a: a
-  1 a: a
   3 b: b4
   3 b: b5
   3 b: b6
@@ -131,8 +131,8 @@ annotate -nf b
 annotate -nlf b
 
   $ hg annotate -nlf b
+  1 a:1: a
   0 a:1: a
-  1 a:2: a
   1 a:3: a
   3 b:4: b4
   3 b:5: b5
@@ -156,9 +156,9 @@ annotate -nlf b
 annotate after merge
 
   $ hg annotate -nf b
+  1 a: a
   0 a: a
   1 a: a
-  1 a: a
   3 b: b4
   4 b: c
   3 b: b5
@@ -166,8 +166,8 @@ annotate after merge
 annotate after merge with -l
 
   $ hg annotate -nlf b
+  1 a:1: a
   0 a:1: a
-  1 a:2: a
   1 a:3: a
   3 b:4: b4
   4 b:5: c
@@ -198,7 +198,7 @@ annotate after merge with -l
 annotate after rename merge
 
   $ hg annotate -nf b
-  0 a: a
+  1 a: a
   6 b: z
   1 a: a
   3 b: b4
@@ -209,7 +209,7 @@ annotate after rename merge
 annotate after rename merge with -l
 
   $ hg annotate -nlf b
-  0 a:1: a
+  1 a:1: a
   6 b:2: z
   1 a:3: a
   3 b:4: b4
@@ -226,7 +226,7 @@ Issue2807: alignment of line numbers wit
   $ echo more >> b
   $ hg ci -mmore -d '7 0'
   $ hg annotate -nlf b
-   0 a: 1: a
+   1 a: 1: a
6 b: 2: z
1 a: 3: a
3 b: 4: b4
@@ -240,15 +240,15 @@ Issue2807: alignment of line numbers wit
 linkrev vs rev
 
   $ hg annotate -r tip -n a
+  1: a
   0: a
   1: a
-  1: a
 
 linkrev vs rev with -l
 
   $ hg annotate -r tip -nl a
+  1:1: a
   0:1: a
-  1:2: a
   1:3: a
 
 Issue589: "undelete" sequence leads to crash
diff --git a/tests/test-bdiff.py b/tests/test-bdiff.py
--- a/tests/test-bdiff.py
+++ b/tests/test-bdiff.py
@@ -79,14 +79,14 @@ testfixws("", "", 0)
 
 print("done")
 
-print("Odd diff for a trivial change:")
+print("Nice diff for a trivial change:")
 showdiff(
 ''.join('<%s\n-\n' % i for i in range(5)),
 ''.join('>%s\n-\n' % i for i in range(5)))
 
-print("Diff 1 to 3 lines - preference for adding / removing at the end of 
sequences:")
+print("Diff 1 to 3 lines - preference for balanced recursion:")
 showdiff('a\n', 'a\n' * 3)
-print("Diff 1 to 5 lines - preference for adding / removing at the end of 
sequences:")
+p

[PATCH 4 of 7 v2] bdiff: rearrange the "better longest match" code

2016-11-15 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1478626653 -3600
#  Tue Nov 08 18:37:33 2016 +0100
# Node ID 5bb26f29b1509520ca3af4c540775cab50b4d6c0
# Parent  fbce612c9be26860338f6ba4ab5df819b6d79521
bdiff: rearrange the "better longest match" code

This is primarily to make the code more managable and prepare for later
changes.

More specific assignments might also be slightly faster, even thought it also
might generate a bit more code.

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -177,10 +177,20 @@ static int longest_match(struct bdiff_li
 
/* best match so far? we prefer matches closer
   to the middle to balance recursion */
-   if (k > mk || (k == mk && (i <= mi || i <= half))) {
+   if (k > mk) {
+   /* a longer match */
mi = i;
mj = j;
mk = k;
+   } else if (k == mk) {
+   if (i > mi && i <= half) {
+   /* same match but closer to half */
+   mi = i;
+   mj = j;
+   } else if (i == mi) {
+   /* same i but earlier j */
+   mj = j;
+   }
}
}
}
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 5 of 5 rfc] bdiff: make sure we append to repeated lines instead of inserting into range

2016-11-07 Thread Mads Kiilerich

On 11/07/2016 03:40 PM, Pierre-Yves David wrote:



On 11/03/2016 10:34 PM, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1478208837 -3600
#  Thu Nov 03 22:33:57 2016 +0100
# Node ID be83c5f4ec8931cb7e771a80a3ef5e2042a005c1
# Parent  3e0216b2a0995cb21946bc13fb21391013332c57
bdiff: make sure we append to repeated lines instead of inserting 
into range


This will mitigate the symptoms that tests exposed in the previous 
changeset.


small nits Would it be possible to have something patch-5-ish before 
patch 4 then to reduce the line change in patch 5?


No, not the way it is done here. They could be collapsed - I just 
separated them to make it clear why the extra complexity was necessary.


The last step could however perhaps be replaced by the mentioned "shift 
matching sequences" post processing step. That one could come first ... 
but it would arguably also go to far in the direction of making 
aesthetically pleasing diffs in code that "mostly" is used for our 
serialization format of changes. There could perhaps be another post 
processing step that optimize for size.


/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 2] perf: add perfbdiff

2016-11-06 Thread Mads Kiilerich

On 11/06/2016 08:42 AM, Gregory Szorc wrote:

# HG changeset patch
# User Gregory Szorc 
# Date 1478414512 25200
#  Sat Nov 05 23:41:52 2016 -0700
# Node ID e65ef545a3dcb46fe0f1798e76ea17f9f9323452
# Parent  f01367faa792635ad2f7a6b175ae3252292b5121
perf: add perfbdiff

bdiff shows up a lot in profiling. I think it would be useful to have
a perf command that runs bdiff over and over so we can find hot spots.

diff --git a/contrib/perf.py b/contrib/perf.py
--- a/contrib/perf.py
+++ b/contrib/perf.py
@@ -25,6 +25,7 @@ import random
  import sys
  import time
  from mercurial import (
+bdiff,
  changegroup,
  cmdutil,
  commands,
@@ -746,6 +747,30 @@ def perffncacheencode(ui, repo, **opts):
  timer(d)
  fm.end()
  
+@command('perfbdiff', revlogopts + formatteropts, '-c|-m|FILE REV')


I would prefer to keep it simple and consistently use -r for specifying 
revisions.



+def perfbdiff(ui, repo, file_, rev=None, **opts):
+"""benchmark a bdiff between a revision and its delta parent"""
+if opts.get('changelog') or opts.get('manifest'):
+file_, rev = None, file_
+elif rev is None:
+raise error.CommandError('perfbdiff', 'invalid arguments')
+
+r = cmdutil.openrevlog(repo, 'perfbdiff', file_, opts)
+
+node = r.lookup(rev)
+rev = r.rev(node)


This might be a stupid lazy question that essential is a request for 
more clarity in code and docstring:

Why this back and forth between rev and node?
Must rev always be a filelog revision ... or is it a changelog revision 
which then is changed to revlog revision while reusing the variable 
name? Perhaps reduce confusion by using different names.



+dp = r.deltaparent(rev)


Should it also consider aggressivemergedeltas?

Or perhaps more generally: should it be possible to compare any two 
revisions - especially for filelogs, to also cover the use case of diff?



More important: patch 2 LGTM.

/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 5 rfc] tests: explore some bdiff cases

2016-11-06 Thread Mads Kiilerich

On 11/06/2016 10:07 AM, Yuya Nishihara wrote:

On Thu, 03 Nov 2016 22:34:11 +0100, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1478208837 -3600
#  Thu Nov 03 22:33:57 2016 +0100
# Node ID f6408efe0d0f4179fe6cc2b967164c1b4567f3d6
# Parent  d06c049695e6ad3219e7479c65ce98a2f123e878
tests: explore some bdiff cases

diff --git a/tests/test-bhalf.t b/tests/test-bhalf.t
new file mode 100644
--- /dev/null
+++ b/tests/test-bhalf.t

'#require no-pure' is necessary since we use difflib in pure.

The other changes in this series look good to me, but it's bdiff.c so I
don't queue them.


Thanks for reviewing and the positive feedback. I will try to polish it 
for "real" submission.


For the last patch, I wonder if it would be better to add a post 
processing step that - given all the chunks - try to shift/rotate all 
match sequences to be as early as possible (and thus deltas to be as 
late and "appending" as possible). That could give more readable diffs, 
especially when combined with heuristics for preferring chunks starting 
with the lowest amount of indentation.


One lesson from these changes seems to be that it is a problem that we 
use the same low level diff algorithm for revlog delta storage and 
bundles and for readable patch diffs. One idea that got mentioned at the 
latest sprint was to use zstandard for storage and "just" seed it with 
the "a" version of the file as dictionary and let it compress the "b" 
side. That might be a better long term solution.


More short term, I wonder how much we could gain from somehow teaching 
bdiff to consider both parents for each chunk instead of just using 
deltas from one side and store chunks from the other verbatim. I think 
that could make a significant difference for repositories with a lot of 
big merges in files or the manifest.


/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 3 of 5 rfc] bdiff: rearrange the better longest match code

2016-11-03 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1478208837 -3600
#  Thu Nov 03 22:33:57 2016 +0100
# Node ID a3e3c7075c3c4b92e6b8c27e28bef7b2c061008d
# Parent  c593308da04e9144da01a08401d886a64985c74b
bdiff: rearrange the better longest match code

Primarily to make the code more managable and prepare for later changes.

More specific assignments might also be slightly faster, even thought it also
might generate a bit more code.

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -172,10 +172,20 @@ static int longest_match(struct bdiff_li
 
/* best match so far? we prefer matches closer
   to the middle to balance recursion */
-   if (k > mk || (k == mk && (i <= mi || i <= half))) {
+   if (k > mk) {
+   /* a longer match is always better */
mi = i;
mj = j;
mk = k;
+   } else if (k == mk) {
+   if (i > mi && i <= half) {
+   /* better i in first lower half */
+   mi = i;
+   mj = j;
+   } else if (i == mi) {
+   /* an earlier j is "better" */
+   mj = j;
+   }
}
}
}
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 5 of 5 rfc] bdiff: make sure we append to repeated lines instead of inserting into range

2016-11-03 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1478208837 -3600
#  Thu Nov 03 22:33:57 2016 +0100
# Node ID be83c5f4ec8931cb7e771a80a3ef5e2042a005c1
# Parent  3e0216b2a0995cb21946bc13fb21391013332c57
bdiff: make sure we append to repeated lines instead of inserting into range

This will mitigate the symptoms that tests exposed in the previous changeset.

Arguably, we need similar handling for longer sequences of repeated lines...

But also, we already have examples of how the heuristics handle other cases in
a similar way.

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -187,7 +187,7 @@ static int longest_match(struct bdiff_li
} else if (i == mi && findbetterb) {
/* better j in first upper half */
mj = j;
-   if (j <= bhalf)
+   if (j <= bhalf && !(j > 0 && k == 1 && 
b[j - 1].e == b[j].e))
findbetterb = 0;
}
}
diff --git a/tests/test-annotate.t b/tests/test-annotate.t
--- a/tests/test-annotate.t
+++ b/tests/test-annotate.t
@@ -91,8 +91,8 @@ annotate (JSON)
 annotate -n b
 
   $ hg annotate -n b
+  0: a
   1: a
-  0: a
   1: a
   3: b4
   3: b5
@@ -111,8 +111,8 @@ annotate --no-follow b
 annotate -nl b
 
   $ hg annotate -nl b
-  1:1: a
   0:1: a
+  1:2: a
   1:3: a
   3:4: b4
   3:5: b5
@@ -121,8 +121,8 @@ annotate -nl b
 annotate -nf b
 
   $ hg annotate -nf b
+  0 a: a
   1 a: a
-  0 a: a
   1 a: a
   3 b: b4
   3 b: b5
@@ -131,8 +131,8 @@ annotate -nf b
 annotate -nlf b
 
   $ hg annotate -nlf b
-  1 a:1: a
   0 a:1: a
+  1 a:2: a
   1 a:3: a
   3 b:4: b4
   3 b:5: b5
@@ -156,8 +156,8 @@ annotate -nlf b
 annotate after merge
 
   $ hg annotate -nf b
+  0 a: a
   1 a: a
-  0 a: a
   1 a: a
   3 b: b4
   4 b: c
@@ -166,8 +166,8 @@ annotate after merge
 annotate after merge with -l
 
   $ hg annotate -nlf b
-  1 a:1: a
   0 a:1: a
+  1 a:2: a
   1 a:3: a
   3 b:4: b4
   4 b:5: c
@@ -198,7 +198,7 @@ annotate after merge with -l
 annotate after rename merge
 
   $ hg annotate -nf b
-  1 a: a
+  0 a: a
   6 b: z
   1 a: a
   3 b: b4
@@ -209,7 +209,7 @@ annotate after rename merge
 annotate after rename merge with -l
 
   $ hg annotate -nlf b
-  1 a:1: a
+  0 a:1: a
   6 b:2: z
   1 a:3: a
   3 b:4: b4
@@ -226,7 +226,7 @@ Issue2807: alignment of line numbers wit
   $ echo more >> b
   $ hg ci -mmore -d '7 0'
   $ hg annotate -nlf b
-   1 a: 1: a
+   0 a: 1: a
6 b: 2: z
1 a: 3: a
3 b: 4: b4
@@ -240,15 +240,15 @@ Issue2807: alignment of line numbers wit
 linkrev vs rev
 
   $ hg annotate -r tip -n a
+  0: a
   1: a
-  0: a
   1: a
 
 linkrev vs rev with -l
 
   $ hg annotate -r tip -nl a
-  1:1: a
   0:1: a
+  1:2: a
   1:3: a
 
 Issue589: "undelete" sequence leads to crash
diff --git a/tests/test-bhalf.t b/tests/test-bhalf.t
--- a/tests/test-bhalf.t
+++ b/tests/test-bhalf.t
@@ -105,8 +105,8 @@ Explore some bdiff implementation edge c
   --- a/x
   +++ b/x
   @@ -1,1 +1,3 @@
+   a
   +a
-   a
   +a
   diff --git a/y b/y
   --- a/y
diff --git a/tests/test-commit-amend.t b/tests/test-commit-amend.t
--- a/tests/test-commit-amend.t
+++ b/tests/test-commit-amend.t
@@ -47,8 +47,8 @@ Amending changeset with changes in worki
   --- a/a  Thu Jan 01 00:00:00 1970 +
   +++ b/a  Thu Jan 01 00:00:00 1970 +
   @@ -1,1 +1,3 @@
+   a
   +a
-   a
   +a
   $ hg log
   changeset:   1:43f1ba15f28a
@@ -122,13 +122,13 @@ No changes, just a different message:
   uncompressed size of bundle content:
254 (changelog)
163 (manifests)
-   141  a
+   129  a
   saved backup bundle to 
$TESTTMP/.hg/strip-backup/74609c7f506e-1bfde511-amend-backup.hg (glob)
   1 changesets found
   uncompressed size of bundle content:
250 (changelog)
163 (manifests)
-   141  a
+   129  a
   adding branch
   adding changesets
   adding manifests
@@ -140,8 +140,8 @@ No changes, just a different message:
   --- a/a  Thu Jan 01 00:00:00 1970 +
   +++ b/a  Thu Jan 01 00:00:00 1970 +
   @@ -1,1 +1,3 @@
+   a
   +a
-   a
   +a
   $ hg log
   changeset:   1:1cd866679df8
@@ -266,13 +266,13 @@ then, test editing custom commit message
   uncompressed size of bundle content:
249 (changelog)
163 (manifests)
-   143  a
+   131  a
   saved backup bundle to 
$TESTTMP/.hg/strip-backup/5f357c7560ab-e7c84ade-amend-backup.hg (glob)
   1 changesets found
   uncompressed size of bundle content:
257 (changelog)
163 (manifests)
-   143  a
+   131  a
   adding branch
   adding changesets
   adding manifests
@@ -309,13 +309,13 @@ Same, but with changes in working dir (d
   uncompressed size of bundle content:
464 (changelog)
  

[PATCH 4 of 5 rfc] bdiff: give preference to longest matches in the middle of b side

2016-11-03 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1478208837 -3600
#  Thu Nov 03 22:33:57 2016 +0100
# Node ID 3e0216b2a0995cb21946bc13fb21391013332c57
# Parent  a3e3c7075c3c4b92e6b8c27e28bef7b2c061008d
bdiff: give preference to longest matches in the middle of b side

j is iterating the b range backwards and we thus use a flag to stop searching
for more-in-the-middle matches when the first hit at or before the middle has
been seen.

This makes the test-bhalf diff "correct". It obviously also gives more
preference to balanced recursion than to appending to sequences. That is kind
of correct, but will also unfortunately make some bundles bigger. No doubt, we
can also create examples where it will make them smaller ...

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -138,7 +138,7 @@ static int longest_match(struct bdiff_li
struct pos *pos,
 int a1, int a2, int b1, int b2, int *omi, int *omj)
 {
-   int mi = a1, mj = b1, mk = 0, i, j, k, half;
+   int mi = a1, mj = b1, mk = 0, i, j, k, half, bhalf, findbetterb;
 
/* window our search on large regions to better bound
   worst-case performance. by choosing a window at the end, we
@@ -147,6 +147,7 @@ static int longest_match(struct bdiff_li
a1 = a2 - 3;
 
half = (a1 + a2 - 1) / 2;
+   bhalf = (b1 + b2 - 1) / 2;
 
for (i = a1; i < a2; i++) {
/* skip all lines in b after the current block */
@@ -154,6 +155,7 @@ static int longest_match(struct bdiff_li
;
 
/* loop through all lines match a[i] in b */
+   findbetterb = 1;
for (; j >= b1; j = b[j].n) {
/* does this extend an earlier match? */
for (k = 1; j - k >= b1 && i - k >= a1; k++) {
@@ -182,9 +184,11 @@ static int longest_match(struct bdiff_li
/* better i in first lower half */
mi = i;
mj = j;
-   } else if (i == mi) {
-   /* an earlier j is "better" */
+   } else if (i == mi && findbetterb) {
+   /* better j in first upper half */
mj = j;
+   if (j <= bhalf)
+   findbetterb = 0;
}
}
}
diff --git a/tests/test-annotate.t b/tests/test-annotate.t
--- a/tests/test-annotate.t
+++ b/tests/test-annotate.t
@@ -91,9 +91,9 @@ annotate (JSON)
 annotate -n b
 
   $ hg annotate -n b
+  1: a
   0: a
   1: a
-  1: a
   3: b4
   3: b5
   3: b6
@@ -111,8 +111,8 @@ annotate --no-follow b
 annotate -nl b
 
   $ hg annotate -nl b
+  1:1: a
   0:1: a
-  1:2: a
   1:3: a
   3:4: b4
   3:5: b5
@@ -121,9 +121,9 @@ annotate -nl b
 annotate -nf b
 
   $ hg annotate -nf b
+  1 a: a
   0 a: a
   1 a: a
-  1 a: a
   3 b: b4
   3 b: b5
   3 b: b6
@@ -131,8 +131,8 @@ annotate -nf b
 annotate -nlf b
 
   $ hg annotate -nlf b
+  1 a:1: a
   0 a:1: a
-  1 a:2: a
   1 a:3: a
   3 b:4: b4
   3 b:5: b5
@@ -156,9 +156,9 @@ annotate -nlf b
 annotate after merge
 
   $ hg annotate -nf b
+  1 a: a
   0 a: a
   1 a: a
-  1 a: a
   3 b: b4
   4 b: c
   3 b: b5
@@ -166,8 +166,8 @@ annotate after merge
 annotate after merge with -l
 
   $ hg annotate -nlf b
+  1 a:1: a
   0 a:1: a
-  1 a:2: a
   1 a:3: a
   3 b:4: b4
   4 b:5: c
@@ -198,7 +198,7 @@ annotate after merge with -l
 annotate after rename merge
 
   $ hg annotate -nf b
-  0 a: a
+  1 a: a
   6 b: z
   1 a: a
   3 b: b4
@@ -209,7 +209,7 @@ annotate after rename merge
 annotate after rename merge with -l
 
   $ hg annotate -nlf b
-  0 a:1: a
+  1 a:1: a
   6 b:2: z
   1 a:3: a
   3 b:4: b4
@@ -226,7 +226,7 @@ Issue2807: alignment of line numbers wit
   $ echo more >> b
   $ hg ci -mmore -d '7 0'
   $ hg annotate -nlf b
-   0 a: 1: a
+   1 a: 1: a
6 b: 2: z
1 a: 3: a
3 b: 4: b4
@@ -240,15 +240,15 @@ Issue2807: alignment of line numbers wit
 linkrev vs rev
 
   $ hg annotate -r tip -n a
+  1: a
   0: a
   1: a
-  1: a
 
 linkrev vs rev with -l
 
   $ hg annotate -r tip -nl a
+  1:1: a
   0:1: a
-  1:2: a
   1:3: a
 
 Issue589: "undelete" sequence leads to crash
diff --git a/tests/test-bhalf.t b/tests/test-bhalf.t
--- a/tests/test-bhalf.t
+++ b/tests/test-bhalf.t
@@ -14,58 +14,48 @@ Diff of boring files:
   +++ b/this-is-the-filename
   @@ -1,31 +1,31 @@
   -once upon a time 1
-  -The quick brown fox jumps over the lazy dog
-  -once upon a time 2
-  -The quick brown fox jumps over the lazy dog
-  -once upon a time 3
-  -The quick brown fox jumps over the lazy dog
-  -once upon a 

[PATCH 1 of 5 rfc] tests: explore some bdiff cases

2016-11-03 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1478208837 -3600
#  Thu Nov 03 22:33:57 2016 +0100
# Node ID f6408efe0d0f4179fe6cc2b967164c1b4567f3d6
# Parent  d06c049695e6ad3219e7479c65ce98a2f123e878
tests: explore some bdiff cases

diff --git a/tests/test-bhalf.t b/tests/test-bhalf.t
new file mode 100644
--- /dev/null
+++ b/tests/test-bhalf.t
@@ -0,0 +1,140 @@
+A couple of test cases exploring the bdiff implementation.
+
+Diff of boring files:
+
+  $ hg init repo1
+  $ cd repo1
+  $ (for i in `seq 15`; do echo "once upon a time $i"; echo "The quick brown 
fox jumps over the lazy dog"; done; echo) > this-is-the-filename
+  $ hg add this-is-the-filename
+  $ hg ci -m "commit message commit message commit message commit message 
commit message commit message commit message commit message"
+  $ (for i in `seq 15`; do echo "twice upon a time $i"; echo "The quick brown 
fox jumps over the lazy dog"; done; echo) > this-is-the-filename
+  $ hg diff --git
+  diff --git a/this-is-the-filename b/this-is-the-filename
+  --- a/this-is-the-filename
+  +++ b/this-is-the-filename
+  @@ -1,31 +1,31 @@
+  -once upon a time 1
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 2
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 3
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 4
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 5
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 6
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 7
+  +twice upon a time 1
+   The quick brown fox jumps over the lazy dog
+  -once upon a time 8
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 9
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 10
+  +twice upon a time 2
+   The quick brown fox jumps over the lazy dog
+  -once upon a time 11
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 12
+  +twice upon a time 3
+   The quick brown fox jumps over the lazy dog
+  -once upon a time 13
+  +twice upon a time 4
+   The quick brown fox jumps over the lazy dog
+  -once upon a time 14
+  +twice upon a time 5
+   The quick brown fox jumps over the lazy dog
+  -once upon a time 15
+  +twice upon a time 6
+  +The quick brown fox jumps over the lazy dog
+  +twice upon a time 7
+  +The quick brown fox jumps over the lazy dog
+  +twice upon a time 8
+  +The quick brown fox jumps over the lazy dog
+  +twice upon a time 9
+  +The quick brown fox jumps over the lazy dog
+  +twice upon a time 10
+  +The quick brown fox jumps over the lazy dog
+  +twice upon a time 11
+  +The quick brown fox jumps over the lazy dog
+  +twice upon a time 12
+  +The quick brown fox jumps over the lazy dog
+  +twice upon a time 13
+  +The quick brown fox jumps over the lazy dog
+  +twice upon a time 14
+  +The quick brown fox jumps over the lazy dog
+  +twice upon a time 15
+   The quick brown fox jumps over the lazy dog
+   
+That's an odd diff for a trivial change!
+
+  $ hg ci -m "commit message commit message commit message commit message 
commit message commit message commit message commit message"
+  $ hg bundle --base null ../bundle.hg
+  2 changesets found
+  $ cd ..
+
+  $ f --size bundle.hg
+  bundle.hg: size=878
+
+Explore some bdiff implementation edge cases:
+
+  $ hg init repo2
+  $ cd repo2
+  $ cat << EOF >> x
+  > a
+  > EOF
+  $ cat << EOF >> y
+  > a
+  > a
+  > a
+  > EOF
+  $ cat << EOF >> z
+  > a
+  > a
+  > a
+  > a
+  > a
+  > EOF
+  $ hg ci -qAm0
+  $ cat << EOF > x
+  > a
+  > a
+  > a
+  > EOF
+  $ cat << EOF > y
+  > a
+  > EOF
+  $ cat << EOF > z
+  > a
+  > EOF
+  $ hg diff --git
+  diff --git a/x b/x
+  --- a/x
+  +++ b/x
+  @@ -1,1 +1,3 @@
+   a
+  +a
+  +a
+  diff --git a/y b/y
+  --- a/y
+  +++ b/y
+  @@ -1,3 +1,1 @@
+   a
+  -a
+  -a
+  diff --git a/z b/z
+  --- a/z
+  +++ b/z
+  @@ -1,5 +1,1 @@
+  -a
+   a
+  -a
+  -a
+  -a
+
+x and y shows the preference for adding / removing at the end of sequences ...
+z just seems weird.
+
+  $ cd ..
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 5 rfc] bdiff: adjust criteria for getting optimal longest match in the middle

2016-11-03 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1478208837 -3600
#  Thu Nov 03 22:33:57 2016 +0100
# Node ID c593308da04e9144da01a08401d886a64985c74b
# Parent  f6408efe0d0f4179fe6cc2b967164c1b4567f3d6
bdiff: adjust criteria for getting optimal longest match in the middle

We prefer matches closer to the middle to balance recursion, as introduced in
f1ca249696ed.

For ranges with uneven length, matches starting exactly in the middle should
have preference. That will be optimal for matches of length 1. We will thus
accept equality in the half check.

For ranges with even length, half was ceil'ed when calculated but we got the
preference for low matches from the 'less than half' check. To get the same
result as before when we also accept equality, floor it. Without that,
test-annotate.t would show some different (still correct but less optimal)
results.

This will change the heuristics. Tests shows a slightly different output - and
sometimes slightly smaller bundles.

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -146,7 +146,7 @@ static int longest_match(struct bdiff_li
if (a2 - a1 > 3)
a1 = a2 - 3;
 
-   half = (a1 + a2) / 2;
+   half = (a1 + a2 - 1) / 2;
 
for (i = a1; i < a2; i++) {
/* skip all lines in b after the current block */
@@ -172,7 +172,7 @@ static int longest_match(struct bdiff_li
 
/* best match so far? we prefer matches closer
   to the middle to balance recursion */
-   if (k > mk || (k == mk && (i <= mi || i < half))) {
+   if (k > mk || (k == mk && (i <= mi || i <= half))) {
mi = i;
mj = j;
mk = k;
diff --git a/tests/test-bhalf.t b/tests/test-bhalf.t
--- a/tests/test-bhalf.t
+++ b/tests/test-bhalf.t
@@ -33,20 +33,21 @@ Diff of boring files:
   -once upon a time 9
   -The quick brown fox jumps over the lazy dog
   -once upon a time 10
+  -The quick brown fox jumps over the lazy dog
+  -once upon a time 11
   +twice upon a time 2
The quick brown fox jumps over the lazy dog
-  -once upon a time 11
+  -once upon a time 12
   -The quick brown fox jumps over the lazy dog
-  -once upon a time 12
+  -once upon a time 13
   +twice upon a time 3
The quick brown fox jumps over the lazy dog
-  -once upon a time 13
+  -once upon a time 14
   +twice upon a time 4
The quick brown fox jumps over the lazy dog
-  -once upon a time 14
+  -once upon a time 15
   +twice upon a time 5
-   The quick brown fox jumps over the lazy dog
-  -once upon a time 15
+  +The quick brown fox jumps over the lazy dog
   +twice upon a time 6
   +The quick brown fox jumps over the lazy dog
   +twice upon a time 7
@@ -76,7 +77,7 @@ That's an odd diff for a trivial change!
   $ cd ..
 
   $ f --size bundle.hg
-  bundle.hg: size=878
+  bundle.hg: size=870
 
 Explore some bdiff implementation edge cases:
 
@@ -121,20 +122,21 @@ Explore some bdiff implementation edge c
   --- a/y
   +++ b/y
   @@ -1,3 +1,1 @@
+  -a
a
   -a
-  -a
   diff --git a/z b/z
   --- a/z
   +++ b/z
   @@ -1,5 +1,1 @@
   -a
+  -a
a
   -a
   -a
-  -a
 
-x and y shows the preference for adding / removing at the end of sequences ...
-z just seems weird.
+x shows the preference for adding at the end of sequences ...
+while y and z shows the preference for balanced recursion
+which is more efficient in stack and performance.
 
   $ cd ..
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 2] context: make sure __str__ works, also when there is no _changectx

2016-11-02 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1426800170 -3600
#  Thu Mar 19 22:22:50 2015 +0100
# Node ID 4b06a40809d781e9aba54718ec2fe1a232d81956
# Parent  90300200bc1fcaedcc6ab109574d08b01ece2853
context: make sure __str__ works, also when there is no _changectx

Before, it could crash when trying to print the wrong kind of object at the
wrong time.

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -709,7 +709,10 @@ class basefilectx(object):
 return False
 
 def __str__(self):
-return "%s@%s" % (self.path(), self._changectx)
+try:
+return "%s@%s" % (self.path(), self._changectx)
+except error.LookupError:
+return "%s@???" % self.path()
 
 def __repr__(self):
 return "<%s %s>" % (type(self).__name__, str(self))
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 2] largefiles: clarify variable name holding file mode

2016-11-02 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476801939 -7200
#  Tue Oct 18 16:45:39 2016 +0200
# Node ID 90300200bc1fcaedcc6ab109574d08b01ece2853
# Parent  bb586966818986131068280bfd95fc96fbdaaa0d
largefiles: clarify variable name holding file mode

A follow-up to c01acee367ec.

'st' sounds like the whole stat result while 'mode' is a better name for the
actual file mode.

diff --git a/hgext/largefiles/lfcommands.py b/hgext/largefiles/lfcommands.py
--- a/hgext/largefiles/lfcommands.py
+++ b/hgext/largefiles/lfcommands.py
@@ -510,18 +510,21 @@ def updatelfiles(ui, repo, filelist=None
 lfdirstate.normal(lfile)
 update1 = 1
 
-# copy the state of largefile standin from the repository's
+# copy the exec mode of largefile standin from the repository's
 # dirstate to its state in the lfdirstate.
 rellfile = lfile
 relstandin = lfutil.standin(lfile)
 if wvfs.exists(relstandin):
+# exec is decided by the users permissions using mask 0o100
 standinexec = wvfs.stat(relstandin).st_mode & 0o100
-st = wvfs.stat(rellfile).st_mode
-if standinexec != st & 0o100:
-st &= ~0o111
+st = wvfs.stat(rellfile)
+mode = st.st_mode
+if standinexec != mode & 0o100:
+# first remove all X bits, then shift all R bits to X
+mode &= ~0o111
 if standinexec:
-st |= (st >> 2) & 0o111 & ~util.umask
-wvfs.chmod(rellfile, st)
+mode |= (mode >> 2) & 0o111 & ~util.umask
+wvfs.chmod(rellfile, mode)
 update1 = 1
 
 updated += update1
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] spelling: fixes of non-dictionary words

2016-11-02 Thread Mads Kiilerich

On 11/02/2016 03:41 PM, Augie Fackler wrote:

On Nov 2, 2016, at 10:32, Mads Kiilerich  wrote:

On 10/18/2016 02:20 AM, Augie Fackler wrote:

On Mon, Oct 17, 2016 at 11:38:02PM +0200, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1476739015 -7200
#  Mon Oct 17 23:16:55 2016 +0200
# Node ID efd5397e9da5b1d7e2c3353b0b06fc904651b150
# Parent  8a864844d5a0c34bdb24d2e098a0cd339e32e020
spelling: fixes of non-dictionary words

Queued, with some hunks discarded. I've tried to note the interesting ones.

Can you push your queued version now?

I thought I already had. If it's not local, then it's gone. Do you want to just 
send a new one, or should I try and reconstruct what I had?


Changes don't "go away" with Mercurial so I would assume that you must 
have it locally somewhere. Digging it out is probably less work for you 
than the alternatives.


Alternatively, you are in a power position and can decide which 
alternative you want ;-)


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] spelling: fixes of non-dictionary words

2016-11-02 Thread Mads Kiilerich

On 10/18/2016 02:20 AM, Augie Fackler wrote:

On Mon, Oct 17, 2016 at 11:38:02PM +0200, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1476739015 -7200
#  Mon Oct 17 23:16:55 2016 +0200
# Node ID efd5397e9da5b1d7e2c3353b0b06fc904651b150
# Parent  8a864844d5a0c34bdb24d2e098a0cd339e32e020
spelling: fixes of non-dictionary words

Queued, with some hunks discarded. I've tried to note the interesting ones.


Can you push your queued version now?

/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 2 of 4 STABLE] help: replace selenic.com by mercurial-scm.org in man pages

2016-11-01 Thread Mads Kiilerich

On 11/01/2016 12:47 PM, FUJIWARA Katsunori wrote:

# HG changeset patch
# User FUJIWARA Katsunori 
# Date 1478000376 -32400
#  Tue Nov 01 20:39:36 2016 +0900
# Branch stable
# Node ID 4cc5a1e982f91d009e00068e9a0f876e03afbf45
# Parent  d9fd3d3b1233c6d420015f329906b9bed393d978
help: replace selenic.com by mercurial-scm.org in man pages

Source code repository and mailing list services have been already
migrated to mercurial-scm.org domain.

diff --git a/mercurial/help/hg-ssh.8.txt b/mercurial/help/hg-ssh.8.txt
--- a/mercurial/help/hg-ssh.8.txt
+++ b/mercurial/help/hg-ssh.8.txt
@@ -58,9 +58,9 @@ Resources
  "
  Main Web Site: https://mercurial-scm.org/


Hmm ... there is now a redirect from that to 
https://www.mercurial-scm.org/ .


This www is so last year and redundant - wouldn't it be better to just 
consistently get rid of it, have the main web site where we say it is, 
and enjoy shorter URLs?


/Mads

  
-Source code repository: http://selenic.com/hg

+Source code repository: https://www.mercurial-scm.org/repo/hg



___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH stable] httppeer: make __del__ access to self.urlopener more safe

2016-10-31 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1477917828 -3600
#  Mon Oct 31 13:43:48 2016 +0100
# Branch stable
# Node ID 13b1ef724b8c5c22f9234f62b9d9a583655ce87e
# Parent  69ffbbe73dd03df0d1a00bdb2bc083fdb73ede09
httppeer: make __del__ access to self.urlopener more safe

Some errors could in some cases show unfortunate scary and confusing warnings
from the httppeer delstructors:

  abort: nodename nor servname provided, or not known
  Exception AttributeError: "'httpspeer' object has no attribute 'urlopener'" 
in > ignored```

To mute that, take 7b15dd9125b3 to the next level and use getattr in __del__.

diff --git a/mercurial/httppeer.py b/mercurial/httppeer.py
--- a/mercurial/httppeer.py
+++ b/mercurial/httppeer.py
@@ -63,8 +63,9 @@ class httppeer(wireproto.wirepeer):
 self.requestbuilder = urlreq.request
 
 def __del__(self):
-if self.urlopener:
-for h in self.urlopener.handlers:
+urlopener = getattr(self, 'urlopener', None)
+if urlopener:
+for h in urlopener.handlers:
 h.close()
 getattr(h, "close_all", lambda : None)()
 
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH stable] tests: work around instability that caused test from 4999c12c526b to fail

2016-10-28 Thread Mads Kiilerich

On 10/28/2016 10:35 AM, Pierre-Yves David wrote:



On 10/18/2016 03:32 PM, Pierre-Yves David wrote:



On 10/18/2016 03:18 PM, Mads Kiilerich wrote:

On 10/18/2016 02:30 PM, Pierre-Yves David wrote:



On 10/18/2016 01:33 AM, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1476746894 -7200
#  Tue Oct 18 01:28:14 2016 +0200
# Branch stable
# Node ID 548f82b480d086c7a551b025fb980cd70187c880
# Parent  328545c7d8a1044330b8a5bfbdd9c2ff08625d6a
tests: work around instability that caused test from 4999c12c526b to
fail


I'm not too sure of what is going on here, Can you elaborate?


I'm also not sure what is going on. I suddenly saw the new test I added
started to fail. Not in the actual test but in the setup code.
Apparently unrelated to other recent changes - the new test just
happened to expose it. Thus, I suggest this workaround for now.


I would be more comfortable if we had a better idea of what is going one
here. Can you have a deeper look?


Any news on this?


No. The patch still makes the setup part of my new test work reliably 
despite an apparent existing instability in the state left from the 
previous test. I haven't investigated further.


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH stable] largefiles: handle that a found standin file doesn't exist when removing it

2016-10-27 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1477591593 -7200
#  Thu Oct 27 20:06:33 2016 +0200
# Branch stable
# Node ID 23f53561a8b6dcfbb35020df4d113fe34fec4c0e
# Parent  69ffbbe73dd03df0d1a00bdb2bc083fdb73ede09
largefiles: handle that a found standin file doesn't exist when removing it

I somehow ended up in a situation where hg crashed on an unlink I introduced in
328545c7d8a1.

I don't know how it happened and can't reproduce it. It seems like it only can
happen when the file is removed between the time of check in a working
directory context walk that finds a standin file, and the time of use when we
try to remove it because the corresponding largefile doesn't exist.

But better safe than sorry: replace the plain unlink with unlinkpath with
ignoremissing=True. That will also remove remaining empty directories, which
arguably is more correct.

diff --git a/hgext/largefiles/reposetup.py b/hgext/largefiles/reposetup.py
--- a/hgext/largefiles/reposetup.py
+++ b/hgext/largefiles/reposetup.py
@@ -217,7 +217,7 @@ def reposetup(ui, repo):
 # standin. Removing a file as a side effect of
 # running status is gross, but the alternatives (if
 # any) are worse.
-self.wvfs.unlink(standin)
+self.wvfs.unlinkpath(standin, ignoremissing=True)
 
 # Filter result lists
 result = list(result)
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


speling tool

2016-10-26 Thread Mads Kiilerich

On 10/26/2016 09:22 PM, timeless wrote:

Pierre-Yves David wrote:

+# the remainig bit and drop this line

remaining ;-)

-- I really should try to put together a version of my spelling tool
as a test to see what people think.

I'd envision it running w/ a stored cache of allowed words, each run
of the test would generate a new word list, diff it with the old one,
and complain about words that aren't in the old one and aren't in the
user's dictionary. It could also be adjusted to only complain about
words of length {x,y} instead of {1,Inf}...


That sounds kind of like 
https://www.mercurial-scm.org/pipermail/mercurial-devel/2012-August/043846.html 
(early version) that I use for patches like 
https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-October/089492.html 
.


We do however use a lot of weird words so it seems like unless we limit 
the scope a lot, there will inherently be more false hits than genuine 
catches.


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH stable] revset: don't cache abstractsmartset min/max invocations infinitely

2016-10-26 Thread Mads Kiilerich

On 10/26/2016 02:31 PM, Yuya Nishihara wrote:

On Wed, 26 Oct 2016 14:09:17 +0200, Mads Kiilerich wrote:

Instead, I would perhaps prefer to introduce a 'gettercache' that works
on methods that only have self as parameter (and thus can set a simple
instance method as I do here)  ... or a 'methodcache' that would be like
cachefunc but store the cache on the first 'self' parameter.

That also seems fine. (I'm not a big fan of overwriting methods since we're
likely to create self->self cycle, but that wouldn't be the case here.)



I guess it could be slightly more gc efficient to use 'lambda v=v: v' to 
avoid having a reference to v in the outer scope that also has self ...


/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH stable] revset: don't cache abstractsmartset min/max invocations infinitely

2016-10-26 Thread Mads Kiilerich

On 10/26/2016 01:21 PM, Yuya Nishihara wrote:

On Tue, 25 Oct 2016 18:57:26 +0200, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1477414587 -7200
#  Tue Oct 25 18:56:27 2016 +0200
# Branch stable
# Node ID c2fe58cd4235fc6c8cabea882794303d620bec3a
# Parent  76c57e1fe79b0980b377b4f305635dea393d6315
revset: don't cache abstractsmartset min/max invocations infinitely

There was a "leak", apparently introduced in ab66c1dee405. When running:

 hg = hglib.open('repo')
 while True:
 hg.log("max(branch('default'))")

all filteredset instances from branch() would be cached indefinitely by the
@util.cachefunc annotation on the max() implementation.

Indeed. We've cached {self: v} pair every time max() was called.
Queued this for stable, thanks.


-@util.cachefunc
  def min(self):
  """return the minimum element in the set"""
-if self.fastasc is not None:
-for r in self.fastasc():
-return r
-raise ValueError('arg is an empty sequence')
-return min(self)
-
-@util.cachefunc
+if self.fastasc is None:
+v = min(self)
+else:
+for v in self.fastasc():
+break
+else:
+raise ValueError('arg is an empty sequence')
+self.min = lambda: v
+return v

I slightly prefer using propertycache() and wrap it by a function, which
is a common pattern seen in context.py, but that's just a nitpicking.


Exactly what do you mean with wrapping the property by a function? I 
don't see a clear pattern of that in context.py?


Instead, I would perhaps prefer to introduce a 'gettercache' that works 
on methods that only have self as parameter (and thus can set a simple 
instance method as I do here)  ... or a 'methodcache' that would be like 
cachefunc but store the cache on the first 'self' parameter.


/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH stable] revset: don't cache abstractsmartset min/max invocations infinitely

2016-10-25 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1477414587 -7200
#  Tue Oct 25 18:56:27 2016 +0200
# Branch stable
# Node ID c2fe58cd4235fc6c8cabea882794303d620bec3a
# Parent  76c57e1fe79b0980b377b4f305635dea393d6315
revset: don't cache abstractsmartset min/max invocations infinitely

There was a "leak", apparently introduced in ab66c1dee405. When running:

hg = hglib.open('repo')
while True:
hg.log("max(branch('default'))")

all filteredset instances from branch() would be cached indefinitely by the
@util.cachefunc annotation on the max() implementation.

util.cachefunc seems dangerous as method decorator and is barely used elsewhere
in the code base. Instead, just open code caching by having the min/max
methods replace themselves with a plain lambda returning the result.

diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -2924,23 +2924,29 @@ class abstractsmartset(object):
 """True if the set will iterate in topographical order"""
 raise NotImplementedError()
 
-@util.cachefunc
 def min(self):
 """return the minimum element in the set"""
-if self.fastasc is not None:
-for r in self.fastasc():
-return r
-raise ValueError('arg is an empty sequence')
-return min(self)
-
-@util.cachefunc
+if self.fastasc is None:
+v = min(self)
+else:
+for v in self.fastasc():
+break
+else:
+raise ValueError('arg is an empty sequence')
+self.min = lambda: v
+return v
+
 def max(self):
 """return the maximum element in the set"""
-if self.fastdesc is not None:
-for r in self.fastdesc():
-return r
-raise ValueError('arg is an empty sequence')
-return max(self)
+if self.fastdesc is None:
+return max(self)
+else:
+for v in self.fastdesc():
+break
+else:
+raise ValueError('arg is an empty sequence')
+self.max = lambda: v
+return v
 
 def first(self):
 """return the first element in the set (user iteration perspective)
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 3 of 3] dirstate: add debug message for debug.dirstate.delaywrite

2016-10-18 Thread Mads Kiilerich

On 10/18/2016 06:17 PM, Kevin Bullock wrote:

On Oct 18, 2016, at 10:19, Mads Kiilerich  wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1476803223 -7200
#  Tue Oct 18 17:07:03 2016 +0200
# Node ID 9329389e9dc752652f7c820255ca5afa346c4e08
# Parent  cde3cae17cba67f80b9f1b41e5cc5fe3b87cd06f
dirstate: add debug message for debug.dirstate.delaywrite

Is this series related to the intermittent test failure you described in 
<https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-October/089501.html>?


No, I don't think so.

/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 3] largefiles: clarify variable name holding file mode

2016-10-18 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476801939 -7200
#  Tue Oct 18 16:45:39 2016 +0200
# Node ID 6d4322a34c2a543665d702fd10f38f1af6001715
# Parent  26089f5e3b51d0416b985ec78e7facdb3113aa48
largefiles: clarify variable name holding file mode

A follow-up to c01acee367ec.

'st' sounds like the whole stat result while 'mode' is a better name for the
actual file mode.

diff --git a/hgext/largefiles/lfcommands.py b/hgext/largefiles/lfcommands.py
--- a/hgext/largefiles/lfcommands.py
+++ b/hgext/largefiles/lfcommands.py
@@ -510,18 +510,21 @@ def updatelfiles(ui, repo, filelist=None
 lfdirstate.normal(lfile)
 update1 = 1
 
-# copy the state of largefile standin from the repository's
+# copy the exec mode of largefile standin from the repository's
 # dirstate to its state in the lfdirstate.
 rellfile = lfile
 relstandin = lfutil.standin(lfile)
 if wvfs.exists(relstandin):
+# exec is decided by the users permissions using mask 0o100
 standinexec = wvfs.stat(relstandin).st_mode & 0o100
-st = wvfs.stat(rellfile).st_mode
-if standinexec != st & 0o100:
-st &= ~0o111
+st = wvfs.stat(rellfile)
+mode = st.st_mode
+if standinexec != mode & 0o100:
+# first remove all X bits, then shift all R bits to X
+mode &= ~0o111
 if standinexec:
-st |= (st >> 2) & 0o111 & ~util.umask
-wvfs.chmod(rellfile, st)
+mode |= (mode >> 2) & 0o111 & ~util.umask
+wvfs.chmod(rellfile, mode)
 update1 = 1
 
 updated += update1
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 3 of 3] dirstate: add debug message for debug.dirstate.delaywrite

2016-10-18 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476803223 -7200
#  Tue Oct 18 17:07:03 2016 +0200
# Node ID 9329389e9dc752652f7c820255ca5afa346c4e08
# Parent  cde3cae17cba67f80b9f1b41e5cc5fe3b87cd06f
dirstate: add debug message for debug.dirstate.delaywrite

Show a message like:
  delaying dirstate write 0.305s to record that it was clean
which is kind of obscure but gives a hint that something is going on and helps
debugging. The functionality *is* obscure and I don't know a better way to
describe it briefly.

diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -756,6 +756,8 @@ class dirstate(object):
 clock = time.time()
 start = int(clock) - (int(clock) % delaywrite)
 end = start + delaywrite
+self._ui.debug('delaying dirstate write %0.3fs to record'
+   ' that it was clean\n' % (end - clock))
 time.sleep(end - clock)
 now = end # trust our estimate that the end is near now
 break
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 3] dirstate: fix debug.dirstate.delaywrite to use the new "now" after sleeping

2016-10-18 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476802355 -7200
#  Tue Oct 18 16:52:35 2016 +0200
# Node ID cde3cae17cba67f80b9f1b41e5cc5fe3b87cd06f
# Parent  6d4322a34c2a543665d702fd10f38f1af6001715
dirstate: fix debug.dirstate.delaywrite to use the new "now" after sleeping

It seems like the a regression has sneaked into debug.dirstate.delaywrite in
6c6b48aca328. It would sleep until no files were modified "now" any more, but
when writing the dirstate it would use the old "now" and still mark files as
'unset' instead of recording the timestamp that would make the file show up as
clean instead of unknown.

Instead of getting a new "now" from the file system, we trust the computed end
time as the new "now" and thus cause the actual modification time to be
writiten to the dirstate.

debug.dirstate.delaywrite is undocumented and only used in
test-largefiles-update.t . All tests seems to work fine for me without
debug.dirstate.delaywrite . Perhaps because it not really worked as intended
without the fix in this patch, and code and tests thus have evolved to do fine
without it? It could thus perhaps make sense to drop usage of this setting in
the tests. That could speed the test up a bit.

This functionality (or something very similar) can however apparently be very
convenient in setups where checking dirty-ness is expensive - such as when
using large files and have slow file filesystems or are CPU constrained. Now it
works and we can try it. (But ideally, for the largefile use case, it should
probably only delay lfdirstate writes - not ordinary dirstate.)

diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -757,6 +757,7 @@ class dirstate(object):
 start = int(clock) - (int(clock) % delaywrite)
 end = start + delaywrite
 time.sleep(end - clock)
+now = end # trust our estimate that the end is near now
 break
 
 st.write(parsers.pack_dirstate(self._map, self._copymap, self._pl, 
now))
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH stable] tests: work around instability that caused test from 4999c12c526b to fail

2016-10-18 Thread Mads Kiilerich

On 10/18/2016 02:30 PM, Pierre-Yves David wrote:



On 10/18/2016 01:33 AM, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1476746894 -7200
#  Tue Oct 18 01:28:14 2016 +0200
# Branch stable
# Node ID 548f82b480d086c7a551b025fb980cd70187c880
# Parent  328545c7d8a1044330b8a5bfbdd9c2ff08625d6a
tests: work around instability that caused test from 4999c12c526b to 
fail


I'm not too sure of what is going on here, Can you elaborate?


I'm also not sure what is going on. I suddenly saw the new test I added 
started to fail. Not in the actual test but in the setup code. 
Apparently unrelated to other recent changes - the new test just 
happened to expose it. Thus, I suggest this workaround for now.


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] spelling: fixes of non-dictionary words

2016-10-17 Thread Mads Kiilerich

On 10/18/2016 02:20 AM, Augie Fackler wrote:

diff --git a/contrib/check-code.py b/contrib/check-code.py
--- a/contrib/check-code.py
+++ b/contrib/check-code.py
@@ -463,7 +463,7 @@ def _preparepats():
  failandwarn = c[-1]
  for pats in failandwarn:
  for i, pseq in enumerate(pats):
-# fix-up regexes for multi-line searches
+# fix-up regexps for multi-line searches

I've always used singular "regex" plural "regexes" - I don't see
"regexp" much in my day-to-day life. Dropped all instances of this
correction for now.


Fine, as long as we can be consistent ;-)

Mercurial calls it "regexp" in .hgignore - we can't change that.

We have approximately twice as many uses of 'regexp' as of 'regex'.

The alternative to 'regexp' thus seems to be officially inconsistent.

/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH stable] tests: work around instability that caused test from 4999c12c526b to fail

2016-10-17 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476746894 -7200
#  Tue Oct 18 01:28:14 2016 +0200
# Branch stable
# Node ID 548f82b480d086c7a551b025fb980cd70187c880
# Parent  328545c7d8a1044330b8a5bfbdd9c2ff08625d6a
tests: work around instability that caused test from 4999c12c526b to fail

diff --git a/tests/test-largefiles-update.t b/tests/test-largefiles-update.t
--- a/tests/test-largefiles-update.t
+++ b/tests/test-largefiles-update.t
@@ -732,12 +732,16 @@ bit correctly on the platform being unaw
 
 #endif
 
+FIXME: At this point large2 seems to be fishy and cause up -c to fail
+"randomly" even though summary shows no changes. For now, just work around it:
+  $ rm large2 .hglf/large2
+
 Test a fatal error interrupting an update. Verify that status report dirty
 files correctly after an interrupted update. Also verify that checking all
 hashes reveals it isn't clean.
 
 Start with clean dirstates:
-  $ hg up -qcr "8^"
+  $ hg up -qCr "8^"
   $ sleep 1
   $ hg st
 Update standins without updating largefiles - large1 is modified and largeX is
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH] spelling: fixes of non-dictionary words

2016-10-17 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476739015 -7200
#  Mon Oct 17 23:16:55 2016 +0200
# Node ID efd5397e9da5b1d7e2c3353b0b06fc904651b150
# Parent  8a864844d5a0c34bdb24d2e098a0cd339e32e020
spelling: fixes of non-dictionary words

diff --git a/contrib/all-revsets.txt b/contrib/all-revsets.txt
--- a/contrib/all-revsets.txt
+++ b/contrib/all-revsets.txt
@@ -8,7 +8,7 @@
 #
 # Please update this file with any revsets you use for benchmarking a change so
 # that future contributors can easily find and retest it when doing further
-# modification. Feel free to highlight interresting variants if needed.
+# modification. Feel free to highlight interesting variants if needed.
 
 
 ## Revset from this section are all extracted from changelog when this file was
diff --git a/contrib/check-code.py b/contrib/check-code.py
--- a/contrib/check-code.py
+++ b/contrib/check-code.py
@@ -463,7 +463,7 @@ def _preparepats():
 failandwarn = c[-1]
 for pats in failandwarn:
 for i, pseq in enumerate(pats):
-# fix-up regexes for multi-line searches
+# fix-up regexps for multi-line searches
 p = pseq[0]
 # \s doesn't match \n
 p = re.sub(r'(?_GenericReview(argslist) 
 let s:origtabpagenr = tabpagenr()
 silent! exe 'tabedit ' . StrippedRelativeFilePath
 if exists('patchcmd')
-  " modelines in loaded files mess with diff comparision
+  " modelines in loaded files mess with diff comparison
   let s:keep_modeline=&modeline
   let &modeline=0
   silent! exe 'vert diffsplit ' . tmpname . '.file'
diff --git a/hgext/chgserver.py b/hgext/chgserver.py
--- a/hgext/chgserver.py
+++ b/hgext/chgserver.py
@@ -629,7 +629,7 @@ class chgunixservicehandler(object):
 
 def chgunixservice(ui, repo, opts):
 if repo:
-# one chgserver can serve multiple repos. drop repo infomation
+# one chgserver can serve multiple repos. drop repo information
 ui.setconfig('bundle', 'mainreporoot', '', 'repo')
 h = chgunixservicehandler(ui)
 return commandserver.unixforkingservice(ui, repo=None, opts=opts, 
handler=h)
diff --git a/hgext/fsmonitor/__init__.py b/hgext/fsmonitor/__init__.py
--- a/hgext/fsmonitor/__init__.py
+++ b/hgext/fsmonitor/__init__.py
@@ -563,7 +563,7 @@ def wrapsymlink(orig, source, link_name)
 pass
 
 class state_update(object):
-''' This context mananger is responsible for dispatching the state-enter
+''' This context manager is responsible for dispatching the state-enter
 and state-leave signals to the watchman service '''
 
 def __init__(self, repo, node, distance, partial):
diff --git a/hgext/histedit.py b/hgext/histedit.py
--- a/hgext/histedit.py
+++ b/hgext/histedit.py
@@ -1406,12 +1406,12 @@ def verifyactions(actions, state, ctxs):
% node.short(missing[0]))
 
 def adjustreplacementsfrommarkers(repo, oldreplacements):
-"""Adjust replacements from obsolescense markers
+"""Adjust replacements from obsolescence markers
 
 Replacements structure is originally generated based on
 histedit's state and does not account for changes that are
 not recorded there. This function fixes that by adding
-data read from obsolescense markers"""
+data read from obsolescence markers"""
 if not obsolete.isenabled(repo, obsolete.createmarkersopt):
 return oldreplacements
 
diff --git a/hgext/logtoprocess.py b/hgext/logtoprocess.py
--- a/hgext/logtoprocess.py
+++ b/hgext/logtoprocess.py
@@ -27,7 +27,7 @@ For example::
 
 would log the warning message and traceback of any failed command dispatch.
 
-Scripts are run asychronously as detached daemon processes; mercurial will
+Scripts are run asynchronously as detached daemon processes; mercurial will
 not ensure that they exit cleanly.
 
 """
diff --git a/hgext/rebase.py b/hgext/rebase.py
--- a/hgext/rebase.py
+++ b/hgext/rebase.py
@@ -1382,7 +1382,7 @@ def _computeobsoletenotrebased(repo, reb
 """return a mapping obsolete => successor for all obsolete nodes to be
 rebased that have a successors in the destination
 
-obsolete => None entries in the mapping indicate nodes with no succesor"""
+obsolete => None entries in the mapping indicate nodes with no successor"""
 obsoletenotrebased = {}
 
 # Build a mapping successor => obsolete nodes for the obsolete
diff --git a/i18n/ja.po b/i18n/ja.po
--- a/i18n/ja.po
+++ b/i18n/ja.po
@@ -6713,7 +6713,7 @@ msgid ""
 msgstr ""
 
 msgid ""
-"Scripts are run asychronously as detached daemon processes; mercurial will\n"
+"Scripts are run a

[PATCH] revset: optimize for destination() being "inefficient"

2016-10-17 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476726516 -7200
#  Mon Oct 17 19:48:36 2016 +0200
# Node ID 50a7bbe82bd3b63a934640b1480ee643cf14ec5f
# Parent  5cb830801855dbb63e98b948e355bc995d295bf3
revset: optimize for destination() being "inefficient"

destination() will scan through the whole subset and read extras for each
revision to get its source.

diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -2594,7 +2594,7 @@ def _optimize(x, small):
 f = getsymbol(x[1])
 wa, ta = _optimize(x[2], small)
 if f in ('author', 'branch', 'closed', 'date', 'desc', 'file', 'grep',
- 'keyword', 'outgoing', 'user'):
+ 'keyword', 'outgoing', 'user', 'destination'):
 w = 10 # slow
 elif f in ('modifies', 'adds', 'removes'):
 w = 30 # slower
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH stable] largefiles: fix 'deleted' files sometimes persistently appearing with R status

2016-10-17 Thread Mads Kiilerich

On 10/17/2016 05:12 PM, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1476717144 -7200
#  Mon Oct 17 17:12:24 2016 +0200
# Node ID bba9f6bba98f82f2403aac8dc656569562690472
# Parent  92414d57a05de39864aecc625e775c693d9edb51
largefiles: fix 'deleted' files sometimes persistently appearing with R status

A code snippet that has been around since largefiles was introduced was wrong:
Standins no longer found in lfdirstate has *not* been removed -
they have probably just been deleted ... or not created.

This wrong reporting did that 'up -C' didn't undo the change and didn't sync
the two dirstates.

Instead of reporting such files as removed, propagate the deletion to the
standin file and report the file as deleted.

diff --git a/hgext/largefiles/reposetup.py b/hgext/largefiles/reposetup.py
--- a/hgext/largefiles/reposetup.py
+++ b/hgext/largefiles/reposetup.py
@@ -164,8 +164,8 @@ def reposetup(ui, repo):
  # files from lfdirstate
  unsure, s = lfdirstate.status(match, [], False, listclean,
False)
-(modified, added, removed, clean) = (s.modified, s.added,
- s.removed, s.clean)
+(modified, added, removed, deleted, clean) = (
+s.modified, s.added, s.removed, s.deleted, s.clean)
  if parentworking:
  for lfile in unsure:
  standin = lfutil.standin(lfile)
@@ -206,14 +206,18 @@ def reposetup(ui, repo):
  removed = [lfile for lfile in removed
 if lfutil.standin(lfile) in ctx1]
  
-# Standins no longer found in lfdirstate has been

-# removed
+# Standins no longer found in lfdirstate have been deleted
  for standin in ctx1.walk(lfutil.getstandinmatcher(self)):
  lfile = lfutil.splitstandin(standin)
  if not match(lfile):
  continue
  if lfile not in lfdirstate:
-removed.append(lfile)
+deleted.append(lfile)
+# Sync "largefile has been removed" back to the

Hmm ... for clarity, this should have been "deleted"

/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH stable] largefiles: fix 'deleted' files sometimes persistently appearing with R status

2016-10-17 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476717144 -7200
#  Mon Oct 17 17:12:24 2016 +0200
# Node ID bba9f6bba98f82f2403aac8dc656569562690472
# Parent  92414d57a05de39864aecc625e775c693d9edb51
largefiles: fix 'deleted' files sometimes persistently appearing with R status

A code snippet that has been around since largefiles was introduced was wrong:
Standins no longer found in lfdirstate has *not* been removed -
they have probably just been deleted ... or not created.

This wrong reporting did that 'up -C' didn't undo the change and didn't sync
the two dirstates.

Instead of reporting such files as removed, propagate the deletion to the
standin file and report the file as deleted.

diff --git a/hgext/largefiles/reposetup.py b/hgext/largefiles/reposetup.py
--- a/hgext/largefiles/reposetup.py
+++ b/hgext/largefiles/reposetup.py
@@ -164,8 +164,8 @@ def reposetup(ui, repo):
 # files from lfdirstate
 unsure, s = lfdirstate.status(match, [], False, listclean,
   False)
-(modified, added, removed, clean) = (s.modified, s.added,
- s.removed, s.clean)
+(modified, added, removed, deleted, clean) = (
+s.modified, s.added, s.removed, s.deleted, s.clean)
 if parentworking:
 for lfile in unsure:
 standin = lfutil.standin(lfile)
@@ -206,14 +206,18 @@ def reposetup(ui, repo):
 removed = [lfile for lfile in removed
if lfutil.standin(lfile) in ctx1]
 
-# Standins no longer found in lfdirstate has been
-# removed
+# Standins no longer found in lfdirstate have been deleted
 for standin in ctx1.walk(lfutil.getstandinmatcher(self)):
 lfile = lfutil.splitstandin(standin)
 if not match(lfile):
 continue
 if lfile not in lfdirstate:
-removed.append(lfile)
+deleted.append(lfile)
+# Sync "largefile has been removed" back to the
+# standin. Removing a file as a side effect of
+# running status is gross, but the alternatives (if
+# any) are worse.
+self.wvfs.unlink(standin)
 
 # Filter result lists
 result = list(result)
@@ -237,7 +241,7 @@ def reposetup(ui, repo):
 normals = [[fn for fn in filelist
 if not lfutil.isstandin(fn)]
for filelist in result]
-lfstatus = (modified, added, removed, s.deleted, [], [],
+lfstatus = (modified, added, removed, deleted, [], [],
 clean)
 result = [sorted(list1 + list2)
   for (list1, list2) in zip(normals, lfstatus)]
diff --git a/tests/test-largefiles-update.t b/tests/test-largefiles-update.t
--- a/tests/test-largefiles-update.t
+++ b/tests/test-largefiles-update.t
@@ -751,30 +751,22 @@ added:
   $ hg up -Cr "8" --config extensions.crashupdatelfiles=../crashupdatelfiles.py
   [7]
 Check large1 content and status ... and that update will undo modifications:
-BUG: large is R
   $ cat large1
   large1 in #3
   $ hg st
   M large1
-  R largeX
+  ! largeX
   $ hg up -Cr .
   getting changed largefiles
-  1 largefiles updated, 0 removed
-  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  2 largefiles updated, 0 removed
+  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
   $ cat large1
   manually modified before 'hg transplant --continue'
   $ hg st
-  R largeX
-Force largefiles rehashing and check again - which makes it realize that largeX
-not has been removed but just doesn't exist:
+Force largefiles rehashing and check that all changes have been caught by
+status and update:
   $ rm .hg/largefiles/dirstate
   $ hg st
-  ! largeX
-  $ hg up -Cr .
-  getting changed largefiles
-  1 largefiles updated, 0 removed
-  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
-  $ hg st
 
   $ cd ..
 
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: disrstate._check(link|exec) modifies repo dir mtime

2016-10-17 Thread Mads Kiilerich

On 10/17/2016 03:55 PM, Yuya Nishihara wrote:

Please correct me if I misread the patch.

  1. checkexec() on normal unix filesystem: creates 'checkisexec' 0755
  2. copy it to VFAT: 'checkisexec' 0644
  3. copy back from VFAT: 'checkisexec' 0644
  4. checkexec() returns False because no exec flag set



You are right. I had local changes that not had been posted. I will 
resend after 4.0 .


/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: disrstate._check(link|exec) modifies repo dir mtime

2016-10-17 Thread Mads Kiilerich

On 10/16/2016 04:19 PM, Yuya Nishihara wrote on the users list:

On Sun, 16 Oct 2016 14:08:52 +0200, Mads Kiilerich wrote:

On 10/16/2016 12:29 PM, Yuya Nishihara wrote:

On Sat, 1 Oct 2016 11:35:40 -0700, Dorian Taylor wrote:

is there a reason why all this testing can’t go on in $REPO/.hg?

No idea why, but there was an attempt to move check* files to .hg/cache.

https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-October/074885.html
https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-November/074971.html

About that:

I don't think this is sufficient. This case can occur, for instance, if
someone naively copies a repo from one machine to another with a USB
stick. Or perhaps they've done a naive recursive chmod. Then we're
permanently stuck thinking our filesystem doesn't have exec bits. If
our fast-path files don't have the right permissions (for who knows
what reasons), then we have to do the entire slow path.

I don't see how copying or chmod can trigger wrong measurement of file
system capabilities. "Wrong" files will be removed again - just like in
the old slow path. The slow path will pretty much be the existing slow
path - it will not be slower than before but it will usually not be used.

Can we distinguish "no exec bit support" with "cache/checkisexec was chmod
by user" ? If a repository is copied through VFAT USB stick, exec bit could
be dropped from both checkisexec and checknoexec.



The basic idea (IIRC) in 
https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-October/074887.html 
is that if we have one cached file that is exec and another that isn't, 
then we can be pretty sure that that file system supports X bit. (That 
assumption might be wrong for filesystems that cache the x bit but 
doesn't persist it - but in that case the repo will be screwed anyway.)


If we can't conclude anything from stat'ing existing files, we remove 
them and check the old way ... and if the result is positive, we leave 
the files behind so it can take the fast path next time.


I thus don't see what you are asking for that not already is taken care of.

/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 2 stable] largefiles: test coverage of fatal interruption of update

2016-10-15 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476577598 -7200
#  Sun Oct 16 02:26:38 2016 +0200
# Node ID 2105975810c843d016ba930c44e63699af2703a6
# Parent  5cb830801855dbb63e98b948e355bc995d295bf3
largefiles: test coverage of fatal interruption of update

Test using existing changesets in a clean working directory, revealing problems
with files that don't show up as modified or do show up as removed when they
just not have been written yet.

diff --git a/tests/test-largefiles-update.t b/tests/test-largefiles-update.t
--- a/tests/test-largefiles-update.t
+++ b/tests/test-largefiles-update.t
@@ -144,6 +144,7 @@ Test that "hg merge" updates largefiles 
   large1 in #1
   $ cat .hglf/large1
   58e24f733a964da346e2407a2bee99d9001184f5
+  $ rm normal1.orig
 
 (merge non-existing largefiles from "other" via conflict prompt -
 make sure the following commit doesn't abort in a confusing way when trying to
@@ -243,6 +244,7 @@ Test that "hg rollback" restores status 
   ? largeY
   $ test -f .hglf/largeY
   [1]
+  $ rm largeY
 
 Test that "hg rollback" restores standins correctly
 
@@ -285,6 +287,7 @@ is not branch-tip)
   58e24f733a964da346e2407a2bee99d9001184f5
   $ cat .hglf/large2
   1deebade43c8c498a3c8daddac0244dc55d1331d
+  $ rm normalX
 
 Test that "hg status" shows status of largefiles correctly just after
 automated commit like rebase/transplant
@@ -598,6 +601,7 @@ it is aborted by conflict.
   58e24f733a964da346e2407a2bee99d9001184f5
   $ cat large1
   large1 in #1
+  $ rm normal1.orig
 
 Test that rebase updates standins for manually modified largefiles at
 the 1st commit of resuming.
@@ -728,6 +732,48 @@ bit correctly on the platform being unaw
 
 #endif
 
+Test a fatal error interrupting an update. lfdirstate doesn't realize that
+.hglf has been updated while the largefile hasn't. Status thus shows a clean
+state ... but rebuilding lfdirstate and checking all hashes reveals it isn't
+clean.
+
+Start with clean dirstates:
+  $ hg up -qcr "8^"
+  $ sleep 1
+  $ hg st
+Update standins without updating largefiles:
+  $ cat << EOF > ../crashupdatelfiles.py
+  > import hgext.largefiles.lfutil
+  > def getlfilestoupdate(oldstandins, newstandins):
+  >  raise SystemExit(7)
+  > hgext.largefiles.lfutil.getlfilestoupdate = getlfilestoupdate
+  > EOF
+  $ hg up -Cr "8" --config extensions.crashupdatelfiles=../crashupdatelfiles.py
+  [7]
+Check large1 content and status:
+BUG: largeX is R and large1 is not M and update does nothing
+  $ cat large1
+  large1 in #3
+  $ hg st
+  R largeX
+  $ hg up -Cr .
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg st
+  R largeX
+Force largefiles rehashing and check again - revealing modifications that
+update now can remove:
+  $ rm .hg/largefiles/dirstate
+  $ hg st
+  M large1
+  ! largeX
+  $ hg up -Cr .
+  getting changed largefiles
+  2 largefiles updated, 0 removed
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg st
+  $ cat large1
+  manually modified before 'hg transplant --continue'
+
   $ cd ..
 
 Test that "hg convert" avoids copying largefiles from the working
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 2 stable] largefiles: more safe handling of interruptions while updating modifications

2016-10-15 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476577785 -7200
#  Sun Oct 16 02:29:45 2016 +0200
# Node ID dcd190cb73a0740be570bdfbc1090066c38e1173
# Parent  2105975810c843d016ba930c44e63699af2703a6
largefiles: more safe handling of interruptions while updating modifications

Largefiles are fragile with the design where dirstate and lfdirstate must be
kept in sync.

To be less fragile, mark all clean largefiles as unsure ("normallookup") before
updating standins. After standins have been updated and we know exactly which
largefile standins actually was changed, mark the unchanged largefiles back to
clean ("normal").

This will make the failure mode more safe. If interrupted, the next command
will continue to perform extra hashing of all largefiles. That will do that all
largefiles that are out of sync with their standin will be marked dirty and
they will show up in status and can be cleaned with update --clean.

diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
--- a/hgext/largefiles/overrides.py
+++ b/hgext/largefiles/overrides.py
@@ -1385,7 +1385,8 @@ def mergeupdate(orig, repo, node, branch
 lfdirstate = lfutil.openlfdirstate(repo.ui, repo)
 unsure, s = lfdirstate.status(matchmod.always(repo.root,
 repo.getcwd()),
-  [], False, False, False)
+  [], False, True, False)
+oldclean = set(s.clean)
 pctx = repo['.']
 for lfile in unsure + s.modified:
 lfileabs = repo.wvfs.join(lfile)
@@ -1397,9 +1398,13 @@ def mergeupdate(orig, repo, node, branch
 lfutil.getexecutable(lfileabs))
 if (standin in pctx and
 lfhash == lfutil.readstandin(repo, lfile, '.')):
-lfdirstate.normal(lfile)
+oldclean.add(lfile)
 for lfile in s.added:
 lfutil.updatestandin(repo, lfutil.standin(lfile))
+# mark all clean largefiles as dirty, just in case the update gets
+# interrupted before largefiles and lfdirstate are synchronized
+for lfile in oldclean:
+lfdirstate.normallookup(lfile)
 lfdirstate.write()
 
 oldstandins = lfutil.getstandinsstate(repo)
@@ -1408,6 +1413,13 @@ def mergeupdate(orig, repo, node, branch
 
 newstandins = lfutil.getstandinsstate(repo)
 filelist = lfutil.getlfilestoupdate(oldstandins, newstandins)
+
+# to avoid leaving all largefiles as dirty and thus rehash them, mark
+# all the ones that didn't change as clean
+for lfile in oldclean.difference(filelist):
+lfdirstate.normal(lfile)
+lfdirstate.write()
+
 if branchmerge or force or partial:
 filelist.extend(s.deleted + s.removed)
 
diff --git a/tests/test-largefiles-update.t b/tests/test-largefiles-update.t
--- a/tests/test-largefiles-update.t
+++ b/tests/test-largefiles-update.t
@@ -732,16 +732,16 @@ bit correctly on the platform being unaw
 
 #endif
 
-Test a fatal error interrupting an update. lfdirstate doesn't realize that
-.hglf has been updated while the largefile hasn't. Status thus shows a clean
-state ... but rebuilding lfdirstate and checking all hashes reveals it isn't
-clean.
+Test a fatal error interrupting an update. Verify that status report dirty
+files correctly after an interrupted update. Also verify that checking all
+hashes reveals it isn't clean.
 
 Start with clean dirstates:
   $ hg up -qcr "8^"
   $ sleep 1
   $ hg st
-Update standins without updating largefiles:
+Update standins without updating largefiles - large1 is modified and largeX is
+added:
   $ cat << EOF > ../crashupdatelfiles.py
   > import hgext.largefiles.lfutil
   > def getlfilestoupdate(oldstandins, newstandins):
@@ -750,29 +750,31 @@ Update standins without updating largefi
   > EOF
   $ hg up -Cr "8" --config extensions.crashupdatelfiles=../crashupdatelfiles.py
   [7]
-Check large1 content and status:
-BUG: largeX is R and large1 is not M and update does nothing
+Check large1 content and status ... and that update will undo modifications:
+BUG: large is R
   $ cat large1
   large1 in #3
   $ hg st
+  M large1
   R largeX
   $ hg up -Cr .
-  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  getting changed largefiles
+  1 largefiles updated, 0 removed
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ cat large1
+  manually modified before 'hg transplant --continue'
   $ hg st
   R largeX
-Force largefiles rehashing and check again - revealing modifications that
-update now can remove:
+Force largefiles rehashing and check again - which makes it realize that largeX
+not has been removed but just doesn't exist:
   $ rm .hg/largefiles/dirstate
   $ hg st
-  M larg

[PATCH 1 of 3 v2] cmdutil: satisfy expections in dirstateguard.__del__, even if __init__ fails

2016-10-13 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476402795 -7200
#  Fri Oct 14 01:53:15 2016 +0200
# Node ID 426994acbf9c9b78aa6922a279dca4091b6508dc
# Parent  c0410814002f467c24ef07ce73850ba15b306f8e
cmdutil: satisfy expections in dirstateguard.__del__, even if __init__ fails

Python "delstructors" are terrible - this one because it assumed that __init__
had completed before it was called. That would not necessarily be the case if
the repository was read only or broken and saving the dirstate thus failed in
unexpected ways. That could give confusing warnings about missing '_active'
after failures.

To fix that, make sure all member variables are "declared" before doing
anything that possibly could fail. [Famous last words.]

diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -3508,10 +3508,11 @@ class dirstateguard(object):
 
 def __init__(self, repo, name):
 self._repo = repo
+self._active = False
+self._closed = False
 self._suffix = '.backup.%s.%d' % (name, id(self))
 repo.dirstate.savebackup(repo.currenttransaction(), self._suffix)
 self._active = True
-self._closed = False
 
 def __del__(self):
 if self._active: # still active
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 3 v2] tests: add more test coverage of phase changes when pushing

2016-10-13 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1471734720 -7200
#  Sun Aug 21 01:12:00 2016 +0200
# Node ID 24a7ccfb932b134da24e58817991943c8bbd63fa
# Parent  426994acbf9c9b78aa6922a279dca4091b6508dc
tests: add more test coverage of phase changes when pushing

Prepare for test coverage of phase updates with future push --readonly option,
both with and without actually pushing changesets.

diff --git a/tests/test-phases-exchange.t b/tests/test-phases-exchange.t
--- a/tests/test-phases-exchange.t
+++ b/tests/test-phases-exchange.t
@@ -1182,10 +1182,40 @@ 2. Test that failed phases movement are 
   cannot lock source repo, skipping local public phase update
   [1]
   $ chmod -R +w .hg
-  $ hgph Upsilon
 
   $ cd ..
 
-  $ killdaemons.py
+#endif
 
-#endif
+Test that clone behaves like pull and doesn't
+publish changesets as plain push does
+
+  $ hg -R Upsilon phase -q --force --draft 2
+  $ hg clone -q Upsilon Pi -r 7
+  $ hgph Upsilon -r 'min(draft())'
+  o  2 draft a-C - 54acac6f23ab
+  |
+  ~
+
+  $ hg -R Upsilon push Pi -r 7
+  pushing to Pi
+  searching for changes
+  no changes found
+  [1]
+  $ hgph Upsilon -r 'min(draft())'
+  o  8 draft a-F - b740e3e5c05d
+  |
+  ~
+
+  $ hg -R Upsilon push Pi -r 8
+  pushing to Pi
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files
+
+  $ hgph Upsilon -r 'min(draft())'
+  o  9 draft a-G - 3e27b6f1eee1
+  |
+  ~
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 3 of 3 v2] repos: introduce '-R readonly:PATH' for doing local operations "read only"

2016-10-13 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476402795 -7200
#  Fri Oct 14 01:53:15 2016 +0200
# Node ID 491da143c2f310da732c2e1005e7f8394134a51d
# Parent  24a7ccfb932b134da24e58817991943c8bbd63fa
repos: introduce '-R readonly:PATH' for doing local operations "read only"

When used as 'readonly:.', this is pretty much like if the repository was owned
by another user and the current user didn't have write access to anything in
.hg .

Using this feature will for example allow multiple simultaneous pushes,
pushes without phase changes, and will provide a "safe" way to run commands ...
assuming this and our use of VFS is complete and correct.

The existing "API" for repository types could use some cleanup - it requires
modules with special undefined duck typing. This patch seems to do what is
needed.

The existing VFS class hierarchy has a "readonlyvfs" class, but whatever it is,
it doesn't seem suitable for this use; it doesn't seem to be a reusable class
or mixin.

diff --git a/mercurial/hg.py b/mercurial/hg.py
--- a/mercurial/hg.py
+++ b/mercurial/hg.py
@@ -31,6 +31,7 @@ from . import (
 merge as mergemod,
 node,
 phases,
+readonlyrepo,
 repoview,
 scmutil,
 sshpeer,
@@ -112,6 +113,7 @@ schemes = {
 'https': httppeer,
 'ssh': sshpeer,
 'static-http': statichttprepo,
+'readonly': lambda path: readonlyrepo,
 }
 
 def _peerlookup(path):
diff --git a/mercurial/readonlyrepo.py b/mercurial/readonlyrepo.py
new file mode 100644
--- /dev/null
+++ b/mercurial/readonlyrepo.py
@@ -0,0 +1,57 @@
+# readonlyrepo.py - a local repository class for mercurial that can't write
+# or lock the repository
+#
+# This software may be used and distributed according to the terms of the
+# GNU General Public License version 2 or any later version.
+
+from __future__ import absolute_import
+
+import errno
+
+from .i18n import _
+from . import (
+localrepo,
+scmutil,
+util,
+)
+
+# created, not thrown yet
+readonlyexception = IOError(errno.EACCES, _('this is a "readonly" repository'))
+
+# this seems to be a necessary part of the repository type API
+islocal = localrepo.islocal
+
+class readonlyvfs(scmutil.vfs):
+"""A VFS that only can be called with read modes - writing will fail with
+an IO error as if the user didn't have write access"""
+
+def __call__(self, path, mode='r', *args, **kw):
+if mode not in ('r', 'rb'):
+raise readonlyexception
+return super(readonlyvfs, self).__call__(path, mode, *args, **kw)
+
+class readonlyrepo(localrepo.localrepository):
+"""A repository that is local but read only, as if the user didn't have
+file system write access."""
+
+def __init__(self, baseui, path=None, create=False):
+# we know the "scheme" for path is "readonly" but do not want to extend
+# the file/bundle hack in the "url" parser - just strip it here
+assert path.startswith('readonly:'), path
+path = path[len('readonly:'):]
+
+super(readonlyrepo, self).__init__(baseui, path=path, create=False)
+
+assert self.vfs.__class__ is scmutil.vfs
+self.vfs.__class__ = readonlyvfs
+assert self.wvfs.__class__ is scmutil.vfs
+self.wvfs.__class__ = readonlyvfs
+
+def lock(self, wait=True):
+raise readonlyexception
+
+def wlock(self, wait=True):
+raise readonlyexception
+
+def instance(ui, path, create):
+return readonlyrepo(ui, util.urllocalpath(path), create=False)
diff --git a/tests/test-phases-exchange.t b/tests/test-phases-exchange.t
--- a/tests/test-phases-exchange.t
+++ b/tests/test-phases-exchange.t
@@ -1197,25 +1197,27 @@ publish changesets as plain push does
   |
   ~
 
-  $ hg -R Upsilon push Pi -r 7
+  $ hg -R readonly:Upsilon push Pi -r 7
   pushing to Pi
   searching for changes
   no changes found
+  cannot lock source repo, skipping local public phase update
   [1]
   $ hgph Upsilon -r 'min(draft())'
-  o  8 draft a-F - b740e3e5c05d
+  o  2 draft a-C - 54acac6f23ab
   |
   ~
 
-  $ hg -R Upsilon push Pi -r 8
+  $ hg -R readonly:Upsilon push Pi -r 8
   pushing to Pi
   searching for changes
   adding changesets
   adding manifests
   adding file changes
   added 1 changesets with 1 changes to 1 files
+  cannot lock source repo, skipping local public phase update
 
   $ hgph Upsilon -r 'min(draft())'
-  o  9 draft a-G - 3e27b6f1eee1
+  o  2 draft a-C - 54acac6f23ab
   |
   ~
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 2 v2] util: increase filechunkiter size to 128k

2016-10-13 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476402795 -7200
#  Fri Oct 14 01:53:15 2016 +0200
# Node ID 8909ce1593659bce0563cc66ca3fa34525daed65
# Parent  b7889580507c3d1d40e61904c7a2c2aba335c6cd
util: increase filechunkiter size to 128k

util.filechunkiter has been using a chunk size of 64k for more than 10 years,
also in years where Moore's law still was a law. It is probably ok to bump it
now and perhaps get a slight win in some cases.

Also, largefiles have been using 128k for a long time. Specifying that size
multiple times (or forgetting to do it) seems a bit stupid. Decreasing it to
64k also seems unfortunate.

Thus, we will set the default chunksize to 128k and use the default everywhere.

diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
--- a/hgext/largefiles/lfutil.py
+++ b/hgext/largefiles/lfutil.py
@@ -372,7 +372,7 @@ def hashfile(file):
 return ''
 hasher = hashlib.sha1('')
 with open(file, 'rb') as fd:
-for data in util.filechunkiter(fd, 128 * 1024):
+for data in util.filechunkiter(fd):
 hasher.update(data)
 return hasher.hexdigest()
 
diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
--- a/hgext/largefiles/overrides.py
+++ b/hgext/largefiles/overrides.py
@@ -1356,7 +1356,7 @@ def overridecat(orig, ui, repo, file1, *
   'downloaded')  % lf)
 path = lfutil.usercachepath(repo.ui, hash)
 with open(path, "rb") as fpin:
-for chunk in util.filechunkiter(fpin, 128 * 1024):
+for chunk in util.filechunkiter(fpin):
 fp.write(chunk)
 err = 0
 return err
diff --git a/hgext/largefiles/proto.py b/hgext/largefiles/proto.py
--- a/hgext/largefiles/proto.py
+++ b/hgext/largefiles/proto.py
@@ -134,7 +134,7 @@ def wirereposetup(ui, repo):
 length))
 
 # SSH streams will block if reading more than length
-for chunk in util.filechunkiter(stream, 128 * 1024, length):
+for chunk in util.filechunkiter(stream, limit=length):
 yield chunk
 # HTTP streams must hit the end to process the last empty
 # chunk of Chunked-Encoding so the connection can be reused.
diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -1684,9 +1684,9 @@ class chunkbuffer(object):
 
 return ''.join(buf)
 
-def filechunkiter(f, size=65536, limit=None):
+def filechunkiter(f, size=131072, limit=None):
 """Create a generator that produces the data in the file size
-(default 65536) bytes at a time, up to optional limit (default is
+(default 131072) bytes at a time, up to optional limit (default is
 to read all data).  Chunks may be less than size bytes if the
 chunk is the last chunk in the file, or the file is a socket or
 some other type of file that sometimes reads less data than is
diff --git a/tests/test-largefiles-small-disk.t 
b/tests/test-largefiles-small-disk.t
--- a/tests/test-largefiles-small-disk.t
+++ b/tests/test-largefiles-small-disk.t
@@ -11,7 +11,7 @@ Test how largefiles abort in case the di
   > shutil.copyfileobj = copyfileobj
   > #
   > # this makes the rewritten code abort:
-  > def filechunkiter(f, size=65536, limit=None):
+  > def filechunkiter(f, size=131072, limit=None):
   > yield f.read(4)
   > raise IOError(errno.ENOSPC, os.strerror(errno.ENOSPC))
   > util.filechunkiter = filechunkiter
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 2 v2] largefiles: always use filechunkiter when iterating files

2016-10-13 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476267738 -7200
#  Wed Oct 12 12:22:18 2016 +0200
# Node ID b7889580507c3d1d40e61904c7a2c2aba335c6cd
# Parent  c0410814002f467c24ef07ce73850ba15b306f8e
largefiles: always use filechunkiter when iterating files

Before, we would sometimes use the default iterator over large files. That
iterator is line based and would add extra buffering and use odd chunk sizes
which could give some overhead.

copyandhash can't just apply a filechunkiter as it sometimes is passed a
genuine generator when downloading remotely.

diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
--- a/hgext/largefiles/lfutil.py
+++ b/hgext/largefiles/lfutil.py
@@ -231,7 +231,8 @@ def copyfromcache(repo, hash, filename):
 # don't use atomic writes in the working copy.
 with open(path, 'rb') as srcfd:
 with wvfs(filename, 'wb') as destfd:
-gothash = copyandhash(srcfd, destfd)
+gothash = copyandhash(
+util.filechunkiter(srcfd), destfd)
 if gothash != hash:
 repo.ui.warn(_('%s: data corruption in %s with hash %s\n')
  % (filename, path, gothash))
diff --git a/hgext/largefiles/localstore.py b/hgext/largefiles/localstore.py
--- a/hgext/largefiles/localstore.py
+++ b/hgext/largefiles/localstore.py
@@ -10,6 +10,7 @@
 from __future__ import absolute_import
 
 from mercurial.i18n import _
+from mercurial import util
 
 from . import (
 basestore,
@@ -42,7 +43,8 @@ class localstore(basestore.basestore):
 raise basestore.StoreError(filename, hash, self.url,
 _("can't get file locally"))
 with open(path, 'rb') as fd:
-return lfutil.copyandhash(fd, tmpfile)
+return lfutil.copyandhash(
+util.filechunkiter(fd), tmpfile)
 
 def _verifyfiles(self, contents, filestocheck):
 failed = False
diff --git a/hgext/largefiles/remotestore.py b/hgext/largefiles/remotestore.py
--- a/hgext/largefiles/remotestore.py
+++ b/hgext/largefiles/remotestore.py
@@ -118,7 +118,7 @@ class remotestore(basestore.basestore):
 raise NotImplementedError('abstract method')
 
 def _get(self, hash):
-'''Get file with the given hash from the remote store.'''
+'''Get a iterator for content with the given hash.'''
 raise NotImplementedError('abstract method')
 
 def _stat(self, hashes):
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH] demandimport: disable lazy import of __builtin__

2016-10-13 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476407019 -7200
#  Fri Oct 14 03:03:39 2016 +0200
# Node ID c0fd76f612735f0354102603ee630437208d8336
# Parent  c0410814002f467c24ef07ce73850ba15b306f8e
demandimport: disable lazy import of __builtin__

Demandimport uses the "try to import __builtin__, else use builtins" trick to
handle Python 3. External libraries and extensions might do something similar.
On Fedora 25 subversion-python-1.9.4-4.fc25.x86_64 will do just that (except
the opposite) ... and it failed all subversion convert tests because
demandimport was hiding that it didn't have builtins but should use
__builtin__.

The builtin module has already been imported when demandimport is loaded so
there is no point in trying to import it on demand. Just always ignore both
variants in demandimport.

diff --git a/mercurial/demandimport.py b/mercurial/demandimport.py
--- a/mercurial/demandimport.py
+++ b/mercurial/demandimport.py
@@ -291,6 +291,8 @@ ignore = [
 'sqlalchemy.events', # has import-time side effects (issue5085)
 # setuptools 8 expects this module to explode early when not on windows
 'distutils.msvc9compiler',
+'__builtin__',
+'builtins',
 ]
 
 if _pypy:
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 6] largefiles: when setting/clearing x bit on largefiles, don't change other bits

2016-10-13 Thread Mads Kiilerich

On 10/13/2016 05:57 PM, Pierre-Yves David wrote:


On 10/13/2016 04:13 PM, Yuya Nishihara wrote:

Okay, pruned the following changes from hg-committed:


Urg, direct pruning on hg-commmitted is problematic as it makes 
changesets disappear on the contributor see (see the related thread on 
the reviewers mailing list a while back and the drophack¹ extension 
(for reviewers only)).


Don't worry; I don't use evolve - this is one of the reasons ;-)

/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 2] eol: do not wait on lack when writing cache

2016-10-13 Thread Mads Kiilerich

On 10/13/2016 05:07 PM, Pierre-Yves David wrote:



On 10/13/2016 03:21 PM, Mads Kiilerich wrote:

On 10/13/2016 01:53 PM, Pierre-Yves David wrote:

# HG changeset patch
# User Pierre-Yves David 
# Date 1476359131 -7200
#  Thu Oct 13 13:45:31 2016 +0200
# Node ID 88cc944830d0c1895e527d6ca13687f1d5e1c785
# Parent  747e546c561fbf34d07cd30013eaf42b0190bb3b
eol: do not wait on lack when writing cache

The cache writing process is properly catching and handling the case
where the
lock is unavailable. However, it fails to specify the lock can failed
to be
acquired when requesting it. This is now fixed.


Hmm.

*If* the user has write access to the repo and *could* lock the repo,
then it seems reasonable that it waits for the lock and does the right
thing. It would be unfortunate to bail out early and happily continue to
expose the less optimal state that read only users might have to deal 
with.


The change introduced by this changeset make the cache in line with 
how most of other cache works in Mercurial. 


A part of the problem here might be that it is unclear to me what 
happens with wait=False. I don't remember the details: will it continue 
without locking or will it raise? It would be nice to get that clarified 
in the localrepo docstrings.


Also, this "cache" is not so much a "store values so we don't have to 
compute them again" cache. It is more about recording how .hgeol looked 
like when file content from the repo was "cached" in the working 
directory. Without this, we have to invalidate the dirstate more often.


The lock don't really have time out so simple read only command could 
hold themself forever if make this call blocking. 


Why not "really"? There is the 10 minutes timeout?


Given than eol is probably not going to be user on pulling


Sorry, I don't understand that sentence.

/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 6] largefiles: when setting/clearing x bit on largefiles, don't change other bits

2016-10-13 Thread Mads Kiilerich

On 10/13/2016 03:58 PM, Yuya Nishihara wrote:

On Thu, 13 Oct 2016 15:46:25 +0200, Mads Kiilerich wrote:

On 10/13/2016 03:39 PM, Yuya Nishihara wrote:
The series looks good to me. Queued up to the patch 5, thanks. 

I have also looked into reworking it to increasing the default chunksize
to 128k everywhere - that seems a bit cleaner in hindsight. I will send
that version if you un-queue it again ;-)

Unqueue everything?



I have no changes to the first two so it would be great to keep them ... 
but I could also easily resend them - whatever is least trouble ;-)


/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 6] largefiles: when setting/clearing x bit on largefiles, don't change other bits

2016-10-13 Thread Mads Kiilerich

On 10/13/2016 03:39 PM, Yuya Nishihara wrote:

On Sat, 08 Oct 2016 01:26:05 +0200, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1475881180 -7200
#  Sat Oct 08 00:59:40 2016 +0200
# Node ID 96315a5833ed015acb7bd8f6d7f1e38db6fa9c50
# Parent  1779dde4c9ef97cb242f8d501655f236f66e5439
largefiles: when setting/clearing x bit on largefiles, don't change other bits

The series looks good to me. Queued up to the patch 5, thanks.


I have also looked into reworking it to increasing the default chunksize 
to 128k everywhere - that seems a bit cleaner in hindsight. I will send 
that version if you un-queue it again ;-)


/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 2] eol: do not wait on lack when writing cache

2016-10-13 Thread Mads Kiilerich

On 10/13/2016 01:53 PM, Pierre-Yves David wrote:

# HG changeset patch
# User Pierre-Yves David 
# Date 1476359131 -7200
#  Thu Oct 13 13:45:31 2016 +0200
# Node ID 88cc944830d0c1895e527d6ca13687f1d5e1c785
# Parent  747e546c561fbf34d07cd30013eaf42b0190bb3b
eol: do not wait on lack when writing cache

The cache writing process is properly catching and handling the case where the
lock is unavailable. However, it fails to specify the lock can failed to be
acquired when requesting it. This is now fixed.


Hmm.

*If* the user has write access to the repo and *could* lock the repo, 
then it seems reasonable that it waits for the lock and does the right 
thing. It would be unfortunate to bail out early and happily continue to 
expose the less optimal state that read only users might have to deal with.


I thus don't think this change would make it less reliable ... and I 
don't see it solving a real problem.


(The next change for proper release is however +1.)

/Mads



diff --git a/hgext/eol.py b/hgext/eol.py
--- a/hgext/eol.py
+++ b/hgext/eol.py
@@ -335,7 +335,7 @@ def reposetup(ui, repo):
  
  wlock = None

  try:
-wlock = self.wlock()
+wlock = self.wlock(wait=False)
  for f in self.dirstate:
  if self.dirstate[f] != 'n':
  continue
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel



___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 3 v2] tests: add more test coverage of phase changes when pushing

2016-10-12 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1471734720 -7200
#  Sun Aug 21 01:12:00 2016 +0200
# Node ID 6affe6ed5cc5b945961fb8f1fe2dc18e711089c2
# Parent  8db96226f748ff80003c0964bc4ece3b6c3e6d12
tests: add more test coverage of phase changes when pushing

Prepare for test coverage of phase updates with future push --readonly option,
both with and without actually pushing changesets.

diff --git a/tests/test-phases-exchange.t b/tests/test-phases-exchange.t
--- a/tests/test-phases-exchange.t
+++ b/tests/test-phases-exchange.t
@@ -1182,10 +1182,40 @@ 2. Test that failed phases movement are 
   cannot lock source repo, skipping local public phase update
   [1]
   $ chmod -R +w .hg
-  $ hgph Upsilon
 
   $ cd ..
 
-  $ killdaemons.py
+#endif
 
-#endif
+Test that clone behaves like pull and doesn't
+publish changesets as plain push does
+
+  $ hg -R Upsilon phase -q --force --draft 2
+  $ hg clone -q Upsilon Pi -r 7
+  $ hgph Upsilon -r 'min(draft())'
+  o  2 draft a-C - 54acac6f23ab
+  |
+  ~
+
+  $ hg -R Upsilon push Pi -r 7
+  pushing to Pi
+  searching for changes
+  no changes found
+  [1]
+  $ hgph Upsilon -r 'min(draft())'
+  o  8 draft a-F - b740e3e5c05d
+  |
+  ~
+
+  $ hg -R Upsilon push Pi -r 8
+  pushing to Pi
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files
+
+  $ hgph Upsilon -r 'min(draft())'
+  o  9 draft a-G - 3e27b6f1eee1
+  |
+  ~
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 3 v2] cmdutil: satisfy expections in dirstateguard.__del__, even if __init__ fails

2016-10-12 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476317981 -7200
#  Thu Oct 13 02:19:41 2016 +0200
# Node ID 8db96226f748ff80003c0964bc4ece3b6c3e6d12
# Parent  b85fa6bf298be07804a74d8fdec0d19fdbc6d740
cmdutil: satisfy expections in dirstateguard.__del__, even if __init__ fails

Python "delstructors" are terrible - this one because it assumed that __init__
had completed before it was called. That would not necessarily be the case if
the repository was read only or broken and saving the dirstate thus failed in
unexpected ways. That could give confusing warnings about missing '_active'
after failures.

To fix that, make sure all member variables are "declared" before doing
anything that possibly could fail. [Famous last words.]

diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -3502,10 +3502,11 @@ class dirstateguard(object):
 
 def __init__(self, repo, name):
 self._repo = repo
+self._active = False
+self._closed = False
 self._suffix = '.backup.%s.%d' % (name, id(self))
 repo.dirstate.savebackup(repo.currenttransaction(), self._suffix)
 self._active = True
-self._closed = False
 
 def __del__(self):
 if self._active: # still active
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 3 of 3 v2] repos: introduce '-R readonly:PATH' for doing local operations "read only"

2016-10-12 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476319116 -7200
#  Thu Oct 13 02:38:36 2016 +0200
# Node ID bc711933bb8cfacfe87d83fe945cf566d79439d8
# Parent  6affe6ed5cc5b945961fb8f1fe2dc18e711089c2
repos: introduce '-R readonly:PATH' for doing local operations "read only"

When used as 'readonly:.', this is pretty much like if the repository was owned
by another user and the current user didn't have write access to anything in
.hg .

Using this feature will for example allow multiple simultaneous pushes,
pushes without phase changes, and will provide a "safe" way to run commands ...
assuming this and our use of VFS is complete and correct.

The "API" for repository types could use some cleanup, but this seems to be one
way to do what is needed.

The existing VFS class hierarchy has a "readonlyvfs" class, but whatever it is,
it doesn't seem suitable for this use. It doesn't seem to be a reusable class
or mixin.

diff --git a/mercurial/hg.py b/mercurial/hg.py
--- a/mercurial/hg.py
+++ b/mercurial/hg.py
@@ -31,6 +31,7 @@ from . import (
 merge as mergemod,
 node,
 phases,
+readonlyrepo,
 repoview,
 scmutil,
 sshpeer,
@@ -112,6 +113,7 @@ schemes = {
 'https': httppeer,
 'ssh': sshpeer,
 'static-http': statichttprepo,
+'readonly': lambda path: readonlyrepo,
 }
 
 def _peerlookup(path):
diff --git a/mercurial/readonlyrepo.py b/mercurial/readonlyrepo.py
new file mode 100644
--- /dev/null
+++ b/mercurial/readonlyrepo.py
@@ -0,0 +1,57 @@
+# readonlyrepo.py - a local repository class for mercurial that can't write
+# or lock the repository
+#
+# This software may be used and distributed according to the terms of the
+# GNU General Public License version 2 or any later version.
+
+from __future__ import absolute_import
+
+import errno
+
+from .i18n import _
+from . import (
+localrepo,
+scmutil,
+util,
+)
+
+# created, not thrown yet
+readonlyexception = IOError(errno.EACCES, _('this is a "readonly" repository'))
+
+# this seems to be a necessary part of the repository type API
+islocal = localrepo.islocal
+
+class readonlyvfs(scmutil.vfs):
+"""A VFS that only can be called with read modes - writing will fail with
+an IO error as if the user didn't have write access"""
+
+def __call__(self, path, mode='r', *args, **kw):
+if mode not in ('r', 'rb'):
+raise readonlyexception
+return super(readonlyvfs, self).__call__(path, mode, *args, **kw)
+
+class readonlyrepo(localrepo.localrepository):
+"""A repository that is local but read only, as if the user didn't have
+file system write access."""
+
+def __init__(self, baseui, path=None, create=False):
+# we know the "scheme" for path is "readonly" but do not want to extend
+# the file/bundle hack in the "url" parser - just strip it here
+assert path.startswith('readonly:'), path
+path = path[len('readonly:'):]
+
+super(readonlyrepo, self).__init__(baseui, path=path, create=False)
+
+assert self.vfs.__class__ is scmutil.vfs
+self.vfs.__class__ = readonlyvfs
+assert self.wvfs.__class__ is scmutil.vfs
+self.wvfs.__class__ = readonlyvfs
+
+def lock(self, wait=True):
+raise readonlyexception
+
+def wlock(self, wait=True):
+raise readonlyexception
+
+def instance(ui, path, create):
+return readonlyrepo(ui, util.urllocalpath(path), create=False)
diff --git a/tests/test-phases-exchange.t b/tests/test-phases-exchange.t
--- a/tests/test-phases-exchange.t
+++ b/tests/test-phases-exchange.t
@@ -1197,25 +1197,27 @@ publish changesets as plain push does
   |
   ~
 
-  $ hg -R Upsilon push Pi -r 7
+  $ hg -R readonly:Upsilon push Pi -r 7
   pushing to Pi
   searching for changes
   no changes found
+  cannot lock source repo, skipping local public phase update
   [1]
   $ hgph Upsilon -r 'min(draft())'
-  o  8 draft a-F - b740e3e5c05d
+  o  2 draft a-C - 54acac6f23ab
   |
   ~
 
-  $ hg -R Upsilon push Pi -r 8
+  $ hg -R readonly:Upsilon push Pi -r 8
   pushing to Pi
   searching for changes
   adding changesets
   adding manifests
   adding file changes
   added 1 changesets with 1 changes to 1 files
+  cannot lock source repo, skipping local public phase update
 
   $ hgph Upsilon -r 'min(draft())'
-  o  9 draft a-G - 3e27b6f1eee1
+  o  2 draft a-C - 54acac6f23ab
   |
   ~
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 3] merge: only show "cannot merge flags for %s" warning if flags are different

2016-10-12 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476267738 -7200
#  Wed Oct 12 12:22:18 2016 +0200
# Node ID 29bd20c4999865fc19ca3f0344c2c1231a318b1c
# Parent  bc7af83150d0137d35d2cd9bf715fc0e3658cf8a
merge: only show "cannot merge flags for %s" warning if flags are different

diff --git a/mercurial/merge.py b/mercurial/merge.py
--- a/mercurial/merge.py
+++ b/mercurial/merge.py
@@ -475,7 +475,7 @@ class mergestate(object):
 flo = fco.flags()
 fla = fca.flags()
 if 'x' in flags + flo + fla and 'l' not in flags + flo + fla:
-if fca.node() == nullid:
+if fca.node() == nullid and flags != flo:
 if preresolve:
 self._repo.ui.warn(
 _('warning: cannot merge flags for %s\n') % afile)
diff --git a/tests/test-merge-types.t b/tests/test-merge-types.t
--- a/tests/test-merge-types.t
+++ b/tests/test-merge-types.t
@@ -337,7 +337,6 @@ h: l vs l, different
   merging a
   warning: cannot merge flags for b
   merging b
-  warning: cannot merge flags for bx
   merging bx
   warning: cannot merge flags for c
   merging d
@@ -400,7 +399,6 @@ h: l vs l, different
   merging a
   warning: cannot merge flags for b
   merging b
-  warning: cannot merge flags for bx
   merging bx
   warning: cannot merge flags for c
   merging d
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 3] tests: add test coverage of merging x flag without ancestor

2016-10-12 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476267738 -7200
#  Wed Oct 12 12:22:18 2016 +0200
# Node ID bc7af83150d0137d35d2cd9bf715fc0e3658cf8a
# Parent  b85fa6bf298be07804a74d8fdec0d19fdbc6d740
tests: add test coverage of merging x flag without ancestor

It is more noisy than necessary - we will fix that later.

diff --git a/tests/test-merge-types.t b/tests/test-merge-types.t
--- a/tests/test-merge-types.t
+++ b/tests/test-merge-types.t
@@ -307,6 +307,8 @@ h: l vs l, different
   $ echo 1 > a
   $ echo 1 > b
   $ chmod +x b
+  $ echo 1 > bx
+  $ chmod +x bx
   $ echo x > c
   $ chmod +x c
   $ echo 1 > d
@@ -321,6 +323,8 @@ h: l vs l, different
   $ hg up -qr0
   $ echo 2 > a
   $ echo 2 > b
+  $ echo 2 > bx
+  $ chmod +x bx
   $ echo x > c
   $ ln -s 2 d
   $ ln -s x e
@@ -333,6 +337,8 @@ h: l vs l, different
   merging a
   warning: cannot merge flags for b
   merging b
+  warning: cannot merge flags for bx
+  merging bx
   warning: cannot merge flags for c
   merging d
   warning: internal :merge cannot merge symlinks for d
@@ -345,29 +351,31 @@ h: l vs l, different
   warning: conflicts while merging h! (edit, then use 'hg resolve --mark')
   warning: conflicts while merging a! (edit, then use 'hg resolve --mark')
   warning: conflicts while merging b! (edit, then use 'hg resolve --mark')
-  3 files updated, 0 files merged, 0 files removed, 5 files unresolved
+  warning: conflicts while merging bx! (edit, then use 'hg resolve --mark')
+  3 files updated, 0 files merged, 0 files removed, 6 files unresolved
   use 'hg resolve' to retry unresolved file merges or 'hg update -C .' to 
abandon
   [1]
   $ hg resolve -l
   U a
   U b
+  U bx
   U d
   U f
   U h
   $ tellmeabout a
   a is a plain file with content:
-  <<<<<<< working copy: 0139c5610547 - test: 2
+  <<<<<<< working copy: 0c617753b41b - test: 2
   2
   ===
   1
-  >>>>>>> merge rev:97e29675e796  - test: 1
+  >>>>>>> merge rev:2e60aa20b912  - test: 1
   $ tellmeabout b
   b is a plain file with content:
-  <<<<<<< working copy: 0139c5610547 - test: 2
+  <<<<<<< working copy: 0c617753b41b - test: 2
   2
   ===
   1
-  >>>>>>> merge rev:97e29675e796  - test: 1
+  >>>>>>> merge rev:2e60aa20b912  - test: 1
   $ tellmeabout c
   c is a plain file with content:
   x
@@ -392,6 +400,8 @@ h: l vs l, different
   merging a
   warning: cannot merge flags for b
   merging b
+  warning: cannot merge flags for bx
+  merging bx
   warning: cannot merge flags for c
   merging d
   warning: internal :merge cannot merge symlinks for d
@@ -404,23 +414,24 @@ h: l vs l, different
   warning: conflicts while merging h! (edit, then use 'hg resolve --mark')
   warning: conflicts while merging a! (edit, then use 'hg resolve --mark')
   warning: conflicts while merging b! (edit, then use 'hg resolve --mark')
-  3 files updated, 0 files merged, 0 files removed, 5 files unresolved
+  warning: conflicts while merging bx! (edit, then use 'hg resolve --mark')
+  3 files updated, 0 files merged, 0 files removed, 6 files unresolved
   use 'hg resolve' to retry unresolved file merges or 'hg update -C .' to 
abandon
   [1]
   $ tellmeabout a
   a is a plain file with content:
-  <<<<<<< working copy: 97e29675e796  - test: 1
+  <<<<<<< working copy: 2e60aa20b912  - test: 1
   1
   ===
   2
-  >>>>>>> merge rev:0139c5610547 - test: 2
+  >>>>>>> merge rev:0c617753b41b - test: 2
   $ tellmeabout b
   b is an executable file with content:
-  <<<<<<< working copy: 97e29675e796  - test: 1
+  <<<<<<< working copy: 2e60aa20b912  - test: 1
   1
   ===
   2
-  >>>>>>> merge rev:0139c5610547 - test: 2
+  >>>>>>> merge rev:0c617753b41b - test: 2
   $ tellmeabout c
   c is an executable file with content:
   x
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 3 of 3] merge: clarify warning for (not) merging flags without ancestor

2016-10-12 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476267738 -7200
#  Wed Oct 12 12:22:18 2016 +0200
# Node ID 9588752fc3a6d2b5b1f40ddfc48965997270892d
# Parent  29bd20c4999865fc19ca3f0344c2c1231a318b1c
merge: clarify warning for (not) merging flags without ancestor

Give hints why it can't merge and what it will do instead.

diff --git a/mercurial/merge.py b/mercurial/merge.py
--- a/mercurial/merge.py
+++ b/mercurial/merge.py
@@ -478,7 +478,9 @@ class mergestate(object):
 if fca.node() == nullid and flags != flo:
 if preresolve:
 self._repo.ui.warn(
-_('warning: cannot merge flags for %s\n') % afile)
+_('warning: cannot merge flags for %s '
+  'without common ancestor - keeping local flags\n')
+% afile)
 elif flags == fla:
 flags = flo
 if preresolve:
diff --git a/tests/test-merge-types.t b/tests/test-merge-types.t
--- a/tests/test-merge-types.t
+++ b/tests/test-merge-types.t
@@ -335,10 +335,10 @@ h: l vs l, different
 
   $ hg merge
   merging a
-  warning: cannot merge flags for b
+  warning: cannot merge flags for b without common ancestor - keeping local 
flags
   merging b
   merging bx
-  warning: cannot merge flags for c
+  warning: cannot merge flags for c without common ancestor - keeping local 
flags
   merging d
   warning: internal :merge cannot merge symlinks for d
   warning: conflicts while merging d! (edit, then use 'hg resolve --mark')
@@ -397,10 +397,10 @@ h: l vs l, different
   $ hg up -Cqr1
   $ hg merge
   merging a
-  warning: cannot merge flags for b
+  warning: cannot merge flags for b without common ancestor - keeping local 
flags
   merging b
   merging bx
-  warning: cannot merge flags for c
+  warning: cannot merge flags for c without common ancestor - keeping local 
flags
   merging d
   warning: internal :merge cannot merge symlinks for d
   warning: conflicts while merging d! (edit, then use 'hg resolve --mark')
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 3 of 3] eol: on update, only re-check files if filtering changed

2016-10-09 Thread Mads Kiilerich

On 10/09/2016 04:45 PM, Pierre-Yves David wrote:



On 10/09/2016 04:19 PM, Mads Kiilerich wrote:

diff --git a/hgext/eol.py b/hgext/eol.py
--- a/hgext/eol.py
+++ b/hgext/eol.py
@@ -312,10 +312,15 @@ def reposetup(ui, repo):
 self._eolmatch = util.never
 return

+oldeol = None
 try:
 cachemtime = os.path.getmtime(self.join("eol.cache"))


This seems like it should live in the "cache/" directory. Any reason 
not to ?




I don't know. I guess it predates the cache directory. I would probably 
agree it now would be better if it lived there.


This patch is backwards compatible and doesn't change that. Changing the 
location would be a separate independent change - and apparently trivial 
and backwards compatible.



/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 3] eol: store and reuse pattern matchers instead of creating in tight loop

2016-10-09 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476021282 -7200
#  Sun Oct 09 15:54:42 2016 +0200
# Node ID 5240a8c9b6289838592f24f4e41a1158a48b951a
# Parent  408ddba4689de432fe77e0ba8d27765b63719180
eol: store and reuse pattern matchers instead of creating in tight loop

More "right" and more efficient.

diff --git a/hgext/eol.py b/hgext/eol.py
--- a/hgext/eol.py
+++ b/hgext/eol.py
@@ -175,25 +175,27 @@ class eolfile(object):
 
 include = []
 exclude = []
+self.patterns = []
 for pattern, style in self.cfg.items('patterns'):
 key = style.upper()
 if key == 'BIN':
 exclude.append(pattern)
 else:
 include.append(pattern)
+m = match.match(root, '', [pattern])
+self.patterns.append((pattern, key, m))
 # This will match the files for which we need to care
 # about inconsistent newlines.
 self.match = match.match(root, '', [], include, exclude)
 
 def copytoui(self, ui):
-for pattern, style in self.cfg.items('patterns'):
-key = style.upper()
+for pattern, key, m in self.patterns:
 try:
 ui.setconfig('decode', pattern, self._decode[key], 'eol')
 ui.setconfig('encode', pattern, self._encode[key], 'eol')
 except KeyError:
 ui.warn(_("ignoring unknown EOL style '%s' from %s\n")
-% (style, self.cfg.source('patterns', pattern)))
+% (key, self.cfg.source('patterns', pattern)))
 # eol.only-consistent can be specified in ~/.hgrc or .hgeol
 for k, v in self.cfg.items('eol'):
 ui.setconfig('eol', k, v, 'eol')
@@ -203,10 +205,10 @@ class eolfile(object):
 for f in (files or ctx.files()):
 if f not in ctx:
 continue
-for pattern, style in self.cfg.items('patterns'):
-if not match.match(repo.root, '', [pattern])(f):
+for pattern, key, m in self.patterns:
+if not m(f):
 continue
-target = self._encode[style.upper()]
+target = self._encode[key]
 data = ctx[f].data()
 if (target == "to-lf" and "\r\n" in data
 or target == "to-crlf" and singlelf.search(data)):
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 3] eol: fix variable naming - call it _eolmatch instead of _eolfile

2016-10-09 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476020562 -7200
#  Sun Oct 09 15:42:42 2016 +0200
# Node ID 408ddba4689de432fe77e0ba8d27765b63719180
# Parent  dbcef8918bbdd8a64d9f79a37bcfa284a26f3a39
eol: fix variable naming - call it _eolmatch instead of _eolfile

It is not the file but a match object based on it.

diff --git a/hgext/eol.py b/hgext/eol.py
--- a/hgext/eol.py
+++ b/hgext/eol.py
@@ -305,9 +305,9 @@ def reposetup(ui, repo):
 return eol.match
 
 def _hgcleardirstate(self):
-self._eolfile = self.loadeol([None, 'tip'])
-if not self._eolfile:
-self._eolfile = util.never
+self._eolmatch = self.loadeol([None, 'tip'])
+if not self._eolmatch:
+self._eolmatch = util.never
 return
 
 try:
@@ -344,7 +344,7 @@ def reposetup(ui, repo):
 
 def commitctx(self, ctx, haserror=False):
 for f in sorted(ctx.added() + ctx.modified()):
-if not self._eolfile(f):
+if not self._eolmatch(f):
 continue
 fctx = ctx[f]
 if fctx is None:
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 3 of 3] eol: on update, only re-check files if filtering changed

2016-10-09 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1476021289 -7200
#  Sun Oct 09 15:54:49 2016 +0200
# Node ID 956e0c953cca045fc2bb57867b8c95f123d66841
# Parent  5240a8c9b6289838592f24f4e41a1158a48b951a
eol: on update, only re-check files if filtering changed

Before, update would mark all files as 'normallookup' in dirstate if .hgeol
changed so all files would get the new filtering applied. That takes some time
... and is pointless if the filtering for that file didn't change.

Instead, keep track of the old filtering and only check files where the
filtering is changed.

To keep the old filtering, change to write the applied .hgeol content to
.hg/eol.cache instead of just touching it. That change is backwards/forwards
compatible.

In a real world test, this takes an update that is changing .hgeol and 3
files from 12s to 4s - where the remaining eol overhead is 1-2s.

diff --git a/hgext/eol.py b/hgext/eol.py
--- a/hgext/eol.py
+++ b/hgext/eol.py
@@ -312,10 +312,15 @@ def reposetup(ui, repo):
 self._eolmatch = util.never
 return
 
+oldeol = None
 try:
 cachemtime = os.path.getmtime(self.join("eol.cache"))
 except OSError:
 cachemtime = 0
+else:
+olddata = self.vfs.read("eol.cache")
+if olddata:
+oldeol = eolfile(self.ui, self.root, olddata)
 
 try:
 eolmtime = os.path.getmtime(self.wjoin(".hgeol"))
@@ -324,17 +329,37 @@ def reposetup(ui, repo):
 
 if eolmtime > cachemtime:
 self.ui.debug("eol: detected change in .hgeol\n")
+
+hgeoldata = self.wvfs.read('.hgeol')
+neweol = eolfile(self.ui, self.root, hgeoldata)
+
 wlock = None
 try:
 wlock = self.wlock()
 for f in self.dirstate:
-if self.dirstate[f] == 'n':
-# all normal files need to be looked at
-# again since the new .hgeol file might no
-# longer match a file it matched before
-self.dirstate.normallookup(f)
-# Create or touch the cache to update mtime
-self.vfs("eol.cache", "w").close()
+if self.dirstate[f] != 'n':
+continue
+if oldeol is not None:
+if not oldeol.match(f) and not neweol.match(f):
+continue
+oldkey = None
+for pattern, key, m in oldeol.patterns:
+if m(f):
+oldkey = key
+break
+newkey = None
+for pattern, key, m in neweol.patterns:
+if m(f):
+newkey = key
+break
+if oldkey == newkey:
+continue
+# all normal files need to be looked at again since
+# the new .hgeol file specify a different filter
+self.dirstate.normallookup(f)
+# Write the cache to update mtime and cache .hgeol
+with self.vfs("eol.cache", "w") as f:
+f.write(hgeoldata)
 wlock.release()
 except error.LockUnavailable:
 # If we cannot lock the repository and clear the
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] import: abort instead of crashing when copy source does not exist (issue5375)

2016-10-08 Thread Mads Kiilerich

On 10/08/2016 06:11 PM, Ryan McElroy wrote:

# HG changeset patch
# User Ryan McElroy 
# Date 1475929618 25200
#  Sat Oct 08 05:26:58 2016 -0700
# Node ID 961569a2cbeebfd54c9369c6e36d03d4938aef38
# Parent  dbcef8918bbdd8a64d9f79a37bcfa284a26f3a39
import: abort instead of crashing when copy source does not exist (issue5375)

Previously, when a patch contained a move or copy from a source that did not
exist, `hg import` would crash. This patch changes import to raise a PatchError
with an explanantion of what is wrong with the patch to avoid the stack trace
and bad user experience.

diff --git a/mercurial/patch.py b/mercurial/patch.py
--- a/mercurial/patch.py
+++ b/mercurial/patch.py
@@ -1952,8 +1952,10 @@ def _applydiff(ui, fp, patcher, backend,
  data, mode = None, None
  if gp.op in ('RENAME', 'COPY'):
  data, mode = store.getfile(gp.oldpath)[:2]
-# FIXME: failing getfile has never been handled here
-assert data is not None
+if data is None:
+# This means that the old path does not exist
+raise PatchError(_("source file '%s' does not exist")
+   % gp.oldpath)
  if gp.mode:
  mode = gp.mode
  if gp.op == 'ADD':
diff --git a/tests/test-import.t b/tests/test-import.t
--- a/tests/test-import.t
+++ b/tests/test-import.t
@@ -1793,3 +1793,13 @@ repository when file not found for patch
1 out of 1 hunks FAILED -- saving rejects to file file1.rej
abort: patch failed to apply
[255]
+
+test import crash (issue5375)
+  $ cd ..
+  $ hg init repo
+  $ cd repo
+  $ printf "diff --git a/a b/b\nrename from a\nrename to b" | hg import -
+  applying patch from stdin
+  a not tracked!
+  abort: source file 'a' does not exist
+  [255]


Hmm.

For a patch modifying a non-existing file we already get:


unable to find 'f' for patching
1 out of 1 hunks FAILED -- saving rejects to file f.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working directory

$ cat f.rej
--- f
+++ f
@@ -1,1 +1,2 @@

+f

In the tested case, I would thus also expect it to leave a .rej file 
with the failing rename "hunk" while applying the rest of the patch 
(even though a pure rename arguably *doesn't* have any hunks).


BUT the logic around this check seems wrong. A rename or copy of a 
missing file should be handled exactly the same, no matter if it is a 
bare rename/copy or if it also modifies the file (= has a first hunk).


I don't know if it is better to give a not-entirely-correct abort than 
to fail with an assertion error, but I think it still deserves a 
FIXME/TODO.


(The iterhunks docstring seems wrong, for example regarding 'file' 
entries and firsthunk. Let's go find pmezard!)


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 6 of 6] largefiles: PoC: replace largefile copies in working directory with symlinks

2016-10-07 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1475882684 -7200
#  Sat Oct 08 01:24:44 2016 +0200
# Node ID 36e7e42f266e5cc70535479e50445395402c7400
# Parent  2c25dc4a5a556abbebe09fe3c4eb0ff4c8fa0cd4
largefiles: PoC: replace largefile copies in working directory with symlinks

Updating between distant revisions can be expensive when working on multiple
divergent branches. The basic working directory update might be efficient but
copying the largefiles and hashing them will take a lot of bandwidth and thus
take time.

Usually, largefiles are copied from the store to the working directory to make
sure that any in-place modification of largefiles in the working directory
doesn't corrupt the store.

In a build farm, there will often be a lot of switching between branches, and
the risk of someone doing in-place modification of largefiles is quite small.
It might thus make sense to make a different trade-off.

The idea here is to make a symlink pointing at the storage file instead of
copying it. In some cases, this can make working copy updates NaN times faster.

To reduce the risk of corruption, storage files are made read-only.

To make sure that time-stamp based build systems manage a symlink change
correctly, storage files are touched when they are linked to.

This should also work on Windows IF running as admin or ordinary users have
been authorized to make symlinks (which probably in general not is a good
idea).

This is a proof of concept and doesn't contain or pass any tests. It should
perhaps turn into an extension.

diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
--- a/hgext/largefiles/lfutil.py
+++ b/hgext/largefiles/lfutil.py
@@ -10,6 +10,7 @@
 from __future__ import absolute_import
 
 import copy
+import errno
 import hashlib
 import os
 import platform
@@ -32,6 +33,20 @@ shortnameslash = shortname + '/'
 longname = 'largefiles'
 filechunkitersize = 128 * 1024
 
+if os.name == 'nt':
+import ctypes
+FILESYMLINK = 0
+_kernel32 = ctypes.windll.kernel32
+def symlink(src, dst):
+"""Create a symbolic link pointing to src named dst.
+Requires running as Admin ... or enabling in:
+Computer configuration > Windows Settings > Security Settings >
+Local Policies > User Rights Assignment > Create symbolic links
+"""
+_kernel32.CreateSymbolicLinkA(dst, os.path.normpath(src), FILESYMLINK)
+else:
+symlink = os.symlink
+
 # -- Private worker functions --
 
 def getminsize(ui, assumelfiles, opt, default=10):
@@ -227,18 +242,26 @@ def copyfromcache(repo, hash, filename):
 path = findfile(repo, hash)
 if path is None:
 return False
+# make storage file read-only before pointing at it - reduce risk of
+# corruption
+try:
+os.chmod(path, os.stat(path).st_mode &
+   ~(stat.S_IWUSR|stat.S_IWGRP|stat.S_IWOTH))
+except OSError as e:
+if e.errno != errno.EPERM:
+raise
+# no chmod access probably means that it is safe (but touch will fail)
+repo.ui.debug("can't chmod %s\n" % path)
 wvfs.makedirs(wvfs.dirname(wvfs.join(filename)))
-# The write may fail before the file is fully written, but we
-# don't use atomic writes in the working copy.
-with open(path, 'rb') as srcfd:
-with wvfs(filename, 'wb') as destfd:
-gothash = copyandhash(
-util.filechunkiter(srcfd, filechunkitersize), destfd)
-if gothash != hash:
-repo.ui.warn(_('%s: data corruption in %s with hash %s\n')
- % (filename, path, gothash))
-wvfs.unlink(filename)
-return False
+symlink(path, wvfs.join(filename))
+# touch the storage file so build systems see the file as modified
+try:
+os.utime(path, None) # touch
+except OSError as e:
+if e.errno != errno.EACCES:
+raise
+repo.ui.warn(_("can't touch %s - that might confuse build systems\n")
+ % path)
 return True
 
 def copytostore(repo, rev, file, uploaded=False):
diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
--- a/hgext/largefiles/overrides.py
+++ b/hgext/largefiles/overrides.py
@@ -1392,6 +1392,8 @@ def mergeupdate(orig, repo, node, branch
 lfileabs = repo.wvfs.join(lfile)
 if not repo.wvfs.exists(lfileabs):
 continue
+if repo.wvfs.islink(lfile):
+continue # never update standins from symlinks
 lfhash = lfutil.hashrepofile(repo, lfile)
 standin = lfutil.standin(lfile)
 lfutil.writestandin(repo, standin, lfhash,
diff --git a/hgext/largefiles/reposetup.py b/hgext/largefiles/reposetup.py
--- a/hgext/la

[PATCH 3 of 6] largefiles: use constant file chunk size instead of repeating 128 * 1024

2016-10-07 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1475881181 -7200
#  Sat Oct 08 00:59:41 2016 +0200
# Node ID 5d63b84517ab6ecff8695aff0748bef9695c241c
# Parent  87cea1040403001e660dd1c6b2e2d069d8a51d2e
largefiles: use constant file chunk size instead of repeating 128 * 1024

diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
--- a/hgext/largefiles/lfutil.py
+++ b/hgext/largefiles/lfutil.py
@@ -30,6 +30,7 @@ from mercurial import (
 shortname = '.hglf'
 shortnameslash = shortname + '/'
 longname = 'largefiles'
+filechunkitersize = 128 * 1024
 
 # -- Private worker functions --
 
@@ -371,7 +372,7 @@ def hashfile(file):
 return ''
 hasher = hashlib.sha1('')
 with open(file, 'rb') as fd:
-for data in util.filechunkiter(fd, 128 * 1024):
+for data in util.filechunkiter(fd, filechunkitersize):
 hasher.update(data)
 return hasher.hexdigest()
 
diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
--- a/hgext/largefiles/overrides.py
+++ b/hgext/largefiles/overrides.py
@@ -1356,7 +1356,8 @@ def overridecat(orig, ui, repo, file1, *
   'downloaded')  % lf)
 path = lfutil.usercachepath(repo.ui, hash)
 with open(path, "rb") as fpin:
-for chunk in util.filechunkiter(fpin, 128 * 1024):
+for chunk in util.filechunkiter(fpin,
+lfutil.filechunkitersize):
 fp.write(chunk)
 err = 0
 return err
diff --git a/hgext/largefiles/proto.py b/hgext/largefiles/proto.py
--- a/hgext/largefiles/proto.py
+++ b/hgext/largefiles/proto.py
@@ -134,7 +134,8 @@ def wirereposetup(ui, repo):
 length))
 
 # SSH streams will block if reading more than length
-for chunk in util.filechunkiter(stream, 128 * 1024, length):
+for chunk in util.filechunkiter(stream, lfutil.filechunkitersize,
+length):
 yield chunk
 # HTTP streams must hit the end to process the last empty
 # chunk of Chunked-Encoding so the connection can be reused.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 4 of 6] largefiles: always use filechunkitersize when using filechunkiter

2016-10-07 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1475881182 -7200
#  Sat Oct 08 00:59:42 2016 +0200
# Node ID 8838011d6452cbe31608482cc9f1b81e00fb1ca5
# Parent  5d63b84517ab6ecff8695aff0748bef9695c241c
largefiles: always use filechunkitersize when using filechunkiter

(Alternatively, we could change the filechunkiter default, use the existing
default, or have a custom largefiles filechunkiter wrapper ...)

diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
--- a/hgext/largefiles/lfutil.py
+++ b/hgext/largefiles/lfutil.py
@@ -57,7 +57,7 @@ def link(src, dest):
 # if hardlinks fail, fallback on atomic copy
 with open(src, 'rb') as srcf:
 with util.atomictempfile(dest) as dstf:
-for chunk in util.filechunkiter(srcf):
+for chunk in util.filechunkiter(srcf, filechunkitersize):
 dstf.write(chunk)
 os.chmod(dest, os.stat(src).st_mode)
 
@@ -268,7 +268,7 @@ def copytostoreabsolute(repo, file, hash
 with open(file, 'rb') as srcf:
 with util.atomictempfile(storepath(repo, hash),
  createmode=repo.store.createmode) as dstf:
-for chunk in util.filechunkiter(srcf):
+for chunk in util.filechunkiter(srcf, filechunkitersize):
 dstf.write(chunk)
 linktousercache(repo, hash)
 
@@ -399,7 +399,7 @@ def hexsha1(data):
 """hexsha1 returns the hex-encoded sha1 sum of the data in the file-like
 object data"""
 h = hashlib.sha1()
-for chunk in util.filechunkiter(data):
+for chunk in util.filechunkiter(data, filechunkitersize):
 h.update(chunk)
 return h.hexdigest()
 
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 6] largefiles: use context for file closing

2016-10-07 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1475881181 -7200
#  Sat Oct 08 00:59:41 2016 +0200
# Node ID 87cea1040403001e660dd1c6b2e2d069d8a51d2e
# Parent  96315a5833ed015acb7bd8f6d7f1e38db6fa9c50
largefiles: use context for file closing

Make the code slightly smaller and safer (and more deeply indented).

diff --git a/hgext/largefiles/basestore.py b/hgext/largefiles/basestore.py
--- a/hgext/largefiles/basestore.py
+++ b/hgext/largefiles/basestore.py
@@ -91,15 +91,13 @@ class basestore(object):
 storefilename = lfutil.storepath(self.repo, hash)
 
 tmpname = storefilename + '.tmp'
-tmpfile = util.atomictempfile(tmpname,
-  createmode=self.repo.store.createmode)
-
-try:
-gothash = self._getfile(tmpfile, filename, hash)
-except StoreError as err:
-self.ui.warn(err.longmessage())
-gothash = ""
-tmpfile.close()
+with util.atomictempfile(tmpname,
+createmode=self.repo.store.createmode) as tmpfile:
+try:
+gothash = self._getfile(tmpfile, filename, hash)
+except StoreError as err:
+self.ui.warn(err.longmessage())
+gothash = ""
 
 if gothash != hash:
 if gothash != "":
diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
--- a/hgext/largefiles/lfutil.py
+++ b/hgext/largefiles/lfutil.py
@@ -54,10 +54,10 @@ def link(src, dest):
 util.oslink(src, dest)
 except OSError:
 # if hardlinks fail, fallback on atomic copy
-dst = util.atomictempfile(dest)
-for chunk in util.filechunkiter(open(src, 'rb')):
-dst.write(chunk)
-dst.close()
+with open(src, 'rb') as srcf:
+with util.atomictempfile(dest) as dstf:
+for chunk in util.filechunkiter(srcf):
+dstf.write(chunk)
 os.chmod(dest, os.stat(src).st_mode)
 
 def usercachepath(ui, hash):
@@ -264,11 +264,11 @@ def copytostoreabsolute(repo, file, hash
 link(usercachepath(repo.ui, hash), storepath(repo, hash))
 else:
 util.makedirs(os.path.dirname(storepath(repo, hash)))
-dst = util.atomictempfile(storepath(repo, hash),
-  createmode=repo.store.createmode)
-for chunk in util.filechunkiter(open(file, 'rb')):
-dst.write(chunk)
-dst.close()
+with open(file, 'rb') as srcf:
+with util.atomictempfile(storepath(repo, hash),
+ createmode=repo.store.createmode) as dstf:
+for chunk in util.filechunkiter(srcf):
+dstf.write(chunk)
 linktousercache(repo, hash)
 
 def linktousercache(repo, hash):
@@ -370,10 +370,9 @@ def hashfile(file):
 if not os.path.exists(file):
 return ''
 hasher = hashlib.sha1('')
-fd = open(file, 'rb')
-for data in util.filechunkiter(fd, 128 * 1024):
-hasher.update(data)
-fd.close()
+with open(file, 'rb') as fd:
+for data in util.filechunkiter(fd, 128 * 1024):
+hasher.update(data)
 return hasher.hexdigest()
 
 def getexecutable(filename):
diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
--- a/hgext/largefiles/overrides.py
+++ b/hgext/largefiles/overrides.py
@@ -883,11 +883,8 @@ def hgclone(orig, ui, opts, *args, **kwa
 
 # If largefiles is required for this repo, permanently enable it 
locally
 if 'largefiles' in repo.requirements:
-fp = repo.vfs('hgrc', 'a', text=True)
-try:
+with repo.vfs('hgrc', 'a', text=True) as fp:
 fp.write('\n[extensions]\nlargefiles=\n')
-finally:
-fp.close()
 
 # Caching is implicitly limited to 'rev' option, since the dest repo 
was
 # truncated at that point.  The user may expect a download count with
@@ -1339,30 +1336,28 @@ def overridecat(orig, ui, repo, file1, *
 m.visitdir = lfvisitdirfn
 
 for f in ctx.walk(m):
-fp = cmdutil.makefileobj(repo, opts.get('output'), ctx.node(),
- pathname=f)
-lf = lfutil.splitstandin(f)
-if lf is None or origmatchfn(f):
-# duplicating unreachable code from commands.cat
-data = ctx[f].data()
-if opts.get('decode'):
-data = repo.wwritedata(f, data)
-fp.write(data)
-else:
-hash = lfutil.readstandin(repo, lf, ctx.rev())
-if not lfutil.inusercache(repo.ui, hash):
-store = storefactory.openstore(repo)
-success, missing = store.get([(lf, hash)])
- 

[PATCH 1 of 6] largefiles: when setting/clearing x bit on largefiles, don't change other bits

2016-10-07 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1475881180 -7200
#  Sat Oct 08 00:59:40 2016 +0200
# Node ID 96315a5833ed015acb7bd8f6d7f1e38db6fa9c50
# Parent  1779dde4c9ef97cb242f8d501655f236f66e5439
largefiles: when setting/clearing x bit on largefiles, don't change other bits

It is only the X bit that it matters to copy from the standin to the largefile
in the working directory. While it generally doesn't do any harm to copy the
whole mode, it is also "wrong" to copy more than the X bit we care about. It
can make a difference if someone should try to handle largefiles differently,
such as marking them read-only.

Thus, do similar to what utils.setflags does and set the X bit where there are
R bits and obey umask.

diff --git a/hgext/largefiles/lfcommands.py b/hgext/largefiles/lfcommands.py
--- a/hgext/largefiles/lfcommands.py
+++ b/hgext/largefiles/lfcommands.py
@@ -515,9 +515,13 @@ def updatelfiles(ui, repo, filelist=None
 rellfile = lfile
 relstandin = lfutil.standin(lfile)
 if wvfs.exists(relstandin):
-mode = wvfs.stat(relstandin).st_mode
-if mode != wvfs.stat(rellfile).st_mode:
-wvfs.chmod(rellfile, mode)
+standinexec = wvfs.stat(relstandin).st_mode & 0o100
+st = wvfs.stat(rellfile).st_mode
+if standinexec != st & 0o100:
+st &= ~0o111
+if standinexec:
+st |= (st >> 2) & 0o111 & ~util.umask
+wvfs.chmod(rellfile, st)
 update1 = 1
 
 updated += update1
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 5 of 6] largefiles: always use filechunkiter when iterating files

2016-10-07 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1475881183 -7200
#  Sat Oct 08 00:59:43 2016 +0200
# Node ID 2c25dc4a5a556abbebe09fe3c4eb0ff4c8fa0cd4
# Parent  8838011d6452cbe31608482cc9f1b81e00fb1ca5
largefiles: always use filechunkiter when iterating files

Before, we would sometimes use the default iterator over large files. That
iterator is line based and would add extra buffering and use odd chunk sizes
which could give some overhead.

copyandhash can't just apply a filechunkiter as it sometimes is passed a
genuine generator when downloading remotely.

diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
--- a/hgext/largefiles/lfutil.py
+++ b/hgext/largefiles/lfutil.py
@@ -232,7 +232,8 @@ def copyfromcache(repo, hash, filename):
 # don't use atomic writes in the working copy.
 with open(path, 'rb') as srcfd:
 with wvfs(filename, 'wb') as destfd:
-gothash = copyandhash(srcfd, destfd)
+gothash = copyandhash(
+util.filechunkiter(srcfd, filechunkitersize), destfd)
 if gothash != hash:
 repo.ui.warn(_('%s: data corruption in %s with hash %s\n')
  % (filename, path, gothash))
diff --git a/hgext/largefiles/localstore.py b/hgext/largefiles/localstore.py
--- a/hgext/largefiles/localstore.py
+++ b/hgext/largefiles/localstore.py
@@ -10,6 +10,7 @@
 from __future__ import absolute_import
 
 from mercurial.i18n import _
+from mercurial import util
 
 from . import (
 basestore,
@@ -42,7 +43,8 @@ class localstore(basestore.basestore):
 raise basestore.StoreError(filename, hash, self.url,
 _("can't get file locally"))
 with open(path, 'rb') as fd:
-return lfutil.copyandhash(fd, tmpfile)
+return lfutil.copyandhash(
+util.filechunkiter(fd, lfutil.filechunkitersize), tmpfile)
 
 def _verifyfiles(self, contents, filestocheck):
 failed = False
diff --git a/hgext/largefiles/remotestore.py b/hgext/largefiles/remotestore.py
--- a/hgext/largefiles/remotestore.py
+++ b/hgext/largefiles/remotestore.py
@@ -118,7 +118,7 @@ class remotestore(basestore.basestore):
 raise NotImplementedError('abstract method')
 
 def _get(self, hash):
-'''Get file with the given hash from the remote store.'''
+'''Get a iterator for content with the given hash.'''
 raise NotImplementedError('abstract method')
 
 def _stat(self, hashes):
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 2 of 2] push: introduce --readonly to make push behave more like pull

2016-08-23 Thread Mads Kiilerich

On 08/23/2016 07:02 PM, Gregory Szorc wrote:
On Mon, Aug 22, 2016 at 6:14 PM, Mads Kiilerich <mailto:m...@kiilerich.com>> wrote:



Push has some advantages:
* I can use revsets to control what is pushed.
* Pushing is "the right model" for propagating changes -
polling/pulling is bad.
* For a master server with "slaves" it also has some advantages
that the master will call out to the slaves instead of allowing
them to connect.
* It's so weird to creating a ssh connection to a remote server
and pass the result of some local revset query, just to let it
connect back and make a pull.


And I think a pull based model is more robust for replication. And I 
think a pull based model where clients consume a log of "pull 
requests" that is asynchronous to the push operation is even more robust.


Right now, I would agree. Pull is more stable than push because of the 
hardcoded policy that pull is read only so there can be multiple 
parallel pulls from a repo, but only one locking push at a time from a 
repo. I want to fix that.


(Btw: AFAIK, that "scalability regression" came when phases were 
introduced. I want to give an option for working around that.)


If you are running async out of a queue anyway and have this read-only 
push, then I doubt there will be much significant difference in 
robustness between initiating the "exchange" from the master and from 
the remove server. So sure, if you want more logic than just the plain 
hgweb on the remote servers and already have a reliable distributed 
secure message queue, then pulling is just as fine as pushing (except 
for the missing revsets).


No single size fits all, and even though our setups are similar, they 
also have differences. I would love to look more into your solution. But 
now, I just need this missing piece to make the next step / improvement 
in our setup.


/Mads

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 2 of 2] push: introduce --readonly to make push behave more like pull

2016-08-23 Thread Mads Kiilerich

On 08/23/2016 03:49 PM, Augie Fackler wrote:

I don't know if this feature I'm asking for most correctly is described as
non-locking and thus readonly and not modifying state or publishing, or if
it non-publishing and thus readonly and with no need for locking. It could
be described either way - I just want one of them ;-)

I think the latter is more sensible from a user perspective - the user
doesn't (ideally) know about or care about our locking requirements,
so the visible side effect to them (no phase changes) seems like the
thing to focus on from a UI level.


It is however (in the situation) very visible and annoying for the user 
that push is locking the source repo.


For many use cases, phase change (publish on push) is really not 
relevant. For example when you are propagating a mainline / upstream 
repo to a mirror and everything already has been published.


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 2 of 2] push: introduce --readonly to make push behave more like pull

2016-08-22 Thread Mads Kiilerich

On 08/22/2016 04:25 PM, Augie Fackler wrote:

On Sun, Aug 21, 2016 at 01:33:14AM +0200, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1471735620 -7200
#  Sun Aug 21 01:27:00 2016 +0200
# Branch stable
# Node ID 48f4ae36a49929c2389bfb75064f96d16159b2d0
# Parent  7880f56ca7495a2d0365a8280935eccd1e63f0c8
push: introduce --readonly to make push behave more like pull

Hm, interesting. To clarify the use case, you've got two local
repositories, and you want to move some draft changes between them,
but because (for example) push allows a revset, it's easier to push
than to pull?

So the goal here is to "push without publishing", and if the
destination supports drafts, you want to push the draft changes as
drafts, otherwise fail? Is that right?


My primary use case is to have a push hook that will push to other 
repositories - some of them local, some of them remote.


Push has some advantages:
* I can use revsets to control what is pushed.
* Pushing is "the right model" for propagating changes - polling/pulling 
is bad.
* For a master server with "slaves" it also has some advantages that the 
master will call out to the slaves instead of allowing them to connect.
* It's so weird to creating a ssh connection to a remote server and pass 
the result of some local revset query, just to let it connect back and 
make a pull.


But the problems are:
* The primary problem is the write lock that prevents multiple 
simultaneous pushes. Sudoing to another user is a odd workaround.
* When I'm pushing I'm in charge. Then it is "wrong" that it is 
hardcoded that the remote server decides how the push modifies state on 
my server.


These problems can be addressed with this flag.


I wonder if we should have --[no-]publish flags on push, wherein the
--no-publish form would make it easy to "push these draft changes but
only if it doesn't publish them" and the --publish form would make it
easier to "push and publish these, but don't make them public unless
the push succeeds." Thoughts?


Perhaps ... but that is also the opposite of what I'm asking for and 
would not solve my use case, right? I want to push as if the receiving 
repo was non-publishing. Also, I want it to not lock.


I don't know if this feature I'm asking for most correctly is described 
as non-locking and thus readonly and not modifying state or publishing, 
or if it non-publishing and thus readonly and with no need for locking. 
It could be described either way - I just want one of them ;-)


/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 2] tests: add more test coverage of phase changes when pushing

2016-08-20 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1471734720 -7200
#  Sun Aug 21 01:12:00 2016 +0200
# Branch stable
# Node ID 7880f56ca7495a2d0365a8280935eccd1e63f0c8
# Parent  5004ef47f437b3bafdd3800551c638a0bf4bb99b
tests: add more test coverage of phase changes when pushing

Prepare for test coverage of phase updates with future push --readonly option,
both with and without actually pushing changesets.

diff --git a/tests/test-phases-exchange.t b/tests/test-phases-exchange.t
--- a/tests/test-phases-exchange.t
+++ b/tests/test-phases-exchange.t
@@ -1189,10 +1189,40 @@ 2. Test that failed phases movement are 
   cannot lock source repo, skipping local public phase update
   [1]
   $ chmod -R +w .hg
-  $ hgph Upsilon
 
   $ cd ..
 
-  $ killdaemons.py
+#endif
 
-#endif
+Test that clone behaves like pull and doesn't
+publish changesets as plain push does
+
+  $ hg -R Upsilon phase -q --force --draft 2
+  $ hg clone -q Upsilon Pi -r 7
+  $ hgph Upsilon -r 'min(draft())'
+  o  2 draft a-C - 54acac6f23ab
+  |
+  ~
+
+  $ hg -R Upsilon push Pi -r 7
+  pushing to Pi
+  searching for changes
+  no changes found
+  [1]
+  $ hgph Upsilon -r 'min(draft())'
+  o  8 draft a-F - b740e3e5c05d
+  |
+  ~
+
+  $ hg -R Upsilon push Pi -r 8
+  pushing to Pi
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files
+
+  $ hgph Upsilon -r 'min(draft())'
+  o  9 draft a-G - 3e27b6f1eee1
+  |
+  ~
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 2] push: introduce --readonly to make push behave more like pull

2016-08-20 Thread Mads Kiilerich
# HG changeset patch
# User Mads Kiilerich 
# Date 1471735620 -7200
#  Sun Aug 21 01:27:00 2016 +0200
# Branch stable
# Node ID 48f4ae36a49929c2389bfb75064f96d16159b2d0
# Parent  7880f56ca7495a2d0365a8280935eccd1e63f0c8
push: introduce --readonly to make push behave more like pull

Push and pull are almost symmetric ... except when they are not.

For example, push makes it possible to use a revset to specify the revisions to
push. Push can also be more suitable for distributing changes to multiple
mirrors than pull is. Finally, push will by default publish draft changesets.

Often, it is correct and desirable that push tries to update phases and thus
locks the repository while pushing ... put that also means that there can't be
multiple concurrent pushes and that pushing from non-publishing repositories
depends on correct configuration of the target repository.

We thus introduce --readonly to prevent push from locking the source repository
and thus prevent it from publishing. Push with --readonly is thus more
symmetric to pull.

diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -5878,6 +5878,7 @@ def pull(ui, repo, source="default", **o
 ('b', 'branch', [],
  _('a specific branch you would like to push'), _('BRANCH')),
 ('', 'new-branch', False, _('allow pushing a new branch')),
+('', 'readonly', False, _("no source repo locking or phase update")),
 ] + remoteopts,
 _('[-f] [-r REV]... [-e CMD] [--remotecmd CMD] [DEST]'))
 def push(ui, repo, dest=None, **opts):
@@ -5886,8 +5887,9 @@ def push(ui, repo, dest=None, **opts):
 Push changesets from the local repository to the specified
 destination.
 
-This operation is symmetrical to pull: it is identical to a pull
-in the destination repository from the current one.
+This operation is almost symmetrical to pull: it is identical to a pull
+in the destination repository from the current one, except that pushed
+changesets in draft phase are made public unless --readonly is used.
 
 By default, push will not allow creation of new heads at the
 destination, since multiple heads would make it unclear which head
@@ -5969,7 +5971,8 @@ def push(ui, repo, dest=None, **opts):
 pushop = exchange.push(repo, other, opts.get('force'), revs=revs,
newbranch=opts.get('new_branch'),
bookmarks=opts.get('bookmark', ()),
-   opargs=opts.get('opargs'))
+   opargs=opts.get('opargs'),
+   readonly=opts.get('readonly'))
 
 result = not pushop.cgresult
 
diff --git a/mercurial/exchange.py b/mercurial/exchange.py
--- a/mercurial/exchange.py
+++ b/mercurial/exchange.py
@@ -275,7 +275,7 @@ class pushoperation(object):
 """
 
 def __init__(self, repo, remote, force=False, revs=None, newbranch=False,
- bookmarks=()):
+ bookmarks=(), readonly=False):
 # repo we push from
 self.repo = repo
 self.ui = repo.ui
@@ -289,6 +289,8 @@ class pushoperation(object):
 self.bookmarks = bookmarks
 # allow push of new branch
 self.newbranch = newbranch
+# no locking of source repo and no publishing phase update
+self.readonly = readonly
 # did a local lock get acquired?
 self.locallocked = None
 # step already performed
@@ -379,7 +381,7 @@ bookmsgmap = {'update': (_("updating boo
 
 
 def push(repo, remote, force=False, revs=None, newbranch=False, bookmarks=(),
- opargs=None):
+ opargs=None, readonly=False):
 '''Push outgoing changesets (limited by revs) from a local
 repository to remote. Return an integer:
   - None means nothing to push
@@ -391,7 +393,7 @@ def push(repo, remote, force=False, revs
 if opargs is None:
 opargs = {}
 pushop = pushoperation(repo, remote, force, revs, newbranch, bookmarks,
-   **opargs)
+   readonly, **opargs)
 if pushop.remote.local():
 missing = (set(pushop.repo.requirements)
- pushop.remote.local().supported)
@@ -413,23 +415,26 @@ def push(repo, remote, force=False, revs
 raise error.Abort(_("destination does not support push"))
 # get local lock as we might write phase data
 localwlock = locallock = None
-try:
-# bundle2 push may receive a reply bundle touching bookmarks or other
-# things requiring the wlock. Take it now to ensure proper ordering.
-maypushback = pushop.ui.configbool('experimental', 'bundle2.pushback')
-if _canusebundle2(pushop)

Re: [PATCH 3 of 5] branchmap: acquires lock before writting the rev branch cache

2016-08-09 Thread Mads Kiilerich
On Sun, Aug 7, 2016 at 1:21 PM, Yuya Nishihara  wrote:

> On Sat, 06 Aug 2016 01:02:55 +0200, Pierre-Yves David wrote:
> > # HG changeset patch
> > # User Pierre-Yves David 
> > # Date 1470401836 -7200
> > #  Fri Aug 05 14:57:16 2016 +0200
> > # Node ID 84885aeec2e442d18dcb70fcce2d65dd9fafbf91
> > # Parent  fa3f05a4219c4ea6b9bc9682f258e4418b36c265
> > # EXP-Topic vfsward
> > branchmap: acquires lock before writting the rev branch cache
> >
> > We now attempt to acquire a lock and write the branch cache within that
> lock.
> > This would prevent cache corruption when multiple processes try to write
> the cache
> > at the same time.
>
> (+CC Durham, Mads)
>
> I wonder if revbranchcache is designed to not take a lock. 9347c15d8136
> says
> corruption can be recovered.
>

Yes, I think it was given as an initial design constraint for acceptance
upstream that the cache had to be updated on the fly by read only
operations without locking. Hindsight perspective might have changed mine
and others memory and opinion ;-)

In my opinion, the revision branch cache should be treated exactly like the
branch head cache and updated the same way. It might be expensive to update
initially, but not doing it will end up being more expensive.

Proper transaction support would however also require that the design
constraint of doing in-place (non-atomic non-append) updates got lifted.

The existing algorithm is not perfect lockfree, but I doubt it was possible
to come up with a test case where simultaneous writes would give a
corruption that not would be detected and recovered automatically.

This change looks like a fine small improvement that will reduce the risk.
But still, like most other write locks in Mercurial, it has the problem
that data is read and checked before taking the lock and then "leaks" into
the locked zone.

I wonder but haven't investigated: How will this handle the case where the
repo can't be locked because it is on a read only filesystem? Will it get
LockHeld (which would be a misleading exception name) and skip the writing?

/Mads
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


<    1   2