D8024: profiling: flush stdout before writing profile to stderr

2020-01-27 Thread spectral (Kyle Lippincott)
spectral created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  On py3, stdout and stderr appear to be buffered and this causes my command's
  output to be intermixed with the profiling output.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8024

AFFECTED FILES
  mercurial/profiling.py

CHANGE DETAILS

diff --git a/mercurial/profiling.py b/mercurial/profiling.py
--- a/mercurial/profiling.py
+++ b/mercurial/profiling.py
@@ -186,6 +186,7 @@
 self._output = None
 self._fp = None
 self._fpdoclose = True
+self._flushfp = None
 self._profiler = None
 self._enabled = enabled
 self._entered = False
@@ -246,6 +247,8 @@
 else:
 self._fpdoclose = False
 self._fp = self._ui.ferr
+# Ensure we've flushed fout before writing to ferr.
+self._flushfp = self._ui.fout
 
 if proffn is not None:
 pass
@@ -265,6 +268,7 @@
 def __exit__(self, exception_type, exception_value, traceback):
 propagate = None
 if self._profiler is not None:
+self._uiflush()
 propagate = self._profiler.__exit__(
 exception_type, exception_value, traceback
 )
@@ -280,3 +284,7 @@
 def _closefp(self):
 if self._fpdoclose and self._fp is not None:
 self._fp.close()
+
+def _uiflush(self):
+if self._flushfp:
+self._flushfp.flush()



To: spectral, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8023: chg: read CHGORIG_ values from env and handle these appropriately

2020-01-27 Thread spectral (Kyle Lippincott)
spectral created this revision.
Herald added subscribers: mercurial-devel, mjpieters.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Python3.7+ will "coerce" the locale (specifically LC_CTYPE) in many 
situations,
  and this can cause chg to not start. My previous fix in D7550 
 did not cover all
  situations correctly, but hopefully this fix does.
  
  The C side of chg will set CHGORIG_LC_CTYPE in its environment before starting
  the command server and before calling setenv on the command server.
  
  When calculating the environment hash, we use the value from CHGORIG_LC_CTYPE 
to
  calculate the hash - intentionally ignoring the modifications that Python may
  have done during command server startup.
  
  When chg calls setenv on the command server, the command server will see
  CHGORIG_LC_CTYPE in the environment-to-set, and NOT modify LC_CTYPE to be the
  same as in the environment-to-set. This preserves the modifications that 
Python
  has done during startup. We'll still calculate the hash using the
  CHGORIG_LC_CTYPE variables appropriately, so we'll detect environment changes
  (even if they don't cause a change in the actual value). Example:
  
  - LC_CTYPE=invalid_1 chg cmd
- Py3.7 sets LC_CTYPE=C.UTF-8 on Linux
- CHGORIG_LC_CTYPE=1invalid_1
- Environment hash is as-if 'LC_CTYPE=invalid_1', even though it really is 
LC_CTYPE=C.UTF-8
  - LC_CTYPE=invalid_2 chg cmd
- Connect to the existing server, call setenv
- Calculate hash as-if 'LC_CTYPE=invalid_2', even though it is identical to 
the other command server (C.UTF-8)
  
  This isn't a huge issue in practice. It can cause two separate command servers
  that are functionally identical to be executed. This should not be considered 
an
  observable/intentional effect, and is something that may change in the future.
  
  This is hopefully a more future-proof fix than the original one in D7550 
: we
  won't have to worry about behavior changes (or incorrect readings of the 
current
  behavior) in future versions of Python. If more environment variables end up
  being modified, it's a simple one line fix in chg.c to also preserve those.
  
  Important Caveat: if something causes one of these variables to change 
*inside*
  the hg serve process, we're going to end up persisting that value. Example:
  
  - Command server starts up, Python sets LC_CTYPE=C.UTF-8
  - Some extension sets LC_CTYPE=en_US.UTF-8 in the environment
  - The next invocation of chg will call setenv, saying via CHGORIG_LC_CTYPE 
that the variable should not be in the environment
  - chgserver.py will preserve LC_CTYPE=en_US.UTF-8
  
  This is quite unlikely and would previously have caused a different problem:
  
  - Command server starts up, let's assume py2 and so LC_CTYPE is unmodified
  - Some extension sets LC_CTYPE=en_US.UTF-8 in the environment
  - The next invocation of chg will call setenv, saying LC_CTYPE shouldn't be 
in the environment
  - chgserver.py will say that the environment hash doesn't match and redirect 
chg to a new server
  - chg will create that server and use that, but it'll have an identical hash 
to the previous one (since at startup LC_CTYPE isn't modified by the extension 
yet). This should be fine, it'll then run the command like normal.
  - Every time chg is run, it restarts the command server due to this issue, 
slowing everything down :)

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8023

AFFECTED FILES
  mercurial/chgserver.py
  tests/test-chg.t

CHANGE DETAILS

diff --git a/tests/test-chg.t b/tests/test-chg.t
--- a/tests/test-chg.t
+++ b/tests/test-chg.t
@@ -332,8 +332,7 @@
   /MM/DD HH:MM:SS (PID)> log -R cached
   /MM/DD HH:MM:SS (PID)> loaded repo into cache: $TESTTMP/cached (in  ...s)
 
-Test that chg works even when python "coerces" the locale (py3.7+, which is 
done
-by default if none of LC_ALL, LC_CTYPE, or LANG are set in the environment)
+Test that chg works even when python "coerces" the locale (py3.7+)
 
   $ cat > $TESTTMP/debugenv.py < from mercurial import encoding
@@ -345,11 +344,31 @@
   > for k in [b'LC_ALL', b'LC_CTYPE', b'LANG']:
   > v = encoding.environ.get(k)
   > if v is not None:
-  > ui.write(b'%s=%s\n' % (k, encoding.environ[k]))
+  > ui.write(b'debugenv: %s=%s\n' % (k, encoding.environ[k]))
   > EOF
-  $ LANG= LC_ALL= LC_CTYPE= chg \
-  >--config extensions.debugenv=$TESTTMP/debugenv.py debugenv
-  LC_ALL=
-  LC_CTYPE=C.UTF-8 (py37 !)
-  LC_CTYPE= (no-py37 !)
-  LANG=
+  $ CHGDEBUG= LANG= LC_ALL= LC_CTYPE= chg \
+  >--config extensions.debugenv=$TESTTMP/debugenv.py debugenv 2>&1 \
+  >| egrep 'debugenv|start'
+  chg: debug: * start cmdserver at * (glob)
+  debugenv: LC_ALL=
+  debugenv: LC_CTYPE=C.UTF-8 (py37 !)
+  debugenv: LC_CTYPE= (no-py37 !)
+  debugenv: LANG=
+(Should not trigger a command 

D8021: chg: switch to using global `environ` instead of envp from main

2020-01-27 Thread spectral (Kyle Lippincott)
spectral created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  My followup patch wants to modify the environment before we call setenv on the
  chg server process (to add items), and that won't be handled properly if we 
use
  envp (which isn't modified by putenv).

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8021

AFFECTED FILES
  contrib/chg/chg.c
  contrib/chg/hgclient.c
  contrib/chg/hgclient.h

CHANGE DETAILS

diff --git a/contrib/chg/hgclient.h b/contrib/chg/hgclient.h
--- a/contrib/chg/hgclient.h
+++ b/contrib/chg/hgclient.h
@@ -25,6 +25,6 @@
   size_t argsize);
 int hgc_runcommand(hgclient_t *hgc, const char *const args[], size_t argsize);
 void hgc_attachio(hgclient_t *hgc);
-void hgc_setenv(hgclient_t *hgc, const char *const envp[]);
+void hgc_setenv(hgclient_t *hgc);
 
 #endif /* HGCLIENT_H_ */
diff --git a/contrib/chg/hgclient.c b/contrib/chg/hgclient.c
--- a/contrib/chg/hgclient.c
+++ b/contrib/chg/hgclient.c
@@ -26,6 +26,8 @@
 #include "procutil.h"
 #include "util.h"
 
+extern char **environ;
+
 enum { CAP_GETENCODING = 0x0001,
CAP_RUNCOMMAND = 0x0002,
/* cHg extension: */
@@ -644,12 +646,12 @@
  * @param envp  list of environment variables in "NAME=VALUE" format,
  *  terminated by NULL.
  */
-void hgc_setenv(hgclient_t *hgc, const char *const envp[])
+void hgc_setenv(hgclient_t *hgc)
 {
-   assert(hgc && envp);
+   assert(hgc && environ);
if (!(hgc->capflags & CAP_SETENV)) {
return;
}
-   packcmdargs(>ctx, envp, /*argsize*/ -1);
+   packcmdargs(>ctx, (const char**)environ, /*argsize*/ -1);
writeblockrequest(hgc, "setenv");
 }
diff --git a/contrib/chg/chg.c b/contrib/chg/chg.c
--- a/contrib/chg/chg.c
+++ b/contrib/chg/chg.c
@@ -30,6 +30,8 @@
 #define PATH_MAX 4096
 #endif
 
+extern char **environ;
+
 struct cmdserveropts {
char sockname[PATH_MAX];
char initsockname[PATH_MAX];
@@ -394,7 +396,7 @@
abortmsgerrno("failed to exec original hg");
 }
 
-int main(int argc, const char *argv[], const char *envp[])
+int main(int argc, const char *argv[])
 {
if (getenv("CHGDEBUG"))
enabledebugmsg();
@@ -429,7 +431,7 @@
hgc = connectcmdserver();
if (!hgc)
abortmsg("cannot open hg client");
-   hgc_setenv(hgc, envp);
+   hgc_setenv(hgc);
const char **insts = hgc_validate(hgc, argv + 1, argc - 1);
int needreconnect = runinstructions(, insts);
free(insts);



To: spectral, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8022: chg: pass copies of some envvars so we can detect py37+ modifications

2020-01-27 Thread spectral (Kyle Lippincott)
spectral created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Python3.7+ will "coerce" some variables in the environment to different 
values.
  Currently this is just LC_CTYPE, which will be coerced to the first of
  ["C.UTF-8", "C.utf8", "UTF-8"] that works on the platform if LC_CTYPE is
  interpreted as "C" (such as by being set to "C", or having LC_ALL set to "C" 
or
  other mechanisms) or is set to an invalid value for the platform (such as
  "LC_CTYPE=UTF-8" on Linux).
  
  This is a second attempt at the fix I did in D7550 
. That fix was based off of an
  incorrect reading of the code in Python - I had misinterpreted a #if block 
that
  dealt with Android as applying in all situations, and missed some cases where
  this would happen and cause issues.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8022

AFFECTED FILES
  contrib/chg/chg.c

CHANGE DETAILS

diff --git a/contrib/chg/chg.c b/contrib/chg/chg.c
--- a/contrib/chg/chg.c
+++ b/contrib/chg/chg.c
@@ -203,6 +203,39 @@
return hgcmd;
 }
 
+static void putsidechannelenv(const char *envname)
+{
+   char *buf, *bufp;
+   const char *existing_val = getenv(envname);
+   size_t existing_val_len = 0;
+   size_t envname_len = strlen(envname);
+   size_t buf_len;
+
+   if (existing_val != NULL) {
+   existing_val_len = strlen(existing_val);
+   }
+
+/* 11: 8 for CHGORIG_, 1 for the =, 1 for the "exists" flag, 1 for the
+ * nul */
+   buf_len = envname_len + existing_val_len + 11;
+   bufp = buf = mallocx(buf_len);
+   strlcpy(bufp, "CHGORIG_", 8);
+   bufp += 8;  /* strlen("CHGORIG_") */
+   strlcpy(bufp, envname, envname_len);
+   bufp += envname_len;
+   *bufp++ = '=';
+   *bufp++ = (existing_val == NULL) ? '0' : '1';
+   if (existing_val != NULL && existing_val_len > 0) {
+   strlcpy(bufp, existing_val, existing_val_len);
+   bufp += existing_val_len;
+   }
+   *bufp = '\0';
+   if (putenv(buf) != 0) {
+   abortmsgerrno("failed to putenv for stored vars");
+}
+/* don't free `buf`, putenv does NOT copy the string! */
+}
+
 static void execcmdserver(const struct cmdserveropts *opts)
 {
const char *hgcmd = gethgcmd();
@@ -410,6 +443,14 @@
 "wrapper to chg. Alternatively, set $CHGHG to the "
 "path of real hg.");
 
+   char *coerce = getenv("PYTHONCOERCECLOCALE");
+   if (coerce == NULL || coerce[0] != '0') {
+   /* Py3 modifies the environment on execution, we need a side
+* channel to get an unmodified environment to `hg serve` for
+* some variables */
+   putsidechannelenv("LC_CTYPE");
+   }
+
if (isunsupported(argc - 1, argv + 1))
execoriginalhg(argv);
 



To: spectral, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7450: packaging: add support for PyOxidizer

2020-01-27 Thread mharbison72 (Matt Harbison)
mharbison72 added a comment.


  In D7450#118284 , @martinvonz 
wrote:
  
  > I'd be very happy to queue this as soon as @mharbison72's request for 
Windows has been addressed (I think that was all, right?).
  
  Yes.
  
  I'm still curious about the *.py-as-resources issue[1], but don't want this 
held up.
  
  [1] https://phab.mercurial-scm.org/D7450#118094

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7450/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7450

To: indygreg, #hg-reviewers, mharbison72
Cc: marmoute, mharbison72, martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: Fixing default behavior of bookmarks.

2020-01-27 Thread Valentin Gatien-Baron
Hi,

I've finally found some time to reply to this.

I couldn't find a place that explains concisely the behavior of
bookmarks, which I thought would be useful to think through this.  I
included some notes [1], if that's useful to someone else.

A tiny bit of context on our use of bookmarks. People put bookmarks on
the tip of their work in progress, and push them to a central server.
So that server sees a lot of bookmarks come, move, and go. People
manipulate not just the bookmarks they create, but also regularly
bookmarks holding work they have to review.


I don't think I have seen the problem you linked to about `hg update`,
but the description is terse, so I don't think I understand what is
the series of steps that leads to the bad behavior.

The problems we do get are as follows, from roughly most problematic to
least problematic (if we didn't work around them).

No way to request that a push must push a bookmark (or fail). `push
-r` pushes the revision pointed to by the bookmark (and may or may not
send the bookmark too), `push -B` overwrites whatever is on the other
side. This is https://phab.mercurial-scm.org/D6776 .
We work around that one by making `push -r bookmark` fail if the
bookmark can't be pushed, but I'd certainly prefer to have something
nice in hg itself.

`pull` creates an accumulation of bookmark locally: if a remote
contains bookmarks {a} at time t0, {b} at time t1, {c} at time t2
(which is expected for short lived bookmarks), any repo that pulled at
t0, t1 and t2 will have bookmarks {a, b, c} locally.
For people, this also means that if a bookmark moved sideways in the
remote, any pull from now on gets a message "created divergent
bookmark foo@default", whether they know anything about foo or not.
For tools (CI, code review, backups), this is merely awkward: you can
workaround this by replacing "hg pull" by something like "set -o
pipefail; rm -f .hg/bookmarks; hg pull | { grep -v "added bookmark" ||
true; }".
We work around this by having pull delete extra bookmarks
(specifically, any local bookmark that points to a non outgoing
revision). This probably only works in a centralized setup.

This notion of current-bookmark (as opposed to active-bookmark) is
awkward. `hg pull` interferes with `hg commit` for instance, because
`hg pull` may change the active bookmark into being merely current,
which `commit` does not consider. We have to account for this when we
pull in people's repositories, and in particular this behavior makes
me think that [share -B] (or the bookmarks-in-store requirement)
should be error prone (pull in one share, mess up commit in other
share).
No workaround there. I've contemplated making hg consider that the
active bookmark is at `wdir()` instead of `p1()`, such that pull
cannot move the active bookmark forward. That may be an improvement,
but wouldn't help with `share -B`. I think `git` handles this by
making it so that, in hg terminology, the clone knows about the active
bookmark of all shares.

`push` moves all bookmarks that can be moved forward. I don't know if
this is for consistency with `push` pushing all the heads by default,
but it sounds like a recipe for pushing unintended things.
We only push the active bookmark or bookmarks passed at the command
line (and we don't push all heads either).

Pushing a bookmark but no commits results in hg printing "no changes
found" and exiting code 1 (which is not specific to bookmarks, pushing
merely phases or obsolescence markers behaves the same). This confuses
people regularly (we may have some changes to this behavior, but I
think it's confusing even as described here). The exit 1 is annoying
for automation (can't trust that exit 1 means no commit, as rejections
from server hooks can also result in exit 1 IIRC).

In `pull` or `clone`, the "added this bookmark" or "updating that
bookmark" prints are extremely spammy (with many bookmarks, and
frequent changes), whether interactively or in logs of tools.
We remove all these prints (and in fact others like "searching for
changes" or "adding changesets").


Valentin




[1]
My notes about what hg does.

- hg pull
  - add all remote-only bookmarks locally
  - for bookmarks on both sides
- if remote rev is not pulled, do nothing
- if remote rev is pulled and is ahead of local, advance bookmark
- otherwise create conflicting bookmark
  - for local-only bookmarks, do nothing
  - the active bookmark may be advanced, becoming current-but-not-active
  - print the state of each bookmark
- hg commit
  - advances active bookmark
  - sometimes delete conflicting bookmark, on merge conflict
- hg merge
  - without -r, try to merge with the @default bookmark if any
- hg clone
  - copies all the bookmarks
  - print the state of each bookmark
- hg share
  - does not copy the bookmarks
  - but with -B, all shares share the bookmarks (or similarly, the
option to make bookmarks be part of the store)
- hg push
  - for all bookmarks that exist on both sides, if local rev 

D7450: packaging: add support for PyOxidizer

2020-01-27 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  I'd be very happy to queue this as soon as @mharbison72's request for Windows 
has been addressed (I think that was all, right?). We (Google) would like to 
use it for macOS packaging too. We may also end up using it on Linux so we 
don't break when the system `python3` changes from 3.7 of 3.8 (because we 
currently install `.so` files specific to a minor Python version, IIUC). Thanks 
a lot for your work on this, @indygreg. And thanks for reviewing and testing 
it, @mharbison72.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7450/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7450

To: indygreg, #hg-reviewers, mharbison72
Cc: marmoute, mharbison72, martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8020: rust: remove an unnecessary set of parentheses

2020-01-27 Thread martinvonz (Martin von Zweigbergk)
martinvonz created this revision.
Herald added subscribers: mercurial-devel, kevincox.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  My build complained about this. I guess it started after I upgraded
  rustc.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8020

AFFECTED FILES
  rust/hg-core/src/dirstate/status.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/dirstate/status.rs 
b/rust/hg-core/src/dirstate/status.rs
--- a/rust/hg-core/src/dirstate/status.rs
+++ b/rust/hg-core/src/dirstate/status.rs
@@ -272,7 +272,7 @@
 
 pub fn status<'a: 'c, 'b: 'c, 'c>(
 dmap: &'a DirstateMap,
-matcher: &'b (impl Matcher),
+matcher: &'b impl Matcher,
 root_dir: impl AsRef + Sync + Send + Copy,
 list_clean: bool,
 last_normal_time: i64,



To: martinvonz, #hg-reviewers
Cc: kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8019: rust-node: avoid meaningless read at the end of odd prefix

2020-01-27 Thread gracinet (Georges Racinet)
gracinet created this revision.
Herald added subscribers: mercurial-devel, kevincox.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This should be heavily factored out by the CPU branch predictor
  anyway.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8019

AFFECTED FILES
  rust/hg-core/src/revlog/node.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/node.rs b/rust/hg-core/src/revlog/node.rs
--- a/rust/hg-core/src/revlog/node.rs
+++ b/rust/hg-core/src/revlog/node.rs
@@ -223,6 +223,7 @@
 /// This is also the `i`th hexadecimal digit in numeric form,
 /// also called a [nybble](https://en.wikipedia.org/wiki/Nibble).
 pub fn get_nybble(, i: usize) -> u8 {
+assert!(i < self.len());
 get_nybble(self.buf, i)
 }
 }



To: gracinet, #hg-reviewers
Cc: kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7794: rust-nodemap: generic NodeTreeVisitor

2020-01-27 Thread gracinet (Georges Racinet)
Closed by commit rHG796d05f3fa84: rust-nodemap: generic NodeTreeVisitor 
(authored by gracinet).
gracinet marked an inline comment as done.
This revision was automatically updated to reflect the committed changes.
This revision was not accepted when it landed; it landed in state "Needs 
Review".

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7794?vs=19634=19647

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7794/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7794

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -272,17 +272,82 @@
 ,
 prefix: NodePrefixRef<'p>,
 ) -> Result, NodeMapError> {
-let mut visit = self.len() - 1;
-for i in 0..prefix.len() {
-let nybble = prefix.get_nybble(i);
-match self[visit].get(nybble) {
-Element::None => return Ok(None),
-Element::Rev(r) => return Ok(Some(r)),
-Element::Block(idx) => visit = idx,
+for visit_item in self.visit(prefix) {
+if let Some(opt) = visit_item.final_revision() {
+return Ok(opt);
 }
 }
 Err(NodeMapError::MultipleResults)
 }
+
+fn visit<'n, 'p>(
+&'n self,
+prefix: NodePrefixRef<'p>,
+) -> NodeTreeVisitor<'n, 'p> {
+NodeTreeVisitor {
+nt: self,
+prefix: prefix,
+visit: self.len() - 1,
+nybble_idx: 0,
+done: false,
+}
+}
+}
+
+struct NodeTreeVisitor<'n, 'p> {
+nt: &'n NodeTree,
+prefix: NodePrefixRef<'p>,
+visit: usize,
+nybble_idx: usize,
+done: bool,
+}
+
+#[derive(Debug, PartialEq, Clone)]
+struct NodeTreeVisitItem {
+block_idx: usize,
+nybble: u8,
+element: Element,
+}
+
+impl<'n, 'p> Iterator for NodeTreeVisitor<'n, 'p> {
+type Item = NodeTreeVisitItem;
+
+fn next( self) -> Option {
+if self.done || self.nybble_idx >= self.prefix.len() {
+return None;
+}
+
+let nybble = self.prefix.get_nybble(self.nybble_idx);
+self.nybble_idx += 1;
+
+let visit = self.visit;
+let element = self.nt[visit].get(nybble);
+if let Element::Block(idx) = element {
+self.visit = idx;
+} else {
+self.done = true;
+}
+
+Some(NodeTreeVisitItem {
+block_idx: visit,
+nybble: nybble,
+element: element,
+})
+}
+}
+
+impl NodeTreeVisitItem {
+// Return `Some(opt)` if this item is final, with `opt` being the
+// `Revision` that it may represent.
+//
+// If the item is not terminal, return `None`
+fn final_revision() -> Option> {
+match self.element {
+Element::Block(_) => None,
+Element::Rev(r) => Some(Some(r)),
+Element::None => Some(None),
+}
+}
 }
 
 impl From> for NodeTree {



To: gracinet, #hg-reviewers, kevincox, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7791: rust-nodemap: NodeMap trait with simplest implementation

2020-01-27 Thread gracinet (Georges Racinet)
Closed by commit rHGe52401a95b94: rust-nodemap: NodeMap trait with simplest 
implementation (authored by gracinet).
gracinet marked 2 inline comments as done.
This revision was automatically updated to reflect the committed changes.
This revision was not accepted when it landed; it landed in state "Needs 
Review".

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7791?vs=19631=19644

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7791/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7791

AFFECTED FILES
  rust/hg-core/src/revlog/node.rs
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -12,8 +12,88 @@
 //! Following existing implicit conventions, the "nodemap" terminology
 //! is used in a more abstract context.
 
-use super::Revision;
+use super::{
+Node, NodeError, NodePrefix, NodePrefixRef, Revision, RevlogIndex,
+};
 use std::fmt;
+use std::ops::Deref;
+
+#[derive(Debug, PartialEq)]
+pub enum NodeMapError {
+MultipleResults,
+InvalidNodePrefix(NodeError),
+/// A `Revision` stored in the nodemap could not be found in the index
+RevisionNotInIndex(Revision),
+}
+
+impl From for NodeMapError {
+fn from(err: NodeError) -> Self {
+NodeMapError::InvalidNodePrefix(err)
+}
+}
+
+/// Mapping system from Mercurial nodes to revision numbers.
+///
+/// ## `RevlogIndex` and `NodeMap`
+///
+/// One way to think about their relationship is that
+/// the `NodeMap` is a prefix-oriented reverse index of the `Node` information
+/// carried by a [`RevlogIndex`].
+///
+/// Many of the methods in this trait take a `RevlogIndex` argument
+/// which is used for validation of their results. This index must naturally
+/// be the one the `NodeMap` is about, and it must be consistent.
+///
+/// Notably, the `NodeMap` must not store
+/// information about more `Revision` values than there are in the index.
+/// In these methods, an encountered `Revision` is not in the index, a
+/// [`RevisionNotInIndex`] error is returned.
+///
+/// In insert operations, the rule is thus that the `NodeMap` must always
+/// be updated after the `RevlogIndex`
+/// be updated first, and the `NodeMap` second.
+///
+/// [`RevisionNotInIndex`]: enum.NodeMapError.html#variant.RevisionNotInIndex
+/// [`RevlogIndex`]: ../trait.RevlogIndex.html
+pub trait NodeMap {
+/// Find the unique `Revision` having the given `Node`
+///
+/// If no Revision matches the given `Node`, `Ok(None)` is returned.
+fn find_node(
+,
+index:  RevlogIndex,
+node: ,
+) -> Result, NodeMapError> {
+self.find_bin(index, node.into())
+}
+
+/// Find the unique Revision whose `Node` starts with a given binary prefix
+///
+/// If no Revision matches the given prefix, `Ok(None)` is returned.
+///
+/// If several Revisions match the given prefix, a [`MultipleResults`]
+/// error is returned.
+fn find_bin<'a>(
+,
+idx:  RevlogIndex,
+prefix: NodePrefixRef<'a>,
+) -> Result, NodeMapError>;
+
+/// Find the unique Revision whose `Node` hexadecimal string representation
+/// starts with a given prefix
+///
+/// If no Revision matches the given prefix, `Ok(None)` is returned.
+///
+/// If several Revisions match the given prefix, a [`MultipleResults`]
+/// error is returned.
+fn find_hex(
+,
+idx:  RevlogIndex,
+prefix: ,
+) -> Result, NodeMapError> {
+self.find_bin(idx, NodePrefix::from_hex(prefix)?.borrow())
+}
+}
 
 /// Low level NodeTree [`Blocks`] elements
 ///
@@ -110,9 +190,86 @@
 }
 }
 
+/// A 16-radix tree with the root block at the end
+pub struct NodeTree {
+readonly: Box + Send>,
+}
+
+/// Return `None` unless the `Node` for `rev` has given prefix in `index`.
+fn has_prefix_or_none<'p>(
+idx:  RevlogIndex,
+prefix: NodePrefixRef<'p>,
+rev: Revision,
+) -> Result, NodeMapError> {
+idx.node(rev)
+.ok_or_else(|| NodeMapError::RevisionNotInIndex(rev))
+.map(|node| {
+if prefix.is_prefix_of(node) {
+Some(rev)
+} else {
+None
+}
+})
+}
+
+impl NodeTree {
+/// Main working method for `NodeTree` searches
+///
+/// This partial implementation lacks special cases for NULL_REVISION
+fn lookup<'p>(
+,
+prefix: NodePrefixRef<'p>,
+) -> Result, NodeMapError> {
+let blocks: &[Block] = &*self.readonly;
+if blocks.is_empty() {
+return Ok(None);
+}
+let mut visit = blocks.len() - 1;
+for i in 0..prefix.len() {
+let nybble = prefix.get_nybble(i);
+match blocks[visit].get(nybble) {
+Element::None => return Ok(None),
+  

D7793: rust-nodemap: mutable NodeTree data structure

2020-01-27 Thread gracinet (Georges Racinet)
Closed by commit rHGa19331456d48: rust-nodemap: mutable NodeTree data structure 
(authored by gracinet).
This revision was automatically updated to reflect the committed changes.
This revision was not accepted when it landed; it landed in state "Needs 
Review".

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7793?vs=19633=19646

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7793/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7793

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -191,16 +191,31 @@
 }
 }
 
-/// A 16-radix tree with the root block at the end
+/// A mutable 16-radix tree with the root block logically at the end
+///
+/// Because of the append only nature of our node trees, we need to
+/// keep the original untouched and store new blocks separately.
+///
+/// The mutable root `Block` is kept apart so that we don't have to rebump
+/// it on each insertion.
 pub struct NodeTree {
 readonly: Box + Send>,
+growable: Vec,
+root: Block,
 }
 
 impl Index for NodeTree {
 type Output = Block;
 
 fn index(, i: usize) ->  {
-[i]
+let ro_len = self.readonly.len();
+if i < ro_len {
+[i]
+} else if i == ro_len + self.growable.len() {
+
+} else {
+[i - ro_len]
+}
 }
 }
 
@@ -222,12 +237,32 @@
 }
 
 impl NodeTree {
-fn len() -> usize {
-self.readonly.len()
+/// Initiate a NodeTree from an immutable slice-like of `Block`
+///
+/// We keep `readonly` and clone its root block if it isn't empty.
+fn new(readonly: Box + Send>) -> Self {
+let root = readonly
+.last()
+.map(|b| b.clone())
+.unwrap_or_else(|| Block::new());
+NodeTree {
+readonly: readonly,
+growable: Vec::new(),
+root: root,
+}
 }
 
+/// Total number of blocks
+fn len() -> usize {
+self.readonly.len() + self.growable.len() + 1
+}
+
+/// Implemented for completeness
+///
+/// A `NodeTree` always has at least the mutable root block.
+#[allow(dead_code)]
 fn is_empty() -> bool {
-self.len() == 0
+false
 }
 
 /// Main working method for `NodeTree` searches
@@ -237,9 +272,6 @@
 ,
 prefix: NodePrefixRef<'p>,
 ) -> Result, NodeMapError> {
-if self.is_empty() {
-return Ok(None);
-}
 let mut visit = self.len() - 1;
 for i in 0..prefix.len() {
 let nybble = prefix.get_nybble(i);
@@ -255,16 +287,18 @@
 
 impl From> for NodeTree {
 fn from(vec: Vec) -> Self {
-NodeTree {
-readonly: Box::new(vec),
-}
+Self::new(Box::new(vec))
 }
 }
 
 impl fmt::Debug for NodeTree {
 fn fmt(, f:  fmt::Formatter<'_>) -> fmt::Result {
-let blocks: &[Block] = &*self.readonly;
-write!(f, "readonly: {:?}", blocks)
+let readonly: &[Block] = &*self.readonly;
+write!(
+f,
+"readonly: {:?}, growable: {:?}, root: {:?}",
+readonly, self.growable, self.root
+)
 }
 }
 
@@ -365,7 +399,9 @@
 assert_eq!(
 format!("{:?}", nt),
 "readonly: \
- [{0: Rev(9)}, {0: Rev(0), 1: Rev(9)}, {0: Block(1), 1: Rev(1)}]"
+ [{0: Rev(9)}, {0: Rev(0), 1: Rev(9)}, {0: Block(1), 1: Rev(1)}], \
+ growable: [], \
+ root: {0: Block(1), 1: Rev(1)}",
 );
 }
 
@@ -374,7 +410,7 @@
 let mut idx: TestIndex = HashMap::new();
 pad_insert( idx, 1, "1234deadcafe");
 
-let nt = NodeTree::from(vec![block![1: Rev(1)]]);
+let nt = NodeTree::from(vec![block! {1: Rev(1)}]);
 assert_eq!(nt.find_hex(, "1")?, Some(1));
 assert_eq!(nt.find_hex(, "12")?, Some(1));
 assert_eq!(nt.find_hex(, "1234de")?, Some(1));
@@ -401,4 +437,25 @@
 assert_eq!(nt.find_hex(, "00"), Ok(Some(0)));
 assert_eq!(nt.find_hex(, "00a"), Ok(Some(0)));
 }
+
+#[test]
+fn test_mutated_find() -> Result<(), NodeMapError> {
+let mut idx = TestIndex::new();
+pad_insert( idx, 9, "012");
+pad_insert( idx, 0, "00a");
+pad_insert( idx, 2, "cafe");
+pad_insert( idx, 3, "15");
+pad_insert( idx, 1, "10");
+
+let nt = NodeTree {
+readonly: sample_nodetree().readonly,
+growable: vec![block![0: Rev(1), 5: Rev(3)]],
+root: block![0: Block(1), 1:Block(3), 12: Rev(2)],
+};
+assert_eq!(nt.find_hex(, "10")?, Some(1));
+assert_eq!(nt.find_hex(, "c")?, Some(2));
+assert_eq!(nt.find_hex(, "00")?, Some(0));
+assert_eq!(nt.find_hex(, "01")?, 

D7792: rust-nodemap: abstracting the indexing

2020-01-27 Thread gracinet (Georges Racinet)
Closed by commit rHG220d4d2e3185: rust-nodemap: abstracting the indexing 
(authored by gracinet).
gracinet marked an inline comment as done.
This revision was automatically updated to reflect the committed changes.
This revision was not accepted when it landed; it landed in state "Needs 
Review".

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7792?vs=19632=19645

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7792/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7792

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -17,6 +17,7 @@
 };
 use std::fmt;
 use std::ops::Deref;
+use std::ops::Index;
 
 #[derive(Debug, PartialEq)]
 pub enum NodeMapError {
@@ -195,6 +196,14 @@
 readonly: Box + Send>,
 }
 
+impl Index for NodeTree {
+type Output = Block;
+
+fn index(, i: usize) ->  {
+[i]
+}
+}
+
 /// Return `None` unless the `Node` for `rev` has given prefix in `index`.
 fn has_prefix_or_none<'p>(
 idx:  RevlogIndex,
@@ -213,6 +222,14 @@
 }
 
 impl NodeTree {
+fn len() -> usize {
+self.readonly.len()
+}
+
+fn is_empty() -> bool {
+self.len() == 0
+}
+
 /// Main working method for `NodeTree` searches
 ///
 /// This partial implementation lacks special cases for NULL_REVISION
@@ -220,14 +237,13 @@
 ,
 prefix: NodePrefixRef<'p>,
 ) -> Result, NodeMapError> {
-let blocks: &[Block] = &*self.readonly;
-if blocks.is_empty() {
+if self.is_empty() {
 return Ok(None);
 }
-let mut visit = blocks.len() - 1;
+let mut visit = self.len() - 1;
 for i in 0..prefix.len() {
 let nybble = prefix.get_nybble(i);
-match blocks[visit].get(nybble) {
+match self[visit].get(nybble) {
 Element::None => return Ok(None),
 Element::Rev(r) => return Ok(Some(r)),
 Element::Block(idx) => visit = idx,



To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7790: rust-node: handling binary Node prefix

2020-01-27 Thread martinvonz (Martin von Zweigbergk)
martinvonz added inline comments.

INLINE COMMENTS

> gracinet wrote in node.rs:226
> yes, indeed.
> 
> current callers from `nodemap` work either on full Nodes or (the visitor) by 
> explicitely bounding with `prefix.len()`, but it's safer to protect it 
> inconditonally.
> 
> I think a simple `assert!` is enough though: that's already what slicing does 
> anyway.

Yes, an `assert!` is probably what I had in mind. I've already queued this, so 
please send in a follow-up.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7790/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7790

To: gracinet, #hg-reviewers, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7794: rust-nodemap: generic NodeTreeVisitor

2020-01-27 Thread gracinet (Georges Racinet)
gracinet added inline comments.
gracinet marked 2 inline comments as done.

INLINE COMMENTS

> martinvonz wrote in nodemap.rs:300
> would something like `block_index` be clearer?

I found it to be clearer for the emitter, but in the iterator implementation, 
it expresses what's to be done next, same as in many cases I've seeen in 
Mercurial

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7794/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7794

To: gracinet, #hg-reviewers, kevincox, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7790: rust-node: handling binary Node prefix

2020-01-27 Thread gracinet (Georges Racinet)
gracinet added inline comments.

INLINE COMMENTS

> martinvonz wrote in node.rs:226
> Should we check here that `i < self.len()`? I'm especially thinking of the 
> case of an odd-length prefix where we would otherwise silently return 0.

yes, indeed.

current callers from `nodemap` work either on full Nodes or (the visitor) by 
explicitely bounding with `prefix.len()`, but it's safer to protect it 
inconditonally.

I think a simple `assert!` is enough though: that's already what slicing does 
anyway.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7790/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7790

To: gracinet, #hg-reviewers, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7794: rust-nodemap: generic NodeTreeVisitor

2020-01-27 Thread martinvonz (Martin von Zweigbergk)
martinvonz added inline comments.
martinvonz accepted this revision.

INLINE COMMENTS

> nodemap.rs:300
> +prefix: NodePrefixRef<'p>,
> +visit: usize,
> +nybble_idx: usize,

would something like `block_index` be clearer?

> nodemap.rs:312
> +
> +impl<'n, 'p> Iterator for NodeTreeVisitor<'n, 'p> {
> +type Item = NodeTreeVisitItem;

can these lifetimes be anonymous?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7794/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7794

To: gracinet, #hg-reviewers, kevincox, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7791: rust-nodemap: NodeMap trait with simplest implementation

2020-01-27 Thread martinvonz (Martin von Zweigbergk)
martinvonz added inline comments.

INLINE COMMENTS

> gracinet wrote in nodemap.rs:153
> Perhaps a better name would be better than this `has_` that indeed feels 
> boolean?
> 
> `check_prefix`?  `confirm` ?
> 
> Previous naming was `validate_candidate`, but that very same name is used at 
> the end of the series when it becomes involved with `NULL_NODE` etc. Then 
> this `has_prefix_or_none` becomes one step in the final verification.
> 
> Returning a `bool` would mean adding a closure to the callers. making it less 
> obvious that they are just chaining the maturation of the response.
> 
> I'm leaving unchanged for now, not sure what the best is. Still note that 
> this is a private function, there's no risk an external caller would be 
> confused by what it does.

Yes, `validate_candidate` sounds much better. Could you send a follow-up with a 
better name for it?

> gracinet wrote in nodemap.rs:205
> Removed

Still there? Or maybe it's a different one, but I think you can remove the 
`<'_>` from the `fm::Formatter<'_>`.

> nodemap.rs:76
> +/// error is returned.
> +fn find_bin<'a>(
> +,

can the explicit lifetime be dropped?

> nodemap.rs:199
> +/// Return `None` unless the `Node` for `rev` has given prefix in `index`.
> +fn has_prefix_or_none<'p>(
> +idx:  RevlogIndex,

can the explicit lifetime be dropped?

> nodemap.rs:219
> +/// This partial implementation lacks special cases for NULL_REVISION
> +fn lookup<'p>(
> +,

can the explicit lifetime be dropped?

> nodemap.rs:256
> +impl NodeMap for NodeTree {
> +fn find_bin<'a>(
> +,

can the explicit lifetime be dropped?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7791/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7791

To: gracinet, #hg-reviewers, kevincox
Cc: Alphare, martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7790: rust-node: handling binary Node prefix

2020-01-27 Thread gracinet (Georges Racinet)
Closed by commit rHG9896a8d0d3d2: rust-node: handling binary Node prefix 
(authored by gracinet).
This revision was automatically updated to reflect the committed changes.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7790?vs=19630=19643

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7790/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7790

AFFECTED FILES
  rust/hg-core/src/revlog.rs
  rust/hg-core/src/revlog/node.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/node.rs b/rust/hg-core/src/revlog/node.rs
--- a/rust/hg-core/src/revlog/node.rs
+++ b/rust/hg-core/src/revlog/node.rs
@@ -62,6 +62,7 @@
 #[derive(Debug, PartialEq)]
 pub enum NodeError {
 ExactLengthRequired(usize, String),
+PrefixTooLong(String),
 HexError(FromHexError, String),
 }
 
@@ -119,17 +120,119 @@
 }
 }
 
-impl From<(FromHexError, )> for NodeError {
-fn from(err_offender: (FromHexError, )) -> Self {
+impl> From<(FromHexError, T)> for NodeError {
+fn from(err_offender: (FromHexError, T)) -> Self {
 let (err, offender) = err_offender;
 match err {
 FromHexError::InvalidStringLength => {
 NodeError::ExactLengthRequired(
 NODE_NYBBLES_LENGTH,
-offender.to_string(),
+offender.as_ref().to_owned(),
 )
 }
-_ => NodeError::HexError(err, offender.to_string()),
+_ => NodeError::HexError(err, offender.as_ref().to_owned()),
+}
+}
+}
+
+/// The beginning of a binary revision SHA.
+///
+/// Since it can potentially come from an hexadecimal representation with
+/// odd length, it needs to carry around whether the last 4 bits are relevant
+/// or not.
+#[derive(Debug, PartialEq)]
+pub struct NodePrefix {
+buf: Vec,
+is_odd: bool,
+}
+
+impl NodePrefix {
+/// Convert from hexadecimal string representation
+///
+/// Similarly to `hex::decode`, can be used with Unicode string types
+/// (`String`, ``) as well as bytes.
+///
+/// To be used in FFI and I/O only, in order to facilitate future
+/// changes of hash format.
+pub fn from_hex(hex: impl AsRef<[u8]>) -> Result {
+let hex = hex.as_ref();
+let len = hex.len();
+if len > NODE_NYBBLES_LENGTH {
+return Err(NodeError::PrefixTooLong(
+String::from_utf8_lossy(hex).to_owned().to_string(),
+));
+}
+
+let is_odd = len % 2 == 1;
+let even_part = if is_odd { [..len - 1] } else { hex };
+let mut buf: Vec = Vec::from_hex(_part)
+.map_err(|e| (e, String::from_utf8_lossy(hex)))?;
+
+if is_odd {
+let latest_char = char::from(hex[len - 1]);
+let latest_nybble = latest_char.to_digit(16).ok_or_else(|| {
+(
+FromHexError::InvalidHexCharacter {
+c: latest_char,
+index: len - 1,
+},
+String::from_utf8_lossy(hex),
+)
+})? as u8;
+buf.push(latest_nybble << 4);
+}
+Ok(NodePrefix { buf, is_odd })
+}
+
+pub fn borrow() -> NodePrefixRef {
+NodePrefixRef {
+buf: ,
+is_odd: self.is_odd,
+}
+}
+}
+
+#[derive(Clone, Debug, PartialEq)]
+pub struct NodePrefixRef<'a> {
+buf: &'a [u8],
+is_odd: bool,
+}
+
+impl<'a> NodePrefixRef<'a> {
+pub fn len() -> usize {
+if self.is_odd {
+self.buf.len() * 2 - 1
+} else {
+self.buf.len() * 2
+}
+}
+
+pub fn is_prefix_of(, node: ) -> bool {
+if self.is_odd {
+let buf = self.buf;
+let last_pos = buf.len() - 1;
+node.data.starts_with(buf.split_at(last_pos).0)
+&& node.data[last_pos] >> 4 == buf[last_pos] >> 4
+} else {
+node.data.starts_with(self.buf)
+}
+}
+
+/// Retrieve the `i`th half-byte from the prefix.
+///
+/// This is also the `i`th hexadecimal digit in numeric form,
+/// also called a [nybble](https://en.wikipedia.org/wiki/Nibble).
+pub fn get_nybble(, i: usize) -> u8 {
+get_nybble(self.buf, i)
+}
+}
+
+/// A shortcut for full `Node` references
+impl<'a> From<&'a Node> for NodePrefixRef<'a> {
+fn from(node: &'a Node) -> Self {
+NodePrefixRef {
+buf: ,
+is_odd: false,
 }
 }
 }
@@ -188,4 +291,74 @@
 fn test_node_encode_hex() {
 assert_eq!(sample_node().encode_hex(), sample_node_hex());
 }
+
+#[test]
+fn test_prefix_from_hex() -> Result<(), NodeError> {
+assert_eq!(
+NodePrefix::from_hex("0e1")?,
+NodePrefix {
+buf: vec![14, 16],
+is_odd: true
+}
+);
+assert_eq!(
+

D7789: rust-revlog: a trait for the revlog index

2020-01-27 Thread gracinet (Georges Racinet)
Closed by commit rHG3fb39dc2e356: rust-revlog: a trait for the revlog index 
(authored by gracinet).
This revision was automatically updated to reflect the committed changes.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7789?vs=19629=19642

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7789/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7789

AFFECTED FILES
  rust/hg-core/src/revlog.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog.rs b/rust/hg-core/src/revlog.rs
--- a/rust/hg-core/src/revlog.rs
+++ b/rust/hg-core/src/revlog.rs
@@ -40,3 +40,17 @@
 ParentOutOfRange(Revision),
 WorkingDirectoryUnsupported,
 }
+
+/// The Mercurial Revlog Index
+///
+/// This is currently limited to the minimal interface that is needed for
+/// the [`nodemap`](nodemap/index.html) module
+pub trait RevlogIndex {
+/// Total number of Revisions referenced in this index
+fn len() -> usize;
+
+/// Return a reference to the Node or `None` if rev is out of bounds
+///
+/// `NULL_REVISION` is not considered to be out of bounds.
+fn node(, rev: Revision) -> Option<>;
+}



To: gracinet, #hg-reviewers, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7790: rust-node: handling binary Node prefix

2020-01-27 Thread martinvonz (Martin von Zweigbergk)
This revision is now accepted and ready to land.
martinvonz added inline comments.
martinvonz accepted this revision.

INLINE COMMENTS

> node.rs:103
> +
> +impl<'a> NodePrefixRef<'a> {
> +pub fn len() -> usize {

The lifetime parameter doesn't seem to be used, so make it anonymous? (I.e., 
remove it from `impl` and use `<'_>` on the type.

> node.rs:226
> +pub fn get_nybble(, i: usize) -> u8 {
> +get_nybble(self.buf, i)
> +}

Should we check here that `i < self.len()`? I'm especially thinking of the case 
of an odd-length prefix where we would otherwise silently return 0.

REPOSITORY
  rHG Mercurial

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7790/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7790

To: gracinet, #hg-reviewers, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D8018: cmdutil: change check_incompatible_arguments() *arg to single iterable

2020-01-27 Thread martinvonz (Martin von Zweigbergk)
martinvonz created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This makes it clearer on the call-sites that the first argument is
  special. Thanks to Yuya for the suggestion.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D8018

AFFECTED FILES
  hgext/rebase.py
  hgext/releasenotes.py
  hgext/transplant.py
  mercurial/cmdutil.py
  mercurial/commands.py
  relnotes/next

CHANGE DETAILS

diff --git a/relnotes/next b/relnotes/next
--- a/relnotes/next
+++ b/relnotes/next
@@ -17,3 +17,6 @@
 
  * `hg.merge()` has lost its `abort` argument. Please call
`hg.abortmerge()` directly instead.
+
+ * The `*others` argument of `cmdutil.check_incompatible_arguments()`
+   changed from being varargs argument to being a single collection.
diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -1228,7 +1228,7 @@
 
 action = cmdutil.check_at_most_one_arg(opts, b'delete', b'rename', b'list')
 if action:
-cmdutil.check_incompatible_arguments(opts, action, b'rev')
+cmdutil.check_incompatible_arguments(opts, action, [b'rev'])
 elif names or rev:
 action = b'add'
 elif inactive:
@@ -1236,7 +1236,9 @@
 else:
 action = b'list'
 
-cmdutil.check_incompatible_arguments(opts, b'inactive', b'delete', b'list')
+cmdutil.check_incompatible_arguments(
+opts, b'inactive', [b'delete', b'list']
+)
 if not names and action in {b'add', b'delete'}:
 raise error.Abort(_(b"bookmark name required"))
 
@@ -4847,7 +4849,7 @@
 abort = opts.get(b'abort')
 if abort and repo.dirstate.p2() == nullid:
 cmdutil.wrongtooltocontinue(repo, _(b'merge'))
-cmdutil.check_incompatible_arguments(opts, b'abort', b'rev', b'preview')
+cmdutil.check_incompatible_arguments(opts, b'abort', [b'rev', b'preview'])
 if abort:
 state = cmdutil.getunfinishedstate(repo)
 if state and state._opname != b'merge':
diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -281,11 +281,11 @@
 return previous
 
 
-def check_incompatible_arguments(opts, first, *others):
+def check_incompatible_arguments(opts, first, others):
 """abort if the first argument is given along with any of the others
 
 Unlike check_at_most_one_arg(), `others` are not mutually exclusive
-among themselves.
+among themselves, and they're passed as a single collection.
 """
 for other in others:
 check_at_most_one_arg(opts, first, other)
diff --git a/hgext/transplant.py b/hgext/transplant.py
--- a/hgext/transplant.py
+++ b/hgext/transplant.py
@@ -761,12 +761,12 @@
 def checkopts(opts, revs):
 if opts.get(b'continue'):
 cmdutil.check_incompatible_arguments(
-opts, b'continue', b'branch', b'all', b'merge'
+opts, b'continue', [b'branch', b'all', b'merge']
 )
 return
 if opts.get(b'stop'):
 cmdutil.check_incompatible_arguments(
-opts, b'stop', b'branch', b'all', b'merge'
+opts, b'stop', [b'branch', b'all', b'merge']
 )
 return
 if not (
diff --git a/hgext/releasenotes.py b/hgext/releasenotes.py
--- a/hgext/releasenotes.py
+++ b/hgext/releasenotes.py
@@ -654,7 +654,7 @@
 opts = pycompat.byteskwargs(opts)
 sections = releasenotessections(ui, repo)
 
-cmdutil.check_incompatible_arguments(opts, b'list', b'rev', b'check')
+cmdutil.check_incompatible_arguments(opts, b'list', [b'rev', b'check'])
 
 if opts.get(b'list'):
 return _getadmonitionlist(ui, sections)
diff --git a/hgext/rebase.py b/hgext/rebase.py
--- a/hgext/rebase.py
+++ b/hgext/rebase.py
@@ -1011,10 +1011,10 @@
 action = cmdutil.check_at_most_one_arg(opts, b'abort', b'stop', 
b'continue')
 if action:
 cmdutil.check_incompatible_arguments(
-opts, action, b'confirm', b'dry_run'
+opts, action, [b'confirm', b'dry_run']
 )
 cmdutil.check_incompatible_arguments(
-opts, action, b'rev', b'source', b'base', b'dest'
+opts, action, [b'rev', b'source', b'base', b'dest']
 )
 cmdutil.check_at_most_one_arg(opts, b'confirm', b'dry_run')
 cmdutil.check_at_most_one_arg(opts, b'rev', b'source', b'base')
@@ -1028,7 +1028,7 @@
 if opts.get(b'auto_orphans'):
 disallowed_opts = set(opts) - {b'auto_orphans'}
 cmdutil.check_incompatible_arguments(
-opts, b'auto_orphans', *disallowed_opts
+opts, b'auto_orphans', disallowed_opts
 )
 
 userrevs = list(repo.revs(opts.get(b'auto_orphans')))



To: martinvonz, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list

D7993: merge: use check_incompatible_arguments() for --abort

2020-01-27 Thread martinvonz (Martin von Zweigbergk)
martinvonz added a comment.


  In D7993#118162 , @yuja wrote:
  
  >> +cmdutil.check_incompatible_arguments(opts, b'abort', b'rev', 
b'preview')
  >
  > It's a bit late, but I feel the arguments of 
`check_incompatible_arguments()`
  > is confusing since `b'abort'` isn't the same kind of arguments as the 
others.
  > I think `(opts, b'abort', [b'rev', b'preview'])` is more explicit.
  
  Good point. I've sent D8018 .

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7993/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7993

To: martinvonz, #hg-reviewers, pulkit
Cc: yuja, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7450: packaging: add support for PyOxidizer

2020-01-27 Thread mharbison72 (Matt Harbison)
mharbison72 added a comment.


  In D7450#118154 , @mharbison72 
wrote:
  
  > I think this can land after the Windows options are added.
  >
  >> So we don't need to worry about vendoring any Rust code to initially 
support PyOxidizer.
  >
  > Does that mean the parent of this can go away?
  
  Nevermind this.  It looks like `phabread --stack` doesn't pay attention to 
the fact that it was abandoned, so I didn't realize it.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7450/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7450

To: indygreg, #hg-reviewers, mharbison72
Cc: marmoute, mharbison72, martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7819: rust-nodemap: core implementation for shortest

2020-01-27 Thread gracinet (Georges Racinet)
gracinet marked an inline comment as done.
gracinet updated this revision to Diff 19640.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7819?vs=19639=19640

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7819/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7819

AFFECTED FILES
  rust/hg-core/src/revlog/node.rs
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -17,6 +17,7 @@
 RevlogIndex, NULL_REVISION,
 };
 
+use std::cmp::max;
 use std::fmt;
 use std::mem;
 use std::ops::Deref;
@@ -98,6 +99,42 @@
 ) -> Result, NodeMapError> {
 self.find_bin(idx, NodePrefix::from_hex(prefix)?.borrow())
 }
+
+/// Give the size of the shortest node prefix that determines
+/// the revision uniquely.
+///
+/// From a binary node prefix, if it is matched in the node map, this
+/// returns the number of hexadecimal digits that would had sufficed
+/// to find the revision uniquely.
+///
+/// Returns `None` if no `Revision` could be found for the prefix.
+///
+/// If several Revisions match the given prefix, a [`MultipleResults`]
+/// error is returned.
+fn unique_prefix_len_bin<'a>(
+,
+idx:  RevlogIndex,
+node_prefix: NodePrefixRef<'a>,
+) -> Result, NodeMapError>;
+
+/// Same as `unique_prefix_len_bin`, with the hexadecimal representation
+/// of the prefix as input.
+fn unique_prefix_len_hex(
+,
+idx:  RevlogIndex,
+prefix: ,
+) -> Result, NodeMapError> {
+self.unique_prefix_len_bin(idx, NodePrefix::from_hex(prefix)?.borrow())
+}
+
+/// Same as `unique_prefix_len_bin`, with a full `Node` as input
+fn unique_prefix_len_node(
+,
+idx:  RevlogIndex,
+node: ,
+) -> Result, NodeMapError> {
+self.unique_prefix_len_bin(idx, node.into())
+}
 }
 
 pub trait MutableNodeMap: NodeMap {
@@ -259,20 +296,24 @@
 fn validate_candidate<'p>(
 idx:  RevlogIndex,
 prefix: NodePrefixRef<'p>,
-rev: Option,
-) -> Result, NodeMapError> {
-if prefix.is_prefix_of(_NODE) {
-// NULL_REVISION always matches a prefix made only of zeros
+candidate: (Option, usize),
+) -> Result<(Option, usize), NodeMapError> {
+let (rev, steps) = candidate;
+if let Some(nz_nybble) = prefix.first_different_nybble(_NODE) {
+rev.map_or(Ok((None, steps)), |r| {
+has_prefix_or_none(idx, prefix, r)
+.map(|opt| (opt, max(steps, nz_nybble + 1)))
+})
+} else {
+// the prefix is only made of zeros; NULL_REVISION always matches it
 // and any other *valid* result is an ambiguity
 match rev {
-None => Ok(Some(NULL_REVISION)),
+None => Ok((Some(NULL_REVISION), steps + 1)),
 Some(r) => match has_prefix_or_none(idx, prefix, r)? {
-None => Ok(Some(NULL_REVISION)),
+None => Ok((Some(NULL_REVISION), steps + 1)),
 _ => Err(NodeMapError::MultipleResults),
 },
 }
-} else {
-rev.map_or(Ok(None), |r| has_prefix_or_none(idx, prefix, r))
 }
 }
 
@@ -356,13 +397,26 @@
 }
 
 /// Main working method for `NodeTree` searches
+///
+/// The first returned value is the result of analysing `NodeTree` data
+/// *alone*: whereas `None` guarantees that the given prefix is absent
+/// from the `NodeTree` data (but still could match `NULL_NODE`), with
+/// `Some(rev)`, it is to be understood that `rev` is the unique `Revision`
+/// that could match the prefix. Actually, all that can be inferred from
+/// the `NodeTree` data is that `rev` is the revision with the longest
+/// common node prefix with the given prefix.
+///
+/// The second returned value is the size of the smallest subprefix
+/// of `prefix` that would give the same result, i.e. not the
+/// `MultipleResults` error variant (again, using only the data of the
+/// `NodeTree`).
 fn lookup<'p>(
 ,
 prefix: NodePrefixRef<'p>,
-) -> Result, NodeMapError> {
-for visit_item in self.visit(prefix) {
+) -> Result<(Option, usize), NodeMapError> {
+for (i, visit_item) in self.visit(prefix).enumerate() {
 if let Some(opt) = visit_item.final_revision() {
-return Ok(opt);
+return Ok((opt, i + 1));
 }
 }
 Err(NodeMapError::MultipleResults)
@@ -607,6 +661,16 @@
 prefix: NodePrefixRef<'a>,
 ) -> Result, NodeMapError> {
 validate_candidate(idx, prefix.clone(), self.lookup(prefix)?)
+.map(|(opt, _shortest)| opt)
+}
+
+fn unique_prefix_len_bin<'a>(
+,
+idx:  RevlogIndex,
+

D7819: rust-nodemap: core implementation for shortest

2020-01-27 Thread gracinet (Georges Racinet)
gracinet added inline comments.
gracinet marked 3 inline comments as done.

INLINE COMMENTS

> kevincox wrote in nodemap.rs:337
> Can you describe the return value in the doc comment.

Done

> martinvonz wrote in nodemap.rs:575
> I'm fine with either. We'll expose it as "shortest" in the python API anyway, 
> I assume. Feel free to change as far as I'm concerned.

Okay, I've settled with `unique_prefix_len_bin` etc, keeping the convention 
that it's `something_bin`, `something_hex`, etc.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7819/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7819

To: gracinet, #hg-reviewers, kevincox
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7796: rust-nodemap: input/output primitives

2020-01-27 Thread gracinet (Georges Racinet)
gracinet added inline comments.
gracinet marked 2 inline comments as done.

INLINE COMMENTS

> kevincox wrote in nodemap.rs:268
> I thought about this more and I think I am okay doing it this way. It seems 
> like this should be well defined as long as there is no padding. However on 
> that note I would want to add a check that there is no padding as expected. I 
> would also like to ensure that this fails to compile if there is ever 
> padding, even in release mode. I think this can be accomplished by something 
> like:
> 
>   let _: [u8; 4 * BLOCK_SIZE] = std::mem::transmute([Block::new(); 4]);
> 
> I would probably want to repeat this check near any code that is relying on 
> this invariant.

About a method to output to a writer: for the time being, we're avoiding doing 
I/O directly in Rust because most of it is shared with Python through layers 
that perform various sanitizations, lock management etc. That's why the 
simplest is to transfer bytes out.

Context note: the data structure (`Block`) is directly taken from the C 
version, which is non-persistent, but I believe that these 64 bytes are just 
some low-level programmer reflex. It's probably not a coincidence that IIRC 
that's also the length of cache lines on most current CPUs.

Anyway, the point of all of this is to get on disk without padding, so yes, we 
could implement our own non-padding by changing the definition of `Block` to 
`[u8; 64]`. In the end, we are forced to trust what's on disk anyway.

> kevincox wrote in nodemap.rs:272
> If someone panics between the `from_raw_parts` and `mem::forget` this is a 
> double free. Right now I think this can't happen but it is completely 
> possible that the code changes between then and now. I would prefer using 
> `Vec::from_raw_parts` to make sure that there is only one `Vec` alive with 
> that pointer at a time. This risks a leak in the face of panic but I think 
> that is much better.

Not sure to understand what you suggest here, since I'm already using 
`Vec::from_raw_parts` in that method. I couldn't find an `into_raw_parts`.
Anyway, the official example 
  has the 
`mem::forget` before  `Vec::from_raw_parts`. Would that be sufficient?

I agree that a leak upon some exceptional-cannot-happen condition is better 
than a double free.

Also, forgot to came back to that one in the latest phab update

> kevincox wrote in nodemap.rs:433
> This is a weird API. Why does new take a buffer and an offset? Can it take 
> either a slice or just remove the offset parameter and always have the buffer 
> start at 0? It is very strange to be passing in data that we own but isn't 
> ever used.

Owning some memory and not using the whole of it is anyway what happens when 
that memory region is just obtained from a mmap (which is the whole point of 
carrying theses `Deref` all around).
Technically, in a mmap, I suppose we actually only own the file descriptor and 
the resulting pointer, but we can't just convert it to an inert slice.

Anyway it's now confirmed that we won't be needing the offset finally, so I've 
removed it.  Some data at the end of the bytes slice may be ignored, too : it 
would be the result of aborted transactions.

Note: the Python layer will maintain the amount of validated blocks in a 
separate small file which is updated atomically. Any future reimplementation of 
this in Rust would have to do the same.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7796/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7796

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7794: rust-nodemap: generic NodeTreeVisitor

2020-01-27 Thread gracinet (Georges Racinet)
gracinet added inline comments.
gracinet marked 2 inline comments as done.

INLINE COMMENTS

> martinvonz wrote in nodemap.rs:264-269
> There will only be a valid result if this is a leaf, so it might be better to 
> combine `leaf` and `opt` into one `Option>` so the caller 
> cannot mistake a `opt=None` for "missing" when `leaf=false`.

I understand your concern, but `>` would defeat the 
purpose of not having a dead match arm to fill with a "can't happen" comment in 
the future `insert()`.

Fortunately, @kevincox suggestion of having `NodeTreeVisitor` emit  a `struct` 
provides us with the best of both worlds, since it can have methods.

It should be fully transparent performance-wise, I just hope that is really 
true.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7794/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7794

To: gracinet, #hg-reviewers, kevincox
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7793: rust-nodemap: mutable NodeTree data structure

2020-01-27 Thread gracinet (Georges Racinet)
gracinet added inline comments.

INLINE COMMENTS

> kevincox wrote in nodemap.rs:158
> This strikes me as a weird name. The fact that it is an adjective rather than 
> a noun is a hint. Can you rename to answer "Growable what?"

In a previous version, I was calling it `mutable` (also an adjective, btw), but 
that became just wrong with the introduction of the separate root block. I 
don't have many ideas:

- `added_inner_blocks` ? actually, these are added on top of the readonly part 
and often mask some of the readonly blocks
- `added_non_root_blocks` ? Eww
- `mutable_inner_blocks` ?

I'm no native speaker, but the adjective thing doesn't deter me, I would expect 
it to be rather clear that it refers to blocks, given the type.
Open to suggestions, leaving as is for the time being

> kevincox wrote in nodemap.rs:249
> I would use 
> https://doc.rust-lang.org/std/fmt/struct.Formatter.html#method.debug_struct 
> for consistency unless you really want to avoid printing the struct name.

sadly, `Formatter.debug_struct` doesn't tale `?Sized` arguments (at least in 
our target 1.34 version):

 --> hg-core/src/revlog/nodemap.rs:298:32
  |
  298 | .field("readonly", readonly)
  | doesn't have a size known at 
compile-time
  |
  = help: the trait `std::marker::Sized` is not implemented for 
`[revlog::nodemap::Block]`

Kept it as it was for the time being, just dropped the unneeded  anonymous 
lifetime

Omitting the struct name was indeed on purpose, to make it more manageable in 
test data. That was useful in early versions. I would have been willing to drop 
it for the final form. Anyone using this for real data wouldn't care about just 
a few bytes at the beginning.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7793/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7793

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7792: rust-nodemap: abstracting the indexing

2020-01-27 Thread gracinet (Georges Racinet)
gracinet added inline comments.
gracinet marked 2 inline comments as done.

INLINE COMMENTS

> kevincox wrote in nodemap.rs:193
> I would add a `self.is_empty()` helper. It's good practice for anything that 
> has a `.len()`.

Sure

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7792/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7792

To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7791: rust-nodemap: NodeMap trait with simplest implementation

2020-01-27 Thread gracinet (Georges Racinet)
gracinet added inline comments.
gracinet marked 3 inline comments as done.

INLINE COMMENTS

> kevincox wrote in nodemap.rs:37
> Can you please add doc-comments for this? I find that documenting trait 
> methods is especially important.

Sure, indeed it's more important than with the `impl`.

> martinvonz wrote in nodemap.rs:150
> How about something like this then?
> 
>   type BlockSource = Box + Send>;
>   type ByteSource = Box + Send>;
> 
> I won't insist, so up to you if you think it helps.

Missed the type (not trait) alias suggestion. Maybe for the next update, then

> martinvonz wrote in nodemap.rs:153
> I understand that you picked this interface because it works well for the 
> caller, but it feel a little weird to always return `None` or `Some(rev)` 
> where `rev` is one of the function inputs. It's practically a boolean-valued 
> function. Do the callers get much harder to read if you actually make it 
> boolean-valued?

Perhaps a better name would be better than this `has_` that indeed feels 
boolean?

`check_prefix`?  `confirm` ?

Previous naming was `validate_candidate`, but that very same name is used at 
the end of the series when it becomes involved with `NULL_NODE` etc. Then this 
`has_prefix_or_none` becomes one step in the final verification.

Returning a `bool` would mean adding a closure to the callers. making it less 
obvious that they are just chaining the maturation of the response.

I'm leaving unchanged for now, not sure what the best is. Still note that this 
is a private function, there's no risk an external caller would be confused by 
what it does.

> kevincox wrote in nodemap.rs:205
> I don't think the `<'_>` is necessary.

Removed

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7791/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7791

To: gracinet, #hg-reviewers, kevincox
Cc: Alphare, martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7790: rust-node: handling binary Node prefix

2020-01-27 Thread gracinet (Georges Racinet)
gracinet added a comment.


  > Depends on the definition of NodePrefixRef. I don't think there would be 
any extra allocation if you define it like this:
  
  
  
pub enum NodePrefixRef<'a> {
Full(&'a Node),
Prefix(&'a NodePrefix),
}
  
  That's an interesting idea, another possibility would be to define a trait 
(essentially for `get_nybble`) and implement it for `` and ``. 
We'll see when we're pass preliminary tests, thanks.

INLINE COMMENTS

> martinvonz wrote in node.rs:79
> Why not use `(len + 1) / 2` as capacity?

Just thought 20 would be the simplest one-size-fit-all. With the preparations 
for different hash length (and potentially even somewhat dynamic), it's really 
obsolete now (switched to actual len based, yes).

> martinvonz wrote in node.rs:89
> Is this lifetime parameter needed?

ah yes indeed the compiler seems to be able to infer that `` and 
`Nodeprefix.buf` have the same lifteime, and that it must then be equal to the 
lifetime parameter of `NodePrefix` itself.

> martinvonz wrote in node.rs:136
> What does the `&*` do? Specifically, what's different if you drop that part?

These are conversions that need to be explicit when the compiler doesn't insert 
them magically. Usually, I try to avoid them, but it can happen that they 
become non necessary in the final form after some changes.

Not needed in the new form with a real struct.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7790/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7790

To: gracinet, #hg-reviewers
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7819: rust-nodemap: core implementation for shortest

2020-01-27 Thread gracinet (Georges Racinet)
gracinet updated this revision to Diff 19639.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7819?vs=19331=19639

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7819/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7819

AFFECTED FILES
  rust/hg-core/src/revlog/node.rs
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -17,6 +17,7 @@
 RevlogIndex, NULL_REVISION,
 };
 
+use std::cmp::max;
 use std::fmt;
 use std::mem;
 use std::ops::Deref;
@@ -98,6 +99,42 @@
 ) -> Result, NodeMapError> {
 self.find_bin(idx, NodePrefix::from_hex(prefix)?.borrow())
 }
+
+/// Give the size of the shortest node prefix that determines
+/// the revision uniquely.
+///
+/// From a binary node prefix, if it is matched in the node map, this
+/// returns the number of hexadecimal digits that would had sufficed
+/// to find the revision uniquely.
+///
+/// Returns `None` if no `Revision` could be found for the prefix.
+///
+/// If several Revisions match the given prefix, a [`MultipleResults`]
+/// error is returned.
+fn unique_prefix_len_bin<'a>(
+,
+idx:  RevlogIndex,
+node_prefix: NodePrefixRef<'a>,
+) -> Result, NodeMapError>;
+
+/// Same as `unique_prefix_len_bin`, with the hexadecimal representation
+/// of the prefix as input.
+fn unique_prefix_len_hex(
+,
+idx:  RevlogIndex,
+prefix: ,
+) -> Result, NodeMapError> {
+self.unique_prefix_len_bin(idx, NodePrefix::from_hex(prefix)?.borrow())
+}
+
+/// Same as `unique_prefix_len_bin`, with a full `Node` as input
+fn unique_prefix_len_node(
+,
+idx:  RevlogIndex,
+node: ,
+) -> Result, NodeMapError> {
+self.unique_prefix_len_bin(idx, node.into())
+}
 }
 
 pub trait MutableNodeMap: NodeMap {
@@ -259,20 +296,24 @@
 fn validate_candidate<'p>(
 idx:  RevlogIndex,
 prefix: NodePrefixRef<'p>,
-rev: Option,
-) -> Result, NodeMapError> {
-if prefix.is_prefix_of(_NODE) {
-// NULL_REVISION always matches a prefix made only of zeros
+cand: (Option, usize),
+) -> Result<(Option, usize), NodeMapError> {
+let (rev, steps) = cand;
+if let Some(nz_nybble) = prefix.first_different_nybble(_NODE) {
+rev.map_or(Ok((None, steps)), |r| {
+has_prefix_or_none(idx, prefix, r)
+.map(|opt| (opt, max(steps, nz_nybble + 1)))
+})
+} else {
+// the prefix is only made of zeros; NULL_REVISION always matches it
 // and any other *valid* result is an ambiguity
 match rev {
-None => Ok(Some(NULL_REVISION)),
+None => Ok((Some(NULL_REVISION), steps + 1)),
 Some(r) => match has_prefix_or_none(idx, prefix, r)? {
-None => Ok(Some(NULL_REVISION)),
+None => Ok((Some(NULL_REVISION), steps + 1)),
 _ => Err(NodeMapError::MultipleResults),
 },
 }
-} else {
-rev.map_or(Ok(None), |r| has_prefix_or_none(idx, prefix, r))
 }
 }
 
@@ -356,13 +397,26 @@
 }
 
 /// Main working method for `NodeTree` searches
+///
+/// The first returned value is the result of analysing `NodeTree` data
+/// *alone*: whereas `None` guarantees that the given prefix is absent
+/// from the `NodeTree` data (but still could match `NULL_NODE`), with
+/// `Some(rev)`, it is to be understood that `rev` is the unique `Revision`
+/// that could match the prefix. Actually, all that can be inferred from
+/// the `NodeTree` data is that `rev` is the revision with the longest
+/// common node prefix with the given prefix.
+///
+/// The second returned value is the size of the smallest subprefix
+/// of `prefix` that would give the same result, i.e. not the
+/// `MultipleResults` error variant (again, using only the data of the
+/// `NodeTree`).
 fn lookup<'p>(
 ,
 prefix: NodePrefixRef<'p>,
-) -> Result, NodeMapError> {
-for visit_item in self.visit(prefix) {
+) -> Result<(Option, usize), NodeMapError> {
+for (i, visit_item) in self.visit(prefix).enumerate() {
 if let Some(opt) = visit_item.final_revision() {
-return Ok(opt);
+return Ok((opt, i + 1));
 }
 }
 Err(NodeMapError::MultipleResults)
@@ -607,6 +661,16 @@
 prefix: NodePrefixRef<'a>,
 ) -> Result, NodeMapError> {
 validate_candidate(idx, prefix.clone(), self.lookup(prefix)?)
+.map(|(opt, _shortest)| opt)
+}
+
+fn unique_prefix_len_bin<'a>(
+,
+idx:  RevlogIndex,
+prefix: NodePrefixRef<'a>,
+) -> Result, 

D7797: rust-nodemap: pure Rust example

2020-01-27 Thread gracinet (Georges Racinet)
gracinet edited the summary of this revision.
gracinet updated this revision to Diff 19637.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7797?vs=19046=19637

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7797/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7797

AFFECTED FILES
  rust/Cargo.lock
  rust/hg-core/Cargo.toml
  rust/hg-core/examples/nodemap/index.rs
  rust/hg-core/examples/nodemap/main.rs

CHANGE DETAILS

diff --git a/rust/hg-core/examples/nodemap/main.rs 
b/rust/hg-core/examples/nodemap/main.rs
new file mode 100644
--- /dev/null
+++ b/rust/hg-core/examples/nodemap/main.rs
@@ -0,0 +1,150 @@
+// Copyright 2019-2020 Georges Racinet 
+//
+// This software may be used and distributed according to the terms of the
+// GNU General Public License version 2 or any later version.
+
+extern crate clap;
+extern crate hg;
+extern crate memmap;
+
+use clap::*;
+use hg::revlog::node::*;
+use hg::revlog::nodemap::*;
+use hg::revlog::*;
+use memmap::MmapOptions;
+use rand::Rng;
+use std::fs::File;
+use std::io;
+use std::io::Write;
+use std::path::{Path, PathBuf};
+use std::str::FromStr;
+use std::time::Instant;
+
+mod index;
+use index::Index;
+
+fn mmap_index(repo_path: ) -> Index {
+let mut path = PathBuf::from(repo_path);
+path.extend([".hg", "store", "00changelog.i"].iter());
+Index::load_mmap(path)
+}
+
+fn mmap_nodemap(path: ) -> NodeTree {
+let file = File::open(path).unwrap();
+let mmap = unsafe { MmapOptions::new().map().unwrap() };
+let len = mmap.len();
+NodeTree::load_bytes(Box::new(mmap), len)
+}
+
+/// Scan the whole index and create the corresponding nodemap file at `path`
+fn create(index: , path: ) -> io::Result<()> {
+let mut file = File::create(path)?;
+let start = Instant::now();
+let mut nm = NodeTree::default();
+for rev in 0..index.len() {
+let rev = rev as Revision;
+nm.insert(index, index.node(rev).unwrap(), rev).unwrap();
+}
+eprintln!("Nodemap constructed in RAM in {:?}", start.elapsed());
+file.write(_readonly_and_added_bytes().1)?;
+eprintln!("Nodemap written to disk");
+Ok(())
+}
+
+fn query(index: , nm: , prefix: ) {
+let start = Instant::now();
+let res = nm.find_hex(index, prefix);
+println!("Result found in {:?}: {:?}", start.elapsed(), res);
+}
+
+fn bench(index: , nm: , queries: usize) {
+let len = index.len() as u32;
+let mut rng = rand::thread_rng();
+let nodes: Vec = (0..queries)
+.map(|_| {
+index
+.node((rng.gen::() % len) as Revision)
+.unwrap()
+.clone()
+})
+.collect();
+if queries < 10 {
+let nodes_hex: Vec =
+nodes.iter().map(|n| n.encode_hex()).collect();
+println!("Nodes: {:?}", nodes_hex);
+}
+let mut last: Option = None;
+let start = Instant::now();
+for node in nodes.iter() {
+last = nm.find_bin(index, node.into()).unwrap();
+}
+let elapsed = start.elapsed();
+println!(
+"Did {} queries in {:?} (mean {:?}), last was {:?} with result {:?}",
+queries,
+elapsed,
+elapsed / (queries as u32),
+nodes.last().unwrap().encode_hex(),
+last
+);
+}
+
+fn main() {
+let matches = App::new("Nodemap pure Rust example")
+.arg(
+Arg::with_name("REPOSITORY")
+.help("Path to the repository, always necessary for its index")
+.required(true),
+)
+.arg(
+Arg::with_name("NODEMAP_FILE")
+.help("Path to the nodemap file, independent of REPOSITORY")
+.required(true),
+)
+.subcommand(
+SubCommand::with_name("create")
+.about("Create NODEMAP_FILE by scanning repository index"),
+)
+.subcommand(
+SubCommand::with_name("query")
+.about("Query NODEMAP_FILE for PREFIX")
+.arg(Arg::with_name("PREFIX").required(true)),
+)
+.subcommand(
+SubCommand::with_name("bench")
+.about(
+"Perform #QUERIES random successful queries on 
NODEMAP_FILE")
+.arg(Arg::with_name("QUERIES").required(true)),
+)
+.get_matches();
+
+let repo = matches.value_of("REPOSITORY").unwrap();
+let nm_path = matches.value_of("NODEMAP_FILE").unwrap();
+
+let index = mmap_index(::new(repo));
+
+if let Some(_) = matches.subcommand_matches("create") {
+println!("Creating nodemap file {} for repository {}", nm_path, repo);
+create(, ::new(nm_path)).unwrap();
+return;
+}
+
+let nm = mmap_nodemap(::new(nm_path));
+if let Some(matches) = matches.subcommand_matches("query") {
+let prefix = matches.value_of("PREFIX").unwrap();
+println!(
+"Querying {} in nodemap 

D7798: rust-nodemap: special case for prefixes of NULL_NODE

2020-01-27 Thread gracinet (Georges Racinet)
gracinet updated this revision to Diff 19638.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7798?vs=19330=19638

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7798/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7798

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -13,7 +13,8 @@
 //! is used in a more abstract context.
 
 use super::{
-Node, NodeError, NodePrefix, NodePrefixRef, Revision, RevlogIndex,
+node::NULL_NODE, Node, NodeError, NodePrefix, NodePrefixRef, Revision,
+RevlogIndex, NULL_REVISION,
 };
 
 use std::fmt;
@@ -250,6 +251,31 @@
 })
 }
 
+/// validate that the candidate's node starts indeed with given prefix,
+/// and treat ambiguities related to `NULL_REVISION`.
+///
+/// From the data in the NodeTree, one can only conclude that some
+/// revision is the only one for a *subprefix* of the one being looked up.
+fn validate_candidate<'p>(
+idx:  RevlogIndex,
+prefix: NodePrefixRef<'p>,
+rev: Option,
+) -> Result, NodeMapError> {
+if prefix.is_prefix_of(_NODE) {
+// NULL_REVISION always matches a prefix made only of zeros
+// and any other *valid* result is an ambiguity
+match rev {
+None => Ok(Some(NULL_REVISION)),
+Some(r) => match has_prefix_or_none(idx, prefix, r)? {
+None => Ok(Some(NULL_REVISION)),
+_ => Err(NodeMapError::MultipleResults),
+},
+}
+} else {
+rev.map_or(Ok(None), |r| has_prefix_or_none(idx, prefix, r))
+}
+}
+
 impl NodeTree {
 /// Initiate a NodeTree from an immutable slice-like of `Block`
 ///
@@ -330,8 +356,6 @@
 }
 
 /// Main working method for `NodeTree` searches
-///
-/// This partial implementation lacks special cases for NULL_REVISION
 fn lookup<'p>(
 ,
 prefix: NodePrefixRef<'p>,
@@ -582,9 +606,7 @@
 idx:  RevlogIndex,
 prefix: NodePrefixRef<'a>,
 ) -> Result, NodeMapError> {
-self.lookup(prefix.clone()).and_then(|opt| {
-opt.map_or(Ok(None), |rev| has_prefix_or_none(idx, prefix, rev))
-})
+validate_candidate(idx, prefix.clone(), self.lookup(prefix)?)
 }
 }
 
@@ -713,8 +735,9 @@
 
 assert_eq!(nt.find_hex(, "0"), Err(MultipleResults));
 assert_eq!(nt.find_hex(, "01"), Ok(Some(9)));
-assert_eq!(nt.find_hex(, "00"), Ok(Some(0)));
+assert_eq!(nt.find_hex(, "00"), Err(MultipleResults));
 assert_eq!(nt.find_hex(, "00a"), Ok(Some(0)));
+assert_eq!(nt.find_hex(, "000"), Ok(Some(NULL_REVISION)));
 }
 
 #[test]
@@ -733,7 +756,8 @@
 };
 assert_eq!(nt.find_hex(, "10")?, Some(1));
 assert_eq!(nt.find_hex(, "c")?, Some(2));
-assert_eq!(nt.find_hex(, "00")?, Some(0));
+assert_eq!(nt.find_hex(, "00"), Err(MultipleResults));
+assert_eq!(nt.find_hex(, "000")?, Some(NULL_REVISION));
 assert_eq!(nt.find_hex(, "01")?, Some(9));
 Ok(())
 }



To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7796: rust-nodemap: input/output primitives

2020-01-27 Thread gracinet (Georges Racinet)
gracinet edited the summary of this revision.
gracinet updated this revision to Diff 19636.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7796?vs=19329=19636

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7796/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7796

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -17,8 +17,10 @@
 };
 
 use std::fmt;
+use std::mem;
 use std::ops::Deref;
 use std::ops::Index;
+use std::slice;
 
 #[derive(Debug, PartialEq)]
 pub enum NodeMapError {
@@ -175,6 +177,8 @@
 #[derive(Clone, PartialEq)]
 pub struct Block([RawElement; 16]);
 
+pub const BLOCK_SIZE: usize = mem::size_of::();
+
 impl Block {
 fn new() -> Self {
 Block([-1; 16])
@@ -262,6 +266,56 @@
 }
 }
 
+/// Create from an opaque bunch of bytes
+///
+/// The created `NodeTreeBytes` from `buffer`,
+/// of which exactly `amount` bytes are used.
+///
+/// - `buffer` could be derived from `PyBuffer` and `Mmap` objects.
+/// - `offset` allows for the final file format to include fixed data
+///   (generation number, behavioural flags)
+/// - `amount` is expressed in bytes, and is not automatically derived from
+///   `bytes`, so that a caller that manages them atomically can perform
+///   temporary disk serializations and still rollback easily if needed.
+///   First use-case for this would be to support Mercurial shell hooks.
+///
+/// panics if `buffer` is smaller than `amount`
+pub fn load_bytes(
+bytes: Box + Send>,
+amount: usize,
+) -> Self {
+NodeTree::new(Box::new(NodeTreeBytes::new(bytes, amount)))
+}
+
+/// Retrieve added `Block` and the original immutable data
+pub fn into_readonly_and_added(
+self,
+) -> (Box + Send>, Vec) {
+let mut vec = self.growable;
+let readonly = self.readonly;
+if readonly.last() != Some() {
+vec.push(self.root);
+}
+(readonly, vec)
+}
+
+/// Retrieve added `Blocks` as bytes, ready to be written to persistent
+/// storage
+pub fn into_readonly_and_added_bytes(
+self,
+) -> (Box + Send>, Vec) {
+let (readonly, vec) = self.into_readonly_and_added();
+let bytes = unsafe {
+Vec::from_raw_parts(
+vec.as_ptr() as *mut u8,
+vec.len() * BLOCK_SIZE,
+vec.capacity() * BLOCK_SIZE,
+)
+};
+mem::forget(vec);
+(readonly, bytes)
+}
+
 /// Total number of blocks
 fn len() -> usize {
 self.readonly.len() + self.growable.len() + 1
@@ -410,6 +464,38 @@
 }
 }
 
+pub struct NodeTreeBytes {
+buffer: Box + Send>,
+len_in_blocks: usize,
+}
+
+impl NodeTreeBytes {
+fn new(
+buffer: Box + Send>,
+amount: usize,
+) -> Self {
+assert!(buffer.len() >= amount);
+let len_in_blocks = amount / BLOCK_SIZE;
+NodeTreeBytes {
+buffer,
+len_in_blocks,
+}
+}
+}
+
+impl Deref for NodeTreeBytes {
+type Target = [Block];
+
+fn deref() -> &[Block] {
+unsafe {
+slice::from_raw_parts(
+().as_ptr() as *const Block,
+self.len_in_blocks,
+)
+}
+}
+}
+
 struct NodeTreeVisitor<'n, 'p> {
 nt: &'n NodeTree,
 prefix: NodePrefixRef<'p>,
@@ -786,4 +872,30 @@
 
 Ok(())
 }
+
+#[test]
+fn test_into_added_empty() {
+assert!(sample_nodetree().into_readonly_and_added().1.is_empty());
+assert!(sample_nodetree()
+.into_readonly_and_added_bytes()
+.1
+.is_empty());
+}
+
+#[test]
+fn test_into_added_bytes() -> Result<(), NodeMapError> {
+let mut idx = TestNtIndex::new();
+idx.insert(0, "1234")?;
+let mut idx = idx.commit();
+idx.insert(4, "cafe")?;
+let (_, bytes) = idx.nt.into_readonly_and_added_bytes();
+
+// only the root block has been changed
+assert_eq!(bytes.len(), BLOCK_SIZE);
+// big endian for -2
+assert_eq!([4..2 * 4], [255, 255, 255, 254]);
+// big endian for -6
+assert_eq!([12 * 4..13 * 4], [255, 255, 255, 250]);
+Ok(())
+}
 }



To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7795: rust-nodemap: insert method

2020-01-27 Thread gracinet (Georges Racinet)
gracinet updated this revision to Diff 19635.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7795?vs=19328=19635

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7795/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7795

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -15,6 +15,7 @@
 use super::{
 Node, NodeError, NodePrefix, NodePrefixRef, Revision, RevlogIndex,
 };
+
 use std::fmt;
 use std::ops::Deref;
 use std::ops::Index;
@@ -96,6 +97,15 @@
 }
 }
 
+pub trait MutableNodeMap: NodeMap {
+fn insert(
+ self,
+index: ,
+node: ,
+rev: Revision,
+) -> Result<(), NodeMapError>;
+}
+
 /// Low level NodeTree [`Blocks`] elements
 ///
 /// These are exactly as for instance on persistent storage.
@@ -292,6 +302,112 @@
 done: false,
 }
 }
+/// Return a mutable reference for `Block` at index `idx`.
+///
+/// If `idx` lies in the immutable area, then the reference is to
+/// a newly appended copy.
+///
+/// Returns (new_idx, glen, mut_ref) where
+///
+/// - `new_idx` is the index of the mutable `Block`
+/// - `mut_ref` is a mutable reference to the mutable Block.
+/// - `glen` is the new length of `self.growable`
+///
+/// Note: the caller wouldn't be allowed to query `self.growable.len()`
+/// itself because of the mutable borrow taken with the returned `Block`
+fn mutable_block( self, idx: usize) -> (usize,  Block, usize) {
+let ro_blocks = 
+let ro_len = ro_blocks.len();
+let glen = self.growable.len();
+if idx < ro_len {
+// TODO OPTIM I think this makes two copies
+self.growable.push(ro_blocks[idx].clone());
+(glen + ro_len,  self.growable[glen], glen + 1)
+} else if glen + ro_len == idx {
+(idx,  self.root, glen)
+} else {
+(idx,  self.growable[idx - ro_len], glen)
+}
+}
+
+/// Main insertion method
+///
+/// This will dive in the node tree to find the deepest `Block` for
+/// `node`, split it as much as needed and record `node` in there.
+/// The method then backtracks, updating references in all the visited
+/// blocks from the root.
+///
+/// All the mutated `Block` are copied first to the growable part if
+/// needed. That happens for those in the immutable part except the root.
+pub fn insert(
+ self,
+index: ,
+node: ,
+rev: Revision,
+) -> Result<(), NodeMapError> {
+let ro_len = ();
+
+let mut visit_steps: Vec<_> = self.visit(node.into()).collect();
+let read_nybbles = visit_steps.len();
+// visit_steps cannot be empty, since we always visit the root block
+let deepest = visit_steps.pop().unwrap();
+
+let (mut block_idx, mut block, mut glen) =
+self.mutable_block(deepest.block_idx);
+
+if let Element::Rev(old_rev) = deepest.element {
+let old_node = index
+.node(old_rev)
+.ok_or_else(|| NodeMapError::RevisionNotInIndex(old_rev))?;
+if old_node == node {
+return Ok(()); // avoid creating lots of useless blocks
+}
+
+// Looping over the tail of nybbles in both nodes, creating
+// new blocks until we find the difference
+let mut new_block_idx = ro_len + glen;
+let mut nybble = deepest.nybble;
+for nybble_pos in read_nybbles..node.nybbles_len() {
+block.set(nybble, Element::Block(new_block_idx));
+
+let new_nybble = node.get_nybble(nybble_pos);
+let old_nybble = old_node.get_nybble(nybble_pos);
+
+if old_nybble == new_nybble {
+self.growable.push(Block::new());
+block =  self.growable[glen];
+glen += 1;
+new_block_idx += 1;
+nybble = new_nybble;
+} else {
+let mut new_block = Block::new();
+new_block.set(old_nybble, Element::Rev(old_rev));
+new_block.set(new_nybble, Element::Rev(rev));
+self.growable.push(new_block);
+break;
+}
+}
+} else {
+// Free slot in the deepest block: no splitting has to be done
+block.set(deepest.nybble, Element::Rev(rev));
+}
+
+// Backtrack over visit steps to update references
+while let Some(visited) = visit_steps.pop() {
+let to_write = Element::Block(block_idx);
+if visit_steps.is_empty() {
+   

D7794: rust-nodemap: generic NodeTreeVisitor

2020-01-27 Thread gracinet (Georges Racinet)
gracinet edited the summary of this revision.
gracinet updated this revision to Diff 19634.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7794?vs=19327=19634

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7794/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7794

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -272,17 +272,82 @@
 ,
 prefix: NodePrefixRef<'p>,
 ) -> Result, NodeMapError> {
-let mut visit = self.len() - 1;
-for i in 0..prefix.len() {
-let nybble = prefix.get_nybble(i);
-match self[visit].get(nybble) {
-Element::None => return Ok(None),
-Element::Rev(r) => return Ok(Some(r)),
-Element::Block(idx) => visit = idx,
+for visit_item in self.visit(prefix) {
+if let Some(opt) = visit_item.final_revision() {
+return Ok(opt);
 }
 }
 Err(NodeMapError::MultipleResults)
 }
+
+fn visit<'n, 'p>(
+&'n self,
+prefix: NodePrefixRef<'p>,
+) -> NodeTreeVisitor<'n, 'p> {
+NodeTreeVisitor {
+nt: self,
+prefix: prefix,
+visit: self.len() - 1,
+nybble_idx: 0,
+done: false,
+}
+}
+}
+
+struct NodeTreeVisitor<'n, 'p> {
+nt: &'n NodeTree,
+prefix: NodePrefixRef<'p>,
+visit: usize,
+nybble_idx: usize,
+done: bool,
+}
+
+#[derive(Debug, PartialEq, Clone)]
+struct NodeTreeVisitItem {
+block_idx: usize,
+nybble: u8,
+element: Element,
+}
+
+impl<'n, 'p> Iterator for NodeTreeVisitor<'n, 'p> {
+type Item = NodeTreeVisitItem;
+
+fn next( self) -> Option {
+if self.done || self.nybble_idx >= self.prefix.len() {
+return None;
+}
+
+let nybble = self.prefix.get_nybble(self.nybble_idx);
+self.nybble_idx += 1;
+
+let visit = self.visit;
+let element = self.nt[visit].get(nybble);
+if let Element::Block(idx) = element {
+self.visit = idx;
+} else {
+self.done = true;
+}
+
+Some(NodeTreeVisitItem {
+block_idx: visit,
+nybble: nybble,
+element: element,
+})
+}
+}
+
+impl NodeTreeVisitItem {
+// Return `Some(opt)` if this item is final, with `opt` being the
+// `Revision` that it may represent.
+//
+// If the item is not terminal, return `None`
+fn final_revision() -> Option> {
+match self.element {
+Element::Block(_) => None,
+Element::Rev(r) => Some(Some(r)),
+Element::None => Some(None),
+}
+}
 }
 
 impl From> for NodeTree {



To: gracinet, #hg-reviewers, kevincox
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7793: rust-nodemap: mutable NodeTree data structure

2020-01-27 Thread gracinet (Georges Racinet)
gracinet updated this revision to Diff 19633.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7793?vs=19326=19633

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7793/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7793

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -191,16 +191,31 @@
 }
 }
 
-/// A 16-radix tree with the root block at the end
+/// A mutable 16-radix tree with the root block logically at the end
+///
+/// Because of the append only nature of our node trees, we need to
+/// keep the original untouched and store new blocks separately.
+///
+/// The mutable root `Block` is kept apart so that we don't have to rebump
+/// it on each insertion.
 pub struct NodeTree {
 readonly: Box + Send>,
+growable: Vec,
+root: Block,
 }
 
 impl Index for NodeTree {
 type Output = Block;
 
 fn index(, i: usize) ->  {
-[i]
+let ro_len = self.readonly.len();
+if i < ro_len {
+[i]
+} else if i == ro_len + self.growable.len() {
+
+} else {
+[i - ro_len]
+}
 }
 }
 
@@ -222,12 +237,32 @@
 }
 
 impl NodeTree {
-fn len() -> usize {
-self.readonly.len()
+/// Initiate a NodeTree from an immutable slice-like of `Block`
+///
+/// We keep `readonly` and clone its root block if it isn't empty.
+fn new(readonly: Box + Send>) -> Self {
+let root = readonly
+.last()
+.map(|b| b.clone())
+.unwrap_or_else(|| Block::new());
+NodeTree {
+readonly: readonly,
+growable: Vec::new(),
+root: root,
+}
 }
 
+/// Total number of blocks
+fn len() -> usize {
+self.readonly.len() + self.growable.len() + 1
+}
+
+/// Implemented for completeness
+///
+/// A `NodeTree` always has at least the mutable root block.
+#[allow(dead_code)]
 fn is_empty() -> bool {
-self.len() == 0
+false
 }
 
 /// Main working method for `NodeTree` searches
@@ -237,9 +272,6 @@
 ,
 prefix: NodePrefixRef<'p>,
 ) -> Result, NodeMapError> {
-if self.is_empty() {
-return Ok(None);
-}
 let mut visit = self.len() - 1;
 for i in 0..prefix.len() {
 let nybble = prefix.get_nybble(i);
@@ -255,16 +287,18 @@
 
 impl From> for NodeTree {
 fn from(vec: Vec) -> Self {
-NodeTree {
-readonly: Box::new(vec),
-}
+Self::new(Box::new(vec))
 }
 }
 
 impl fmt::Debug for NodeTree {
 fn fmt(, f:  fmt::Formatter<'_>) -> fmt::Result {
-let blocks: &[Block] = &*self.readonly;
-write!(f, "readonly: {:?}", blocks)
+let readonly: &[Block] = &*self.readonly;
+write!(
+f,
+"readonly: {:?}, growable: {:?}, root: {:?}",
+readonly, self.growable, self.root
+)
 }
 }
 
@@ -365,7 +399,9 @@
 assert_eq!(
 format!("{:?}", nt),
 "readonly: \
- [{0: Rev(9)}, {0: Rev(0), 1: Rev(9)}, {0: Block(1), 1: Rev(1)}]"
+ [{0: Rev(9)}, {0: Rev(0), 1: Rev(9)}, {0: Block(1), 1: Rev(1)}], \
+ growable: [], \
+ root: {0: Block(1), 1: Rev(1)}",
 );
 }
 
@@ -374,7 +410,7 @@
 let mut idx: TestIndex = HashMap::new();
 pad_insert( idx, 1, "1234deadcafe");
 
-let nt = NodeTree::from(vec![block![1: Rev(1)]]);
+let nt = NodeTree::from(vec![block! {1: Rev(1)}]);
 assert_eq!(nt.find_hex(, "1")?, Some(1));
 assert_eq!(nt.find_hex(, "12")?, Some(1));
 assert_eq!(nt.find_hex(, "1234de")?, Some(1));
@@ -401,4 +437,25 @@
 assert_eq!(nt.find_hex(, "00"), Ok(Some(0)));
 assert_eq!(nt.find_hex(, "00a"), Ok(Some(0)));
 }
+
+#[test]
+fn test_mutated_find() -> Result<(), NodeMapError> {
+let mut idx = TestIndex::new();
+pad_insert( idx, 9, "012");
+pad_insert( idx, 0, "00a");
+pad_insert( idx, 2, "cafe");
+pad_insert( idx, 3, "15");
+pad_insert( idx, 1, "10");
+
+let nt = NodeTree {
+readonly: sample_nodetree().readonly,
+growable: vec![block![0: Rev(1), 5: Rev(3)]],
+root: block![0: Block(1), 1:Block(3), 12: Rev(2)],
+};
+assert_eq!(nt.find_hex(, "10")?, Some(1));
+assert_eq!(nt.find_hex(, "c")?, Some(2));
+assert_eq!(nt.find_hex(, "00")?, Some(0));
+assert_eq!(nt.find_hex(, "01")?, Some(9));
+Ok(())
+}
 }



To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list

D7791: rust-nodemap: NodeMap trait with simplest implementation

2020-01-27 Thread gracinet (Georges Racinet)
gracinet retitled this revision from "rust-nodemap: NodeMap trait with simplest 
implementor" to "rust-nodemap: NodeMap trait with simplest implementation".
gracinet updated this revision to Diff 19631.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7791?vs=19324=19631

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7791/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7791

AFFECTED FILES
  rust/hg-core/src/revlog/node.rs
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -12,8 +12,88 @@
 //! Following existing implicit conventions, the "nodemap" terminology
 //! is used in a more abstract context.
 
-use super::Revision;
+use super::{
+Node, NodeError, NodePrefix, NodePrefixRef, Revision, RevlogIndex,
+};
 use std::fmt;
+use std::ops::Deref;
+
+#[derive(Debug, PartialEq)]
+pub enum NodeMapError {
+MultipleResults,
+InvalidNodePrefix(NodeError),
+/// A `Revision` stored in the nodemap could not be found in the index
+RevisionNotInIndex(Revision),
+}
+
+impl From for NodeMapError {
+fn from(err: NodeError) -> Self {
+NodeMapError::InvalidNodePrefix(err)
+}
+}
+
+/// Mapping system from Mercurial nodes to revision numbers.
+///
+/// ## `RevlogIndex` and `NodeMap`
+///
+/// One way to think about their relationship is that
+/// the `NodeMap` is a prefix-oriented reverse index of the `Node` information
+/// carried by a [`RevlogIndex`].
+///
+/// Many of the methods in this trait take a `RevlogIndex` argument
+/// which is used for validation of their results. This index must naturally
+/// be the one the `NodeMap` is about, and it must be consistent.
+///
+/// Notably, the `NodeMap` must not store
+/// information about more `Revision` values than there are in the index.
+/// In these methods, an encountered `Revision` is not in the index, a
+/// [`RevisionNotInIndex`] error is returned.
+///
+/// In insert operations, the rule is thus that the `NodeMap` must always
+/// be updated after the `RevlogIndex`
+/// be updated first, and the `NodeMap` second.
+///
+/// [`RevisionNotInIndex`]: enum.NodeMapError.html#variant.RevisionNotInIndex
+/// [`RevlogIndex`]: ../trait.RevlogIndex.html
+pub trait NodeMap {
+/// Find the unique `Revision` having the given `Node`
+///
+/// If no Revision matches the given `Node`, `Ok(None)` is returned.
+fn find_node(
+,
+index:  RevlogIndex,
+node: ,
+) -> Result, NodeMapError> {
+self.find_bin(index, node.into())
+}
+
+/// Find the unique Revision whose `Node` starts with a given binary prefix
+///
+/// If no Revision matches the given prefix, `Ok(None)` is returned.
+///
+/// If several Revisions match the given prefix, a [`MultipleResults`]
+/// error is returned.
+fn find_bin<'a>(
+,
+idx:  RevlogIndex,
+prefix: NodePrefixRef<'a>,
+) -> Result, NodeMapError>;
+
+/// Find the unique Revision whose `Node` hexadecimal string representation
+/// starts with a given prefix
+///
+/// If no Revision matches the given prefix, `Ok(None)` is returned.
+///
+/// If several Revisions match the given prefix, a [`MultipleResults`]
+/// error is returned.
+fn find_hex(
+,
+idx:  RevlogIndex,
+prefix: ,
+) -> Result, NodeMapError> {
+self.find_bin(idx, NodePrefix::from_hex(prefix)?.borrow())
+}
+}
 
 /// Low level NodeTree [`Blocks`] elements
 ///
@@ -110,9 +190,86 @@
 }
 }
 
+/// A 16-radix tree with the root block at the end
+pub struct NodeTree {
+readonly: Box + Send>,
+}
+
+/// Return `None` unless the `Node` for `rev` has given prefix in `index`.
+fn has_prefix_or_none<'p>(
+idx:  RevlogIndex,
+prefix: NodePrefixRef<'p>,
+rev: Revision,
+) -> Result, NodeMapError> {
+idx.node(rev)
+.ok_or_else(|| NodeMapError::RevisionNotInIndex(rev))
+.map(|node| {
+if prefix.is_prefix_of(node) {
+Some(rev)
+} else {
+None
+}
+})
+}
+
+impl NodeTree {
+/// Main working method for `NodeTree` searches
+///
+/// This partial implementation lacks special cases for NULL_REVISION
+fn lookup<'p>(
+,
+prefix: NodePrefixRef<'p>,
+) -> Result, NodeMapError> {
+let blocks: &[Block] = &*self.readonly;
+if blocks.is_empty() {
+return Ok(None);
+}
+let mut visit = blocks.len() - 1;
+for i in 0..prefix.len() {
+let nybble = prefix.get_nybble(i);
+match blocks[visit].get(nybble) {
+Element::None => return Ok(None),
+Element::Rev(r) => return Ok(Some(r)),
+Element::Block(idx) => 

D7792: rust-nodemap: abstracting the indexing

2020-01-27 Thread gracinet (Georges Racinet)
gracinet updated this revision to Diff 19632.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7792?vs=19325=19632

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7792/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7792

AFFECTED FILES
  rust/hg-core/src/revlog/nodemap.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/nodemap.rs 
b/rust/hg-core/src/revlog/nodemap.rs
--- a/rust/hg-core/src/revlog/nodemap.rs
+++ b/rust/hg-core/src/revlog/nodemap.rs
@@ -17,6 +17,7 @@
 };
 use std::fmt;
 use std::ops::Deref;
+use std::ops::Index;
 
 #[derive(Debug, PartialEq)]
 pub enum NodeMapError {
@@ -195,6 +196,14 @@
 readonly: Box + Send>,
 }
 
+impl Index for NodeTree {
+type Output = Block;
+
+fn index(, i: usize) ->  {
+[i]
+}
+}
+
 /// Return `None` unless the `Node` for `rev` has given prefix in `index`.
 fn has_prefix_or_none<'p>(
 idx:  RevlogIndex,
@@ -213,6 +222,14 @@
 }
 
 impl NodeTree {
+fn len() -> usize {
+self.readonly.len()
+}
+
+fn is_empty() -> bool {
+self.len() == 0
+}
+
 /// Main working method for `NodeTree` searches
 ///
 /// This partial implementation lacks special cases for NULL_REVISION
@@ -220,14 +237,13 @@
 ,
 prefix: NodePrefixRef<'p>,
 ) -> Result, NodeMapError> {
-let blocks: &[Block] = &*self.readonly;
-if blocks.is_empty() {
+if self.is_empty() {
 return Ok(None);
 }
-let mut visit = blocks.len() - 1;
+let mut visit = self.len() - 1;
 for i in 0..prefix.len() {
 let nybble = prefix.get_nybble(i);
-match blocks[visit].get(nybble) {
+match self[visit].get(nybble) {
 Element::None => return Ok(None),
 Element::Rev(r) => return Ok(Some(r)),
 Element::Block(idx) => visit = idx,



To: gracinet, #hg-reviewers, kevincox
Cc: durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7790: rust-node: handling binary Node prefix

2020-01-27 Thread gracinet (Georges Racinet)
gracinet edited the summary of this revision.
gracinet updated this revision to Diff 19630.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7790?vs=19039=19630

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7790/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7790

AFFECTED FILES
  rust/hg-core/src/revlog.rs
  rust/hg-core/src/revlog/node.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog/node.rs b/rust/hg-core/src/revlog/node.rs
--- a/rust/hg-core/src/revlog/node.rs
+++ b/rust/hg-core/src/revlog/node.rs
@@ -62,6 +62,7 @@
 #[derive(Debug, PartialEq)]
 pub enum NodeError {
 ExactLengthRequired(usize, String),
+PrefixTooLong(String),
 HexError(FromHexError, String),
 }
 
@@ -119,17 +120,119 @@
 }
 }
 
-impl From<(FromHexError, )> for NodeError {
-fn from(err_offender: (FromHexError, )) -> Self {
+impl> From<(FromHexError, T)> for NodeError {
+fn from(err_offender: (FromHexError, T)) -> Self {
 let (err, offender) = err_offender;
 match err {
 FromHexError::InvalidStringLength => {
 NodeError::ExactLengthRequired(
 NODE_NYBBLES_LENGTH,
-offender.to_string(),
+offender.as_ref().to_owned(),
 )
 }
-_ => NodeError::HexError(err, offender.to_string()),
+_ => NodeError::HexError(err, offender.as_ref().to_owned()),
+}
+}
+}
+
+/// The beginning of a binary revision SHA.
+///
+/// Since it can potentially come from an hexadecimal representation with
+/// odd length, it needs to carry around whether the last 4 bits are relevant
+/// or not.
+#[derive(Debug, PartialEq)]
+pub struct NodePrefix {
+buf: Vec,
+is_odd: bool,
+}
+
+impl NodePrefix {
+/// Convert from hexadecimal string representation
+///
+/// Similarly to `hex::decode`, can be used with Unicode string types
+/// (`String`, ``) as well as bytes.
+///
+/// To be used in FFI and I/O only, in order to facilitate future
+/// changes of hash format.
+pub fn from_hex(hex: impl AsRef<[u8]>) -> Result {
+let hex = hex.as_ref();
+let len = hex.len();
+if len > NODE_NYBBLES_LENGTH {
+return Err(NodeError::PrefixTooLong(
+String::from_utf8_lossy(hex).to_owned().to_string(),
+));
+}
+
+let is_odd = len % 2 == 1;
+let even_part = if is_odd { [..len - 1] } else { hex };
+let mut buf: Vec = Vec::from_hex(_part)
+.map_err(|e| (e, String::from_utf8_lossy(hex)))?;
+
+if is_odd {
+let latest_char = char::from(hex[len - 1]);
+let latest_nybble = latest_char.to_digit(16).ok_or_else(|| {
+(
+FromHexError::InvalidHexCharacter {
+c: latest_char,
+index: len - 1,
+},
+String::from_utf8_lossy(hex),
+)
+})? as u8;
+buf.push(latest_nybble << 4);
+}
+Ok(NodePrefix { buf, is_odd })
+}
+
+pub fn borrow() -> NodePrefixRef {
+NodePrefixRef {
+buf: ,
+is_odd: self.is_odd,
+}
+}
+}
+
+#[derive(Clone, Debug, PartialEq)]
+pub struct NodePrefixRef<'a> {
+buf: &'a [u8],
+is_odd: bool,
+}
+
+impl<'a> NodePrefixRef<'a> {
+pub fn len() -> usize {
+if self.is_odd {
+self.buf.len() * 2 - 1
+} else {
+self.buf.len() * 2
+}
+}
+
+pub fn is_prefix_of(, node: ) -> bool {
+if self.is_odd {
+let buf = self.buf;
+let last_pos = buf.len() - 1;
+node.data.starts_with(buf.split_at(last_pos).0)
+&& node.data[last_pos] >> 4 == buf[last_pos] >> 4
+} else {
+node.data.starts_with(self.buf)
+}
+}
+
+/// Retrieve the `i`th half-byte from the prefix.
+///
+/// This is also the `i`th hexadecimal digit in numeric form,
+/// also called a [nybble](https://en.wikipedia.org/wiki/Nibble).
+pub fn get_nybble(, i: usize) -> u8 {
+get_nybble(self.buf, i)
+}
+}
+
+/// A shortcut for full `Node` references
+impl<'a> From<&'a Node> for NodePrefixRef<'a> {
+fn from(node: &'a Node) -> Self {
+NodePrefixRef {
+buf: ,
+is_odd: false,
 }
 }
 }
@@ -188,4 +291,74 @@
 fn test_node_encode_hex() {
 assert_eq!(sample_node().encode_hex(), sample_node_hex());
 }
+
+#[test]
+fn test_prefix_from_hex() -> Result<(), NodeError> {
+assert_eq!(
+NodePrefix::from_hex("0e1")?,
+NodePrefix {
+buf: vec![14, 16],
+is_odd: true
+}
+);
+assert_eq!(
+NodePrefix::from_hex("0e1a")?,
+NodePrefix {
+  

D7789: rust-revlog: a trait for the revlog index

2020-01-27 Thread gracinet (Georges Racinet)
gracinet updated this revision to Diff 19629.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7789?vs=19038=19629

BRANCH
  default

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7789/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7789

AFFECTED FILES
  rust/hg-core/src/revlog.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/revlog.rs b/rust/hg-core/src/revlog.rs
--- a/rust/hg-core/src/revlog.rs
+++ b/rust/hg-core/src/revlog.rs
@@ -40,3 +40,17 @@
 ParentOutOfRange(Revision),
 WorkingDirectoryUnsupported,
 }
+
+/// The Mercurial Revlog Index
+///
+/// This is currently limited to the minimal interface that is needed for
+/// the [`nodemap`](nodemap/index.html) module
+pub trait RevlogIndex {
+/// Total number of Revisions referenced in this index
+fn len() -> usize;
+
+/// Return a reference to the Node or `None` if rev is out of bounds
+///
+/// `NULL_REVISION` is not considered to be out of bounds.
+fn node(, rev: Revision) -> Option<>;
+}



To: gracinet, #hg-reviewers, martinvonz
Cc: martinvonz, durin42, kevincox, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


mercurial@44180: 6 new changesets

2020-01-27 Thread Mercurial Commits
6 new changesets in mercurial:

https://www.mercurial-scm.org/repo/hg/rev/2b6632d64328
changeset:   44175:2b6632d64328
user:Martin von Zweigbergk 
date:Fri Jan 24 16:00:54 2020 -0800
summary: merge: check that there are no conflicts after --abort

https://www.mercurial-scm.org/repo/hg/rev/6744859ff3ee
changeset:   44176:6744859ff3ee
user:Martin von Zweigbergk 
date:Fri Jan 24 16:05:11 2020 -0800
summary: merge: call hg.abortmerge() directly and return early

https://www.mercurial-scm.org/repo/hg/rev/1850066f9e36
changeset:   44177:1850066f9e36
user:Martin von Zweigbergk 
date:Fri Jan 24 17:49:21 2020 -0800
summary: merge: don't auto-pick destination with `hg merge 'wdir()'`

https://www.mercurial-scm.org/repo/hg/rev/bd4f666b55a7
changeset:   44178:bd4f666b55a7
user:Martin von Zweigbergk 
date:Fri Jan 24 16:07:42 2020 -0800
summary: merge: drop now-unused "abort" argument from hg.merge()

https://www.mercurial-scm.org/repo/hg/rev/293cab97a766
changeset:   44179:293cab97a766
user:Martin von Zweigbergk 
date:Fri Jan 24 15:18:19 2020 -0800
summary: merge: replace a repo.lookup('.') by more typical repo['.'].node()

https://www.mercurial-scm.org/repo/hg/rev/d84420232492
changeset:   44180:d84420232492
bookmark:@
tag: tip
user:Martin von Zweigbergk 
date:Fri Jan 24 17:10:45 2020 -0800
summary: pathauditor: drop a redundant call to bytes.lower()

-- 
Repository URL: https://www.mercurial-scm.org/repo/hg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7906: rebase: define base in only place in defineparents()

2020-01-27 Thread martinvonz (Martin von Zweigbergk)
martinvonz added inline comments.

INLINE COMMENTS

> pulkit wrote in rebase.py:1811
> I am not sure why we are doing `bases[i] = bases[0]` here? (TBH I didn't try 
> to understand the whole logic, mostly only the diff)

Doing it only for consistency with `newps`. `bases` and `newps` should be kept 
in sync (we don't do that in one other place, but that's pretty weird, IMO). So 
I think it's clearer this way even though it's unnecessary.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7906/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7906

To: martinvonz, #hg-reviewers
Cc: pulkit, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7906: rebase: define base in only place in defineparents()

2020-01-27 Thread pulkit (Pulkit Goyal)
pulkit added inline comments.

INLINE COMMENTS

> rebase.py:1811
> +newps[0], newps[i] = newps[i], newps[0]
> +bases[0], bases[i] = bases[i], bases[0]
> +

I am not sure why we are doing `bases[i] = bases[0]` here? (TBH I didn't try to 
understand the whole logic, mostly only the diff)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7906/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7906

To: martinvonz, #hg-reviewers
Cc: pulkit, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: Nlnet funding for transitioning out of SHA-1

2020-01-27 Thread Pierre-Yves David



On 1/26/20 8:57 PM, Joerg Sonnenberger wrote:

On Wed, Jan 15, 2020 at 05:53:06PM +0100, Raphaël Gomès wrote:

Right now I can see the following high level steps

     - Update the core code to be able to deal with multiple hashing
functions
     - Update the network protocol to deal with multiple hashing functions
     - Update the on-disk format to deal with larger hashes
     - How to deal with backwards and forwards compatibility with regards to
both repositories and client/server (wire protocol changes, etc.)
     - How changing hashing functions impacts the user experience (from
additional steps to UI getting broken)
     - Help extensions to migrate if need be
     - Actually select a new hash function

Am I missing anything? How do you all feel about this?


I'd like to take a step back and start from the needs to be supported.

(1) It must be possible to create a new repository that uses a modern
cryptographic hash function both on-disk and when communicating with
other servers.

(2) It must be possible to migrate an existing repository without
invalidating references to commits. It should allow as much
interoperability with old wire clients as possible, at the very least pull
support and optionally write support. It should not be necessary to
support direct repository access from old clients. It must be possible
to schedule the migration on the timeframe of the user, not enforced as
part of an update.

The most important implication is that we need to get support for
secondary identifiers. That's the one big part currently missing in
revlog. With that in place, a lot of the other things follow as they can
transparently convert old and new node references.


I think it is important to not try to solve this questions while setting 
up a grant request. The grant request should prepare time to think and 
solves these questions, but that grant need to be finalized by Friday 
(and Raphaël is Off today, (happy birthday Raphaël)). So we should plan 
to solve theses questions, but not make it a pre-requires.


--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7993: merge: use check_incompatible_arguments() for --abort

2020-01-27 Thread yuja (Yuya Nishihara)
yuja added a comment.


  > +cmdutil.check_incompatible_arguments(opts, b'abort', b'rev', 
b'preview')
  
  It's a bit late, but I feel the arguments of `check_incompatible_arguments()`
  is confusing since `b'abort'` isn't the same kind of arguments as the others.
  I think `(opts, b'abort', [b'rev', b'preview'])` is more explicit.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7993/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7993

To: martinvonz, #hg-reviewers, pulkit
Cc: yuja, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: D7993: merge: use check_incompatible_arguments() for --abort

2020-01-27 Thread Yuya Nishihara
> +cmdutil.check_incompatible_arguments(opts, b'abort', b'rev', b'preview')

It's a bit late, but I feel the arguments of `check_incompatible_arguments()`
is confusing since `b'abort'` isn't the same kind of arguments as the others.
I think `(opts, b'abort', [b'rev', b'preview'])` is more explicit.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


mercurial@44174: 6 new changesets (5 on stable)

2020-01-27 Thread Mercurial Commits
6 new changesets (5 on stable) in mercurial:

https://www.mercurial-scm.org/repo/hg/rev/5d85e9ddc7b9
changeset:   44169:5d85e9ddc7b9
branch:  stable
parent:  44157:624fe53ce1e7
user:Matt Harbison 
date:Sat Jan 25 01:06:46 2020 -0500
summary: phabricator: fix a crash when submitting binaries (issue6260)

https://www.mercurial-scm.org/repo/hg/rev/0ab651b5f77c
changeset:   44170:0ab651b5f77c
branch:  stable
user:Matt Harbison 
date:Sat Jan 25 00:16:04 2020 -0500
summary: copyright: update to 2020

https://www.mercurial-scm.org/repo/hg/rev/8d653abed861
changeset:   44171:8d653abed861
branch:  stable
user:Gregory Szorc 
date:Fri Jan 24 20:21:53 2020 -0800
summary: wix: more robust normalization of RC version components

https://www.mercurial-scm.org/repo/hg/rev/2251b6cde170
changeset:   44172:2251b6cde170
branch:  stable
user:Gregory Szorc 
date:Fri Jan 24 20:24:29 2020 -0800
summary: wix: always normalize version string

https://www.mercurial-scm.org/repo/hg/rev/62111bc5ff87
changeset:   44173:62111bc5ff87
branch:  stable
user:Gregory Szorc 
date:Fri Jan 24 20:27:59 2020 -0800
summary: wix: use original version string for MSI filename

https://www.mercurial-scm.org/repo/hg/rev/75c2ca094d3a
changeset:   44174:75c2ca094d3a
bookmark:@
tag: tip
parent:  44168:1cb7ae9b0071
user:Martin von Zweigbergk 
date:Fri Jan 24 15:07:44 2020 -0800
summary: merge: use check_incompatible_arguments() for --abort

-- 
Repository URL: https://www.mercurial-scm.org/repo/hg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7932: [RFC]debugbackups: introduce command to interact with strip backups

2020-01-27 Thread marmoute (Pierre-Yves David)
marmoute added a comment.


  Can we call it `debugbackupbundle` for clarify?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7932/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7932

To: pulkit, #hg-reviewers
Cc: marmoute, durin42, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7967: exchange: recognize changegroup3 bundles in `getbundlespec()`

2020-01-27 Thread marmoute (Pierre-Yves David)
marmoute added a comment.


  I can confirm the spec version number are different to the changegroup 
version number. For the rest. I'll try to have a look soon.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7967/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7967

To: mharbison72, #hg-reviewers
Cc: marmoute, pulkit, joerg.sonnenberger, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel