Re: New revlog format, plan page

2021-01-13 Thread Pierre-Yves David



On 1/11/21 4:14 PM, Joerg Sonnenberger wrote:

On Mon, Jan 11, 2021 at 01:12:30PM +0100, Pierre-Yves David wrote:

(1) Some of the current cache we have would fit well in such index
* The hgtagsfnodes cache: taking 4 bytes to cache the `.hgtags` revision
number associated with a changelog revisions. (This will requires some
bookkeeping while adding/stripping),
* the `rbc-revs-v1`: using an integer (4bytes) and an external list to store
the branch on which each revision is,
* (probably another 4 bytes to store the sub-branch/topic,)


I'd be reluctant to move them into the revlog. If anything, it would
call for a more variant friendly format specification.


I not sure what you mean with "a more variant friendly format specification"


Ultimately, we
should figure out first how "hot" the various caches are before dedicing
to tie them tighter to changelog.


The branch cache pretty hot as we often requires this information, 
including to warm other cache and data. The tags one is not accessed 
that often, but the cache is very important for performance on large 
repository so we will keep needing it. Having in the changelog index 
makes its lifetime much simpler.



Also, at the very least in the case of
rbc-revs-v1, it would also prevent some useful optimisations.


This is not different from the current situation, except we no longer 
have to deal with a different file with non-trivial cache validation and 
life time. If we want to speed up the "which revision are in this 
branch" question, we will need some other index anyway, and that can 
come later.



When we
sort out the cache invalidation story, having a strict linear mapping of
32bit entries would make queries for all revisions of a given branch
easier than if it is part of a more complex data structure.


The data remains linear, if just have extra, fixed size data inbetween. 
This should not be a problem, should it ?



(2) Some cache key mechanism. Right now a lot of cache validate their
content using a (tip-rev, tip-node) pair. That pair is fragile as it does
not garantee that the content before the tip is the same. Having "some"
bytes that gather some kind of accumulated value from the previously added
nodes. It does not have to be too many bytes, as the (tip-node, tip-rev,
cache-key) should be good enough. We can probably build it using a series of
shift and xor of the hash we are adding.


See my mail from Dec, 14th. Having done a few more things in the mean
time, I'd add phases and obslog as cache keys on top and that's
something we don't handle well right now at all. At that point the
current invalidation strategy just becomes way too fragile.


I can't find said message, do you have a link ?

Here, I am only talking about cache content for the the changelog only. 
I think we both agree that for content that depends on other stuff, key 
for these other stuff need to be put to use.





--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-11 Thread Joerg Sonnenberger
On Mon, Jan 11, 2021 at 01:12:30PM +0100, Pierre-Yves David wrote:
> (1) Some of the current cache we have would fit well in such index
> * The hgtagsfnodes cache: taking 4 bytes to cache the `.hgtags` revision
> number associated with a changelog revisions. (This will requires some
> bookkeeping while adding/stripping),
> * the `rbc-revs-v1`: using an integer (4bytes) and an external list to store
> the branch on which each revision is,
> * (probably another 4 bytes to store the sub-branch/topic,)

I'd be reluctant to move them into the revlog. If anything, it would
call for a more variant friendly format specification. Ultimately, we
should figure out first how "hot" the various caches are before dedicing
to tie them tighter to changelog. Also, at the very least in the case of
rbc-revs-v1, it would also prevent some useful optimisations. When we
sort out the cache invalidation story, having a strict linear mapping of
32bit entries would make queries for all revisions of a given branch
easier than if it is part of a more complex data structure.

> (2) Some cache key mechanism. Right now a lot of cache validate their
> content using a (tip-rev, tip-node) pair. That pair is fragile as it does
> not garantee that the content before the tip is the same. Having "some"
> bytes that gather some kind of accumulated value from the previously added
> nodes. It does not have to be too many bytes, as the (tip-node, tip-rev,
> cache-key) should be good enough. We can probably build it using a series of
> shift and xor of the hash we are adding.

See my mail from Dec, 14th. Having done a few more things in the mean
time, I'd add phases and obslog as cache keys on top and that's
something we don't handle well right now at all. At that point the
current invalidation strategy just becomes way too fragile.

Joerg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-11 Thread Pierre-Yves David

I finally remember a couple of things I though we could move in an index.

(1) Some of the current cache we have would fit well in such index
* The hgtagsfnodes cache: taking 4 bytes to cache the `.hgtags` revision 
number associated with a changelog revisions. (This will requires some 
bookkeeping while adding/stripping),
* the `rbc-revs-v1`: using an integer (4bytes) and an external list to 
store the branch on which each revision is,

* (probably another 4 bytes to store the sub-branch/topic,)

(2) Some cache key mechanism. Right now a lot of cache validate their 
content using a (tip-rev, tip-node) pair. That pair is fragile as it 
does not garantee that the content before the tip is the same. Having 
"some" bytes that gather some kind of accumulated value from the 
previously added nodes. It does not have to be too many bytes, as the 
(tip-node, tip-rev, cache-key) should be good enough. We can probably 
build it using a series of shift and xor of the hash we are adding.


Note that with this, the index is heavily biased toward the changelog. 
So it is probably worth having distinct format: one mean for changelog 
were we can reclaim bytes related to filelog/manifestlog (linkrev and 
unified-revlog ID).


On 1/5/21 4:38 PM, Raphaël Gomès wrote:

Hi all,

During the last (virtual) sprint, a lot of us spoke about the need for a 
format change of the revlog to overcome some of its limitations.


I've opened a very much draft plan page [1] to try to list all the 
things we want to do in that version and try to figure out an efficient 
new format.


I'm aware that the v2 is already planned, but I figured that we can just 
merge that (seemingly) paused effort and this new one.


I wish you all a nice 2021!
Raphaël

[1] https://www.mercurial-scm.org/wiki/RevlogV2Plan

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-07 Thread Pierre-Yves David



On 1/7/21 8:52 PM, Pierre-Yves David wrote:



On 1/5/21 7:33 PM, Joerg Sonnenberger wrote:

On Tue, Jan 05, 2021 at 04:38:20PM +0100, Raphaël Gomès wrote:
I've opened a very much draft plan page [1] to try to list all the 
things we
want to do in that version and try to figure out an efficient new 
format.


"No support for hash version"

I don't think that points really matters. The plan for the hash
migration allows them in theory to coexist fully on the revlog layer and
the main problems for mixing them are on the changeset/manifest layer
anyway. That is, any migration strategy will IMO rewrite all revlogs to
the newer hash anyway and only keep a secondary index for changesets and
maybe manifests.


I agree here, the hash used will likely be defined at repository level 
(or at least revlog level).




"No support for sidedata"

My big design level concern is that revlog ATM is optimized for fast
integer indexing and append-only storage. At least for some sidedata
use cases I have, that is an ill fit.


The current spirit for sidedata is to have


Looks like this sentence got interru…

The current spirit for sidedata is for them to contain computed data 
that are inherent to the changesets (or revision in general) and can be 
computed once and for all when the changesets is added.


The storage proposed in revlog v2 requires the data to be added at 
"revision addition time" but does not requires the sidedata to be next 
the changeset data. This simplify operation that needs the rest of the 
changegroupe (manifest, filelog) to be added before computation.


It also means one could "update" the sidedata by "simply" rewriting the 
index.


I am sympathetic to a more generic storage for more volatile data. 
However the current proposal is good enough for the current goal and a 
couple of other and quite simple to implement. So the plan is to go with 
it for now.



"No support for unified revlog"

IMO this should be the driving feature. The biggest issue for me is that
it creates two challenges that didn't exist so far:
(1) Inter-file patches and how they interact with the wire protocol


I not worried here, inter-file patches should be able as simple as using 
a delta base pointing to the content of another file. And regarding the 
wireprotocol, we are already very bad at dealing with delta to 
non-parent, so we should be about as bad.



(2) Identical revisions stored in different places.


The broad plan of unified revlog is to have store things using a pair of 
identifier (content hash (eg: filenodeid) and content identifier. The 
two main options here are:


* using a hash of the target content (taking 32bits, "expensive" to search
* using some integer identifier and an associated side mapping for 
content → ID mapping. (and over the wire translation to non local 
identifier).


The second option seems more time and space efficient, so I am leaning 
toward it.


Either way, I think similar content (ie: same nodeid), should probably 
be stored twice in the index to keep current properly, we can reuse the 
data segment however. So the uniqness and indexing would happens using 
the (nodeid, contentid) pairs





"No support for larger files"

Supporting large revlog files is sensible and having a store for
design-challenged file systems might be necessary. Microsoft, I'm
looking at you. Otherwise the concern is space use in the revlog file
and RAM use during operations. I don't think the latter is as big an
issue now as it was 15 years ago, but the former is real. But it might
be a good point in time to just go for 64bit offsets by default...


Right now, offset are 6 bytes, so we can use revlog.d up to 281 TB, that 
seems good enough. The main "limitation" is about the file content, 
currently limited at 4GB. Given that we hold these in RAM for now, I 
don't think we need to bump it. We can bump it when introducing smarter 
RAM handling for such file.




--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-07 Thread Pierre-Yves David



On 1/5/21 7:33 PM, Joerg Sonnenberger wrote:

On Tue, Jan 05, 2021 at 04:38:20PM +0100, Raphaël Gomès wrote:

I've opened a very much draft plan page [1] to try to list all the things we
want to do in that version and try to figure out an efficient new format.


"No support for hash version"

I don't think that points really matters. The plan for the hash
migration allows them in theory to coexist fully on the revlog layer and
the main problems for mixing them are on the changeset/manifest layer
anyway. That is, any migration strategy will IMO rewrite all revlogs to
the newer hash anyway and only keep a secondary index for changesets and
maybe manifests.


I agree here, the hash used will likely be defined at repository level 
(or at least revlog level).




"No support for sidedata"

My big design level concern is that revlog ATM is optimized for fast
integer indexing and append-only storage. At least for some sidedata
use cases I have, that is an ill fit.


The current spirit for sidedata is to have



"No support for unified revlog"

IMO this should be the driving feature. The biggest issue for me is that
it creates two challenges that didn't exist so far:
(1) Inter-file patches and how they interact with the wire protocol


I not worried here, inter-file patches should be able as simple as using 
a delta base pointing to the content of another file. And regarding the 
wireprotocol, we are already very bad at dealing with delta to 
non-parent, so we should be about as bad.



(2) Identical revisions stored in different places.


The broad plan of unified revlog is to have store things using a pair of 
identifier (content hash (eg: filenodeid) and content identifier. The 
two main options here are:


* using a hash of the target content (taking 32bits, "expensive" to search
* using some integer identifier and an associated side mapping for 
content → ID mapping. (and over the wire translation to non local 
identifier).


The second option seems more time and space efficient, so I am leaning 
toward it.


Either way, I think similar content (ie: same nodeid), should probably 
be stored twice in the index to keep current properly, we can reuse the 
data segment however. So the uniqness and indexing would happens using 
the (nodeid, contentid) pairs





"No support for larger files"

Supporting large revlog files is sensible and having a store for
design-challenged file systems might be necessary. Microsoft, I'm
looking at you. Otherwise the concern is space use in the revlog file
and RAM use during operations. I don't think the latter is as big an
issue now as it was 15 years ago, but the former is real. But it might
be a good point in time to just go for 64bit offsets by default...


Right now, offset are 6 bytes, so we can use revlog.d up to 281 TB, that 
seems good enough. The main "limitation" is about the file content, 
currently limited at 4GB. Given that we hold these in RAM for now, I 
don't think we need to bump it. We can bump it when introducing smarter 
RAM handling for such file.


--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-07 Thread Josef 'Jeff' Sipek
On Thu, Jan 07, 2021 at 19:22:23 +0100, Joerg Sonnenberger wrote:
> On Thu, Jan 07, 2021 at 12:04:06PM -0500, Josef 'Jeff' Sipek wrote:
> > On Tue, Jan 05, 2021 at 19:33:36 +0100, Joerg Sonnenberger wrote:

... snip things I agree with and have nothing to add to ...

> > > "No support for sidedata"
> > > 
> > > My big design level concern is that revlog ATM is optimized for fast
> > > integer indexing and append-only storage.
> > 
> > This is an interesting point.  What *are* the most common revlog operations?
> > It probably varies between repos, but I suspect that they are mostly reads
> > rather than writes.  As a consequence, a good revlog format would optimize
> > for the common case (without making the less common cases completely suck).
> 
> The problem is that anything that needs inplace writes is a lot more
> difficult to get right for on-disk consistency and for concurrent
> read-access.

Yes, it definitely is harder.  Depending on the expected workloads and the
exact design, it may or may not be worth the effort.

> Normal revision data does not change, by design. That's
> quite different from any unversioned metadata. This can include
> signatures for example, it could include obsolescence data etc.
> Separating mutable and immutable data is a natural design choice.

Yes, this is a variant of the age old "separate hot and cold data".

If mutable and immutable data is considered independently, each type can be
stored in a different format - each optimized in its own way for the common
case.  This however will likely still require some form of transaction and
synchronization code to guarantee sufficiently atomic updates in face of
errors.

> > hg already makes use of CBOR, so it'd be reasonable to use here - either for
> > the whole entry or just for parts of it.  For example, CBOR's interegers are
> > encoded as 1 byte type, followed by 0, 1, 2, 4, or 8 byte integer.  Smaller
> > values use less space.  For example, values less than 2^32 use 1-5 bytes.
> 
> Needing a separate index from the index for efficient access would
> defeat the point of revlog being an index format in first place...

Variable length encoding can be constrained.  If for example, each entry is
padded to be a multiple of 16B, then access can still be relatively
efficient.  If having sparse revnums is ok, then it shouldn't even require
many (any?) changes to the "core" code.

(My favorite example from CPU instruction sets is the s390 instruction set -
it is variable length, but the length can only be 2, 4, or 6 bytes.)

Jeff.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-07 Thread Joerg Sonnenberger
On Thu, Jan 07, 2021 at 12:04:06PM -0500, Josef 'Jeff' Sipek wrote:
> On Tue, Jan 05, 2021 at 19:33:36 +0100, Joerg Sonnenberger wrote:
> > On Tue, Jan 05, 2021 at 04:38:20PM +0100, Raphaël Gomès wrote:
> > > I've opened a very much draft plan page [1] to try to list all the things 
> > > we
> > > want to do in that version and try to figure out an efficient new format.
> > 
> > "No support for hash version"
> > 
> > I don't think that points really matters. The plan for the hash
> > migration allows them in theory to coexist fully on the revlog layer and
> > the main problems for mixing them are on the changeset/manifest layer
> > anyway. That is, any migration strategy will IMO rewrite all revlogs to
> > the newer hash anyway and only keep a secondary index for changesets and
> > maybe manifests.
> 
> At the same time, I think it is sensible (and very useful when looking an a
> revlog without repo-level info) for revlogs to identify which hash they
> contain.  Either in some sort of revlog header or in each entry (if hash can
> vary between entries).

I plan the replacement hash to be tagged, so yes, they can be
individually distinguished. 

> 
> > "No support for sidedata"
> > 
> > My big design level concern is that revlog ATM is optimized for fast
> > integer indexing and append-only storage.
> 
> This is an interesting point.  What *are* the most common revlog operations?
> It probably varies between repos, but I suspect that they are mostly reads
> rather than writes.  As a consequence, a good revlog format would optimize
> for the common case (without making the less common cases completely suck).

The problem is that anything that needs inplace writes is a lot more
difficult to get right for on-disk consistency and for concurrent
read-access. Normal revision data does not change, by design. That's
quite different from any unversioned metadata. This can include
signatures for example, it could include obsolescence data etc.
Separating mutable and immutable data is a natural design choice.

> hg already makes use of CBOR, so it'd be reasonable to use here - either for
> the whole entry or just for parts of it.  For example, CBOR's interegers are
> encoded as 1 byte type, followed by 0, 1, 2, 4, or 8 byte integer.  Smaller
> values use less space.  For example, values less than 2^32 use 1-5 bytes.

Needing a separate index from the index for efficient access would
defeat the point of revlog being an index format in first place...

Joerg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-07 Thread Josef 'Jeff' Sipek
On Tue, Jan 05, 2021 at 19:33:36 +0100, Joerg Sonnenberger wrote:
> On Tue, Jan 05, 2021 at 04:38:20PM +0100, Raphaël Gomès wrote:
> > I've opened a very much draft plan page [1] to try to list all the things we
> > want to do in that version and try to figure out an efficient new format.
> 
> "No support for hash version"
> 
> I don't think that points really matters. The plan for the hash
> migration allows them in theory to coexist fully on the revlog layer and
> the main problems for mixing them are on the changeset/manifest layer
> anyway. That is, any migration strategy will IMO rewrite all revlogs to
> the newer hash anyway and only keep a secondary index for changesets and
> maybe manifests.

At the same time, I think it is sensible (and very useful when looking an a
revlog without repo-level info) for revlogs to identify which hash they
contain.  Either in some sort of revlog header or in each entry (if hash can
vary between entries).

> "No support for sidedata"
> 
> My big design level concern is that revlog ATM is optimized for fast
> integer indexing and append-only storage.

This is an interesting point.  What *are* the most common revlog operations?
It probably varies between repos, but I suspect that they are mostly reads
rather than writes.  As a consequence, a good revlog format would optimize
for the common case (without making the less common cases completely suck).

> At least for some sidedata use cases I have, that is an ill fit.

I actually have no idea what sidedata is, but I don't think it changes my
point about picking formats that match the workload :)

> "No support for unified revlog"
> 
> IMO this should be the driving feature.

Agreed (assuming that 'unified revlog' is just a placeholder name for 'a
storage scheme that uses less than O(n) files to store revision data').  I
always think twice before I move a file in a hg repo because I don't like
wasting disk space.  It's a stupid feeling, I know.

> The biggest issue for me is that
> it creates two challenges that didn't exist so far:
> (1) Inter-file patches and how they interact with the wire protocol
> (2) Identical revisions stored in different places.
> 
> "No support for larger files"
> 
> Supporting large revlog files is sensible and having a store for
> design-challenged file systems might be necessary. Microsoft, I'm
> looking at you. Otherwise the concern is space use in the revlog file
> and RAM use during operations. I don't think the latter is as big an
> issue now as it was 15 years ago, but the former is real. But it might
> be a good point in time to just go for 64bit offsets by default...

I'd *strongly* advocate for 64-bit offsets.  They pretty much let you forget
that there is a limit.  Storage is cheap.

If revlog entry size is a concern (e.g., it takes more than 1% of the size
of the data it is tracking), then maybe a variable encoding would be the way
to go.

hg already makes use of CBOR, so it'd be reasonable to use here - either for
the whole entry or just for parts of it.  For example, CBOR's interegers are
encoded as 1 byte type, followed by 0, 1, 2, 4, or 8 byte integer.  Smaller
values use less space.  For example, values less than 2^32 use 1-5 bytes.

A common alternative is LEB128 [1], which IIRC is used by git for something
internally.  It is however a bit more expensive to pack/unpack.

Jeff.

[1] https://en.wikipedia.org/wiki/LEB128
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-06 Thread Joerg Sonnenberger
On Thu, Jan 07, 2021 at 01:05:34AM +, Johannes Totz wrote:
> Have we ever addressed the file duplication on hg-mv? .hg/store/data/ will
> end up with lots of duplicated data. That has always been my biggest gripe.

That's the unified revlog issue.

Joerg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-06 Thread Johannes Totz

On 05/01/2021 15:38, Raphaël Gomès wrote:

Hi all,

During the last (virtual) sprint, a lot of us spoke about the need for a 
format change of the revlog to overcome some of its limitations.


I've opened a very much draft plan page [1] to try to list all the 
things we want to do in that version and try to figure out an efficient 
new format.


I'm aware that the v2 is already planned, but I figured that we can just 
merge that (seemingly) paused effort and this new one.


I wish you all a nice 2021!
Raphaël

[1] https://www.mercurial-scm.org/wiki/RevlogV2Plan


I haven't kept up to date with hg dev... sorry if this is a stupid question:

Have we ever addressed the file duplication on hg-mv? .hg/store/data/ 
will end up with lots of duplicated data. That has always been my 
biggest gripe.


Random idea I had:
store the *.d files as usual but add a layer of "segments" like *.d.0 
and *.d.1 and so on. So when one does a hg-mv, the initial oldname.d.0 
is hardlinked to newname.d.0 and any appends to oldname.d.0 would 
instead go to a new segment oldname.d.1 (same for newname.d.1) because 
that segment *.d.0 has >1 link count. And so on for subsequent segments.
That should work for local clones, moves, copies. But will prob make a 
mess of the wire protocol.


___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: New revlog format, plan page

2021-01-05 Thread Joerg Sonnenberger
On Tue, Jan 05, 2021 at 04:38:20PM +0100, Raphaël Gomès wrote:
> I've opened a very much draft plan page [1] to try to list all the things we
> want to do in that version and try to figure out an efficient new format.

"No support for hash version"

I don't think that points really matters. The plan for the hash
migration allows them in theory to coexist fully on the revlog layer and
the main problems for mixing them are on the changeset/manifest layer
anyway. That is, any migration strategy will IMO rewrite all revlogs to
the newer hash anyway and only keep a secondary index for changesets and
maybe manifests.

"No support for sidedata"

My big design level concern is that revlog ATM is optimized for fast
integer indexing and append-only storage. At least for some sidedata
use cases I have, that is an ill fit.

"No support for unified revlog"

IMO this should be the driving feature. The biggest issue for me is that
it creates two challenges that didn't exist so far:
(1) Inter-file patches and how they interact with the wire protocol
(2) Identical revisions stored in different places.

"No support for larger files"

Supporting large revlog files is sensible and having a store for
design-challenged file systems might be necessary. Microsoft, I'm
looking at you. Otherwise the concern is space use in the revlog file
and RAM use during operations. I don't think the latter is as big an
issue now as it was 15 years ago, but the former is real. But it might
be a good point in time to just go for 64bit offsets by default...

Joerg
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


New revlog format, plan page

2021-01-05 Thread Raphaël Gomès

Hi all,

During the last (virtual) sprint, a lot of us spoke about the need for a 
format change of the revlog to overcome some of its limitations.


I've opened a very much draft plan page [1] to try to list all the 
things we want to do in that version and try to figure out an efficient 
new format.


I'm aware that the v2 is already planned, but I figured that we can just 
merge that (seemingly) paused effort and this new one.


I wish you all a nice 2021!
Raphaël

[1] https://www.mercurial-scm.org/wiki/RevlogV2Plan

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel