On 28/06/2021 11:49, Raphaël Gomès wrote:
Hello all,

As you probably know my colleagues at Octobus and I have been working on
a new version of the dirstate, and we're coming pretty close to
something usable in production, so we need to freeze the format soon.

Hello again,

Together with the Rust implementation of the new status algorithm, this dirstate-v2 file format enables great performance improvements of `hg status` on large repositories.

We Octobus are hoping to stabilize it very soon after a few remaining changes, so that the format will not be experimental anymore in the upcoming Mercurial 6.0 release. It will not yet be enabled by default, but future Mercurial versions will need to be compatible both ways with 6.0 when accessing a given local repository that uses dirstate-v2.


A short user guide (how to enable, upgrade, or downgrade) as well as detailed documentation of the file format can be found at:

https://www.mercurial-scm.org/repo/hg-committed/file/tip/mercurial/helptext/internals/dirstate-v2.txt

… or in a source repository by running `make local` then `./hg help internals.dirstate-v2`


The remaining format changes we have planned are:

* Add sub-second precision to stored file/symlink mtime, and share its location with that of directory mtime. (This part of the format is a bit of a mess right now since we’re in the middle of this change.)

* Maybe add a flag bit to allow marking files as "known modified at this mtime". `hg status` sometimes needs to read the contents of files in case of possible size-preserving changes. If there is indeed a change, currently this read is repeated every time status runs again. The new bit would record that result.

* Maybe add some node-specific or dirstate-wide flags or a "mode switch" to make the format and its storage of directory mtimes less tied to details of the current readdir-skipping optimization. (For example, a future version of Mercurial might want to add dirstate nodes for unknown or/and ignored files to skip readdir in more cases.)


Non-format changes that we want to have in 6.0:

* Merge D11520 and the rest of that stack to have a Python implementation of the format, so that repositories that use it are usable when Rust extensions are not enabled. This is slower, in the order of 0.1 to 0.3 seconds added to `hg status` commands taking 0.4 to 2.5 seconds with dirstate-v1 without Rust on various repositories.

* Add configuration to either abort, warn, or silently continue when this slow code path is or would be used. And decide its default. I’m personally inclined at least not to abort by default since the slow path is not *horribly* slow.


Please let us know of any question or comment!

--
Simon Sapin
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to