Hi, all. This is a high-level mail in which I try to figure out the current status of the issue #525 work and what's left to land it in trunk and release it. Corrections and feedback welcome.

To remind everyone:

The purpose of this work is to reduce checkout sizes by optionally not having local pristine text-bases for WC files. In trees that have lots of large binary files, this can reduce disk usage by about half, so it really matters for some use cases. Also: in the long run, we want the user to be able to specify which files do and don't have pristines (but in the first release it can be a per-WC choice).

Current status as I understand it:

First, Julian has written up a great description of how the feature works from a user's perspective:

https://svn.apache.org/viewvc/subversion/branches/pristines-on-demand-on-mwf/notes/i525/i525-user-guide.md?view=log

Based on that document, it looks to me like we still need some well-named knobs by which the user can control this feature. Right now, the command-line way looks like one of these:

 $ svn checkout --compatible-version=1.15
 $ svn upgrade --compatible-version=1.15

However, there's a "TODO" note that addresses this UI point:

> [TODO] We might change this so that upgrading to > 1.15-compatible > format and enabling "i525pod" are separate steps and the > latter is
 > optional.

I think we should implement that TODO before releasing the feature. Ideally, the new WC format would support the "pristines-on-demand" feature without forcing a given WC to be in p-o-d mode.

Right now, if I understand correctly, a WC can either be entirely in p-o-d mode or entirely in regular mode (i.e., the current default, with pristines are always present for everything). In other words, in its first release, this feature would *not* allow users specify that certain files in a WC should be p-o-d while other files are regular (but see the note "Now, a subtle point..." below about this). It's a whole-WC thing.

However, I think it's okay to release this feature that way, without support for selective per-file p-o-d, as long as the UI for per-WC toggling is clear (e.g., not a flag like "--compatible-version=1.15", which doesn't say anything about the actual behavior being toggled).

("Toggle" may be the wrong word here, as I believe we also don't yet have a way to bring a WC back from p-o-d to regular mode. Do we care about that for release?)

Now, a subtle point about this UI issue:

In the bright future, when we *do* support per-file specification of p-o-d-ness, there would be no need for a per-WC flag at all. Instead, users would specify that certain files should be p-o-d either by using client-side configuration options (e.g., all files larger than a given size, or having certain MIME type(s), are in p-o-d mode), or via command line actions to support explicit "hydrate" and "dehydrate" operations (these actions would either be top-level subcommands or options to existing commands -- we don't need to decide that detail now).

I guess what I'm saying is, if we are *close* to having the underlying WC support needed to support per-file selection of p-o-d-ness, then maybe it's better to go all the way and just finish that. *Then* people could simply upgrade their working copies as usual, with no immediate behavior change resulting from that upgrade, and this new feature would then be available to them. We would then offer...

 $ svn checkout --store-pristines=no
 $ svn upgrade --store-pristines=no

...as the gateways to the feature in the first release (so p-o-d-ness would to every file in the WC), and add selective UI in later releases, knowing that the underlying UI already supports it. However, if that's a complex change in the WC code, then let's just release with whole-WC support and not delay.

Have I summarized the current status accurately?  Thoughts?

Please see also Julian's status email from April, which goes into more detail about which tests need updating, etc:

 https://lists.apache.org/thread/lm98og8jqonffcs250q5y3ft5r5qlmk5

 From: Julian Foad
 To: Daniel Shahaf
 Cc: Subversion Dev, Karl Fogel
Subject: Re: A two-part vision for Subversion and large binary objects.
 Date: Tue, 5 Apr 2022 15:50:56 +0100
Message-ID: <70d88dc5-1558-422d-9986-42a2977a9...@getmailspring.com>

By the way, in that thread, Evgeny Kotkov -- whose initial work much of this is based on -- follows up with a patch that does a first-pass implementation of 'svn checkout --store-pristines=no' (by implementing a new persistent setting in wc.db).

Note that Julian and Daniel originally undertook this work as part of a contract with my company (which represents a consortium of companies interested in this feature). Mostly it was Julian writing new code and Daniel reviewing and writing tests, and I thank both of them for having gotten us this far.

The work went a bit over budget not through any fault of theirs, but because we ran into an unexpected snag having to do with order of network operations in Subversion. TL;DR: even though in *theory* an operation can always know at the beginning which pristines it has locally and which ones it doesn't, Subversion's current client/server communications conventions don't take advantage of that information in the way we'd want. Instead, the client assumes pristines are present and sends up-front revision information to the server, causing the server to send responses that rely on those pristines being present. The whole way the client and server talk to each other is based on this; it's fixable, of course, but doing so is not simple and probably not just client-side. So the 'pristines-on-demand-on-mwf' branch takes a reasonable-but-not-perfect solution for now; the 'pristines-on-demand-issue4892' that branches from it improves the situation [1], but is not complete and needn't block release. (See [2] for deeper discussion.)

I'll talk privately with them about finishing this and the budget required to do so. I think we're close and would really like to see this feature released soon. (Note that we have merged the 'multi-wc-format' branch to trunk, in r1898187 on 2022-02-18. IIUC that was a necessary predecessor to everything else.)

We should be able to get there from here, right?

Best regards,
-Karl

[1] This command will give you some sense of the difference between those two branches:

$ svn diff https://svn.apache.org/repos/asf/subversion/branches/pristines-on-demand-on-mwf/notes/i525/i525-user-guide.md https://svn.apache.org/repos/asf/subversion/branches/pristines-on-demand-issue4892/notes/i525/i525-user-guide.md

[2] https://lists.apache.org/thread/mwo5zy14wlkbs8j4334zn0296dl472qd

   From: Evgeny Kotkov
   To: Julian Foad
   Cc: Subversion Dev
Subject: Re: Issue #525/#4892: on only fetching the pristines we really need
   Date: Fri, 11 Mar 2022 18:23:55 +0300
Message-ID: <CAP_GPNhtNYttB-wCk-SYYAPevCx2Xb0AsLt-hq=nckqmf_u...@mail.gmail.com>

Reply via email to