Re: Getting to first release of pristines-on-demand feature (#525).
Evgeny Kotkov writes: > Merged in https://svn.apache.org/r1905955 > > I'm going to respond on the topic of SHA1 a bit later. For the history: thread [1] proposes the `pristine-checksum-salt` branch that adds the infrastructure to support new pristine checksum kinds in the working copy and makes a switch to the dynamically-salted SHA1. >From the technical standpoint, I think that it would be better to release the first version of the pristines-on-demand feature having this branch merged, because now we rely on the checksum comparison to determine if a file has changed — and currently it's a checksum kind with known collisions. At the same time, having that branch merged probably isn't a formal release blocker for the pristines-on-demand feature. Also, considering that the `pristine-checksum-salt` branch is currently vetoed by danielsh (presumably, for an indefinite period of time), I'd like to note that personally I have no objections to proceeding with a release of the pristines-on-demand feature without this branch. [1] https://lists.apache.org/thread/xmd7x6bx2mrrbw7k5jr1tdmhhrlr9ljc Regards, Evgeny Kotkov
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format (was: Re: Getting to first release of pristines-on-demand feature (#525).)
Evgeny Kotkov via dev wrote on Tue, Dec 20, 2022 at 11:14:00 +0300: > [Moving discussion to a new thread] > > We currently have a problem that a working copy relies on the checksum type > with known collisions (SHA1). A solution to that problem Why is libsvn_wc's use of SHA-1 a problem? What's the scenario wherein Subversion will behave differently than it should? > is to switch to a different checksum type without known collisions in > one of the newer working copy formats. Such as SHA-1 salted by NODES.LOCAL_RELPATH and NODES.WC_ID (or a per-wc UUID)? > Since we plan on shipping a new working copy format in 1.15, this seems to > be an appropriate moment of time to decide whether we'd also want to switch > to a checksum type without known collisions in that new format. > What's the acceptance test we use for candidate checksum algorithms? You say we should switch to a checksum algorithm that doesn't have known collisions, but, why should we require that? Consider the following 160-bit checksum algorithm: . 1. If the input consists of 40 ASCII lowercase hex digits and nothing else, return the input. 2. Else, return the SHA-1 of the input. This algorithm has a trivial first preimage attack. If a wc used this identity-then-sha1 algorithm instead of SHA-1, then… what? > Below are the arguments for including a switch to a different checksum type > in the working copy format for 1.15: > > 1) Since the "is the file modified?" check now compares checksums, leaving >everything as-is may be considered a regression, because it would >introduce additional cases where a working copy currently relies on >comparing checksums with known collisions. > Well, SHA-1 is still collision-free so long as one is not deliberately trying to use collisions, so this would only be a regression if we consider "Deliberately store files that have the same checksum" to be a use-case. Do we? I recall we discussed this when shattered.io was announced, and we didn't rush to upgrade the checksums we use everywhere, so I guess back then we came to the conclusion that wasn't a use-case. (Of course we can change our opinion; that's just a datapoint, and there may be more, on both sides, in the old thread.) I looked for the old thread and didn't find it. (I looked in the private@ archives too in case the thread was there.) > 2) We already need a working copy format bump for the pristines-on-demand >feature. So using that format bump to solve the SHA1 issue might reduce >the overall number of required bumps for users (assuming that we'll still >need to switch from SHA1 at some point later). > Considering that 1.15 will support reading and writing both f31 and f32, the "overall number of required bumps" between 1.8 and trunk@HEAD is zero, meaning the proposed change can't reduce that number. > 3) While the pristines-on-demand feature is not released, upgrading >with a switch to the new checksum type seems to be possible without >requiring a network fetch. I infer the scenario in question here is upgrading a (say) pristinesless wc to a a newer format that supports a new checksum algorithm. >But if some of the pristines are optional, we lose the possibility >to rehash all contents in place. So we might find ourselves having >to choose between two worse alternatives of either requiring >a network fetch during upgrade or entirely prohibiting an upgrade >of working copies with optional pristines. Why would we want to rehash everything in place? The 1.15→1.16 upgrade could simply leave pristineless files' checksums as SHA-1 until the next «svn up», just like «svnadmin upgrade» of FSFS doesn't retroactively add SHA-1 checksums to node-rev headers or "-file" or "-dir" indicators in the changed-paths section. There may be yet other alternatives. > Thoughts? I'm not voting either -0 or +0 at this time. Cheers, Daniel
Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format (was: Re: Getting to first release of pristines-on-demand feature (#525).)
On 20.12.2022 09:14, Evgeny Kotkov wrote: 2) We already need a working copy format bump for the pristines-on-demand feature. So using that format bump to solve the SHA1 issue might reduce the overall number of required bumps for users (assuming that we'll still need to switch from SHA1 at some point later). Using a new hashing algorithm in the working copy is relatively simple. Making such a change backwards-compatible is not. It would be really nice if this could be done in a way that allows newer clients to still support older working copies without upgrading them; after all, we have the infrastructure for this in place now. -- Brane
Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format (was: Re: Getting to first release of pristines-on-demand feature (#525).)
Karl Fogel writes: > > While here, I would like to raise a topic of incorporating a switch from > > SHA1 to a different checksum type (without known collisions) for the new > > working copy format. This topic is relevant to the pristines-on-demand > > branch, because the new "is the file modified?" check relies on the > > checksum comparison, instead of comparing the contents of working and > > pristine files. > > > > And so while I consider it to be out of the scope of the pristines-on- > > demand branch, I think that we might want to evaluate if this is something > > that should be a part of the next release. > > Good point. Maybe worth a new thread? [Moving discussion to a new thread] We currently have a problem that a working copy relies on the checksum type with known collisions (SHA1). A solution to that problem is to switch to a different checksum type without known collisions in one of the newer working copy formats. Since we plan on shipping a new working copy format in 1.15, this seems to be an appropriate moment of time to decide whether we'd also want to switch to a checksum type without known collisions in that new format. Below are the arguments for including a switch to a different checksum type in the working copy format for 1.15: 1) Since the "is the file modified?" check now compares checksums, leaving everything as-is may be considered a regression, because it would introduce additional cases where a working copy currently relies on comparing checksums with known collisions. 2) We already need a working copy format bump for the pristines-on-demand feature. So using that format bump to solve the SHA1 issue might reduce the overall number of required bumps for users (assuming that we'll still need to switch from SHA1 at some point later). 3) While the pristines-on-demand feature is not released, upgrading with a switch to the new checksum type seems to be possible without requiring a network fetch. But if some of the pristines are optional, we lose the possibility to rehash all contents in place. So we might find ourselves having to choose between two worse alternatives of either requiring a network fetch during upgrade or entirely prohibiting an upgrade of working copies with optional pristines. Thoughts? Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
On 13 Dec 2022, Evgeny Kotkov wrote: Evgeny Kotkov writes: Merged in https://svn.apache.org/r1905955 W00t!! Thank you, and Julian and Daniel and everyone who's contributed to this. So... do we have a release manager? :-)
Re: Getting to first release of pristines-on-demand feature (#525).
Evgeny Kotkov writes: > I think that the `pristines-on-demand-on-mwf` branch is now ready for a > merge to trunk. I could do that, assuming there are no objections. Merged in https://svn.apache.org/r1905955 I'm going to respond on the topic of SHA1 a bit later. Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
Nathan Hartman wrote on Wed, Dec 07, 2022 at 20:29:11 -0500: > On Wed, Dec 7, 2022 at 12:11 PM Evgeny Kotkov via dev < > dev@subversion.apache.org> wrote: > > > > > I think that the `pristines-on-demand-on-mwf` branch is now ready for a > > merge to trunk. I could do that, assuming there are no objections. > > > > I'd like to echo what others have already said by saying a great big THANK > YOU, to all who have worked on this cool new feature so far! > > I used an earlier incarnation of this branch some months ago in real usage > scenarios with good results and looking at the recent commit emails as > they've happened everything looks sensible to me. > > I will try to run the full test suite in the next couple of days and > assuming the tests pass for me I'll use it as my daily driver to test the > real usage. Obviously I'll post here if I find anything... > > Meanwhile I'd like to say that on further thought and after reading Johan's > and Karl's feedback regarding the feature switch naming, I've come around > to the point of view that --store-pristine={yes|no} is a perfectly fine UI. > Well, if we're bikeshedding anyway, how about --backend-tweaks=without-pristines? We can support just two values for starters ("without pristines" and "with pristines"), and have the room to extend this in 1.16, similar to --trust-server-cert/--trust-server-cert-failures and --pre-1.4-compatible/--compatible-version. Similarly, a new config file section with one valid option might make sense if we anticipate adding more options to that section in the future. This way we avoid having the configuration split across two places. > Given that this is now the command line switch name, and since users are > given direct control over the pristinefulness of a WC, and we've been > calling this feature Pristines On Demand since its inception, I think we > should finally bless this as the official name of the feature. > > In the next couple of days I plan to update the staged 1.15 release notes, > which until now tentatively called it Bare Working Copies, to call it > Pristines On Demand and to complete the description there. > > Regarding the SHA hash question: > > While here, I would like to raise a topic of incorporating a switch from > > SHA1 to a different checksum type (without known collisions) for the new > > working copy format. This topic is relevant to the pristines-on-demand > > branch, because the new "is the file modified?" check relies on the > > checksum > > comparison, instead of comparing the contents of working and pristine > > files. > > > > And so while I consider it to be out of the scope of the > > pristines-on-demand > > branch, I think that we might want to evaluate if this is something that > > should be a part of the next release. > > > Is it feasible and would it be beneficial to somehow decouple the hash code > type from the wc format version? Asking because IIRC the need for a format > bump to change hashes was one of the reasons it wasn't done a few years ago. Maybe if we teach f32 to read /two/ new checksum kinds? E.g., if we teach f32 to read both SHA-512 and SHA-3, then even if 1.15 f32 writes SHA-512 by default, it will nevertheless be able to read f32 wc's with SHA-3 rows that 1.16 might create. svn_checksum_kind_t's possible values include svn_checksum_fnv1a_32, so I guess we already support reading wc.db's that use FNV-1a checksums? (Incidentally, f31 is new in 1.8 whereas svn_checksum_fnv1a_32 is new in 1.9.) Cheers, Daniel
Re: Getting to first release of pristines-on-demand feature (#525).
On Wed, Dec 7, 2022 at 12:11 PM Evgeny Kotkov via dev < dev@subversion.apache.org> wrote: > > I think that the `pristines-on-demand-on-mwf` branch is now ready for a > merge to trunk. I could do that, assuming there are no objections. I'd like to echo what others have already said by saying a great big THANK YOU, to all who have worked on this cool new feature so far! I used an earlier incarnation of this branch some months ago in real usage scenarios with good results and looking at the recent commit emails as they've happened everything looks sensible to me. I will try to run the full test suite in the next couple of days and assuming the tests pass for me I'll use it as my daily driver to test the real usage. Obviously I'll post here if I find anything... Meanwhile I'd like to say that on further thought and after reading Johan's and Karl's feedback regarding the feature switch naming, I've come around to the point of view that --store-pristine={yes|no} is a perfectly fine UI. Given that this is now the command line switch name, and since users are given direct control over the pristinefulness of a WC, and we've been calling this feature Pristines On Demand since its inception, I think we should finally bless this as the official name of the feature. In the next couple of days I plan to update the staged 1.15 release notes, which until now tentatively called it Bare Working Copies, to call it Pristines On Demand and to complete the description there. Regarding the SHA hash question: While here, I would like to raise a topic of incorporating a switch from > SHA1 to a different checksum type (without known collisions) for the new > working copy format. This topic is relevant to the pristines-on-demand > branch, because the new "is the file modified?" check relies on the > checksum > comparison, instead of comparing the contents of working and pristine > files. > > And so while I consider it to be out of the scope of the > pristines-on-demand > branch, I think that we might want to evaluate if this is something that > should be a part of the next release. Is it feasible and would it be beneficial to somehow decouple the hash code type from the wc format version? Asking because IIRC the need for a format bump to change hashes was one of the reasons it wasn't done a few years ago. Cheers, Nathan
Re: Getting to first release of pristines-on-demand feature (#525).
On 07 Dec 2022, Evgeny Kotkov wrote: The branch passes all tests in my Windows and Linux environments, in both --store-pristine=yes and =no modes. FYI, it passes all tests here too (on Debian GNU/Linux, up-to-date 'testing' distro). Attached file has details; there were some XFAILs, but no FAILs. Best regards, -Karl $ svn info | grep -E "^URL: " URL: https://svn.apache.org/repos/asf/subversion/branches/pristines-on-demand-on-mwf $ svn status ? subversion/tests/libsvn_subr/task-test $ time make check [001/127] auth-test...success [002/127] authz-test..success [003/127] bit-array-test..success [004/127] cache-test..success [005/127] changes-testsuccess [006/127] checksum-test...success [007/127] client-test.success [008/127] compat-test.success [009/127] compress-test...success [010/127] config-test.success [011/127] conflict-data-test..success [012/127] conflicts-test..success [013/127] crypto-test.success [014/127] db-test.success [015/127] diff-diff3-test.success [016/127] dirent_uri-test.success [017/127] dump-load-test..success [018/127] entries-compat-test.success [019/127] error-code-test.success [020/127] error-test..success [021/127] filesize-test...success [022/127] fs-base-testsuccess [023/127] fs-fs-pack-test.success [024/127] fs-fs-private-test..
Re: Getting to first release of pristines-on-demand feature (#525).
On 07 Dec 2022, Evgeny Kotkov wrote: Evgeny Kotkov writes: I think that the `pristines-on-demand-on-mwf` branch is now ready for a merge to trunk. I could do that, assuming there are no objections. +1, and thank you. Now, I haven't had time to do a real code review -- my manager hat gets tighter every year -- so my "+1" is mainly a sign of enthusiasm for the feature, and of general trust in our test suite and in everyone who has worked on this. https://svn.apache.org/repos/asf/subversion/branches/pristines-on-demand-on-mwf The branch includes the following: – Core implementation of the new mode where required pristines are fetched at the beginning of the operation. – A new --store-pristine=yes/no option for `svn checkout` that is persisted as a working copy setting. +1 to this UI. We can offer other gateways to this feature later, but this is a clean & simple way to start out. – An update for `svn info` to display the value of this new setting. Yay. – A standalone test harness that tests main operations in both --store-pristine modes and gets executed on every test run. – A new --store-pristine=yes/no option for the test suite that forces all tests to run with a specific pristine mode. Very nice. The branch passes all tests in my Windows and Linux environments, in both --store-pristine=yes and =no modes. W00t! While here, I would like to raise a topic of incorporating a switch from SHA1 to a different checksum type (without known collisions) for the new working copy format. This topic is relevant to the pristines-on-demand branch, because the new "is the file modified?" check relies on the checksum comparison, instead of comparing the contents of working and pristine files. And so while I consider it to be out of the scope of the pristines-on-demand branch, I think that we might want to evaluate if this is something that should be a part of the next release. Good point. Maybe worth a new thread? Best regards, -Karl
Re: Getting to first release of pristines-on-demand feature (#525).
Evgeny, Thanks so much for your hard work in pushing this project forward! I don't think I can contribute much in getting this merged to trunk (from lack of C experience and lack of time to dig into the inner workings), but I hope it can be completed! Kind regards, Daniel Sahlberg Den ons 7 dec. 2022 kl 18:10 skrev Evgeny Kotkov via dev < dev@subversion.apache.org>: > Evgeny Kotkov writes: > > > > IMHO, once the tests are ready, we could merge it and release > > > it to the world. > > > > Apart from the required test changes, there are some technical > > TODOs that remain from the initial patch and should be resolved. > > I'll try to handle them as well. > > I think that the `pristines-on-demand-on-mwf` branch is now ready for a > merge to trunk. I could do that, assuming there are no objections. > > > https://svn.apache.org/repos/asf/subversion/branches/pristines-on-demand-on-mwf > > The branch includes the following: > – Core implementation of the new mode where required pristines are fetched > at the beginning of the operation. > – A new --store-pristine=yes/no option for `svn checkout` that is persisted > as a working copy setting. > – An update for `svn info` to display the value of this new setting. > – A standalone test harness that tests main operations in both > --store-pristine modes and gets executed on every test run. > – A new --store-pristine=yes/no option for the test suite that forces all > tests to run with a specific pristine mode. > > The branch passes all tests in my Windows and Linux environments, in both > --store-pristine=yes and =no modes. > > > While here, I would like to raise a topic of incorporating a switch from > SHA1 to a different checksum type (without known collisions) for the new > working copy format. This topic is relevant to the pristines-on-demand > branch, because the new "is the file modified?" check relies on the > checksum > comparison, instead of comparing the contents of working and pristine > files. > > And so while I consider it to be out of the scope of the > pristines-on-demand > branch, I think that we might want to evaluate if this is something that > should be a part of the next release. > > > Thanks, > Evgeny Kotkov >
Re: Getting to first release of pristines-on-demand feature (#525).
Evgeny Kotkov writes: > > IMHO, once the tests are ready, we could merge it and release > > it to the world. > > Apart from the required test changes, there are some technical > TODOs that remain from the initial patch and should be resolved. > I'll try to handle them as well. I think that the `pristines-on-demand-on-mwf` branch is now ready for a merge to trunk. I could do that, assuming there are no objections. https://svn.apache.org/repos/asf/subversion/branches/pristines-on-demand-on-mwf The branch includes the following: – Core implementation of the new mode where required pristines are fetched at the beginning of the operation. – A new --store-pristine=yes/no option for `svn checkout` that is persisted as a working copy setting. – An update for `svn info` to display the value of this new setting. – A standalone test harness that tests main operations in both --store-pristine modes and gets executed on every test run. – A new --store-pristine=yes/no option for the test suite that forces all tests to run with a specific pristine mode. The branch passes all tests in my Windows and Linux environments, in both --store-pristine=yes and =no modes. While here, I would like to raise a topic of incorporating a switch from SHA1 to a different checksum type (without known collisions) for the new working copy format. This topic is relevant to the pristines-on-demand branch, because the new "is the file modified?" check relies on the checksum comparison, instead of comparing the contents of working and pristine files. And so while I consider it to be out of the scope of the pristines-on-demand branch, I think that we might want to evaluate if this is something that should be a part of the next release. Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
On 29 Nov 2022, Johan Corveleyn wrote: My thanks also to the courageous people having developed this, and the gentle souls keeping the ball rolling :-). About the name: [...] FWIW, my vote still goes to --store-pristines={yes|no} Same here, FWIW. I understand the argument that this exposes an "implementation detail" that the user is supposed to not need to think about. But remember, the reason we developed this feature is because the user was *already* exposed to the existence of pristines: disk space usage by pristines is quite visible to the user -- that's the whole problem :-). So only users who already "see" pristines -- that is, who are already aware of the storage issue -- would go looking for this feature in the first place. So by the time they learn about the '--store-pristines' option, they're already being forced to deal with pristines as a concept, and the only question is whether the tool we give them to solve their problem will take advantage of that conceptual familiarity. So, +1 to "--store-pristines=foo". I prefer such an explicit option here, rather than vague ones that could cover many different things. Also, --optimize=X can easily be interpreted inversely as intended (for instance: when I have an optimal network, do I use --optimize=network?) Apart from {yes|no} the feature might grow other option values in the future ('size-based' or 'text-only', or maybe simply 'auto' if we come up with a good general strategy that works for 99% of the cases, the details of which we don't want to burden our users with). We could even, in some distant future, allow user-defined names that are specified in ~/.subversion/config by the user (using some syntax where the user can set configurable size limits or mime-types or whatever). I also agree with Johan's point here. One other suggestion: not a blocker of course, but a runtime-config-area default would be nice :-). Users might want to choose the same option all the time, without having to remember to add the option to their checkout command. Something like, in ~/.suversion/config store-pristines-default={yes|no} Later on, this might grow into more sophisticated local run-time config regarding pristines, but for now, providing this basic yes/no default is a good idea. For example, on machines where one is regularly checking out trees with huge files, one might set the default to "no". Best regards, -Karl
Re: Getting to first release of pristines-on-demand feature (#525).
My thanks also to the courageous people having developed this, and the gentle souls keeping the ball rolling :-). About the name: On Thu, Nov 24, 2022 at 3:57 PM Nathan Hartman wrote: ... > Previously we got stuck trying to choose the user-facing name of this > feature and its command line switches. > > Currently the CLI switch is --store-pristine={yes|no}. > > I'm okay with this, but for completeness I'll mention that earlier in > the year there was a little bit of push back because pristines, up > until now, have been an internal implementation detail that users > needn't concern themselves with. (Except that they double the storage > space...) > > I've been trying to think of something better for months now, and > here's what I've come up with: > > --optimize=storage > --optimize=network FWIW, my vote still goes to --store-pristines={yes|no} I prefer such an explicit option here, rather than vague ones that could cover many different things. Also, --optimize=X can easily be interpreted inversely as intended (for instance: when I have an optimal network, do I use --optimize=network?) Apart from {yes|no} the feature might grow other option values in the future ('size-based' or 'text-only', or maybe simply 'auto' if we come up with a good general strategy that works for 99% of the cases, the details of which we don't want to burden our users with). We could even, in some distant future, allow user-defined names that are specified in ~/.subversion/config by the user (using some syntax where the user can set configurable size limits or mime-types or whatever). One other suggestion: not a blocker of course, but a runtime-config-area default would be nice :-). Users might want to choose the same option all the time, without having to remember to add the option to their checkout command. Something like, in ~/.suversion/config store-pristines-default={yes|no} Just my 2 cents of course ... -- Johan
Re: Getting to first release of pristines-on-demand feature (#525).
On Wed, Nov 23, 2022 at 9:53 AM Julian Foad wrote: > Nathan, I see you replied enthusiastically and mentioned "I have much to > say on both of these [TODOs] but I won't go into detail yet...". It > seems to me it could be helpful to get that started sooner rather than > later, too, if those issues still need hashing out. Thanks for the nudge. Previously we got stuck trying to choose the user-facing name of this feature and its command line switches. Currently the CLI switch is --store-pristine={yes|no}. I'm okay with this, but for completeness I'll mention that earlier in the year there was a little bit of push back because pristines, up until now, have been an internal implementation detail that users needn't concern themselves with. (Except that they double the storage space...) I've been trying to think of something better for months now, and here's what I've come up with: --optimize=storage --optimize=network Rationale: * Self-documenting. * Easy to explain: --optimize=storage saves storage space; --optimize=network reduces network accesses to the repository server. * Users don't need to know about pristines. There aren't several levels of abstraction between the option name and why the user cares about it. * Extensible. Maybe we can think of other ways to optimize for network bandwidth, for example. The docs can give more user-facing explanation, including tradeoffs, which SVN operations are affected, and example scenarios to help users choose. It should be much easier to write -- and read -- than what we currently have at the draft release notes [1]. As for example scenarios, while the original premise was to save space on large files that don't change often, i525pod is also great in other situations, such as checking out a large source tree on a ramdrive (limited space), or on the same machine as the repo, or on a storage- limited embedded device. (I've tried i525pod in all 3 of these scenarios!) Downsides: * Admittedly, --optimize=network isn't the best name in all scenarios. Notably, this is a misnomer when the repository server is on the same machine as the working copy, but that might not matter because it's the default. (And I might suggest trying --optimize=storage in that scenario). * If we ever want to do other cool things with pristines, such as an option to keep more locally cached history, these names won't be right for that. * These option names haven't helped me come up with a better name for the feature itself. There is an advantage to using --store-pristine={yes|no}: We don't need to rename the feature because Pristines On Demand and the CLI options are named similarly. The disadvantage of --store-pristine={yes|no} is that the feature is more burdensome for us to explain and for others to learn about, especially from a non-technical standpoint. How would you explain this feature in a press release, or in a short blurb (or dare I say, tweet) about "What's new in Subversion 1.15?" Some other possibilities that were discussed: I'll mention these for completeness but note that if --optimize=x is shot down, I'd rather use --store-pristine={yes|no} than any of these: * Hydrate and dehydrate -- perhaps the terms that appear most in dev discussions. I don't recommend these in user-facing areas because they aren't self-documenting. Users can't deduce what these actually do for the user. Users might mistakenly think that their working files would be hydrated or dehydrated in some way. Users would have to learn about pristines to know what is being hydrated or dehydrated, eliminating any useful abstraction. * "Bare working copies" -- the draft release notes [1] use this term tentatively to explain that "bare" working copies save storage by not caching "BASE" files. Unfortunately, "bare" and "BASE" differ by only one letter (and capitalization) and I feel like the explanation is too complicated and doesn't bring us closer to a good result. * Briefly discussed: "local BASE" or "remote BASE" -- but that's a misnomer because there's no such thing as "remote" BASE. Well, you've been warned that I have much to say. :-) Cheers, Nathan
Re: Getting to first release of pristines-on-demand feature (#525).
I'm glad to see you all picking up this project again. While working on this at the beginning of the year I turned on the pristines-on-demand mode in some of my own WCs such as my 'Documents' tree which includes lots of scanned paper docs. It works nicely for cases like this, and feels right, the pristine store being mostly unpopulated when the working files are mostly unchanging. I meant to check back with you during the year, how we should take it forward. The recent summary in this thread sounds about right. My own capacity to contribute is steadily decreasing. So, thank you, dev community: it's good to see people working together to make it happen. It would be pleasing to see this being brought to a satisfactory state and released. Nathan, I see you replied enthusiastically and mentioned "I have much to say on both of these [TODOs] but I won't go into detail yet...". It seems to me it could be helpful to get that started sooner rather than later, too, if those issues still need hashing out. - Julian
Re: Getting to first release of pristines-on-demand feature (#525).
On 16 Nov 2022, Evgeny Kotkov wrote: Apart from the required test changes, there are some technical TODOs that remain from the initial patch and should be resolved. I'll try to handle them as well. Thank you!
Re: Getting to first release of pristines-on-demand feature (#525).
Karl Fogel writes: > Thank you, Evgeny! Just to make sure I understand correctly -- > the status now on the 'pristines-on-demand-on-mwf' branch is: > > 1) One can do 'svn checkout --store-pristines=no' to get an > entirely pristine-less working copy. In that working copy, > individual files will be hydrated/dehydrated automagically on an > as-needed basis. > > 2) There is no command to hydrate or dehydrate a particular file. > Hydration and dehydration only happen as a side effect of other > regular Subversion operations. > > 3) There is no way to rehydrate the entire working copy. E.g., > something like 'svn update --store-pristines=yes' or 'svn hydrate > --depth=infinity' does not exist yet. > > 4) Likewise, there is no way to dehydrate an existing working copy > that currently has its pristines (even if that working copy is at > a high-enough version format to support pristinelessness). E.g., > something like 'svn update --store-pristines=no' or 'svn dehydrate > --depth=infinity' does not exist yet. > > Is that all correct? Yes, I believe that is correct. > By the way, I do not think (2), (3), and (4) are blockers. Just > (1) by itself is a huge step forward and solves issue #525; +1 on keeping the scope of the feature to just (1) for now. > IMHO, once the tests are ready, we could merge it and release > it to the world. Apart from the required test changes, there are some technical TODOs that remain from the initial patch and should be resolved. I'll try to handle them as well. Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
On 15 Nov 2022, Evgeny Kotkov wrote: Evgeny Kotkov writes: Perhaps we could transition into that state by committing the patch and maybe re-evaluate things from there. I could do that, assuming no objections, of course. Committed the patch in https://svn.apache.org/r1905324 I'll try to handle the related tasks in the near future. Thank you, Evgeny! Just to make sure I understand correctly -- the status now on the 'pristines-on-demand-on-mwf' branch is: 1) One can do 'svn checkout --store-pristines=no' to get an entirely pristine-less working copy. In that working copy, individual files will be hydrated/dehydrated automagically on an as-needed basis. 2) There is no command to hydrate or dehydrate a particular file. Hydration and dehydration only happen as a side effect of other regular Subversion operations. 3) There is no way to rehydrate the entire working copy. E.g., something like 'svn update --store-pristines=yes' or 'svn hydrate --depth=infinity' does not exist yet. 4) Likewise, there is no way to dehydrate an existing working copy that currently has its pristines (even if that working copy is at a high-enough version format to support pristinelessness). E.g., something like 'svn update --store-pristines=no' or 'svn dehydrate --depth=infinity' does not exist yet. Is that all correct? By the way, I do not think (2), (3), and (4) are blockers. Just (1) by itself is a huge step forward and solves issue #525; IMHO, once the tests are ready, we could merge it and release it to the world. Best regards, -Karl
Re: Getting to first release of pristines-on-demand feature (#525).
Evgeny Kotkov writes: > Perhaps we could transition into that state by committing the patch > and maybe re-evaluate things from there. I could do that, assuming > no objections, of course. Committed the patch in https://svn.apache.org/r1905324 I'll try to handle the related tasks in the near future. Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
Karl Fogel writes: > By the way, in that thread, Evgeny Kotkov -- whose initial work > much of this is based on -- follows up with a patch that does a > first-pass implementation of 'svn checkout --store-pristines=no' > (by implementing a new persistent setting in wc.db). Perhaps we could transition into that state by committing the patch and maybe re-evaluate things from there. I could do that, assuming no objections, of course. Thanks, Evgeny Kotkov
Re: Getting to first release of pristines-on-demand feature (#525).
On Sat, Nov 5, 2022 at 6:13 PM Karl Fogel wrote: > > Hi, all. This is a high-level mail in which I try to figure out > the current status of the issue #525 work and what's left to land > it in trunk and release it. Corrections and feedback welcome. Thanks for the overview and the work already done to make this possible! The P-O-D feature itself works. What's left to do for a first release, IMHO: (1) Decide on user-facing names for the feature and its command line switch(es). (2) Resolve the [TODO] that Karl mentions (decoupling the compatible version switch from the i525pod switch). Though there are many other possible enhancements, some of them touched upon in Karl's message, I think these two items are the only really crucial ones for a first release. I have much to say on both of these but I won't go into detail yet because that would hijack the thread away from the high-level topic of: what remains to be done for initial viable product? I'd like to give others a chance to respond before we dive down the rabbit hole. :-) It's better if each of the above becomes a thread devoted to that topic. I'll point out that some initial release note text was drafted at [1]. Cheers, Nathan [1] https://subversion-staging.apache.org/docs/release-notes/1.15.html#bare-working-copies