Re: A two-part vision for Subversion and large binary objects.

Karl Fogel Thu, 13 Jan 2022 20:38:50 -0800

On 12 Jan 2022, Julian Foad wrote:

No reason to upgrade an old WC until someone actually wants anoptional pristine.
In principle, an what we ideally desire, agreed. Here I was justsayingwhat this branch does as it is now, before being combined withthemulti-wc-format work, which we're told is needed to accommodatewhat wedesire. (I'll be looking into exactly what this means andwhetheravoiding WC database changes and using on-disk pristine presencealoneis a feasible (perhaps even superior) alternative, as Imentioned.)


Gotcha -- understood.

Also, the point of this feature is not to remove pristines forallunmodified files. It's to make it possible for users tospecificcertain circumstances (generally involving large file size!) inwhich the pristine should be omitted *for certain files*.
Understood. That ability (to pick and choose which files itapplies to,by some client-side config) will need to be added. I'll belooking into it.


Cool.

Here's an example use case, in case it helps:

At our company, we have a separate repository for large binaryassets, which we call our "bigdata" repository. However, noteverything in that repository is a giant multi-gigabyte blob --there are also some README files, etc. For those small files, onenaturally wants the pristines, because 'svn diff' is usefullocally on them. But generally no one wants pristines for thelarge binary blob files.

So if we have client-side configuration that can specify "nopristine" based on some combination of one or more of...


 - file size
 - repository of origin
 - path and/or basename
 - svn:mime-type property (if present)
 - some custom property

...then any developer will be able to get whatever behavior theyneed given their local storage constraints.

Different people may make different choices based on availablelocal storage. A developer with a lot of local disk space mightset her no-pristine size threshold to, say, 5 GB, and thus atleast preserve 'svn revert' ability for those files (I don't think'svn diff' would be useful for any of the files, though theremight be exceptions even to that). Meanwhile, another developerwith less disk space might choose 100 MB as the limit.

This is why I feel so strongly that the UI needs to be entirelyclient side -- only the client side has the information needed tomake the appropriate decision.

By the way, I'll give some more details about our setup, since itinvolves a nice trick: our bigdata repository tree is a sparsemirror of our regular internal corporate repository tree, which isalso in SVN because Subversion's path-based authz is so great whenyou have different clients / contractors / employees / partnersall having access to different things. (Note that our source codeis in public Git repositories -- it's all open source. The stuffI'm talking about here is not source code, though that doesn'tmatter for this discussion.)

Having the two SVN repositories be parallel means that we can usethe same authn file and authz file for both :-). So if a personhas access to customer Foo's area in the regular repository, thenby default they also have access to the bigdata assets for Foo aswell. (In the few cases where different access is needed, we justcreate an extra subdirectory and update the authz specaccordingly.)

Most people never need the binary assets, and so they don't paythose checkout or storage costs -- they never fetch from bigdata.But a few people *do* need access to the bigdata, and for some ofit the checkout totals can run to hundreds of gigabytes, soavoiding pristines is not just a nice benefit but rather usually anecessity.

All this is just our use case, of course. I don't mean that it ismoreimportant than other use cases others may present. I just wantedtogive some concreteness to the discussions. I hope others willpost with

their scenarios.

Best regards,
-Karl

Re: A two-part vision for Subversion and large binary objects.

Reply via email to