On Mon, Mar 2, 2015 at 6:30 AM, Richard Hipp <d...@sqlite.org> wrote:
> Ben Pollack's essay at
> http://bitquabit.com/post/unorthodocs-abandon-your-dvcs-and-return-to-sanity/
> succinctly points up some of the problems with DVCS versus centralized
> VCS (like subversion).  Much further discussion occurs on the various
> news aggregator sites.
>
> So I was thinking, could Fossil 2.0 be enhanced in ways to support
> scaling to the point where it works on really massive projects?
>
> The key idea would be to relax the requirement that each client load
> the entire history of the project.  Instead, a clone would only load a

git can do this, and it's a relatively new feature.  The really nice
thing would be to load whatever is needed on demand, or to perform
certain operations (e.g., producing annotated sources, viewing
history, ...) on the server.

> limited amount of history (a month, a year, perhaps even just the most
> recent check-in).  This would make cloning much faster and the
> resulting clone much smaller.  Missing content could be downloaded
> from the server on an as-needed basis.  So, for example, if the user
> does "fossil update trunk:2010-01-01" then the local client would
> first have to go back to the server to fetch content from 2010.  The
> additional content would be added to the local repository.  And so the
> repository would still grow.  But it grows only on an as-needed basis
> rather than starting out at full size.  And in the common case where
> the developer never needs to look at any content over a few months
> old, the growth is limited.
>
> By downloading the meta-data that is currently computed locally by
> "rebuild", many operations on older content, such as timelines or
> search, could be performed even without having the data present.  In
> the "bsd-src.fossil" repository, the content is 78% of the repository
> file and the meta-data is the other 22%.  So a clone that stored only
> the most recent content together with all metadata might be about
> 1/4th the size of a full clone.  For even greater savings, perhaps the
> metadata could be time-limited, though not as severely as the content.
> So perhaps the clone would only initialize to the last month of
> content and the last five years of metadata.
>
> For "wide" repositories (such as bsd-src) that hold many thousands of
> files in a single check-out, Fossil could be enhanced to allow
> cloning, checkout, and commit of just a small slice of the entire
> tree.  So, for example, a clone might hold just the bin/ subdirectory
> of bsd-src containing just 56 files, rather than all 147720 files of a
> complete check-out.  Fossil should be able to do everything it
> normally does with just this subset, including commit changes, except
> that on new manifests generated by the commit, the R-card would have
> to be omitted since the entire tree is necessary to compute the
> R-card.  But the R-card is optional already, controlled by the
> "repo-cksum" setting, which is turned off in bsd-src, so there would
> be no loss in functionality.

Yes, this would be very nice.  Though a BSD would probably need
significant build system rototilling to make it possible for
developers to work on isolated portions of the code with partial
clones only.

> The sync protocol would need to be greatly enhanced to support this
> functionality.  Also, the schema for the meta-data, which currently is
> an implementation detail, would need to become part of the interface.
> Exposing the meta-data as interface would have been unthinkable a few
> years ago, but at this point we have accumulated enough experience
> about what is needed in the meta-data to perhaps make exposing its
> design a reasonable alternative.

Exposing the metadata would be one of the best things Fossil could do,
IMO, once it's ready.

Nico
--
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to