> Note that keeping a single ref per issue is problimatic, because ref
> scalability is one of git's weak points. In particular, a git pull
> always exchanges a list of all refs in the repository, so if you have
> thousands of issues in thousands of refs, git pulls become slower.

I've played with a BuGit repository holding a copy of debbugs.gnu.org
(about 30K issues, IIRC) and indeed I found a few performance problems,
but it still seemed very usable at that scale.  It did indicate that it
probably wouldn't scale as-is to bugs.debian.org and its >1M issues.

> I suppose that gcing refs for fixed issues would help avoid this, but
> active projects tend to accumulate a lot of issues.

The way I was thinking of attacking this was by keeping issues in
a separate remote repository (so as to separate the performance of
"pull" on issues and "pull" on code), as well as move old issues to an
"archive" repository.

> (The git devs may have some plans to fix this, which require a new
> version of the git protocol.)

If someone wants to design something like git-dit or BuGit which can
handle 1M issues in a single repository, it might be a good idea to
replace the "tree of refs inside refs/" and replace it with a "tree of
commits" so you only need a single (or a few) ref which gives you access
to the tree of commits which you then need to traverse to find the actual
issues.

My desire with BuGit was to try and move as much as possible of the work
to Git (e.g. let Git take care of merging), so dealing with such
scalability issues is pretty far down the list of priorities for BuGit.


        Stefan
_______________________________________________
dist-bugs mailing list
[email protected]
https://kitenet.net/cgi-bin/mailman/listinfo/dist-bugs

Reply via email to