On Wed, Dec 27, 2017 at 12:08:07AM -0800, 
antonio.poma...@external.thalesgroup.com wrote:

> I am trying to find some official information about the limit size of a 
> repository. As I have read git has a limitation of 2GB for each repo 
> because if you increment the size it starts to decrease the overall 
> performance. But this limitation is mainly explained in the web based 
> servers like GitHub or Bitbucket Cloud. So my main question is if this also 
> happens *when you use a self hosted server* like Bitbucket Server or GitHub 
> enterprise. However, what I am trying to find is official documentation 
> that explains this.

It's a bit unclear what you're asking for a number of reasons.

First, the statement "if you increment the size it starts to decrease
the overall performance" defines a operational speed as a function of
some "size" -- while not saying anything about what the threshold of
that speed is.  And certainly, the operation speed highly depends on the
hardware.

Second, that "the size" bit of the statement cited above is moot, too.
What size does it tell about? The overall cumulative size of all the
objects kept in a Git repository or a size of just one (typical?
biggest?) object which is part of a commit?

Third, the term "self-hosted" is unfortunately not telling much either.
That's why. First, there exists "the" Git, which is the reference
implementation developed over there at the <git at vger.kernel.org>
mailing list and available convenielty via <https://git-scm.com> and
from <https://github.com/git/git>. Git is written in C and primarily
targets POSIX-compatible systems. Next, there exists its "official"
ongoing port to Windows, known as "Git for Windows". While it's very
closely related to "vanilla" Git, it's still a separate project for a
number of reasons (see below for one of them). Next, there are various
implementations of "Git" -- as a set of wire protocols and repository
format -- developed as separate projects. A very visible example is JGit
which is a sort-of compatible Git implementation in Java.
So, to discuss any "self-hosted solution" other than plain stock valilla
Git implementation, we must first figure out what that particular solution
uses under its hood to talk to Git clients and manage its repositories.


Now let's have another stab at all of this.

When it comes to repositories, to store objects in them, vanilla Git uses
a combination of plain files, each of which stores (compressed) data
pertaining to a single checked-in file (and other kinds of data) and the
so-called pack files which are efficiently compressed archives, indexed
for fast access.
AFAIK, the limits Git has with regard to these files come from the
underlying filesystem, and AFAIK pack files are kept under some size
(4 GiB?) each intentionally.

Wielding commits is another story. AFAIK Git has two code paths to
handle "small" objects and "large" objects; in the latter case it
switches to a so-called streaming API essentially allowing it to handle
insanely large objects.

One twist to this come from Git for Windows. Historically, due to what
may be thought of as a design short-sightedness, sizes of the objects
are encoded in Git using a C data type which is 64-bit these days on all
platforms except Windows, where it's still 32-bit. This leads to a
practical limit of 2 GiB on the size of of the objects Git is able to
manipulate in memory; see [1] for more background.
But this supposedly does not apply to JGit running on Windows.


As you can see, it's next to impossible to ask your question as stated
because too much information is missing, and the possible answer highly
depends on the actual hosting solution and the environment it's running
in. I think Mark Waite put forward a good practical answer, while mine
can be considered as touching on more theoretical/background side of
this stuff.

1. https://public-inbox.org/git/a660460d-b294-5113-bfaf-d98bcf99b...@gmail.com/

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to