On Wed, Jun 1, 2016 at 4:00 PM, Duy Nguyen <pclo...@gmail.com> wrote:
> On Tue, May 31, 2016 at 8:18 PM, Christian Couder
> <christian.cou...@gmail.com> wrote:
>>>> [3] 
>>>> http://thread.gmane.org/gmane.comp.version-control.git/202902/focus=203020
>>>
>>> This points to  https://github.com/peff/git/commits/jk/external-odb
>>> which is dead. Jeff, do you still have it somewhere, or is it not
>>> worth looking at anymore?
>>
>> I have rebased, fixed and improved it a bit. I added write support for
>> blobs. But the result is not very clean right now.
>> I was going to send a RFC patch series after cleaning the result, but
>> as you ask, here are some links to some branches:
>>
>> - https://github.com/chriscool/git/commits/gl-external-odb3 (the
>> updated patches from Peff, plus 2 small patches from me)
>> - https://github.com/chriscool/git/commits/gl-external-odb7 (the same
>> as above, plus a number of WIP patches to add blob write support)
>
> Thanks. I had a super quick look. It would be nice if you could give a
> high level overview on this (if you're going to spend a lot more time on it).

Sorry about the late answer.

Here is a new series after some cleanup:

https://github.com/chriscool/git/commits/gl-external-odb12

The high level overview of the patch series I would like to send
really soon now could go like this:

---
Git can store its objects only in the form of loose objects in
separate files or packed objects in a pack file.
To be able to better handle some kind of objects, for example big
blobs, it would be nice if Git could store its objects in other object
databases (ODB).

To do that, this patch series makes it possible to register commands,
using "odb.<odbname>.command" config variables, to access external
ODBs. Each specified command will then be called the following ways:

  - "<command> have": the command should output the sha1, size and
type of all the objects the external ODB contains, one object per
line.
  - "<command> get <sha1>": the command should then read from the
external ODB the content of the object corresponding to <sha1> and
output it on stdout.
  - "<command> put <sha1> <size> <type>": the command should then read
from stdin an object and store it in the external ODB.

This RFC patch series does not address the following important parts
of a complete solution:

  - There is no way to transfer external ODB content using Git.
  - No real external ODB has been interfaced with Git. The tests use
another git repo in a separate directory for this purpose which is
probably useless in the real world.
---

> One random thought, maybe it's better to have a daemon for external
> odb right from the start (one for all odbs, or one per odb, I don't
> know). It could do fancy stuff like object caching if necessary, and
> it can avoid high cost handshake (e.g. via tls) every time a git
> process runs and gets one object. Reducing process spawn would
> definitely receive a big cheer from Windows crowd.

The caching could be done inside Git and I am not sure it's worth
optimizing this now.
It could also make it more difficult to write support for an external
ODB if we required a daemon.
Maybe later we can add support for "odb.<odbname>.daemon" if we think
that this is worth it.

> Any thought on object streaming support?

No I didn't think about this. In fact I am not sure what this means.

> It could be a big deal (might
> affect some design decisions).

Could you elaborate on this?

> I would also think about how pack v4
> fits in this (e.g. how a tree walker can still walk fast, a big
> promise of pack v4; I suppose if you still maintain "pack" concept
> over external odb then it might work). Not that it really matters.
> Pack v4 is the future, but the future can never be "today" :)

Sorry I haven't really followed pack v4 and I forgot what it is about.

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to