On 7/31/2017 5:02 PM, Jonathan Tan wrote:
Besides review changes, this patch set now includes my rewritten
lazy-loading sha1_file patch, so you can now do this (excerpted from one
of the tests):

     test_create_repo server
     test_commit -C server 1 1.t abcdefgh
     HASH=$(git hash-object server/1.t)
test_create_repo client
     test_must_fail git -C client cat-file -p "$HASH"
     git -C client config core.repositoryformatversion 1
     git -C client config extensions.lazyobject \
         "\"$TEST_DIRECTORY/t0410/lazy-object\" \"$(pwd)/server/.git\""
     git -C client cat-file -p "$HASH"

with fsck still working. Also, there is no need for a list of promised
blobs, and the long-running process protocol is being used.

Changes from v1:
  - added last patch that supports lazy loading
  - clarified documentation in "introduce lazyobject extension" patch
    (following Junio's comments [1])

As listed in the changes above, I have rewritten my lazy-loading
sha1_file patch to no longer use the list of promises. Also, I have
added documentation about the protocol used to (hopefully) the
appropriate places.

Glad to see the removal of the promises. Given the ongoing conversation, I'm interested to see how you are detecting locally create objects vs those downloaded from a server.


This is a minimal implementation, hopefully enough of a foundation to be
built upon. In particular, I haven't added the environment variable to
suppress lazy loading, and the lazy loading protocol only supports one
object at a time.

We can add multiple object support to the protocol when we get to the point that we have code that will utilize it.


Other work
----------

This differs slightly from Ben Peart's patch [2] in that the
lazy-loading functionality is provided through a configured shell
command instead of a hook shell script. I envision commands like "git
clone", in the future, needing to pre-configure lazy loading, and I
think that it will be less surprising to the user if "git clone" wrote a
default configuration instead of a default hook.

This was on my "todo" list to investigate as I've been told it can enable people to use taskset to set CPU affinity and get some significant performance wins. I'd be interested to see if it actually helps here at all.


This also differs from Christian Couder's patch set [3] that implement a
larger-scale object database, in that (i) my patch set does not support
putting objects into external databases, and (ii) my patch set requires
the lazy loader to make the objects available in the local repo, instead
of allowing the objects to only be stored in the external database.

This is the model we're using today so I'm confident it will meet our requirements.


[1] https://public-inbox.org/git/xmqqzibpn1zh....@gitster.mtv.corp.google.com/
[2] https://public-inbox.org/git/20170714132651.170708-2-benpe...@microsoft.com/
[3] https://public-inbox.org/git/20170620075523.26961-1-chrisc...@tuxfamily.org/

Jonathan Tan (5):
   environment, fsck: introduce lazyobject extension
   fsck: support refs pointing to lazy objects
   fsck: support referenced lazy objects
   fsck: support lazy objects as CLI argument
   sha1_file: support loading lazy objects

  Documentation/Makefile                             |   1 +
  Documentation/gitattributes.txt                    |  54 ++--------
  Documentation/gitrepository-layout.txt             |   3 +
  .../technical/long-running-process-protocol.txt    |  50 +++++++++
  Documentation/technical/repository-version.txt     |  23 +++++
  Makefile                                           |   1 +
  builtin/cat-file.c                                 |   2 +
  builtin/fsck.c                                     |  25 ++++-
  cache.h                                            |   4 +
  environment.c                                      |   1 +
  lazy-object.c                                      |  80 +++++++++++++++
  lazy-object.h                                      |  12 +++
  object.c                                           |   7 ++
  object.h                                           |  13 +++
  setup.c                                            |   7 +-
  sha1_file.c                                        |  44 +++++---
  t/t0410-lazy-object.sh                             | 113 +++++++++++++++++++++
  t/t0410/lazy-object                                | 102 +++++++++++++++++++
  18 files changed, 478 insertions(+), 64 deletions(-)
  create mode 100644 Documentation/technical/long-running-process-protocol.txt
  create mode 100644 lazy-object.c
  create mode 100644 lazy-object.h
  create mode 100755 t/t0410-lazy-object.sh
  create mode 100755 t/t0410/lazy-object

Reply via email to