Re: fetch-any-blob / ref-in-want proposal

2017-07-24 Thread Jonathan Tan
On Sun, 23 Jul 2017 09:41:50 +0300 Orgad Shaneh <org...@gmail.com> wrote: > Hi, > > Jonathan Tan proposed a design and a patch series for requesting a > specific ref on fetch 4 months ago[1]. > > Is there any progress with this? > > - Orgad > > [1] >

Re: [RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects

2017-07-21 Thread Jonathan Tan
On Fri, 21 Jul 2017 12:24:52 -0400 Ben Peart wrote: > Today we have 3.5 million objects * 30 bytes per entry = 105 MB of > promises. Given the average developer only hydrates 56K files (2 MB > promises) that is 103 MB to download that no one will ever need. We > would like

Re: [RFC PATCH v2 4/4] sha1_file: support promised object hook

2017-07-20 Thread Jonathan Tan
On Thu, 20 Jul 2017 16:58:16 -0400 Ben Peart wrote: > >> This is meant as a temporary measure to ensure that all Git commands > >> work in such a situation. Future patches will update some commands to > >> either tolerate promised objects (without invoking the hook) or be

Re: [RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects

2017-07-20 Thread Jonathan Tan
On Thu, 20 Jul 2017 15:58:51 -0400 Ben Peart <peart...@gmail.com> wrote: > On 7/19/2017 8:21 PM, Jonathan Tan wrote: > > Currently, Git does not support repos with very large numbers of objects > > or repos that wish to minimize manipulation of certain blobs (for

Re: [RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects

2017-07-20 Thread Jonathan Tan
On Thu, 20 Jul 2017 11:07:29 -0700 Stefan Beller wrote: > > + if (fsck_promised_objects()) { > > + error("Errors found in promised object list"); > > + errors_found |= ERROR_PROMISED_OBJECT; > > + } > > This got me thinking: It is an

Re: [RFC PATCH v2 1/4] object: remove "used" field from struct object

2017-07-19 Thread Jonathan Tan
On Wed, 19 Jul 2017 17:36:39 -0700 Stefan Beller <sbel...@google.com> wrote: > On Wed, Jul 19, 2017 at 5:21 PM, Jonathan Tan <jonathanta...@google.com> > wrote: > > The "used" field in struct object is only used by builtin/fsck. Remove > > that field and m

[RFC PATCH v2 3/4] sha1-array: support appending unsigned char hash

2017-07-19 Thread Jonathan Tan
In a subsequent patch, sha1_file will need to append object names in the form of "unsigned char *" to oid arrays. Teach sha1-array support for that. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sha1-array.c | 7 +++ sha1-array.h | 1 + 2 files changed, 8 ins

[RFC PATCH v2 4/4] sha1_file: support promised object hook

2017-07-19 Thread Jonathan Tan
ng support for the most commonly used commands, but is not tolerable now for repos that exclude a large amount of objects. Helped-by: Ben Peart <benpe...@microsoft.com> Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/config.txt | 8 + Docum

[RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects

2017-07-19 Thread Jonathan Tan
uot; based on only the information available to promised objects, without requiring the object itself. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/technical/repository-version.txt | 6 ++ Makefile | 1 +

[RFC PATCH v2 1/4] object: remove "used" field from struct object

2017-07-19 Thread Jonathan Tan
The "used" field in struct object is only used by builtin/fsck. Remove that field and modify builtin/fsck to use a flag instead. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fsck.c | 24 ++-- object.c | 1 - object.h | 2 +-

[RFC PATCH v2 0/4] Partial clone: promised objects (not only blobs)

2017-07-19 Thread Jonathan Tan
us to investigate multiple promise lists in the first place. [1] https://public-inbox.org/git/20170718222848.1453-1-jonathanta...@google.com/ Jonathan Tan (4): object: remove "used" field from struct object promised-object, fsck: introduce promised objects sha1-array: support

Re: [RFC PATCH 2/3] sha1-array: support appending unsigned char hash

2017-07-19 Thread Jonathan Tan
On Tue, 11 Jul 2017 15:06:11 -0700 Stefan Beller <sbel...@google.com> wrote: > On Tue, Jul 11, 2017 at 12:48 PM, Jonathan Tan <jonathanta...@google.com> > wrote: > > In a subsequent patch, sha1_file will need to append object names in the > > form of "unsigne

Re: [RFC PATCH 1/3] promised-blob, fsck: introduce promised blobs

2017-07-19 Thread Jonathan Tan
On Tue, 11 Jul 2017 15:02:09 -0700 Stefan Beller wrote: > Here I wondered what this file looks like, in a later patch you > add documentation: > > +objects/promisedblob:: > + This file records the sha1 object names and sizes of promised > + blobs. > + >

[PATCH] sha1_file: use access(), not lstat(), if possible

2017-07-19 Thread Jonathan Tan
In sha1_loose_object_info(), use access() (indirectly invoked through has_loose_object()) instead of lstat() if we do not need the on-disk size, as it should be faster on Windows [1]. [1] https://public-inbox.org/git/alpine.DEB.2.21.1.1707191450570.4193@virtualbox/ Signed-off-by: Jonathan Tan

[PATCH] fsck: remove redundant parse_tree() invocation

2017-07-18 Thread Jonathan Tan
way from object-refs to fsck_walk", 2008-02-25). The same issue existed in that commit. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Here's a code cleanup. I noticed this while looking at modifying fsck. --- builtin/fsck.c | 13 + 1 file changed, 1 insertion(+), 12

Re: [PATCH v5 8/8] sha1_file: refactor has_sha1_file_with_flags

2017-07-18 Thread Jonathan Tan
On Tue, 18 Jul 2017 12:30:46 +0200 Christian Couder <christian.cou...@gmail.com> wrote: > On Thu, Jun 22, 2017 at 2:40 AM, Jonathan Tan <jonathanta...@google.com> > wrote: > > > diff --git a/sha1_file.c b/sha1_file.c > > index bf6b64ec8..778f01d92 10064

Re: [PATCH v2 1/1] sha1_file: Add support for downloading blobs on demand

2017-07-17 Thread Jonathan Tan
On Mon, 17 Jul 2017 16:09:17 -0400 Ben Peart wrote: > > Is this change meant to ensure that Git code that operates on loose > > objects directly (bypassing storage-agnostic functions such as > > sha1_object_info_extended() and has_sha1_file()) still work? If yes, > > this

Re: [PATCH v2 1/1] sha1_file: Add support for downloading blobs on demand

2017-07-17 Thread Jonathan Tan
About the difference between this patch and my patch set [1], besides the fact that this patch does not spawn separate processes for each missing object, which does seem like an improvement to me, this patch (i) does not use a list of promised objects (but instead communicates with the hook for

Re: [RFC PATCH 1/3] promised-blob, fsck: introduce promised blobs

2017-07-13 Thread Jonathan Tan
On Wed, 12 Jul 2017 13:29:11 -0400 Jeff Hostetler wrote: > My primary concern is scale and managing the list of objects over time. > > My fear is that this list will be quite large. If we only want to omit > the very large blobs, then maybe not. But if we want to

Re: [PATCH 1/3] trailers: create struct trailer_opts

2017-07-12 Thread Jonathan Tan
On Wed, 12 Jul 2017 15:46:44 +0200 Paolo Bonzini wrote: > -static void print_all(FILE *outfile, struct list_head *head, int trim_empty) > +static void print_all(FILE *outfile, struct list_head *head, > + struct trailer_opts *opts) This can be "const struct

Re: [PATCH 2/3] trailers: export action enums and corresponding lookup functions

2017-07-12 Thread Jonathan Tan
On Wed, 12 Jul 2017 15:46:45 +0200 Paolo Bonzini wrote: > -static struct conf_info default_conf_info; > +static struct conf_info default_conf_info = { > + .where = WHERE_END, > + .if_exists = EXISTS_ADD_IF_DIFFERENT_NEIGHBOR, > + .if_missing = MISSING_ADD, > +}; I'm

Re: [PATCH 3/3] interpret-trailers: add options for actions

2017-07-12 Thread Jonathan Tan
On Wed, 12 Jul 2017 15:46:46 +0200 Paolo Bonzini wrote: > +static int option_parse_where(const struct option *opt, > + const char *arg, int unset) > +{ > + enum action_where *where = opt->value; > + > + if (unset) > + return 0; > + >

Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support

2017-07-12 Thread Jonathan Tan
On Tue, 20 Jun 2017 09:54:34 +0200 Christian Couder wrote: > Git can store its objects only in the form of loose objects in > separate files or packed objects in a pack file. > > To be able to better handle some kind of objects, for example big > blobs, it would be

[RFC PATCH 3/3] sha1_file: add promised blob hook support

2017-07-11 Thread Jonathan Tan
tolerable in the future if we have batching support for the most commonly used commands, but is not tolerable now for repos that exclude a large amount of blobs. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/config.txt | 8 Documentation/gitrepository

[RFC PATCH 2/3] sha1-array: support appending unsigned char hash

2017-07-11 Thread Jonathan Tan
In a subsequent patch, sha1_file will need to append object names in the form of "unsigned char *" to oid arrays. Teach sha1-array support for that. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sha1-array.c | 7 +++ sha1-array.h | 1 + 2 files changed, 8 ins

[RFC PATCH 1/3] promised-blob, fsck: introduce promised blobs

2017-07-11 Thread Jonathan Tan
; functions for creating and modifying that file will be introduced in later patches. A repository that is missing a blob but has that blob promised is not considered to be in error, so also teach fsck this. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Makefile

[RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs")

2017-07-11 Thread Jonathan Tan
@google.com/ [2] Documentation/technical/repository-version.txt Jonathan Tan (3): promised-blob, fsck: introduce promised blobs sha1-array: support appending unsigned char hash sha1_file: add promised blob hook support Documentation/config.txt | 8 ++ Documentation/

Re: speeding up git pull from a busy gerrit instance over a slow link?

2017-06-30 Thread Jonathan Tan
On Fri, 30 Jun 2017 14:28:15 +0200 Noel Grandin wrote: > - > snippet of packet trace > --- > > 14:20:45.705091 pkt-line.c:80 packet:fetch< > c5b026801c729ab37e2af6a610f31ca2e28b51fe > refs/changes/99/29099/2 >

RFC: Missing blob hook might be invoked infinitely recursively

2017-06-29 Thread Jonathan Tan
As some of you may know, I'm currently working on support for partial clones/fetches in Git (where blobs above a user-specified size threshold are not downloaded - only their names and sizes are downloaded). To do this, the client repository needs to be able to download blobs at will whenever it

Re: [PATCH v4 6/8] sha1_file: improve sha1_object_info_extended

2017-06-26 Thread Jonathan Tan
On Mon, Jun 26, 2017 at 10:28 AM, Junio C Hamano <gits...@pobox.com> wrote: > Jonathan Tan <jonathanta...@google.com> writes: > >> On Sat, Jun 24, 2017 at 5:45 AM, Jeff King <p...@peff.net> wrote: >>> On Mon, Jun 19, 2017 at 06:03:13PM -0700, Jonathan Tan

Re: What's cooking in git.git (Jun 2017, #07; Sat, 24)

2017-06-26 Thread Jonathan Tan
On Sat, 24 Jun 2017 16:25:13 -0700 Junio C Hamano wrote: > * jt/unify-object-info (2017-06-21) 8 commits > - sha1_file: refactor has_sha1_file_with_flags > - sha1_file: do not access pack if unneeded > - sha1_file: improve sha1_object_info_extended > - sha1_file: refactor

Re: [PATCH v4 6/8] sha1_file: improve sha1_object_info_extended

2017-06-26 Thread Jonathan Tan
On Sat, Jun 24, 2017 at 5:45 AM, Jeff King <p...@peff.net> wrote: > On Mon, Jun 19, 2017 at 06:03:13PM -0700, Jonathan Tan wrote: > >> Subject: [PATCH v4 6/8] sha1_file: improve sha1_object_info_extended >> Improve sha1_object_info_extended() by supporting addition

Re: [PATCH v4 7/8] sha1_file: do not access pack if unneeded

2017-06-26 Thread Jonathan Tan
On Sat, Jun 24, 2017 at 1:39 PM, Jeff King wrote: > On Sat, Jun 24, 2017 at 11:41:39AM -0700, Junio C Hamano wrote: > If we are open to writing anything, then I think it should follow the > same pointer-to-data pattern that the rest of the struct does. I.e., > declare the extra

Re: [PATCH 1/3] list-objects: add filter_blob to traverse_commit_list

2017-06-22 Thread Jonathan Tan
On Thu, 22 Jun 2017 14:45:26 -0700 Jonathan Tan <jonathanta...@google.com> wrote: > On Thu, 22 Jun 2017 20:36:13 + > Jeff Hostetler <g...@jeffhostetler.com> wrote: > > > From: Jeff Hostetler <jeffh...@microsoft.com> > > > > In preparation for pa

Re: [PATCH 2/3] pack-objects: WIP add max-blob-size filtering

2017-06-22 Thread Jonathan Tan
On Thu, 22 Jun 2017 20:36:14 + Jeff Hostetler wrote: > +static signed long max_blob_size = -1; FYI Junio suggested "blob-max-bytes" when he looked at my patch [1]. [1] https://public-inbox.org/git/xmqqmv9ryoym@gitster.mtv.corp.google.com/ [snip] > +/* > + *

Re: [PATCH 1/3] list-objects: add filter_blob to traverse_commit_list

2017-06-22 Thread Jonathan Tan
On Thu, 22 Jun 2017 20:36:13 + Jeff Hostetler wrote: > From: Jeff Hostetler > > In preparation for partial/sparse clone/fetch where the > server is allowed to omit large/all blobs from the packfile, > teach traverse_commit_list() to take a

[PATCH v5 5/8] sha1_file: refactor read_object

2017-06-21 Thread Jonathan Tan
these mechanisms would be a great help to maintainability. Therefore, consolidate them by extending sha1_object_info_extended() to support the functionality needed, and then modifying read_object() to use sha1_object_info_extended(). Signed-off-by: Jonathan Tan <jonathanta...@google.

[PATCH v5 7/8] sha1_file: do not access pack if unneeded

2017-06-21 Thread Jonathan Tan
formance improvement. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sha1_file.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/sha1_file.c b/sha1_file.c index b6bc02f09..bf6b64ec8 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2977,12 +2977,16 @@ static int sh

[PATCH v5 6/8] sha1_file: improve sha1_object_info_extended

2017-06-21 Thread Jonathan Tan
Improve sha1_object_info_extended() by supporting additional flags. This allows has_sha1_file_with_flags() to be modified to use sha1_object_info_extended() in a subsequent patch. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- cache.h | 4 sha1_file.

[PATCH v5 4/8] sha1_file: move delta base cache code up

2017-06-21 Thread Jonathan Tan
In a subsequent patch, packed_object_info() will be modified to use the delta base cache, so move the relevant code to before packed_object_info(). Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sha1_file.c | 220 ++--

[PATCH v5 3/8] sha1_file: rename LOOKUP_REPLACE_OBJECT

2017-06-21 Thread Jonathan Tan
to parse_sha1_header_extended() since commit 46f0344 ("sha1_file: support reading from a loose object of unknown type", 2015-05-03), but that has had no effect since that commit. Therefore this patch also removes this flag from that invocation. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- bu

[PATCH v5 8/8] sha1_file: refactor has_sha1_file_with_flags

2017-06-21 Thread Jonathan Tan
has_sha1_file_with_flags() implements many mechanisms in common with sha1_object_info_extended(). Make has_sha1_file_with_flags() a convenience function for sha1_object_info_extended() instead. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fetch.c

[PATCH v5 2/8] sha1_file: rename LOOKUP_UNKNOWN_OBJECT

2017-06-21 Thread Jonathan Tan
object_info_extended() supports both flags. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/cat-file.c | 2 +- cache.h| 3 ++- sha1_file.c| 4 ++-- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 4

[PATCH v5 1/8] sha1_file: teach packed_object_info about typename

2017-06-21 Thread Jonathan Tan
p" which could only represent valid types. Some relatively complex manipulations were added to avoid breaking packed_object_info() without modifying it, but it is much easier to just teach packed_object_info() about the new field. Therefore, teach packed_object_info() as described above.

[PATCH v5 0/8] Improvements to sha1_file

2017-06-21 Thread Jonathan Tan
1_object_info_extended() compare "< 0". - patch 7 - Rewrote patch to make sha1_object_info_extended() accept NULL struct object_info pointer. - patch 8 - Made has_sha1_file_with_flags send NULL instead of blank struct object_info. Jonathan Tan (8): sha1_file: teach packe

Re: [PATCH v3 19/20] repository: enable initialization of submodules

2017-06-21 Thread Jonathan Tan
On Tue, 20 Jun 2017 12:19:50 -0700 Brandon Williams wrote: > Introduce 'repo_submodule_init()' which performs initialization of a > 'struct repository' as a submodule of another 'struct repository'. > > The resulting submodule can be in one of three states: > > 1. The

Re: [PATCH v3 15/20] repository: add index_state to struct repo

2017-06-21 Thread Jonathan Tan
On Tue, 20 Jun 2017 12:19:46 -0700 Brandon Williams wrote: > +int repo_read_index(struct repository *repo) > +{ > + if (!repo->index) > + repo->index = xcalloc(1, sizeof(struct index_state)); sizeof(*repo->index)? [snip] > + /* Repository's in-memory

Re: [PATCH v3 20/20] ls-files: use repository object

2017-06-21 Thread Jonathan Tan
On Tue, 20 Jun 2017 12:19:51 -0700 Brandon Williams wrote: > -static void show_ce_entry(const struct index_state *istate, > - const char *tag, const struct cache_entry *ce) > +static void show_ce(struct repository *repo, struct dir_struct *dir, > +

Re: [PATCH v4 2/8] sha1_file: rename LOOKUP_UNKNOWN_OBJECT

2017-06-21 Thread Jonathan Tan
On Wed, 21 Jun 2017 10:22:38 -0700 Junio C Hamano <gits...@pobox.com> wrote: > Jonathan Tan <jonathanta...@google.com> writes: > > > The LOOKUP_UNKNOWN_OBJECT flag was introduced in commit 46f0344 > > ("sha1_file: support reading from a loose object of unkn

Re: [PATCHv2] submodules: overhaul documentation

2017-06-20 Thread Jonathan Tan
Thanks, this looks like a good explanation. Some more nits, but overall I feel like I understand this and have learned something from it. On Tue, Jun 20, 2017 at 3:56 PM, Stefan Beller wrote: > +A submodule is another Git repository tracked inside a repository. > +The tracked

Re: [PATCH v3 10/20] path: convert do_git_path to take a 'struct repository'

2017-06-20 Thread Jonathan Tan
On Tue, 20 Jun 2017 12:19:41 -0700 Brandon Williams wrote: > +static void do_git_path(const struct repository *repo, > + const struct worktree *wt, struct strbuf *buf, > const char *fmt, va_list args) > { > int gitdir_len; > -

Re: [PATCH v3 05/20] environment: place key repository state in the_repository

2017-06-20 Thread Jonathan Tan
On Tue, 20 Jun 2017 12:19:36 -0700 Brandon Williams wrote: > Migrate 'git_dir', 'git_common_dir', 'git_object_dir', 'git_index_file', > 'git_graft_file', and 'namespace' to be stored in 'the_repository'. > > Signed-off-by: Brandon Williams > --- > cache.h

Re: [PATCH v3 04/20] repository: introduce the repository object

2017-06-20 Thread Jonathan Tan
On Tue, 20 Jun 2017 12:19:35 -0700 Brandon Williams wrote: > Introduce the repository object 'struct repository' which can be used to > hold all state pertaining to a git repository. > > Some of the benefits of object-ifying a repository are: > > 1. Make the code base more

Re: [PATCH 22/26] diff.c: color moved lines differently

2017-06-20 Thread Jonathan Tan
I just glanced through this file, because it seems similar to the versions I have previously reviewed. I'll skip patches 23 onwards in this round of review because (i) I would be happy if just patches 1-22 were included in the tree and (ii) those patches might end up changing anyway because of

Re: [PATCH 15/26] submodule.c: migrate diff output to use emit_diff_symbol

2017-06-20 Thread Jonathan Tan
On Mon, 19 Jun 2017 19:48:05 -0700 Stefan Beller wrote: > As the submodule process is no longer attached to the same stdout as > the superprojects process, we need to pass coloring explicitly. I found this confusing - what difference does the stdout make? If they were the

Re: [PATCH 11/26] diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR

2017-06-20 Thread Jonathan Tan
On Mon, 19 Jun 2017 19:48:01 -0700 Stefan Beller wrote: > @@ -676,6 +677,14 @@ static void emit_diff_symbol(struct diff_options *o, > enum diff_symbol s, > } > emit_line(o, context, reset, line, len); > break; > + case

Re: [RFC/PATCH] submodules: overhaul documentation

2017-06-20 Thread Jonathan Tan
On Wed, 7 Jun 2017 11:53:54 -0700 Stefan Beller wrote: [snip] > +DESCRIPTION > +--- > + > +A submodule is another Git repository tracked in a subdirectory of your > +repository. The tracked repository has its own history, which does not > +interfere with the history

[PATCH v4 8/8] sha1_file: refactor has_sha1_file_with_flags

2017-06-19 Thread Jonathan Tan
has_sha1_file_with_flags() implements many mechanisms in common with sha1_object_info_extended(). Make has_sha1_file_with_flags() a convenience function for sha1_object_info_extended() instead. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fetch.c

[PATCH v4 6/8] sha1_file: improve sha1_object_info_extended

2017-06-19 Thread Jonathan Tan
Improve sha1_object_info_extended() by supporting additional flags. This allows has_sha1_file_with_flags() to be modified to use sha1_object_info_extended() in a subsequent patch. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- cache.h | 4 sha1_file.

[PATCH v4 7/8] sha1_file: do not access pack if unneeded

2017-06-19 Thread Jonathan Tan
will make use of this optimization. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- cache.h | 1 + sha1_file.c | 17 + streaming.c | 1 + 3 files changed, 15 insertions(+), 4 deletions(-) diff --git a/cache.h b/cache.h index 7cf2ca466..2e1cc3fe2 100644 --- a/c

[PATCH v4 4/8] sha1_file: move delta base cache code up

2017-06-19 Thread Jonathan Tan
In a subsequent patch, packed_object_info() will be modified to use the delta base cache, so move the relevant code to before packed_object_info(). Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sha1_file.c | 220 ++--

[PATCH v4 5/8] sha1_file: refactor read_object

2017-06-19 Thread Jonathan Tan
these mechanisms would be a great help to maintainability. Therefore, consolidate them by extending sha1_object_info_extended() to support the functionality needed, and then modifying read_object() to use sha1_object_info_extended(). Signed-off-by: Jonathan Tan <jonathanta...@google.

[PATCH v4 0/8] Improvements to sha1_file

2017-06-19 Thread Jonathan Tan
oids accessing the pack in certain situations, but this optimization requires checking a lot of fields. Let me know what you think. Jonathan Tan (8): sha1_file: teach packed_object_info about typename sha1_file: rename LOOKUP_UNKNOWN_OBJECT sha1_file: rename LOOKUP_REPLACE_OBJECT sha

[PATCH v4 1/8] sha1_file: teach packed_object_info about typename

2017-06-19 Thread Jonathan Tan
p" which could only represent valid types. Some relatively complex manipulations were added to avoid breaking packed_object_info() without modifying it, but it is much easier to just teach packed_object_info() about the new field. Therefore, teach packed_object_info() as described above.

[PATCH v4 2/8] sha1_file: rename LOOKUP_UNKNOWN_OBJECT

2017-06-19 Thread Jonathan Tan
le flag that invokes this feature, and move it closer to the declaration of sha1_object_info_extended(). Also add documentation for this flag. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/cat-file.c | 2 +- cache.h| 3 ++- sha1_file.c| 4 ++-- 3 files c

[PATCH v4 3/8] sha1_file: rename LOOKUP_REPLACE_OBJECT

2017-06-19 Thread Jonathan Tan
to parse_sha1_header_extended() since commit 46f0344 ("sha1_file: support reading from a loose object of unknown type", 2015-05-03), but that has had no effect since that commit. Therefore this patch also removes this flag from that invocation. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- bu

Re: Behavior of 'git fetch' for commit hashes

2017-06-19 Thread Jonathan Tan
On Mon, 19 Jun 2017 10:49:36 -0700 Jonathan Tan <jonathanta...@google.com> wrote: > On Mon, 19 Jun 2017 12:09:28 + > <eero.aalto...@vaisala.com> wrote: > > > For version 2.13.3 Firstly, exactly which version of Git doesn't work? I'm assuming 2.13.1 (as writt

Re: Behavior of 'git fetch' for commit hashes

2017-06-19 Thread Jonathan Tan
On Mon, 19 Jun 2017 12:09:28 + wrote: > For version 2.7.4 > = > Git exits with exit code 1. > > However, if I first do 'git fetch ', then 'git fetch will > also work > > * branch-> FETCH_HEAD I suspect that what is happening is that 'git

Re: [WIP v2 2/2] pack-objects: support --blob-max-bytes

2017-06-15 Thread Jonathan Tan
On Thu, 15 Jun 2017 16:28:24 -0400 Jeff Hostetler wrote: > I agree with Peff here. I've been working on my partial/narrow/sparse > clone/fetch ideas since my original RFC and have come to the conclusion > that the server can do the size limiting efficiently, but we

[PATCH v3 2/4] sha1_file: move delta base cache code up

2017-06-15 Thread Jonathan Tan
In a subsequent patch, packed_object_info() will be modified to use the delta base cache, so move the relevant code to before packed_object_info(). Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sha1_file.c | 220 ++--

[PATCH v3 0/4] Improvements to sha1_file

2017-06-15 Thread Jonathan Tan
on patches 1-3 to go into the tree. (Patch 4 is a work in progress, and is here just to demonstrate the effectiveness of the refactoring.) Jonathan Tan (4): sha1_file: teach packed_object_info about typename sha1_file: move delta base cache code up sha1_file: consolidate storage-agnostic object fns

[PATCH v3 1/4] sha1_file: teach packed_object_info about typename

2017-06-15 Thread Jonathan Tan
p" which could only represent valid types. Some relatively complex manipulations were added to avoid breaking packed_object_info() without modifying it, but it is much easier to just teach packed_object_info() about the new field. Therefore, teach packed_object_info() as described above.

[PATCH v3 4/4] sha1_file, fsck: add missing blob support

2017-06-15 Thread Jonathan Tan
sing blobs (for example, repos that only want to exclude large blobs), and might be tolerable in the future if we have batching support for the most commonly used commands, but is not tolerable now for repos that exclude a large amount of blobs. Signed-off-by: Jonathan Tan <jonathanta.

[PATCH v3 3/4] sha1_file: consolidate storage-agnostic object fns

2017-06-15 Thread Jonathan Tan
oose files found). Therefore, consolidate only the other 2 functions by extending sha1_object_info_extended() to support the functionality needed, and then modifying read_object() to use sha1_object_info_extended(). Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- cache.h | 1 +

Re: [PATCH v2 4/4] sha1_file, fsck: add missing blob support

2017-06-15 Thread Jonathan Tan
A reroll is coming soon, but there is an interesting discussion point here so I'll reply to this e-mail first. On Thu, 15 Jun 2017 11:34:45 -0700 Junio C Hamano <gits...@pobox.com> wrote: > Jonathan Tan <jonathanta...@google.com> writes: > > > +struct missing_blob_m

Re: [PATCH v2 3/4] sha1_file: consolidate storage-agnostic object fns

2017-06-15 Thread Jonathan Tan
On Thu, 15 Jun 2017 10:50:46 -0700 Junio C Hamano <gits...@pobox.com> wrote: > Jonathan Tan <jonathanta...@google.com> writes: > > > Looking at the 3 primary functions (sha1_object_info_extended, > > read_object, has_sha1_file_with_flags), they independ

Re: [PATCHv5 04/17] diff: introduce more flexible emit function

2017-06-13 Thread Jonathan Tan
On Tue, 13 Jun 2017 16:41:57 -0700 Stefan Beller <sbel...@google.com> wrote: > On Tue, Jun 13, 2017 at 2:54 PM, Jonathan Tan <jonathanta...@google.com> > wrote: > > - could this be called emit() instead? > > Despite having good IDEs available some (includin

Re: [RFC/PATCH] builtin/blame: darken redundant line information

2017-06-13 Thread Jonathan Tan
On Mon, 12 Jun 2017 19:31:51 -0700 Stefan Beller wrote: > When using git-blame lots of lines contain redundant information, for > example in hunks that consist of multiple lines, the metadata (commit name, > author, timezone) are repeated. A reader may not be interested in

Re: [PATCH] diff.c: color moved lines differently

2017-06-13 Thread Jonathan Tan
On Wed, 31 May 2017 17:24:29 -0700 Stefan Beller wrote: > When a patch consists mostly of moving blocks of code around, it can > be quite tedious to ensure that the blocks are moved verbatim, and not > undesirably modified in the move. To that end, color blocks that are >

Re: [PATCHv5 16/17] diff: buffer all output if asked to

2017-06-13 Thread Jonathan Tan
On Wed, 24 May 2017 14:40:35 -0700 Stefan Beller wrote: > diff --git a/diff.h b/diff.h > index 85948ed65a..fad1258556 100644 > --- a/diff.h > +++ b/diff.h > @@ -115,6 +115,42 @@ enum diff_submodule_format { > DIFF_SUBMODULE_INLINE_DIFF > }; > > +/* > + * This struct

Re: [PATCHv5 04/17] diff: introduce more flexible emit function

2017-06-13 Thread Jonathan Tan
On Wed, 24 May 2017 14:40:23 -0700 Stefan Beller wrote: > Currently, diff output is written either through the emit_line_0 > function or through the FILE * in struct diff_options directly. To > make it easier to teach diff to buffer its output (which will be done > in a

[PATCH v2 4/4] sha1_file, fsck: add missing blob support

2017-06-13 Thread Jonathan Tan
that only want to exclude large blobs), and might be tolerable in the future if we have batching support for the most commonly used commands, but is not tolerable now for repos that exclude a large amount of blobs. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Document

[PATCH v2 0/4] Improvements to sha1_file

2017-06-13 Thread Jonathan Tan
to sha1_object_info_extended() instead of the now gone get_object(). As before, I would like review on patches 1-3 to go into the tree. (Patch 4 is a work in progress, and is here just to demonstrate the effectiveness of the refactoring.) Jonathan Tan (4): sha1_file: teach packed_object_info about typename

[PATCH v2 3/4] sha1_file: consolidate storage-agnostic object fns

2017-06-13 Thread Jonathan Tan
that has_sha1_file_with_flags() does not try cached storage, whereas the other 2 functions do - this functionality is preserved. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- cache.h | 7 +++ sha1_file.c | 143 +++- 2 files chang

[PATCH v2 1/4] sha1_file: teach packed_object_info about typename

2017-06-13 Thread Jonathan Tan
p" which could only represent valid types. Some relatively complex manipulations were added to avoid breaking packed_object_info() without modifying it, but it is much easier to just teach packed_object_info() about the new field. Therefore, teach packed_object_info() as described above.

[PATCH v2 2/4] sha1_file: move delta base cache code up

2017-06-13 Thread Jonathan Tan
In a subsequent patch, packed_object_info() will be modified to use the delta base cache, so move the relevant code to before packed_object_info(). Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sha1_file.c | 226 +++-

Re: [PATCH v2 00/32] repository object

2017-06-12 Thread Jonathan Tan
On Mon, 12 Jun 2017 12:11:21 -0700 Brandon Williams <bmw...@google.com> wrote: > On 06/12, Jonathan Tan wrote: > > On Sat, 10 Jun 2017 02:07:12 -0400 > > Jeff King <p...@peff.net> wrote: > > > > > I do agree that "pass just what the su

Re: [RFC PATCH 2/4] sha1_file: extract type and size from object_info

2017-06-12 Thread Jonathan Tan
On Sat, 10 Jun 2017 03:01:33 -0400 Jeff King <p...@peff.net> wrote: > On Fri, Jun 09, 2017 at 12:23:24PM -0700, Jonathan Tan wrote: > > > Looking at the 3 primary functions (sha1_object_info_extended, > > read_object, has_sha1_file_with_flags), they independently

Re: [PATCH v2 00/32] repository object

2017-06-12 Thread Jonathan Tan
On Sat, 10 Jun 2017 17:43:29 -0700 Brandon Williams wrote: > I disagree with a few points of what jonathan said (mostly about > removing the config from the repo object, as I like the idea of nothing > knowing about a 'config_set' object) and I think this problem could be >

Re: [PATCH v2 00/32] repository object

2017-06-12 Thread Jonathan Tan
On Sat, 10 Jun 2017 02:07:12 -0400 Jeff King wrote: > I do agree that "pass just what the sub-function needs" is a good rule > of thumb. But the reason that these are globals in the first place is > that there are a ton of them, and they are used at the lowest levels of > call

Re: [PATCH v2 00/32] repository object

2017-06-09 Thread Jonathan Tan
On Thu, 8 Jun 2017 16:40:28 -0700 Brandon Williams wrote: > When I sent out my RFC series there seemed to be a lot of interest but I > haven't seen many people jump to review this series. Despite lack of review I > wanted to get out another version which includes some

[RFC PATCH 2/4] sha1_file: extract type and size from object_info

2017-06-09 Thread Jonathan Tan
ruct object_info", making them additional parameters in sha1_object_info_extended (and related functions) instead. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/cat-file.c | 29 +++- builtin/pack-objects.c | 5 ++-- cache.h|

[RFC PATCH 3/4] sha1_file: consolidate storage-agnostic object fns

2017-06-09 Thread Jonathan Tan
the other 2 functions do - this functionality is preserved. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sha1_file.c | 294 ++-- 1 file changed, 165 insertions(+), 129 deletions(-) diff --git a/sha1_file.c b/sha1_file.c

[RFC PATCH 4/4] sha1_file, fsck: add missing blob support

2017-06-09 Thread Jonathan Tan
to exclude large blobs), and might be tolerable in the future if we have batching support for the most commonly used commands, but is not tolerable now for repos that exclude a large amount of blobs. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/

[RFC PATCH 0/4] Improvements to sha1_file

2017-06-09 Thread Jonathan Tan
in patches 2-3. I am hoping for reviews on patches 1-3 to be included into the tree. [1] https://public-inbox.org/git/20170426221346.25337-1-jonathanta...@google.com/ Jonathan Tan (4): sha1_file: teach packed_object_info about typename sha1_file: extract type and size from object_info sha1_file

[RFC PATCH 1/4] sha1_file: teach packed_object_info about typename

2017-06-09 Thread Jonathan Tan
p" which could only represent valid types. Some relatively complex manipulations were added to avoid breaking packed_object_info() without modifying it, but it is much easier to just teach packed_object_info() about the new field. Therefore, teach packed_object_info() as described above.

Re: [WIP v2 0/2] Modifying pack objects to support --blob-max-bytes

2017-06-05 Thread Jonathan Tan
Thanks for your comments. On Fri, 2 Jun 2017 18:16:45 -0400 Jeff King <p...@peff.net> wrote: > On Fri, Jun 02, 2017 at 12:38:43PM -0700, Jonathan Tan wrote: > > > > Do we need to future-proof the output format so that we can later > > > use 32-byte hash? The i

[WIP v2 2/2] pack-objects: support --blob-max-bytes

2017-06-02 Thread Jonathan Tan
try_from_bitmap(), has not been modified - thus packing in the presence of bitmaps still packs all blobs regardless of size. See the documentation update in this commit for the rationale. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/git-pack-objects.txt | 19 +++

[WIP v2 1/2] pack-objects: rename want_.* to ignore_.*

2017-06-02 Thread Jonathan Tan
ranteed to be in to_pack. [1] For the purposes of pack_objects, a blob is a Git special file if it appears in a to-be-packed tree with a filename beginning with ".git". Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/pack-objects.c | 56 +

[WIP v2 0/2] Modifying pack objects to support --blob-max-bytes

2017-06-02 Thread Jonathan Tan
it to the hashmap. And then the other name that begins > with ".git" is later discovered to point at the same blob, what > happens? Would we need to unregister it from the hashmap elsewhere > in the code? [snip] > Ah, is this where the "unregistering" happens? Yes. &

<    3   4   5   6   7   8   9   10   11   >