Re: [PATCH 3/3] archive: allow archive over HTTP(S) with proto v2
Hi, Josh Steadmon wrote: > Subject: archive: allow archive over HTTP(S) with proto v2 It's interesting how little this has to touch the client. > Signed-off-by: Josh Steadmon > --- > builtin/archive.c | 8 +++- > http-backend.c | 10 +- > transport-helper.c | 5 +++-- > 3 files changed, 19 insertions(+), 4 deletions(-) [] > --- a/builtin/archive.c > +++ b/builtin/archive.c > @@ -87,7 +87,13 @@ static int run_remote_archiver(int argc, const char **argv, > status = packet_reader_read(); > if (status != PACKET_READ_FLUSH) > die(_("git archive: expected a flush")); > - } > + } else if (version == protocol_v2 && > +starts_with(transport->url, "http")) As Stefan noticed, this starts_with test seems a bit too loose. For example, what happens if I try an scp-style SSH URL like http.example.com:path/to/repo, a local path like http/foo/bar, or a custom protocol like httplikebutbetter://path/to/repo (honest question: I haven't tried)? > + /* > + * Commands over HTTP require two requests, so there's an > + * additional server response to parse. > + */ > + discover_version(); Can this be made consistent with the non-http case? The original capabilities (/info/refs) response told us what protocol version the server wants to use, which means that a hypothetical protocol v3 could use a completely different request format for the followup commands: so could the server omit the protocol version in the v2 /git-upload-archive response? Alternatively, if we want to include the protocol version again, could we do that in stateful protocols as well? Related question: what should happen if the two responses declare different protocol versions? Should we diagnose that as a protocol error? [...] > --- a/http-backend.c > +++ b/http-backend.c > @@ -32,6 +32,7 @@ struct rpc_service { > static struct rpc_service rpc_service[] = { > { "upload-pack", "uploadpack", 1, 1 }, > { "receive-pack", "receivepack", 0, -1 }, > + { "upload-archive", "uploadarchive", 1, 1 }, shell.c orders these in almost-alphabetical order (receive-pack, upload-pack, upload-archive). I guess they should both use actual alphabetical order? (If you agree, then please feel free to do that in a separate patch.) [...] > @@ -637,6 +638,12 @@ static void service_rpc(struct strbuf *hdr, char > *service_name) > struct rpc_service *svc = select_service(hdr, service_name); > struct strbuf buf = STRBUF_INIT; > > + if (!strcmp(service_name, "git-upload-archive")) { > + /* git-upload-archive doesn't need --stateless-rpc */ This comment doesn't seem actionable. Can it say why? E.g. "[...] because an upload-archive command always involves a single round-trip". Or alternatively, I think it's fine to omit the comment. > + argv[1] = "."; > + argv[2] = NULL; > + } [...] > @@ -713,7 +720,8 @@ static struct service_cmd { > {"GET", "/objects/pack/pack-[0-9a-f]{40}\\.idx$", get_idx_file}, > > {"POST", "/git-upload-pack$", service_rpc}, > - {"POST", "/git-receive-pack$", service_rpc} > + {"POST", "/git-receive-pack$", service_rpc}, > + {"POST", "/git-upload-archive$", service_rpc}, Same comment about services seeming to be in a randomish order. [...] > --- a/transport-helper.c > +++ b/transport-helper.c > @@ -605,7 +605,8 @@ static int process_connect_service(struct transport > *transport, > ret = run_connect(transport, ); > } else if (data->stateless_connect && > (get_protocol_version_config() == protocol_v2) && (not about this patch) These parens don't help --- they make it harder for me to read, especially with the new parens to try to match them up with. > -!strcmp("git-upload-pack", name)) { > +(!strcmp("git-upload-pack", name) || > + !strcmp("git-upload-archive", name))) { A part of me wonders about the wasted cycles comparing to "git-upload-" twice, but (1) it is tiny relative to actually serving the request and (2) if we're lucky, the compiler (or a compiler of the future) inlines the strcmp call and could optimize it out. [...] > @@ -639,7 +640,7 @@ static int connect_helper(struct transport *transport, > const char *name, > > /* Get_helper so connect is inited. */ > get_helper(transport); > - if (!data->connect) > + if (!data->connect && !data->stateless_connect) > die(_("operation not supported by protocol")); I don't understand this part. Can you explain it further (possibly by putting it in its own patch)? Thanks for a pleasant read, Jonathan
Re: [PATCH 2/3] archive: implement protocol v2 archive command
Hi, Josh Steadmon wrote: > This adds a new archive command for protocol v2. The command expects > arguments in the form "argument X" which are passed unmodified to > git-upload-archive--writer. > > This command works over the file://, Git, and SSH transports. HTTP > support will be added in a separate patch. > > Signed-off-by: Josh Steadmon > --- > builtin/archive.c| 45 +++- > builtin/upload-archive.c | 44 --- > t/t5000-tar-tree.sh | 5 + > 3 files changed, 77 insertions(+), 17 deletions(-) I like the diffstat. :) Can this include some docs in Documentation/technical/ as well, to help other implementors to understand the protocol so they can interoperate? Thanks, Jonathan
Re: Add proto v2 archive command with HTTP support
Hi, Josh Steadmon wrote: > This series adds a new protocol v2 command for archiving, and allows > this command to work over HTTP(S). This was previously discussed in [1]. > I've CCed everyone who participated in that discussion. Yay! Getting ready to read it now. For the future, "git format-patch --cover-letter" does a few things that can be nice for this kind of opening message: - it lists the patches in the series, and a diffstat - it puts [PATCH 0/3] in the subject line so people know what to expect Thanks, Jonathan > [1]: > https://public-inbox.org/git/CANq=j3tk7qebjoc7vnwkh4+wbnibmjjp5yukd9te5naywuk...@mail.gmail.com/
Compliment of the day to you Dear Friend
Compliment of the day to you Dear Friend. Dear Friend. I am Mrs. Amina Kadi. am sending this brief letter to solicit your partnership to transfer $5.5 million US Dollars. I shall send you more information and procedures when I receive positive response from you. Mrs. Amina Kadi
[PATCH v8 7/7] list-objects-filter: implement filter tree:0
Teach list-objects the "tree:0" filter which allows for filtering out all tree and blob objects (unless other objects are explicitly specified by the user). The purpose of this patch is to allow smaller partial clones. The name of this filter - tree:0 - does not explicitly specify that it also filters out all blobs, but this should not cause much confusion because blobs are not at all useful without the trees that refer to them. I also considered only:commits as a name, but this is inaccurate because it suggests that annotated tags are omitted, but actually they are included. The name "tree:0" allows later filtering based on depth, i.e. "tree:1" would filter out all but the root tree and blobs. In order to avoid confusion between 0 and capital O, the documentation was worded in a somewhat round-about way that also hints at this future improvement to the feature. Signed-off-by: Matthew DeVore --- Documentation/rev-list-options.txt | 5 +++ list-objects-filter-options.c | 14 list-objects-filter-options.h | 1 + list-objects-filter.c | 49 ++ t/t5317-pack-objects-filter-objects.sh | 28 +++ t/t5616-partial-clone.sh | 38 t/t6112-rev-list-filters-objects.sh| 12 +++ 7 files changed, 147 insertions(+) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 7b273635d..5f1672913 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -731,6 +731,11 @@ the requested refs. + The form '--filter=sparse:path=' similarly uses a sparse-checkout specification contained in . ++ +The form '--filter=tree:' omits all blobs and trees whose depth +from the root tree is >= (minimum depth if an object is located +at multiple depths in the commits traversed). Currently, only =0 +is supported, which omits all blobs and trees. --no-filter:: Turn off any previous `--filter=` argument. diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index c0e2bd6a0..14f251de4 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -50,6 +50,20 @@ static int gently_parse_list_objects_filter( return 0; } + } else if (skip_prefix(arg, "tree:", )) { + unsigned long depth; + if (!git_parse_ulong(v0, ) || depth != 0) { + if (errbuf) { + strbuf_init(errbuf, 0); + strbuf_addstr( + errbuf, + _("only 'tree:0' is supported")); + } + return 1; + } + filter_options->choice = LOFC_TREE_NONE; + return 0; + } else if (skip_prefix(arg, "sparse:oid=", )) { struct object_context oc; struct object_id sparse_oid; diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index a61f8..af64e5c66 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -10,6 +10,7 @@ enum list_objects_filter_choice { LOFC_DISABLED = 0, LOFC_BLOB_NONE, LOFC_BLOB_LIMIT, + LOFC_TREE_NONE, LOFC_SPARSE_OID, LOFC_SPARSE_PATH, LOFC__COUNT /* must be last */ diff --git a/list-objects-filter.c b/list-objects-filter.c index 5f8b1a002..09b2b05d5 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -79,6 +79,54 @@ static void *filter_blobs_none__init( return d; } +/* + * A filter for list-objects to omit ALL trees and blobs from the traversal. + * Can OPTIONALLY collect a list of the omitted OIDs. + */ +struct filter_trees_none_data { + struct oidset *omits; +}; + +static enum list_objects_filter_result filter_trees_none( + enum list_objects_filter_situation filter_situation, + struct object *obj, + const char *pathname, + const char *filename, + void *filter_data_) +{ + struct filter_trees_none_data *filter_data = filter_data_; + + switch (filter_situation) { + default: + BUG("unknown filter_situation: %d", filter_situation); + + case LOFS_BEGIN_TREE: + case LOFS_BLOB: + if (filter_data->omits) + oidset_insert(filter_data->omits, >oid); + return LOFR_MARK_SEEN; /* but not LOFR_DO_SHOW (hard omit) */ + + case LOFS_END_TREE: + assert(obj->type == OBJ_TREE); + return LOFR_ZERO; + + } +} + +static void* filter_trees_none__init( + struct oidset *omitted, + struct list_objects_filter_options *filter_options, + filter_object_fn *filter_fn, + filter_free_fn *filter_free_fn) +{ + struct filter_trees_none_data *d = xcalloc(1, sizeof(*d)); + d->omits = omitted; +
[PATCH v8 1/7] list-objects: store common func args in struct
This will make utility functions easier to create, as done by the next patch. Signed-off-by: Matthew DeVore --- list-objects.c | 158 +++-- 1 file changed, 74 insertions(+), 84 deletions(-) diff --git a/list-objects.c b/list-objects.c index c99c47ac1..584518a3f 100644 --- a/list-objects.c +++ b/list-objects.c @@ -12,20 +12,25 @@ #include "packfile.h" #include "object-store.h" -static void process_blob(struct rev_info *revs, +struct traversal_context { + struct rev_info *revs; + show_object_fn show_object; + show_commit_fn show_commit; + void *show_data; + filter_object_fn filter_fn; + void *filter_data; +}; + +static void process_blob(struct traversal_context *ctx, struct blob *blob, -show_object_fn show, struct strbuf *path, -const char *name, -void *cb_data, -filter_object_fn filter_fn, -void *filter_data) +const char *name) { struct object *obj = >object; size_t pathlen; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; - if (!revs->blob_objects) + if (!ctx->revs->blob_objects) return; if (!obj) die("bad blob object"); @@ -41,21 +46,21 @@ static void process_blob(struct rev_info *revs, * may cause the actual filter to report an incomplete list * of missing objects. */ - if (revs->exclude_promisor_objects && + if (ctx->revs->exclude_promisor_objects && !has_object_file(>oid) && is_promisor_object(>oid)) return; pathlen = path->len; strbuf_addstr(path, name); - if (!(obj->flags & USER_GIVEN) && filter_fn) - r = filter_fn(LOFS_BLOB, obj, - path->buf, >buf[pathlen], - filter_data); + if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) + r = ctx->filter_fn(LOFS_BLOB, obj, + path->buf, >buf[pathlen], + ctx->filter_data); if (r & LOFR_MARK_SEEN) obj->flags |= SEEN; if (r & LOFR_DO_SHOW) - show(obj, path->buf, cb_data); + ctx->show_object(obj, path->buf, ctx->show_data); strbuf_setlen(path, pathlen); } @@ -81,26 +86,21 @@ static void process_blob(struct rev_info *revs, * the link, and how to do it. Whether it necessarily makes * any sense what-so-ever to ever do that is another issue. */ -static void process_gitlink(struct rev_info *revs, +static void process_gitlink(struct traversal_context *ctx, const unsigned char *sha1, - show_object_fn show, struct strbuf *path, - const char *name, - void *cb_data) + const char *name) { /* Nothing to do */ } -static void process_tree(struct rev_info *revs, +static void process_tree(struct traversal_context *ctx, struct tree *tree, -show_object_fn show, struct strbuf *base, -const char *name, -void *cb_data, -filter_object_fn filter_fn, -void *filter_data) +const char *name) { struct object *obj = >object; + struct rev_info *revs = ctx->revs; struct tree_desc desc; struct name_entry entry; enum interesting match = revs->diffopt.pathspec.nr == 0 ? @@ -133,14 +133,14 @@ static void process_tree(struct rev_info *revs, } strbuf_addstr(base, name); - if (!(obj->flags & USER_GIVEN) && filter_fn) - r = filter_fn(LOFS_BEGIN_TREE, obj, - base->buf, >buf[baselen], - filter_data); + if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) + r = ctx->filter_fn(LOFS_BEGIN_TREE, obj, + base->buf, >buf[baselen], + ctx->filter_data); if (r & LOFR_MARK_SEEN) obj->flags |= SEEN; if (r & LOFR_DO_SHOW) - show(obj, base->buf, cb_data); + ctx->show_object(obj, base->buf, ctx->show_data); if (base->len) strbuf_addch(base, '/'); @@ -157,29 +157,25 @@ static void process_tree(struct rev_info *revs, } if (S_ISDIR(entry.mode)) - process_tree(revs, + process_tree(ctx, lookup_tree(the_repository, entry.oid), -
[PATCH v8 3/7] list-objects: always parse trees gently
If parsing fails when revs->ignore_missing_links and revs->exclude_promisor_objects are both false, we print the OID anyway in the die("bad tree object...") call, so any message printed by parse_tree_gently() is superfluous. Signed-off-by: Matthew DeVore --- list-objects.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/list-objects.c b/list-objects.c index ccc529e5e..f9b51db7a 100644 --- a/list-objects.c +++ b/list-objects.c @@ -143,8 +143,6 @@ static void process_tree(struct traversal_context *ctx, struct rev_info *revs = ctx->revs; int baselen = base->len; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; - int gently = revs->ignore_missing_links || -revs->exclude_promisor_objects; if (!revs->tree_objects) return; @@ -152,7 +150,7 @@ static void process_tree(struct traversal_context *ctx, die("bad tree object"); if (obj->flags & (UNINTERESTING | SEEN)) return; - if (parse_tree_gently(tree, gently) < 0) { + if (parse_tree_gently(tree, 1) < 0) { if (revs->ignore_missing_links) return; -- 2.19.0.397.gdd90340f6a-goog
[PATCH v8 5/7] revision: mark non-user-given objects instead
Currently, list-objects.c incorrectly treats all root trees of commits as USER_GIVEN. Also, it would be easier to mark objects that are non-user-given instead of user-given, since the places in the code where we access an object through a reference are more obvious than the places where we access an object that was given by the user. Resolve these two problems by introducing a flag NOT_USER_GIVEN that marks blobs and trees that are non-user-given, replacing USER_GIVEN. (Only blobs and trees are marked because this mark is only used when filtering objects, and filtering of other types of objects is not supported yet.) This fixes a bug in that git rev-list behaved differently from git pack-objects. pack-objects would *not* filter objects given explicitly on the command line and rev-list would filter. This was because the two commands used a different function to add objects to the rev_info struct. This seems to have been an oversight, and pack-objects has the correct behavior, so I added a test to make sure that rev-list now behaves properly. Signed-off-by: Matthew DeVore fixup of 6defd70de --- list-objects.c | 31 + revision.c | 1 - revision.h | 11 -- t/t6112-rev-list-filters-objects.sh | 10 ++ 4 files changed, 37 insertions(+), 16 deletions(-) diff --git a/list-objects.c b/list-objects.c index 243192af5..7a1a0929d 100644 --- a/list-objects.c +++ b/list-objects.c @@ -53,7 +53,7 @@ static void process_blob(struct traversal_context *ctx, pathlen = path->len; strbuf_addstr(path, name); - if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) + if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) r = ctx->filter_fn(LOFS_BLOB, obj, path->buf, >buf[pathlen], ctx->filter_data); @@ -120,17 +120,19 @@ static void process_tree_contents(struct traversal_context *ctx, continue; } - if (S_ISDIR(entry.mode)) - process_tree(ctx, -lookup_tree(the_repository, entry.oid), -base, entry.path); + if (S_ISDIR(entry.mode)) { + struct tree *t = lookup_tree(the_repository, entry.oid); + t->object.flags |= NOT_USER_GIVEN; + process_tree(ctx, t, base, entry.path); + } else if (S_ISGITLINK(entry.mode)) process_gitlink(ctx, entry.oid->hash, base, entry.path); - else - process_blob(ctx, -lookup_blob(the_repository, entry.oid), -base, entry.path); + else { + struct blob *b = lookup_blob(the_repository, entry.oid); + b->object.flags |= NOT_USER_GIVEN; + process_blob(ctx, b, base, entry.path); + } } } @@ -171,7 +173,7 @@ static void process_tree(struct traversal_context *ctx, } strbuf_addstr(base, name); - if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) + if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) r = ctx->filter_fn(LOFS_BEGIN_TREE, obj, base->buf, >buf[baselen], ctx->filter_data); @@ -185,7 +187,7 @@ static void process_tree(struct traversal_context *ctx, if (!failed_parse) process_tree_contents(ctx, tree, base); - if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) { + if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) { r = ctx->filter_fn(LOFS_END_TREE, obj, base->buf, >buf[baselen], ctx->filter_data); @@ -301,8 +303,11 @@ static void do_traverse(struct traversal_context *ctx) * an uninteresting boundary commit may not have its tree * parsed yet, but we are not going to show them anyway */ - if (get_commit_tree(commit)) - add_pending_tree(ctx->revs, get_commit_tree(commit)); + if (get_commit_tree(commit)) { + struct tree *tree = get_commit_tree(commit); + tree->object.flags |= NOT_USER_GIVEN; + add_pending_tree(ctx->revs, tree); + } ctx->show_commit(commit, ctx->show_data); if (ctx->revs->tree_blobs_in_commit_order) diff --git a/revision.c b/revision.c index de4dce600..72d48a17f 100644 --- a/revision.c +++ b/revision.c @@ -175,7 +175,6 @@ static void add_pending_object_with_path(struct rev_info *revs,
[PATCH v8 4/7] rev-list: handle missing tree objects properly
Previously, we assumed only blob objects could be missing. This patch makes rev-list handle missing trees like missing blobs. The --missing=* and --exclude-promisor-objects flags now work for trees as they already do for blobs. This is demonstrated in t6112. Signed-off-by: Matthew DeVore --- builtin/rev-list.c | 11 --- list-objects.c | 11 +-- revision.h | 15 + t/t0410-partial-clone.sh | 45 ++ t/t5317-pack-objects-filter-objects.sh | 13 t/t6112-rev-list-filters-objects.sh| 17 ++ 6 files changed, 105 insertions(+), 7 deletions(-) diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 5b07f3f4a..49d6deed7 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -6,6 +6,7 @@ #include "list-objects.h" #include "list-objects-filter.h" #include "list-objects-filter-options.h" +#include "object.h" #include "object-store.h" #include "pack.h" #include "pack-bitmap.h" @@ -209,7 +210,8 @@ static inline void finish_object__ma(struct object *obj) */ switch (arg_missing_action) { case MA_ERROR: - die("missing blob object '%s'", oid_to_hex(>oid)); + die("missing %s object '%s'", + type_name(obj->type), oid_to_hex(>oid)); return; case MA_ALLOW_ANY: @@ -222,8 +224,8 @@ static inline void finish_object__ma(struct object *obj) case MA_ALLOW_PROMISOR: if (is_promisor_object(>oid)) return; - die("unexpected missing blob object '%s'", - oid_to_hex(>oid)); + die("unexpected missing %s object '%s'", + type_name(obj->type), oid_to_hex(>oid)); return; default: @@ -235,7 +237,7 @@ static inline void finish_object__ma(struct object *obj) static int finish_object(struct object *obj, const char *name, void *cb_data) { struct rev_list_info *info = cb_data; - if (obj->type == OBJ_BLOB && !has_object_file(>oid)) { + if (!has_object_file(>oid)) { finish_object__ma(obj); return 1; } @@ -373,6 +375,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) init_revisions(, prefix); revs.abbrev = DEFAULT_ABBREV; revs.commit_format = CMIT_FMT_UNSPECIFIED; + revs.do_not_die_on_missing_tree = 1; /* * Scan the argument list before invoking setup_revisions(), so that we diff --git a/list-objects.c b/list-objects.c index f9b51db7a..243192af5 100644 --- a/list-objects.c +++ b/list-objects.c @@ -143,6 +143,7 @@ static void process_tree(struct traversal_context *ctx, struct rev_info *revs = ctx->revs; int baselen = base->len; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; + int failed_parse; if (!revs->tree_objects) return; @@ -150,7 +151,9 @@ static void process_tree(struct traversal_context *ctx, die("bad tree object"); if (obj->flags & (UNINTERESTING | SEEN)) return; - if (parse_tree_gently(tree, 1) < 0) { + + failed_parse = parse_tree_gently(tree, 1); + if (failed_parse) { if (revs->ignore_missing_links) return; @@ -163,7 +166,8 @@ static void process_tree(struct traversal_context *ctx, is_promisor_object(>oid)) return; - die("bad tree object %s", oid_to_hex(>oid)); + if (!revs->do_not_die_on_missing_tree) + die("bad tree object %s", oid_to_hex(>oid)); } strbuf_addstr(base, name); @@ -178,7 +182,8 @@ static void process_tree(struct traversal_context *ctx, if (base->len) strbuf_addch(base, '/'); - process_tree_contents(ctx, tree, base); + if (!failed_parse) + process_tree_contents(ctx, tree, base); if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) { r = ctx->filter_fn(LOFS_END_TREE, obj, diff --git a/revision.h b/revision.h index 007278cc1..5910613cb 100644 --- a/revision.h +++ b/revision.h @@ -126,6 +126,21 @@ struct rev_info { line_level_traverse:1, tree_blobs_in_commit_order:1, + /* +* Blobs are shown without regard for their existence. +* But not so for trees: unless exclude_promisor_objects +* is set and the tree in question is a promisor object; +* OR ignore_missing_links is set, the revision walker +* dies with a "bad tree object HASH" message when +* encountering a missing tree. For callers that can +* handle missing trees
[PATCH v8 2/7] list-objects: refactor to process_tree_contents
This will be used in a follow-up patch to reduce indentation needed when invoking the logic conditionally. i.e. rather than: if (foo) { while (...) { /* this is very indented */ } } we will have: if (foo) process_tree_contents(...); Signed-off-by: Matthew DeVore --- list-objects.c | 68 ++ 1 file changed, 41 insertions(+), 27 deletions(-) diff --git a/list-objects.c b/list-objects.c index 584518a3f..ccc529e5e 100644 --- a/list-objects.c +++ b/list-objects.c @@ -94,6 +94,46 @@ static void process_gitlink(struct traversal_context *ctx, /* Nothing to do */ } +static void process_tree(struct traversal_context *ctx, +struct tree *tree, +struct strbuf *base, +const char *name); + +static void process_tree_contents(struct traversal_context *ctx, + struct tree *tree, + struct strbuf *base) +{ + struct tree_desc desc; + struct name_entry entry; + enum interesting match = ctx->revs->diffopt.pathspec.nr == 0 ? + all_entries_interesting : entry_not_interesting; + + init_tree_desc(, tree->buffer, tree->size); + + while (tree_entry(, )) { + if (match != all_entries_interesting) { + match = tree_entry_interesting(, base, 0, + >revs->diffopt.pathspec); + if (match == all_entries_not_interesting) + break; + if (match == entry_not_interesting) + continue; + } + + if (S_ISDIR(entry.mode)) + process_tree(ctx, +lookup_tree(the_repository, entry.oid), +base, entry.path); + else if (S_ISGITLINK(entry.mode)) + process_gitlink(ctx, entry.oid->hash, + base, entry.path); + else + process_blob(ctx, +lookup_blob(the_repository, entry.oid), +base, entry.path); + } +} + static void process_tree(struct traversal_context *ctx, struct tree *tree, struct strbuf *base, @@ -101,10 +141,6 @@ static void process_tree(struct traversal_context *ctx, { struct object *obj = >object; struct rev_info *revs = ctx->revs; - struct tree_desc desc; - struct name_entry entry; - enum interesting match = revs->diffopt.pathspec.nr == 0 ? - all_entries_interesting: entry_not_interesting; int baselen = base->len; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; int gently = revs->ignore_missing_links || @@ -144,29 +180,7 @@ static void process_tree(struct traversal_context *ctx, if (base->len) strbuf_addch(base, '/'); - init_tree_desc(, tree->buffer, tree->size); - - while (tree_entry(, )) { - if (match != all_entries_interesting) { - match = tree_entry_interesting(, base, 0, - >diffopt.pathspec); - if (match == all_entries_not_interesting) - break; - if (match == entry_not_interesting) - continue; - } - - if (S_ISDIR(entry.mode)) - process_tree(ctx, -lookup_tree(the_repository, entry.oid), -base, entry.path); - else if (S_ISGITLINK(entry.mode)) - process_gitlink(ctx, entry.oid->hash, base, entry.path); - else - process_blob(ctx, -lookup_blob(the_repository, entry.oid), -base, entry.path); - } + process_tree_contents(ctx, tree, base); if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) { r = ctx->filter_fn(LOFS_END_TREE, obj, -- 2.19.0.397.gdd90340f6a-goog
[PATCH v8 6/7] list-objects-filter: use BUG rather than die
In some cases in this file, BUG makes more sense than die. In such cases, a we get there from a coding error rather than a user error. 'return' has been removed following some instances of BUG since BUG does not return. Signed-off-by: Matthew DeVore --- list-objects-filter.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/list-objects-filter.c b/list-objects-filter.c index a0ba78b20..5f8b1a002 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -44,8 +44,7 @@ static enum list_objects_filter_result filter_blobs_none( switch (filter_situation) { default: - die("unknown filter_situation"); - return LOFR_ZERO; + BUG("unknown filter_situation: %d", filter_situation); case LOFS_BEGIN_TREE: assert(obj->type == OBJ_TREE); @@ -102,8 +101,7 @@ static enum list_objects_filter_result filter_blobs_limit( switch (filter_situation) { default: - die("unknown filter_situation"); - return LOFR_ZERO; + BUG("unknown filter_situation: %d", filter_situation); case LOFS_BEGIN_TREE: assert(obj->type == OBJ_TREE); @@ -208,8 +206,7 @@ static enum list_objects_filter_result filter_sparse( switch (filter_situation) { default: - die("unknown filter_situation"); - return LOFR_ZERO; + BUG("unknown filter_situation: %d", filter_situation); case LOFS_BEGIN_TREE: assert(obj->type == OBJ_TREE); @@ -389,7 +386,7 @@ void *list_objects_filter__init( assert((sizeof(s_filters) / sizeof(s_filters[0])) == LOFC__COUNT); if (filter_options->choice >= LOFC__COUNT) - die("invalid list-objects filter choice: %d", + BUG("invalid list-objects filter choice: %d", filter_options->choice); init_fn = s_filters[filter_options->choice]; -- 2.19.0.397.gdd90340f6a-goog
[PATCH v8 0/7] filter: support for excluding all trees and blobs
Things seem to have settled down in terms of responses, so here is a re-roll, some of the changes being Junio's suggestions: - show a more helpful error if a positive integer is given after "tree:" - added a test for an issue that this patchset inadvertently fixes: git rev-list would filter objects given explicitly on the command-line, but it should not. - improved documentation on the NOT_USER_GIVEN flag in revision.h Matthew DeVore (7): list-objects: store common func args in struct list-objects: refactor to process_tree_contents list-objects: always parse trees gently rev-list: handle missing tree objects properly revision: mark non-user-given objects instead list-objects-filter: use BUG rather than die list-objects-filter: implement filter tree:0 Documentation/rev-list-options.txt | 5 + builtin/rev-list.c | 11 +- list-objects-filter-options.c | 14 ++ list-objects-filter-options.h | 1 + list-objects-filter.c | 60 ++- list-objects.c | 232 + revision.c | 1 - revision.h | 26 ++- t/t0410-partial-clone.sh | 45 + t/t5317-pack-objects-filter-objects.sh | 41 + t/t5616-partial-clone.sh | 38 t/t6112-rev-list-filters-objects.sh| 39 + 12 files changed, 389 insertions(+), 124 deletions(-) -- 2.19.0.397.gdd90340f6a-goog
Re: [PATCH 3/3] sequencer: use read_author_script()
On Wed, Sep 12, 2018 at 6:11 AM Phillip Wood wrote: > Use the new function to read the author script, updating > read_env_script() and read_author_ident(). This means we now have a > single code path that reads the author script and uses sq_dequote() to > dequote it. This fixes potential problems with user edited scripts > as read_env_script() which did not track quotes properly. > [...] > Signed-off-by: Phillip Wood > --- > /* > * Read a list of environment variable assignments (such as the author-script > * file) into an environment block. Returns -1 on error, 0 otherwise. > */ According to this comment, this function is capable of parsing a file of arbitrary "NAME=Value" lines, and indeed the original code does just that, but... > static int read_env_script(struct argv_array *env) > { > + char *name, *email, *date; > > - if (strbuf_read_file(, rebase_path_author_script(), 256) <= 0) > + if (read_author_script(rebase_path_author_script(), > + , , , 0)) ...the new implementation is able to handle only GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, and GIT_AUTHOR_DATE, in exactly that order. This seems like a pretty serious (and possibly buggy) change of behavior, and makes the function much less useful (in general). Is it true that it will only ever be used for files containing that limited set of names? If so, the behavior change deserves mention in the commit message, the function comment needs updating, and the function itself probably ought to be renamed. > + strbuf_addstr(, "GIT_AUTHOR_NAME="); > + strbuf_addstr(, name); > + argv_array_push(env, script.buf); > + strbuf_reset(); > + strbuf_addstr(, "GIT_AUTHOR_EMAIL="); > + strbuf_addstr(, email); > + argv_array_push(env, script.buf); > + strbuf_reset(); > + strbuf_addstr(, "GIT_AUTHOR_DATE="); > + strbuf_addstr(, date); > + argv_array_push(env, script.buf); > + strbuf_release(); Mentioned earlier[1], this can all collapse down to: argv_array_pushf(env, "GIT_AUTHOR_NAME=%s", name); argv_array_pushf(env, "GIT_AUTHOR_EMAIL=%s", email); argv_array_pushf(env, "GIT_AUTHOR_DATE=%s", date); However, it's unfortunate that this manual and hard-coded reconstruction is needed at all. If you restructure the factoring of this patch series, such ugliness can be avoided altogether. For instance, the series could be structured like this: 1. Introduce a general-purpose function for reading a file containing arbitrary "NAME=Value" lines (not carrying about specific key names or their order) and returning them in some data structure (perhaps via 'string_list' as parse_key_value_squoted() in patch 2/3 does). 2. Build read_author_script() atop #1, making it expect and extract GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, and GIT_AUTHOR_DATE (possibly in that order, or possibly not if we don't care). 3. Retrofit existing parsers to call one of those two functions (this step may happen over several patches). So, for instance, read_env_script() would call the generic parser from #1, whereas sequencer.c:read_author_ident() would call the more specific parser from #2. More below... > @@ -790,54 +771,25 @@ static char *get_author(const char *message) > /* Read author-script and return an ident line (author timestamp) */ > static const char *read_author_ident(struct strbuf *buf) > { > - const char *keys[] = { > - "GIT_AUTHOR_NAME=", "GIT_AUTHOR_EMAIL=", "GIT_AUTHOR_DATE=" > - }; > - if (strbuf_read_file(buf, rebase_path_author_script(), 256) <= 0) > + if (read_author_script(rebase_path_author_script(), > + , , , 0)) > return NULL; > - /* dequote values and construct ident line in-place */ > - for (in = buf->buf; i < 3 && in - buf->buf < buf->len; i++) { > - if (!skip_prefix(in, keys[i], (const char **))) { > - warning(_("could not parse '%s' (looking for '%s')"), > - rebase_path_author_script(), keys[i]); > - return NULL; > - } > - if (!sq_dequote(in)) { > - warning(_("bad quoting on %s value in '%s'"), > - keys[i], rebase_path_author_script()); > - return NULL; > - } > - if (i < 3) { > - warning(_("could not parse '%s' (looking for '%s')"), > - rebase_path_author_script(), keys[i]); > - return NULL; > - } The parsing code being thrown away here does a better job of diagnosing problems (thus helping the user figure out what went wrong) than the new shared parser introduced by patch 2/3. The shared function only ever reports a generic "unable to parse", whereas the above code gets specific, saying that it was looking for a particular key or that quoting was broken. I'd have expected the new shared
Re: [PATCH 2/3] add read_author_script() to libgit
On Wed, Sep 12, 2018 at 6:11 AM Phillip Wood wrote: > Add read_author_script() to sequencer.c based on the implementation in > builtin/am.c and update read_am_author_script() to use > read_author_script(). The sequencer code that reads the author script > will be updated in the next commit. > > Signed-off-by: Phillip Wood > --- > diff --git a/builtin/am.c b/builtin/am.c > @@ -305,39 +279,16 @@ static int parse_key_value_squoted(char *buf, struct > string_list *list) > static int read_am_author_script(struct am_state *state) > { This function returns 'int'... > const char *filename = am_path(state, "author-script"); > + if (read_author_script(filename, >author_name, > + >author_email, >author_date, 1)) > + exit(128); ...but only ever exit()s... > + return 0; ... or returns 0. > } It's a little disturbing that it now invokes exit() directly rather than calling die() since die() may involve extra functionality before actually exiting. What if, instead of exit()ing directly, you drop the conditional and instead return the value of read_author_script(): return read_author_script(...); and let the caller deal with the zero or non-zero return value as usual? (True, you'd get two error messages upon failure rather than just one, but that's not necessarily a bad thing.) A possibly valid response is that such a change is outside the scope of this series since the original code shared this odd architecture of sometimes returning 0, sometimes -1, and sometimes die()ing.
Re: [PATCH v5 1/5] eoie: add End of Index Entry (EOIE) extension
Ben Peart writes: > diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh > index 39133bcbc8..f613dd72e3 100755 > --- a/t/t1700-split-index.sh > +++ b/t/t1700-split-index.sh > @@ -7,6 +7,7 @@ test_description='split index mode tests' > # We need total control of index splitting here > sane_unset GIT_TEST_SPLIT_INDEX > sane_unset GIT_FSMONITOR_TEST > +export GIT_TEST_DISABLE_EOIE=true > > test_expect_success 'enable split index' ' > git config splitIndex.maxPercentChange 100 && It is safer to squash the following in; we may want to revisit the decision test-lint makes on this issue later, though. -- >8 -- Subject: [PATCH] SQUASH??? http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#export specifies how "export name[=word]" ought to work, but because writing "name=word; export name" is not so much more cumbersome and some older shells that do not understand the former do grok the latter. test-lint also recommends spelling it this way. --- t/t1700-split-index.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh index f613dd72e3..dab97c2187 100755 --- a/t/t1700-split-index.sh +++ b/t/t1700-split-index.sh @@ -7,7 +7,7 @@ test_description='split index mode tests' # We need total control of index splitting here sane_unset GIT_TEST_SPLIT_INDEX sane_unset GIT_FSMONITOR_TEST -export GIT_TEST_DISABLE_EOIE=true +GIT_TEST_DISABLE_EOIE=true; export GIT_TEST_DISABLE_EOIE test_expect_success 'enable split index' ' git config splitIndex.maxPercentChange 100 && -- 2.19.0
[PATCH v2] linear-assignment: fix potential out of bounds memory access
Currently the 'compute_assignment()' function may read memory out of bounds, even if used correctly. Namely this happens when we only have one column. In that case we try to calculate the initial minimum cost using '!j1' as column in the reduction transfer code. That in turn causes us to try and get the cost from column 1 in the cost matrix, which does not exist, and thus results in an out of bounds memory read. In the original paper [1], the example code initializes that minimum cost to "infinite". We could emulate something similar by setting the minimum cost to INT_MAX, which would result in the same minimum cost as the current algorithm, as we'd always go into the if condition at least once, except when we only have one column, and column_count thus equals 1. If column_count does equal 1, the condition in the loop would always be false, and we'd end up with a minimum of INT_MAX, which may lead to integer overflows later in the algorithm. For a column count of 1, we however do not even really need to go through the whole algorithm. A column count of 1 means that there's no possible assignments, and we can just zero out the column2row and row2column arrays, and return early from the function, while keeping the reduction transfer part of the function the same as it is currently. Another solution would be to just not call the 'compute_assignment()' function from the range diff code in this case, however it's better to make the compute_assignment function more robust, so future callers don't run into this potential problem. Note that the test only fails under valgrind on Linux, but the same command has been reported to segfault on Mac OS. [1]: Jonker, R., & Volgenant, A. (1987). A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing, 38(4), 325–340. Reported-by: ryenus Helped-by: Derrick Stolee Signed-off-by: Thomas Gummerer --- linear-assignment.c | 6 ++ t/t3206-range-diff.sh | 5 + 2 files changed, 11 insertions(+) diff --git a/linear-assignment.c b/linear-assignment.c index 9b3e56e283..ecffc09be6 100644 --- a/linear-assignment.c +++ b/linear-assignment.c @@ -19,6 +19,12 @@ void compute_assignment(int column_count, int row_count, int *cost, int *free_row, free_count = 0, saved_free_count, *pred, *col; int i, j, phase; + if (column_count < 2) { + memset(column2row, 0, sizeof(int) * column_count); + memset(row2column, 0, sizeof(int) * row_count); + return; + } + memset(column2row, -1, sizeof(int) * column_count); memset(row2column, -1, sizeof(int) * row_count); ALLOC_ARRAY(v, column_count); diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh index 2237c7f4af..fb4c13a84a 100755 --- a/t/t3206-range-diff.sh +++ b/t/t3206-range-diff.sh @@ -142,4 +142,9 @@ test_expect_success 'changed message' ' test_cmp expected actual ' +test_expect_success 'no commits on one side' ' + git commit --amend -m "new message" && + git range-diff master HEAD@{1} HEAD +' + test_done -- 2.19.0.397.gdd90340f6a
Re: [PATCH] linear-assignment: fix potential out of bounds memory access (was: Re: Git 2.19 Segmentation fault 11 on macOS)
On 09/12, Johannes Schindelin wrote: > Hi Thomas, > > [quickly, as I will go back to a proper vacation after this] Sorry about interrupting your vacation, enjoy wherever you are! :) > On Wed, 12 Sep 2018, Thomas Gummerer wrote: > > > diff --git a/linear-assignment.c b/linear-assignment.c > > index 9b3e56e283..7700b80eeb 100644 > > --- a/linear-assignment.c > > +++ b/linear-assignment.c > > @@ -51,8 +51,8 @@ void compute_assignment(int column_count, int row_count, > > int *cost, > > else if (j1 < -1) > > row2column[i] = -2 - j1; > > else { > > - int min = COST(!j1, i) - v[!j1]; > > - for (j = 1; j < column_count; j++) > > + int min = INT_MAX; > > I am worried about this, as I tried very hard to avoid integer overruns. Ah fair enough, now I think I understand where the calculation of the initial value of min comes from, thanks! > Wouldn't it be possible to replace the `else {` by an appropriate `else if > (...) { ... } else {`? E.g. `else if (column_count < 2)` or some such? Yes, I think that would be possible. However if we're already special casing "column_count < 2", I think we might as well just exit early before running through the whole algorithm in that case. If there's only one column, there are no commits that can be assigned to eachother, as there is only the one. We could also just not run call 'compute_assignment' in the first place if column_count == 1, however I'd rather make the function safer to call, just in case we find it useful for something else in the future. Will send an updated patch in a bit. > Ciao, > Dscho > > > + for (j = 0; j < column_count; j++) > > if (j != j1 && min > COST(j, i) - v[j]) > > min = COST(j, i) - v[j]; > > v[j1] -= min; > > diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh > > index 2237c7f4af..fb4c13a84a 100755 > > --- a/t/t3206-range-diff.sh > > +++ b/t/t3206-range-diff.sh > > @@ -142,4 +142,9 @@ test_expect_success 'changed message' ' > > test_cmp expected actual > > ' > > > > +test_expect_success 'no commits on one side' ' > > + git commit --amend -m "new message" && > > + git range-diff master HEAD@{1} HEAD > > +' > > + > > test_done > > -- > > 2.19.0.397.gdd90340f6a > > > >
Re: [PATCH v1] read-cache: add GIT_TEST_INDEX_VERSION support
Thomas Gummerer writes: > Thanks, I do think this is a good idea. I do however share Ævar's > concern in https://public-inbox.org/git/87h8itkz2h@evledraar.gmail.com/. > I have TEST_GIT_INDEX_VERSION=4 set in my config.mak since quite a > long time, and had I missed this thread, I would all of a sudden not > run the test suite with index format 4 anymore without any notice. > > I think the suggestion of erroring out if TEST_GIT_INDEX_VERSION is > set would be useful in this case (and probably others in which you're > renaming these variables). I am not enthused by "you are using an old variable---we fail your build/test". The assumption is that people let config.mak laying around regardless of how new/old the vintage of Git they are building and testing. I do not think you'd want to adjust your config.mak as you switch between 'maint' and 'next. I think it is OK to make it error only if the old one is set without the new one. Then people can have _both_ set to the same value during the period in which the topic sails through pu down to next down to master, after seeing an failure once while building and testing 'pu'. > Btw, I think it would be nice to have all these renaming/documenting > variables for the test suite patches in one series, so they are easier > to look at with more context. Yeah, even though these three were posted as independent changes, their additions to t/README inevitably conflicted with each other.
Re: [PATCH v2] builtin/remote: quote remote name on error to display empty name
Junio C Hamano writes: > Have you run "make test" with this change? > > I expect at least 5505.10 to fail without adjustment. For now, I'll queue this on top, and if this turns out to be sufficient, I will squash it in. t/t5505-remote.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh index 241e6a319d..d2a2cdd453 100755 --- a/t/t5505-remote.sh +++ b/t/t5505-remote.sh @@ -145,7 +145,7 @@ test_expect_success 'remove remote protects local branches' ' test_expect_success 'remove errors out early when deleting non-existent branch' ' ( cd test && - echo "fatal: No such remote: foo" >expect && + echo "fatal: No such remote: '\''foo'\''" >expect && test_must_fail git remote rm foo 2>actual && test_i18ncmp expect actual ) @@ -173,7 +173,7 @@ test_expect_success 'remove remote with a branch without configured merge' ' test_expect_success 'rename errors out early when deleting non-existent branch' ' ( cd test && - echo "fatal: No such remote: foo" >expect && + echo "fatal: No such remote: '\''foo'\''" >expect && test_must_fail git remote rename foo bar 2>actual && test_i18ncmp expect actual )
Re: [PATCH v1] read-cache: add GIT_TEST_INDEX_VERSION support
On 09/13, Ben Peart wrote: > > > On 9/12/2018 6:31 PM, Thomas Gummerer wrote: > > On 09/12, Ben Peart wrote: > > > Teach get_index_format_default() to support running the test suite > > > with specific index versions. In particular, this enables the test suite > > > to be run using index version 4 which is not the default so gets less > > > testing. > > > > I found this commit message slightly misleading. Running the test > > suite with specific index versions is already supported, by defining > > TEST_GIT_INDEX_VERSION in 'config.mak'. What we're doing here is > > introduce an additional environment variable that can also be used to > > set the index format in tests. > > > > Even setting TEST_GIT_INDEX_VERSION=4 in the environment does run the > > test suite with index-v4. Admittedly the name is a bit strange > > compared to our usual GIT_TEST_* environment variable names, and it > > should probably be documented better (it's only documented in the > > Makefile currently), but I'm not sure we should introduce another > > environment variable for this purpose? > > TEST_GIT_INDEX_VERSION enables the testing I was looking for but you're > right, it isn't well documented and the atypical naming and implementation > don't help either. > > I checked the documentation and code but didn't see any way to test the V4 > index code path so wrote this patch. I wonder if we can improve the > discoverability of TEST_GIT_INDEX_VERSION through better naming and > documentation. > > How about this as an alternative? Thanks, I do think this is a good idea. I do however share Ævar's concern in https://public-inbox.org/git/87h8itkz2h@evledraar.gmail.com/. I have TEST_GIT_INDEX_VERSION=4 set in my config.mak since quite a long time, and had I missed this thread, I would all of a sudden not run the test suite with index format 4 anymore without any notice. I think the suggestion of erroring out if TEST_GIT_INDEX_VERSION is set would be useful in this case (and probably others in which you're renaming these variables). Not sure how many people it would affect (and most of those would probably read the mailing list), but it's not a big change either. Btw, I think it would be nice to have all these renaming/documenting variables for the test suite patches in one series, so they are easier to look at with more context. > > diff --git a/Makefile b/Makefile > index 5a969f5830..9e84ef02f7 100644 > --- a/Makefile > +++ b/Makefile > @@ -400,7 +400,7 @@ all:: > # (defaults to "man") if you want to have a different default when > # "git help" is called without a parameter specifying the format. > # > -# Define TEST_GIT_INDEX_VERSION to 2, 3 or 4 to run the test suite > +# Define GIT_TEST_INDEX_VERSION to 2, 3 or 4 to run the test suite > # with a different indexfile format version. If it isn't set the index > # file format used is index-v[23]. > # > @@ -2599,8 +2599,8 @@ endif > ifdef GIT_INTEROP_MAKE_OPTS > @echo GIT_INTEROP_MAKE_OPTS=\''$(subst ','\'',$(subst > ','\'',$(GIT_INTEROP_MAKE_OPTS)))'\' >>$@+ > endif > -ifdef TEST_GIT_INDEX_VERSION > - @echo TEST_GIT_INDEX_VERSION=\''$(subst ','\'',$(subst > ','\'',$(TEST_GIT_INDEX_VERSION)))'\' >>$@+ > +ifdef GIT_TEST_INDEX_VERSION > + @echo GIT_TEST_INDEX_VERSION=\''$(subst ','\'',$(subst > ','\'',$(GIT_TEST_INDEX_VERSION)))'\' >>$@+ > > endif > @if cmp $@+ $@ >/dev/null 2>&1; then $(RM) $@+; else mv $@+ $@; fi > > diff --git a/t/test-lib.sh b/t/test-lib.sh > index 44288cbb59..31698c01c4 100644 > --- a/t/test-lib.sh > +++ b/t/test-lib.sh > @@ -134,9 +134,9 @@ export EDITOR > GIT_TRACE_BARE=1 > export GIT_TRACE_BARE > > -if test -n "${TEST_GIT_INDEX_VERSION:+isset}" > +if test -n "${GIT_TEST_INDEX_VERSION:+isset}" > then > - GIT_INDEX_VERSION="$TEST_GIT_INDEX_VERSION" > + GIT_INDEX_VERSION="$GIT_TEST_INDEX_VERSION" > export GIT_INDEX_VERSION > fi > > diff --git a/t/README b/t/README > index 9028b47d92..f872638a78 100644 > --- a/t/README > +++ b/t/README > @@ -315,10 +315,14 @@ packs on demand. This normally only happens when the > object size is > over 2GB. This variable forces the code path on any object larger than >bytes. > > -GIT_TEST_OE_DELTA_SIZE= exercises the uncomon pack-objects code > +GIT_TEST_OE_DELTA_SIZE= exercises the uncommon pack-objects code > path where deltas larger than this limit require extra memory > allocation for bookkeeping. > > +GIT_TEST_INDEX_VERSION= exercises the index read/write code path > +for the index version specified. Can be set to any valid version > +but the non-default version 4 is probably the most beneficial. > + > Naming Tests > >
Re: [PATCH v2] builtin/remote: quote remote name on error to display empty name
Shulhan writes: > When adding new remote name with empty string, git will print the > following error message, > > fatal: '' is not a valid remote name\n > > But when removing remote name with empty string as input, git shows the > empty string without quote, > > fatal: No such remote: \n > > To make these error messages consistent, quote the name of the remote > that we tried and failed to find. > > Signed-off-by: Shulhan > Reviewed-by: Junio C Hamano > --- > builtin/remote.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) Have you run "make test" with this change? I expect at least 5505.10 to fail without adjustment.
Re: [PATCH v1] preload-index: update GIT_FORCE_PRELOAD_TEST support
Ben Peart writes: > Rename GIT_FORCE_PRELOAD_TEST to GIT_TEST_PRELOAD for consistency with the > other GIT_TEST_ special setups and properly document its use. > > Signed-off-by: Ben Peart > --- > Among the two unrelated changes that are not mentioned in the proposed log message, the change to the test to use sane_unset is probably OK, but replacing a call to getenv() with git_env_bool() without making sure necessary header file(s) are included would break the build. I _think_ config.h is the header to include; I didn't try it, though. Thanks. > Notes: > Base Ref: v2.19.0 > Web-Diff: https://github.com/benpeart/git/commit/dcd201b920 > Checkout: git fetch https://github.com/benpeart/git git-test-preload-v1 > && git checkout dcd201b920 > > preload-index.c | 2 +- > t/README| 3 +++ > t/t7519-status-fsmonitor.sh | 4 ++-- > 3 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/preload-index.c b/preload-index.c > index 71cd2437a3..cc8a7333c2 100644 > --- a/preload-index.c > +++ b/preload-index.c > @@ -84,7 +84,7 @@ static void preload_index(struct index_state *index, > return; > > threads = index->cache_nr / THREAD_COST; > - if ((index->cache_nr > 1) && (threads < 2) && > getenv("GIT_FORCE_PRELOAD_TEST")) > + if ((index->cache_nr > 1) && (threads < 2) && > git_env_bool("GIT_TEST_PRELOAD", 0)) > threads = 2; > if (threads < 2) > return; > diff --git a/t/README b/t/README > index 9028b47d92..73fb09560f 100644 > --- a/t/README > +++ b/t/README > @@ -319,6 +319,9 @@ GIT_TEST_OE_DELTA_SIZE= exercises the uncomon > pack-objects code > path where deltas larger than this limit require extra memory > allocation for bookkeeping. > > +GIT_TEST_PRELOAD= exercises the preload-index code path by > +overriding the minimum number of cache entries required per thread. > + > Naming Tests > > > diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh > index 756beb0d8e..9b703d49b5 100755 > --- a/t/t7519-status-fsmonitor.sh > +++ b/t/t7519-status-fsmonitor.sh > @@ -245,9 +245,9 @@ do > git config core.preloadIndex $preload_val && > if test $preload_val = true > then > - GIT_FORCE_PRELOAD_TEST=$preload_val; export > GIT_FORCE_PRELOAD_TEST > + GIT_TEST_PRELOAD=$preload_val; export GIT_TEST_PRELOAD > else > - unset GIT_FORCE_PRELOAD_TEST > + sane_unset GIT_TEST_PRELOAD > fi > ' > > > base-commit: 1d4361b0f344188ab5eec6dcea01f61a3a3a1670
Re: [PATCH v2 1/1] commit-reach: properly peel tags
Derrick Stolee writes: >> +if (!from_one || from_one->type != OBJ_COMMIT) { >> +/* no way to tell if this is reachable by >> + * looking at the ancestry chain alone, so >> + * leave a note to ourselves not to worry about >> + * this object anymore. >> + */ >> +from->objects[i].item->flags |= assign_flag; >> +continue; >> +} >> + >> +list[nr_commits] = (struct commit *)from_one; >> +if (parse_commit(list[nr_commits]) || >> +list[nr_commits]->generation < min_generation) >> +return 0; /* is this a leak? */ > > Of course, after sending v2, I see this comment. This is a leak of > 'list' and should be fixed. > > Not only is it a leak here, it is also a leak in the 'cleanup' > section. I'll squash the following into v3, but I'll let v2 simmer for > review before rerolling. > > diff --git a/commit-reach.c b/commit-reach.c > index 4048a2132a..c457d8d85f 100644 > --- a/commit-reach.c > +++ b/commit-reach.c > @@ -569,8 +569,11 @@ int can_all_from_reach_with_flag(struct > object_array *from, > > list[nr_commits] = (struct commit *)from_one; > if (parse_commit(list[nr_commits]) || > - list[nr_commits]->generation < min_generation) > - return 0; /* is this a leak? */ > + list[nr_commits]->generation < min_generation) { > + result = 0; > + goto cleanup; > + } > + > nr_commits++; > } > > @@ -623,6 +626,7 @@ int can_all_from_reach_with_flag(struct > object_array *from, > clear_commit_marks(list[i], RESULT); > clear_commit_marks(list[i], assign_flag); > } > + free(list); > return result; > } With this, commit marks are cleared even when we do the "early return", but only for the objects that appear in the resulting list[]. Because the for() loop in the last hunk interates over list[], those non-commit objects that are smudged with assign_flag but never made to list[] will be left smudged when this function returns (this is true when there is no early return). Is that intended?
Re: [PATCH v1] read-cache: add GIT_TEST_INDEX_VERSION support
Ævar Arnfjörð Bjarmason writes: > On Wed, Sep 12 2018, Ben Peart wrote: > >> -GIT_TEST_OE_DELTA_SIZE= exercises the uncomon pack-objects code >> +GIT_TEST_OE_DELTA_SIZE= exercises the uncommon pack-objects code >> path where deltas larger than this limit require extra memory >> allocation for bookkeeping. > > If re-rolled maybe better as a leading "fix a typo" patch. Yeah, an easy to miss change that is unrelated to the topic is better made as a separate patch.
Re: [PATCH 1/2] fetch-object: provide only one fetching function
Jonathan Tan writes: >> Instead of explaining why the new convention is better to justify >> (2), the above three lines handwave by saying "more flexible" >> twice. We should do better. >> >> fetch-object: unify fetch_object[s] functions >> >> There are fetch_object() and fetch_objects() helpers in >> fetch-object.h; as the latter takes "struct oid_array", >> the former cannot be made into a thin wrapper around the >> latter without an extra allocation and set-up cost. >> >> Update fetch_objects() to take an array of "struct >> object_id" and number of elements in it as separate >> parameters, remove fetch_object(), and adjust all existing >> callers of these functions to use the new fetch_objects(). >> >> perhaps? > > Thanks - your explanation is much clearer than mine. Let me know if you > want a reroll (or if you can update the commit message yourself, that's > fine too). If there is no other change needed for either of the patches, I do not mind rewording the 1/2 myself to save a round-trip. Thanks.
Re: [PATCH] sequencer: fix --allow-empty-message behavior, make it smarter
Elijah Newren writes: > This patch cleanly applies to both 2.19.0 and pu. There are some related > code cleanups that I'd like to make, but doing that cleanup conflicts with > the various rewrite-rebase-in-C topics sitting in pu; since those are > fairly lengthy, I really don't want to cause problems there, but I think > SZEDER really wants this 2.19.0 regression fix before 2.20.0 and thus > before those other topics. Oh absolutely. Materials for 2.19.x maintenance track can and should jump over other topics for 2.20 and later. Thanks for being considerate. > @@ -899,7 +899,7 @@ static int run_git_commit(const char *defmsg, struct > replay_opts *opts, > if ((flags & ALLOW_EMPTY)) > argv_array_push(, "--allow-empty"); > > - if (opts->allow_empty_message) > + if (!(flags & EDIT_MSG)) > argv_array_push(, "--allow-empty-message"); Hmph. I briefly wondered if an alternative logic would be better: If and only the original commit being rewritten is empty, then we allow the result to be empty. But looking at EDIT_MSG would be more explicit and probably is a better idea. That would allow you to abort a reword of a commit whose message is empty. The reason why I thought about the alt logic is because I am worried about a use case where $ GIT_EDITOR=: git rebase|cherry-pick ... is used to say "I do not want you to go interactive, when the only interaction needed from me is to edit the message---I am perfectly happy with the messages of commits being replayed or ones you come up with as the default". Because "--allow-empty-message" tends to make things _less_ interactive, the worry is unfounded.
Re: [PATCH 7/9] submodule: fetch in submodules git directory instead of in worktree
On Wed, Sep 12, 2018 at 11:36 AM Junio C Hamano wrote: > > Stefan Beller writes: > > > This patch started as a refactoring to make 'get_next_submodule' more > > readable, but upon doing so, I realized that git-fetch actually doesn't > > need to be run in the worktree. So let's run it in the git dir instead. > > It may be clear to the author but not clear to the reader of the > above paragraph that "worktree", "fetch" and "git dir" all refer to > the recursively invoked operation that updates the submodules > repository. s/git-fetch/"git fetch" for the submodule/ should be > sufficient to help the readers. > > > That should pave the way towards fetching submodules that are currently > > not checked out. > > Very good. > > > +static void prepare_submodule_repo_env_in_gitdir(struct argv_array *out) > > +{ > > + prepare_submodule_repo_env_no_git_dir(out); > > + argv_array_pushf(out, "%s=.", GIT_DIR_ENVIRONMENT); > > +} > > + > > /* Helper function to display the submodule header line prior to the full > > * summary output. If it can locate the submodule objects directory it will > > * attempt to lookup both the left and right commits and put them into the > > @@ -1227,6 +1233,27 @@ static int get_fetch_recurse_config(const struct > > submodule *submodule, > > return spf->default_option; > > } > > > > +static const char *get_submodule_git_dir(struct repository *r, const char > > *path) > > +{ > > + struct repository subrepo; > > + const char *ret; > > + > > + if (repo_submodule_init(, r, path)) { > > + /* no entry in .gitmodules? */ > > + struct strbuf gitdir = STRBUF_INIT; > > + strbuf_repo_worktree_path(, r, "%s/.git", path); > > + if (repo_init(, gitdir.buf, NULL)) { > > + strbuf_release(); > > + return NULL; > > + } > > This is for the modern "absorbed" layout? Do we get a notice and > encouragement to migrate from the historical layout, or there is no > need to (e.g. the migration happens automatically in some other > codepaths)? No, the absorbed would also be handled by repo_submodule_init. I wrote a patch once to migrate repo_submodule_init to take a "struct *submodule" instead of a path as the third argument, which would fall in line with this patch as well, I'll dig it up. Historically git-fetch supported repositories that are not submodules (but have a gitlink and a working tree in place) as well. That is covered here. (see comment /* no entry in .gitmodules? */) > > - strbuf_release(_path); > > - strbuf_release(_git_dir); > > But if it is a leak, it is easily plugged by freeing git_dir here, I > think. Thanks.
Re: [PATCH v5 11/12] t1407: make hash size independent
On Thu, Sep 13 2018, brian m. carlson wrote: > - $RWT for-each-reflog | cut -c 42- | sort >actual && > + $RWT for-each-reflog | cut -d" " -f 2- | sort >actual && Aside from hash size issues, this just makes the tests easier to read. Thanks!
Re: [PATCH v2 10/11] multi-pack-index: report progress during 'verify'
On Thu, Sep 13 2018, Derrick Stolee via GitGitGadget wrote: > + progress = start_progress(_("Verifying object offsets"), > m->num_objects); I think in the spirit of my "commit-graph {write,verify}: add progress output" it would be better to say: "Verifying multi-pack-index object offsets" I.e. both to make it clearer what's actually going on (without much added verbosity), and also so that when the user turns on this feature it's clear what gc / fsck etc. are doing for that feature in particular, as with the commit-graph.
Re: [PATCH v1] fsmonitor: update GIT_TEST_FSMONITOR support
On Thu, Sep 13 2018, Ben Peart wrote: > diff --git a/config.c b/config.c > index 3461993f0a..3555c63f28 100644 > --- a/config.c > +++ b/config.c > @@ -2278,7 +2278,7 @@ int git_config_get_max_percent_split_change(void) > int git_config_get_fsmonitor(void) > { > if (git_config_get_pathname("core.fsmonitor", _fsmonitor)) > - core_fsmonitor = getenv("GIT_FSMONITOR_TEST"); > + core_fsmonitor = getenv("GIT_TEST_FSMONITOR"); > > if (core_fsmonitor && !*core_fsmonitor) > core_fsmonitor = NULL; > diff --git a/t/README b/t/README > index 9028b47d92..545438c820 100644 > --- a/t/README > +++ b/t/README > @@ -319,6 +319,10 @@ GIT_TEST_OE_DELTA_SIZE= exercises the uncomon > pack-objects code > path where deltas larger than this limit require extra memory > allocation for bookkeeping. > > +GIT_TEST_FSMONITOR=$PWD/t7519/fsmonitor-all exercises the fsmonitor > +code path for utilizing a file system monitor to speed up detecting > +new or changed files. > + > Naming Tests > I've seen this & will watch out for it, but still, when I'm updating to "next" in a couple of months I may not be tracking the exact state of the integration of this patch, and then running with GIT_FSMONITOR_TEST=... will suddenly be a noop. So maybe something like this to test-lib.sh as well (or directly in config.c): if test -n "$GIT_FSMONITOR_TEST" then echo "The GIT_FSMONITOR_TEST variable has been renamed to GIT_TEST_FSMONITOR" exit 1 fi Maybe I'm being too nitpicky and there's only two of us who run the test with this anyway, and we can deal with it. It just rubs me the wrong way that we have a test mode that silently stops being picked up because a command-line option or env variable got renamed, especially since we've had it for 4 stable releases, especially since it's so easy for us to avoid that confusion (just die), v.s. potential time wasted downstream (wondering why fsmonitor stuff broke on $SOME_OS even though we're testing for it during package build...).
Re: [PATCH v1] read-cache: add GIT_TEST_INDEX_VERSION support
On Wed, Sep 12 2018, Ben Peart wrote: > -GIT_TEST_OE_DELTA_SIZE= exercises the uncomon pack-objects code > +GIT_TEST_OE_DELTA_SIZE= exercises the uncommon pack-objects code > path where deltas larger than this limit require extra memory > allocation for bookkeeping. If re-rolled maybe better as a leading "fix a typo" patch. I was reading this in a mobile client earlier and couldn't see at a glance (small screen+wrapping) what the change was here, so that sort of nitpicking isn't purely theoretical :) > +GIT_TEST_INDEX_VERSION= exercises the index read/write code path > +for the index version specified. Can be set to any valid version > +but the non-default version 4 is probably the most beneficial. > + I'm not familiar with this and haven't dug, so I don't know this: are any values of 1..4 OK? 0..4? Would be better to say that here...
Re: [PATCH 2/3] archive: implement protocol v2 archive command
On Wed, Sep 12 2018, Stefan Beller wrote: > On Tue, Sep 11, 2018 at 10:36 PM Josh Steadmon wrote: >> +*/ >> + status = packet_reader_read(); >> + } >> + if (status != PACKET_READ_DELIM) >> + die(_("upload-archive: expected delim packet")); > > This is upload-archive, which is a low level plumbing command > (see the main man page of git for an explanation of that category), > so we do not translate the error/die() calls. Besides, this is executed > on the server, which might have a different locale than the requesting > client? > > Would asking for a setlocale() on the server side be an unreasonable > feature request for the capabilities (in a follow up patch, and then not > just for archive but also fetch/push, etc.)? This would be very nice to have, but as you suggest in some follow-up change. I think though that instead of doing setlocale() it would be better to pass some flag saying we're operating in a machine-readable mode, and then we'd (as part of the protocol defintion) say we're going to emit GIT_ERR_UPLOAD_ARCHIVE_EXPECTED_DELIM_PACKET or whatever. Advantages of doing that over a server-side setlocale(): 1) Purely for translation purposes, users can update to a newer client to get new translations, even though they're talking to an old server. 2) Again, only for translation purposes, servers may not have the appropriate locales generated and/or linked to libgettext. 3) Ditto, some clients that aren't git.git may want/need to emit different translation messages to their consumers than what we have, think some GUI client / Emacs magit etc. whose UI is different from ours. 4) Aside from translation purposes, getting a machine-readable "push/pull" etc. mode would be very handy. E.g. now you need to parse stderr to see why exactly your push failed (hook denied, or non-fast-forward, or non-fast-forward where there was a lock race condition? ...). I also wonder if something like #4 wouldn't compliment something like the proposed structured logging[1]. I.e. even though we'd like to run git.git, and present exactly the message to the user we do now, we might want to run in such a machine-readable mode under the hood when talking to the server so we can log exactly how the push went / how it failed for the purposes of aggregation. 1. https://public-inbox.org/git/20180713165621.52017-2-...@jeffhostetler.com/
[PATCH v2] read-cache: update TEST_GIT_INDEX_VERSION support
Rename TEST_GIT_INDEX_VERSION to GIT_TEST_INDEX_VERSION for consistency with the other GIT_TEST_ special setups and properly document its use. Signed-off-by: Ben Peart --- Notes: Base Ref: v2.19.0 Web-Diff: https://github.com/benpeart/git/commit/e26ccb9004 Checkout: git fetch https://github.com/benpeart/git git-test-index-version-v2 && git checkout e26ccb9004 ### Interdiff (v1..v2): diff --git a/Makefile b/Makefile index 5a969f5830..9e84ef02f7 100644 --- a/Makefile +++ b/Makefile @@ -400,7 +400,7 @@ all:: # (defaults to "man") if you want to have a different default when # "git help" is called without a parameter specifying the format. # -# Define TEST_GIT_INDEX_VERSION to 2, 3 or 4 to run the test suite +# Define GIT_TEST_INDEX_VERSION to 2, 3 or 4 to run the test suite # with a different indexfile format version. If it isn't set the index # file format used is index-v[23]. # @@ -2599,8 +2599,8 @@ endif ifdef GIT_INTEROP_MAKE_OPTS @echo GIT_INTEROP_MAKE_OPTS=\''$(subst ','\'',$(subst ','\'',$(GIT_INTEROP_MAKE_OPTS)))'\' >>$@+ endif -ifdef TEST_GIT_INDEX_VERSION - @echo TEST_GIT_INDEX_VERSION=\''$(subst ','\'',$(subst ','\'',$(TEST_GIT_INDEX_VERSION)))'\' >>$@+ +ifdef GIT_TEST_INDEX_VERSION + @echo GIT_TEST_INDEX_VERSION=\''$(subst ','\'',$(subst ','\'',$(GIT_TEST_INDEX_VERSION)))'\' >>$@+ endif @if cmp $@+ $@ >/dev/null 2>&1; then $(RM) $@+; else mv $@+ $@; fi diff --git a/read-cache.c b/read-cache.c index d140ce9989..7b1354d759 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1570,45 +1570,26 @@ static unsigned int get_index_format_default(void) char *envversion = getenv("GIT_INDEX_VERSION"); char *endp; int value; - unsigned int version = -1; + unsigned int version = INDEX_FORMAT_DEFAULT; - if (envversion) { - version = strtoul(envversion, , 10); - if (*endp || - version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < version) { - warning(_("GIT_INDEX_VERSION set, but the value is invalid.\n" - "Using version %i"), INDEX_FORMAT_DEFAULT); - version = INDEX_FORMAT_DEFAULT; - } - } - - if (version == -1) { - if (!git_config_get_int("index.version", )) { + if (!envversion) { + if (!git_config_get_int("index.version", )) version = value; if (version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < version) { warning(_("index.version set, but the value is invalid.\n" "Using version %i"), INDEX_FORMAT_DEFAULT); - version = INDEX_FORMAT_DEFAULT; - } + return INDEX_FORMAT_DEFAULT; } + return version; } - if (version == -1) { - envversion = getenv("GIT_TEST_INDEX_VERSION"); - if (envversion) { version = strtoul(envversion, , 10); if (*endp || version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < version) { - warning(_("GIT_TEST_INDEX_VERSION set, but the value is invalid.\n" + warning(_("GIT_INDEX_VERSION set, but the value is invalid.\n" "Using version %i"), INDEX_FORMAT_DEFAULT); version = INDEX_FORMAT_DEFAULT; } - } - } - - if (version == -1) - version = INDEX_FORMAT_DEFAULT; - return version; } diff --git a/t/README b/t/README index f872638a78..fb6359b17b 100644 --- a/t/README +++ b/t/README @@ -320,8 +320,7 @@ path where deltas larger than this limit require extra memory allocation for bookkeeping. GIT_TEST_INDEX_VERSION= exercises the index read/write code path -for the index version specified. Can be set to any valid version -but the non-default version 4 is probably the most beneficial. +for the index version specified. Can be set to any valid version. Naming Tests diff --git a/t/test-lib.sh b/t/test-lib.sh index 44288cbb59..31698c01c4 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -134,9 +134,9 @@ export EDITOR GIT_TRACE_BARE=1 export GIT_TRACE_BARE -if test -n "${TEST_GIT_INDEX_VERSION:+isset}" +if test -n "${GIT_TEST_INDEX_VERSION:+isset}" then - GIT_INDEX_VERSION="$TEST_GIT_INDEX_VERSION" + GIT_INDEX_VERSION="$GIT_TEST_INDEX_VERSION" export GIT_INDEX_VERSION fi ### Patches Makefile | 6 +++--- t/README | 5 - t/test-lib.sh | 4 ++-- 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/Makefile b/Makefile
Re: [PATCH v2 1/1] contrib: add coverage-diff script
On 9/13/2018 1:40 PM, Junio C Hamano wrote: "Derrick Stolee via GitGitGadget" writes: + then + line_num=$(($line_num + 1)) + fi + fi + done I have a feeling that a single Perl script instead of a shell loop that runs many grep and awk as subprocesses performs better even on Windows, and it would be more readable and maintainable. perl -e ' my $line_num; while (<>) { # Hunk header? Grab the beginning in postimage. if (/^@@ -\d+(?:,\d+)? \+(\d+)(?:,\d+)? @@/) { $line_num = $1; next; } # Have we seen a hunk? Ignore "diff --git" etc. next unless defined $line_num; # Deleted line? Ignore. if (/^-/) { next; } # Show only the line number of added lines. if (/^\+/) { print "$line_num\n"; } # Either common context or added line appear in # the postimage. Count it. $line_num++; } ' or something like that, given that you seem to only need line numbers in new_lines.txt anyway? Thanks for the deep dive here, especially with the perl assistance. I've never written any perl, but it seems like the right tool here. I'll have time to revisit this next week. Thanks, -Stolee
Re: [PATCH 1/2] fetch-object: provide only one fetching function
> Instead of explaining why the new convention is better to justify > (2), the above three lines handwave by saying "more flexible" > twice. We should do better. > > fetch-object: unify fetch_object[s] functions > > There are fetch_object() and fetch_objects() helpers in > fetch-object.h; as the latter takes "struct oid_array", > the former cannot be made into a thin wrapper around the > latter without an extra allocation and set-up cost. > > Update fetch_objects() to take an array of "struct > object_id" and number of elements in it as separate > parameters, remove fetch_object(), and adjust all existing > callers of these functions to use the new fetch_objects(). > > perhaps? Thanks - your explanation is much clearer than mine. Let me know if you want a reroll (or if you can update the commit message yourself, that's fine too).
Re: [PATCH v1] fsmonitor: update GIT_TEST_FSMONITOR support
Ben Peart writes: > Rename GIT_FSMONITOR_TEST to GIT_TEST_FSMONITOR for consistency with the > other GIT_TEST_ special setups and properly document its use. Makes sense. Thanks for such an attention to detail. > > Signed-off-by: Ben Peart > --- > > Notes: > Base Ref: v2.19.0 > Web-Diff: https://github.com/benpeart/git/commit/311484a684 > Checkout: git fetch https://github.com/benpeart/git git-test-fsmonitor-v1 > && git checkout 311484a684 > > config.c| 2 +- > t/README| 4 > t/t1700-split-index.sh | 2 +- > t/t7519-status-fsmonitor.sh | 2 +- > 4 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/config.c b/config.c > index 3461993f0a..3555c63f28 100644 > --- a/config.c > +++ b/config.c > @@ -2278,7 +2278,7 @@ int git_config_get_max_percent_split_change(void) > int git_config_get_fsmonitor(void) > { > if (git_config_get_pathname("core.fsmonitor", _fsmonitor)) > - core_fsmonitor = getenv("GIT_FSMONITOR_TEST"); > + core_fsmonitor = getenv("GIT_TEST_FSMONITOR"); > > if (core_fsmonitor && !*core_fsmonitor) > core_fsmonitor = NULL; > diff --git a/t/README b/t/README > index 9028b47d92..545438c820 100644 > --- a/t/README > +++ b/t/README > @@ -319,6 +319,10 @@ GIT_TEST_OE_DELTA_SIZE= exercises the uncomon > pack-objects code > path where deltas larger than this limit require extra memory > allocation for bookkeeping. > > +GIT_TEST_FSMONITOR=$PWD/t7519/fsmonitor-all exercises the fsmonitor > +code path for utilizing a file system monitor to speed up detecting > +new or changed files. > + > Naming Tests > > > diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh > index b3b4d83eaf..f6a856f24c 100755 > --- a/t/t1700-split-index.sh > +++ b/t/t1700-split-index.sh > @@ -6,7 +6,7 @@ test_description='split index mode tests' > > # We need total control of index splitting here > sane_unset GIT_TEST_SPLIT_INDEX > -sane_unset GIT_FSMONITOR_TEST > +sane_unset GIT_TEST_FSMONITOR > > test_expect_success 'enable split index' ' > git config splitIndex.maxPercentChange 100 && > diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh > index 756beb0d8e..d77012ea6d 100755 > --- a/t/t7519-status-fsmonitor.sh > +++ b/t/t7519-status-fsmonitor.sh > @@ -8,7 +8,7 @@ test_description='git status with file system watcher' > # To run the entire git test suite using fsmonitor: > # > # copy t/t7519/fsmonitor-all to a location in your path and then set > -# GIT_FSMONITOR_TEST=fsmonitor-all and run your tests. > +# GIT_TEST_FSMONITOR=fsmonitor-all and run your tests. > # > > # Note, after "git reset --hard HEAD" no extensions exist other than 'TREE' > > base-commit: 1d4361b0f344188ab5eec6dcea01f61a3a3a1670
[PATCH v2 01/11] multi-pack-index: add 'verify' verb
From: Derrick Stolee The multi-pack-index builtin writes multi-pack-index files, and uses a 'write' verb to do so. Add a 'verify' verb that checks this file matches the contents of the pack-indexes it replaces. The current implementation is a no-op, but will be extended in small increments in later commits. Signed-off-by: Derrick Stolee --- Documentation/git-multi-pack-index.txt | 10 ++ builtin/multi-pack-index.c | 4 +++- midx.c | 13 + midx.h | 1 + t/t5319-multi-pack-index.sh| 8 5 files changed, 35 insertions(+), 1 deletion(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index 1f97e79912..f7778a2c85 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -27,6 +27,10 @@ write:: When given as the verb, write a new MIDX file to `/packs/multi-pack-index`. +verify:: + When given as the verb, verify the contents of the MIDX file + at `/packs/multi-pack-index`. + EXAMPLES @@ -43,6 +47,12 @@ $ git multi-pack-index write $ git multi-pack-index --object-dir write --- +* Verify the MIDX file for the packfiles in the current .git folder. ++ +--- +$ git multi-pack-index verify +--- + SEE ALSO diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 2633efd95d..fca70f8e4f 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -5,7 +5,7 @@ #include "midx.h" static char const * const builtin_multi_pack_index_usage[] = { - N_("git multi-pack-index [--object-dir=] write"), + N_("git multi-pack-index [--object-dir=] (write|verify)"), NULL }; @@ -42,6 +42,8 @@ int cmd_multi_pack_index(int argc, const char **argv, if (!strcmp(argv[0], "write")) return write_midx_file(opts.object_dir); + if (!strcmp(argv[0], "verify")) + return verify_midx_file(opts.object_dir); die(_("unrecognized verb: %s"), argv[0]); } diff --git a/midx.c b/midx.c index f3e8dbc108..b253bed517 100644 --- a/midx.c +++ b/midx.c @@ -928,3 +928,16 @@ void clear_midx_file(const char *object_dir) free(midx); } + +int verify_midx_error; + +int verify_midx_file(const char *object_dir) +{ + struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); + verify_midx_error = 0; + + if (!m) + return 0; + + return verify_midx_error; +} diff --git a/midx.h b/midx.h index a210f1af2a..ce80b91c68 100644 --- a/midx.h +++ b/midx.h @@ -43,5 +43,6 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i int write_midx_file(const char *object_dir); void clear_midx_file(const char *object_dir); +int verify_midx_file(const char *object_dir); #endif diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 6f56b38674..1c4e0e6d31 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -150,6 +150,10 @@ test_expect_success 'write midx with twelve packs' ' compare_results_with_midx "twelve packs" +test_expect_success 'verify multi-pack-index success' ' + git multi-pack-index verify --object-dir=$objdir +' + test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && git repack -adf && @@ -214,4 +218,8 @@ test_expect_success 'force some 64-bit offsets with pack-objects' ' midx_read_expect 1 63 5 objects64 " large-offsets" ' +test_expect_success 'verify multi-pack-index with 64-bit offsets' ' + git multi-pack-index verify --object-dir=objects64 +' + test_done -- gitgitgadget
[PATCH v2 10/11] multi-pack-index: report progress during 'verify'
From: Derrick Stolee When verifying a multi-pack-index, the only action that takes significant time is checking the object offsets. For example, to verify a multi-pack-index containing 6.2 million objects in the Linux kernel repository takes 1.3 seconds on my machine. 99% of that time is spent looking up object offsets in each of the packfiles and comparing them to the multi-pack-index offset. Add a progress indicator for that section of the 'verify' verb. Signed-off-by: Derrick Stolee --- midx.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/midx.c b/midx.c index 47e7e6113a..4d4c930522 100644 --- a/midx.c +++ b/midx.c @@ -7,6 +7,7 @@ #include "object-store.h" #include "sha1-lookup.h" #include "midx.h" +#include "progress.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 @@ -940,6 +941,7 @@ static void midx_report(const char *fmt, ...) int verify_midx_file(const char *object_dir) { uint32_t i; + struct progress *progress = NULL; struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); verify_midx_error = 0; @@ -971,6 +973,7 @@ int verify_midx_file(const char *object_dir) i, oid_to_hex(), oid_to_hex(), i + 1); } + progress = start_progress(_("Verifying object offsets"), m->num_objects); for (i = 0; i < m->num_objects; i++) { struct object_id oid; struct pack_entry e; @@ -995,7 +998,10 @@ int verify_midx_file(const char *object_dir) if (m_offset != p_offset) midx_report(_("incorrect object offset for oid[%d] = %s: %"PRIx64" != %"PRIx64), i, oid_to_hex(), m_offset, p_offset); + + display_progress(progress, i + 1); } + stop_progress(); return verify_midx_error; } -- gitgitgadget
[PATCH v2 04/11] multi-pack-index: verify packname order
From: Derrick Stolee The final check we make while loading a multi-pack-index is that the packfile names are in lexicographical order. Make this error be a die() instead. In order to test this condition, we need multiple packfiles. Earlier in t5319-multi-pack-index.sh, we tested the interaction with 'git repack' but this limits us to one packfile in our object dir. Move these repack tests until after the 'verify' tests. Signed-off-by: Derrick Stolee --- midx.c | 6 ++ t/t5319-multi-pack-index.sh | 10 ++ 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/midx.c b/midx.c index 8b054b39ab..e655a15aed 100644 --- a/midx.c +++ b/midx.c @@ -157,12 +157,10 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local cur_pack_name += strlen(cur_pack_name) + 1; - if (i && strcmp(m->pack_names[i], m->pack_names[i - 1]) <= 0) { - error(_("multi-pack-index pack names out of order: '%s' before '%s'"), + if (i && strcmp(m->pack_names[i], m->pack_names[i - 1]) <= 0) + die(_("multi-pack-index pack names out of order: '%s' before '%s'"), m->pack_names[i - 1], m->pack_names[i]); - goto cleanup_fail; - } } return m; diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index c54b6e7188..01a3cd6b00 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -181,6 +181,11 @@ MIDX_BYTE_CHUNK_COUNT=6 MIDX_HEADER_SIZE=12 MIDX_BYTE_CHUNK_ID=$MIDX_HEADER_SIZE MIDX_BYTE_CHUNK_OFFSET=$(($MIDX_HEADER_SIZE + 4)) +MIDX_NUM_CHUNKS=5 +MIDX_CHUNK_LOOKUP_WIDTH=12 +MIDX_OFFSET_PACKNAMES=$(($MIDX_HEADER_SIZE + \ +$MIDX_NUM_CHUNKS * $MIDX_CHUNK_LOOKUP_WIDTH)) +MIDX_BYTE_PACKNAME_ORDER=$(($MIDX_OFFSET_PACKNAMES + 2)) test_expect_success 'verify bad version' ' corrupt_midx_and_verify $MIDX_BYTE_VERSION "\00" $objdir \ @@ -212,6 +217,11 @@ test_expect_success 'verify invalid chunk offset' ' "invalid chunk offset (too large)" ' +test_expect_success 'verify packnames out of order' ' + corrupt_midx_and_verify $MIDX_BYTE_PACKNAME_ORDER "z" $objdir \ + "pack names out of order" +' + test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && git repack -adf && -- gitgitgadget
[PATCH v2 09/11] multi-pack-index: verify object offsets
From: Derrick Stolee The 'git multi-pack-index verify' command must verify the object offsets stored in the multi-pack-index are correct. There are two ways the offset chunk can be incorrect: the pack-int-id and the object offset. Replace the BUG() statement with a die() statement, now that we may hit a bad pack-int-id during a 'verify' command on a corrupt multi-pack-index, and it is covered by a test. Signed-off-by: Derrick Stolee --- midx.c | 29 - t/t5319-multi-pack-index.sh | 27 +++ 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 80094c02a7..47e7e6113a 100644 --- a/midx.c +++ b/midx.c @@ -197,7 +197,8 @@ int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id) struct strbuf pack_name = STRBUF_INIT; if (pack_int_id >= m->num_packs) - BUG("bad pack-int-id"); + die(_("bad pack-int-id: %u (%u total packs"), + pack_int_id, m->num_packs); if (m->packs[pack_int_id]) return 0; @@ -970,5 +971,31 @@ int verify_midx_file(const char *object_dir) i, oid_to_hex(), oid_to_hex(), i + 1); } + for (i = 0; i < m->num_objects; i++) { + struct object_id oid; + struct pack_entry e; + off_t m_offset, p_offset; + + nth_midxed_object_oid(, m, i); + if (!fill_midx_entry(, , m)) { + midx_report(_("failed to load pack entry for oid[%d] = %s"), + i, oid_to_hex()); + continue; + } + + if (open_pack_index(e.p)) { + midx_report(_("failed to load pack-index for packfile %s"), + e.p->pack_name); + break; + } + + m_offset = e.offset; + p_offset = find_pack_entry_one(oid.hash, e.p); + + if (m_offset != p_offset) + midx_report(_("incorrect object offset for oid[%d] = %s: %"PRIx64" != %"PRIx64), + i, oid_to_hex(), m_offset, p_offset); + } + return verify_midx_error; } diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index a968b9a959..828c240389 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -176,6 +176,7 @@ test_expect_success 'verify bad signature' ' ' HASH_LEN=20 +NUM_OBJECTS=74 MIDX_BYTE_VERSION=4 MIDX_BYTE_OID_VERSION=5 MIDX_BYTE_CHUNK_COUNT=6 @@ -192,6 +193,10 @@ MIDX_OID_FANOUT_WIDTH=4 MIDX_BYTE_OID_FANOUT_ORDER=$((MIDX_OFFSET_OID_FANOUT + 250 * $MIDX_OID_FANOUT_WIDTH + 1)) MIDX_OFFSET_OID_LOOKUP=$(($MIDX_OFFSET_OID_FANOUT + 256 * $MIDX_OID_FANOUT_WIDTH)) MIDX_BYTE_OID_LOOKUP=$(($MIDX_OFFSET_OID_LOOKUP + 16 * $HASH_LEN)) +MIDX_OFFSET_OBJECT_OFFSETS=$(($MIDX_OFFSET_OID_LOOKUP + $NUM_OBJECTS * $HASH_LEN)) +MIDX_OFFSET_WIDTH=8 +MIDX_BYTE_PACK_INT_ID=$(($MIDX_OFFSET_OBJECT_OFFSETS + 16 * $MIDX_OFFSET_WIDTH + 2)) +MIDX_BYTE_OFFSET=$(($MIDX_OFFSET_OBJECT_OFFSETS + 16 * $MIDX_OFFSET_WIDTH + 6)) test_expect_success 'verify bad version' ' corrupt_midx_and_verify $MIDX_BYTE_VERSION "\00" $objdir \ @@ -243,6 +248,16 @@ test_expect_success 'verify oid lookup out of order' ' "oid lookup out of order" ' +test_expect_success 'verify incorrect pack-int-id' ' + corrupt_midx_and_verify $MIDX_BYTE_PACK_INT_ID "\07" $objdir \ + "bad pack-int-id" +' + +test_expect_success 'verify incorrect offset' ' + corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\07" $objdir \ + "incorrect object offset" +' + test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && git repack -adf && @@ -310,4 +325,16 @@ test_expect_success 'verify multi-pack-index with 64-bit offsets' ' git multi-pack-index verify --object-dir=objects64 ' +NUM_OBJECTS=63 +MIDX_OFFSET_OID_FANOUT=$((MIDX_OFFSET_PACKNAMES + 54)) +MIDX_OFFSET_OID_LOOKUP=$((MIDX_OFFSET_OID_FANOUT + 256 * $MIDX_OID_FANOUT_WIDTH)) +MIDX_OFFSET_OBJECT_OFFSETS=$(($MIDX_OFFSET_OID_LOOKUP + $NUM_OBJECTS * $HASH_LEN)) +MIDX_OFFSET_LARGE_OFFSETS=$(($MIDX_OFFSET_OBJECT_OFFSETS + $NUM_OBJECTS * $MIDX_OFFSET_WIDTH)) +MIDX_BYTE_LARGE_OFFSET=$(($MIDX_OFFSET_LARGE_OFFSETS + 3)) + +test_expect_success 'verify incorrect 64-bit offset' ' + corrupt_midx_and_verify $MIDX_BYTE_LARGE_OFFSET "\07" objects64 \ + "incorrect object offset" +' + test_done -- gitgitgadget
[PATCH v2 08/11] multi-pack-index: fix 32-bit vs 64-bit size check
From: Derrick Stolee When loading a 64-bit offset, we intend to check that off_t can store the resulting offset. However, the condition accidentally checks the 32-bit offset to see if it is smaller than a 64-bit value. Fix it, and this will be covered by a test in the 'git multi-pack-index verify' command in a later commit. Signed-off-by: Derrick Stolee --- midx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 06d5cfc826..80094c02a7 100644 --- a/midx.c +++ b/midx.c @@ -236,7 +236,7 @@ static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) offset32 = get_be32(offset_data + sizeof(uint32_t)); if (m->chunk_large_offsets && offset32 & MIDX_LARGE_OFFSET_NEEDED) { - if (sizeof(offset32) < sizeof(uint64_t)) + if (sizeof(off_t) < sizeof(uint64_t)) die(_("multi-pack-index stores a 64-bit offset, but off_t is too small")); offset32 ^= MIDX_LARGE_OFFSET_NEEDED; -- gitgitgadget
[PATCH v2 05/11] multi-pack-index: verify missing pack
From: Derrick Stolee Signed-off-by: Derrick Stolee --- midx.c | 16 t/t5319-multi-pack-index.sh | 5 + 2 files changed, 21 insertions(+) diff --git a/midx.c b/midx.c index e655a15aed..a02b19efc1 100644 --- a/midx.c +++ b/midx.c @@ -926,13 +926,29 @@ void clear_midx_file(const char *object_dir) int verify_midx_error; +static void midx_report(const char *fmt, ...) +{ + va_list ap; + verify_midx_error = 1; + va_start(ap, fmt); + vfprintf(stderr, fmt, ap); + fprintf(stderr, "\n"); + va_end(ap); +} + int verify_midx_file(const char *object_dir) { + uint32_t i; struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); verify_midx_error = 0; if (!m) return 0; + for (i = 0; i < m->num_packs; i++) { + if (prepare_midx_pack(m, i)) + midx_report("failed to load pack in position %d", i); + } + return verify_midx_error; } diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 01a3cd6b00..0a566afb05 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -222,6 +222,11 @@ test_expect_success 'verify packnames out of order' ' "pack names out of order" ' +test_expect_success 'verify packnames out of order' ' + corrupt_midx_and_verify $MIDX_BYTE_PACKNAME_ORDER "a" $objdir \ + "failed to load pack" +' + test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && git repack -adf && -- gitgitgadget
[PATCH v2 11/11] fsck: verify multi-pack-index
From: Derrick Stolee When core.multiPackIndex is true, we may have a multi-pack-index in our object directory. Add calls to 'git multi-pack-index verify' at the end of 'git fsck' if so. Signed-off-by: Derrick Stolee --- builtin/fsck.c | 18 ++ t/t5319-multi-pack-index.sh | 13 - 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 63c8578cc1..06eb421720 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -848,5 +848,23 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) } } + if (!git_config_get_bool("core.multipackindex", ) && i) { + struct child_process midx_verify = CHILD_PROCESS_INIT; + const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL }; + + midx_verify.argv = midx_argv; + midx_verify.git_cmd = 1; + if (run_command(_verify)) + errors_found |= ERROR_COMMIT_GRAPH; + + prepare_alt_odb(the_repository); + for (alt = the_repository->objects->alt_odb_list; alt; alt = alt->next) { + midx_argv[2] = "--object-dir"; + midx_argv[3] = alt->path; + if (run_command(_verify)) + errors_found |= ERROR_COMMIT_GRAPH; + } + } + return errors_found; } diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 828c240389..bd8e841b81 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -160,12 +160,17 @@ corrupt_midx_and_verify() { DATA="${2:-\0}" && OBJDIR=$3 && GREPSTR="$4" && + COMMAND="$5" && + if test -z "$COMMAND" + then + COMMAND="git multi-pack-index verify --object-dir=$OBJDIR" + fi && FILE=$OBJDIR/pack/multi-pack-index && chmod a+w $FILE && test_when_finished mv midx-backup $FILE && cp $FILE midx-backup && printf "$DATA" | dd of="$FILE" bs=1 seek="$POS" conv=notrunc && - test_must_fail git multi-pack-index verify --object-dir=$OBJDIR 2>test_err && + test_must_fail $COMMAND 2>test_err && grep -v "^+" test_err >err && test_i18ngrep "$GREPSTR" err } @@ -258,6 +263,12 @@ test_expect_success 'verify incorrect offset' ' "incorrect object offset" ' +test_expect_success 'git-fsck incorrect offset' ' + corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\07" $objdir \ + "incorrect object offset" \ + "git -c core.multipackindex=true fsck" +' + test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && git repack -adf && -- gitgitgadget
[PATCH v2 07/11] multi-pack-index: verify oid lookup order
From: Derrick Stolee Signed-off-by: Derrick Stolee --- midx.c | 11 +++ t/t5319-multi-pack-index.sh | 8 2 files changed, 19 insertions(+) diff --git a/midx.c b/midx.c index dfd26b4d74..06d5cfc826 100644 --- a/midx.c +++ b/midx.c @@ -959,5 +959,16 @@ int verify_midx_file(const char *object_dir) i, oid_fanout1, oid_fanout2, i + 1); } + for (i = 0; i < m->num_objects - 1; i++) { + struct object_id oid1, oid2; + + nth_midxed_object_oid(, m, i); + nth_midxed_object_oid(, m, i + 1); + + if (oidcmp(, ) >= 0) + midx_report(_("oid lookup out of order: oid[%d] = %s >= %s = oid[%d]"), + i, oid_to_hex(), oid_to_hex(), i + 1); + } + return verify_midx_error; } diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 47a54e138d..a968b9a959 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -175,6 +175,7 @@ test_expect_success 'verify bad signature' ' "multi-pack-index signature" ' +HASH_LEN=20 MIDX_BYTE_VERSION=4 MIDX_BYTE_OID_VERSION=5 MIDX_BYTE_CHUNK_COUNT=6 @@ -189,6 +190,8 @@ MIDX_BYTE_PACKNAME_ORDER=$(($MIDX_OFFSET_PACKNAMES + 2)) MIDX_OFFSET_OID_FANOUT=$(($MIDX_OFFSET_PACKNAMES + 652)) MIDX_OID_FANOUT_WIDTH=4 MIDX_BYTE_OID_FANOUT_ORDER=$((MIDX_OFFSET_OID_FANOUT + 250 * $MIDX_OID_FANOUT_WIDTH + 1)) +MIDX_OFFSET_OID_LOOKUP=$(($MIDX_OFFSET_OID_FANOUT + 256 * $MIDX_OID_FANOUT_WIDTH)) +MIDX_BYTE_OID_LOOKUP=$(($MIDX_OFFSET_OID_LOOKUP + 16 * $HASH_LEN)) test_expect_success 'verify bad version' ' corrupt_midx_and_verify $MIDX_BYTE_VERSION "\00" $objdir \ @@ -235,6 +238,11 @@ test_expect_success 'verify oid fanout out of order' ' "oid fanout out of order" ' +test_expect_success 'verify oid lookup out of order' ' + corrupt_midx_and_verify $MIDX_BYTE_OID_LOOKUP "\00" $objdir \ + "oid lookup out of order" +' + test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && git repack -adf && -- gitgitgadget
[PATCH v2 03/11] multi-pack-index: verify corrupt chunk lookup table
From: Derrick Stolee Signed-off-by: Derrick Stolee --- midx.c | 3 +++ t/t5319-multi-pack-index.sh | 13 + 2 files changed, 16 insertions(+) diff --git a/midx.c b/midx.c index ec78254bb6..8b054b39ab 100644 --- a/midx.c +++ b/midx.c @@ -100,6 +100,9 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local uint64_t chunk_offset = get_be64(m->data + MIDX_HEADER_SIZE + 4 + MIDX_CHUNKLOOKUP_WIDTH * i); + if (chunk_offset >= m->data_len) + die(_("invalid chunk offset (too large)")); + switch (chunk_id) { case MIDX_CHUNKID_PACKNAMES: m->chunk_pack_names = m->data + chunk_offset; diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index e04b5f43a2..c54b6e7188 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -178,6 +178,9 @@ test_expect_success 'verify bad signature' ' MIDX_BYTE_VERSION=4 MIDX_BYTE_OID_VERSION=5 MIDX_BYTE_CHUNK_COUNT=6 +MIDX_HEADER_SIZE=12 +MIDX_BYTE_CHUNK_ID=$MIDX_HEADER_SIZE +MIDX_BYTE_CHUNK_OFFSET=$(($MIDX_HEADER_SIZE + 4)) test_expect_success 'verify bad version' ' corrupt_midx_and_verify $MIDX_BYTE_VERSION "\00" $objdir \ @@ -199,6 +202,16 @@ test_expect_success 'verify extended chunk count' ' "terminating multi-pack-index chunk id appears earlier than expected" ' +test_expect_success 'verify missing required chunk' ' + corrupt_midx_and_verify $MIDX_BYTE_CHUNK_ID "\01" $objdir \ + "missing required" +' + +test_expect_success 'verify invalid chunk offset' ' + corrupt_midx_and_verify $MIDX_BYTE_CHUNK_OFFSET "\01" $objdir \ + "invalid chunk offset (too large)" +' + test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && git repack -adf && -- gitgitgadget
[PATCH v2 02/11] multi-pack-index: verify bad header
From: Derrick Stolee When verifying if a multi-pack-index file is valid, we want the command to fail to signal an invalid file. Previously, we wrote an error to stderr and continued as if we had no multi-pack-index. Now, die() instead of error(). Add tests that check corrupted headers in a few ways: * Bad signature * Bad file version * Bad hash version * Truncated hash count * Extended hash count Signed-off-by: Derrick Stolee --- midx.c | 18 +-- t/t5319-multi-pack-index.sh | 46 - 2 files changed, 51 insertions(+), 13 deletions(-) diff --git a/midx.c b/midx.c index b253bed517..ec78254bb6 100644 --- a/midx.c +++ b/midx.c @@ -76,24 +76,18 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local m->local = local; m->signature = get_be32(m->data); - if (m->signature != MIDX_SIGNATURE) { - error(_("multi-pack-index signature 0x%08x does not match signature 0x%08x"), + if (m->signature != MIDX_SIGNATURE) + die(_("multi-pack-index signature 0x%08x does not match signature 0x%08x"), m->signature, MIDX_SIGNATURE); - goto cleanup_fail; - } m->version = m->data[MIDX_BYTE_FILE_VERSION]; - if (m->version != MIDX_VERSION) { - error(_("multi-pack-index version %d not recognized"), + if (m->version != MIDX_VERSION) + die(_("multi-pack-index version %d not recognized"), m->version); - goto cleanup_fail; - } hash_version = m->data[MIDX_BYTE_HASH_VERSION]; - if (hash_version != MIDX_HASH_VERSION) { - error(_("hash version %u does not match"), hash_version); - goto cleanup_fail; - } + if (hash_version != MIDX_HASH_VERSION) + die(_("hash version %u does not match"), hash_version); m->hash_len = MIDX_HASH_LEN; m->num_chunks = m->data[MIDX_BYTE_NUM_CHUNKS]; diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 1c4e0e6d31..e04b5f43a2 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -154,6 +154,51 @@ test_expect_success 'verify multi-pack-index success' ' git multi-pack-index verify --object-dir=$objdir ' +# usage: corrupt_midx_and_verify +corrupt_midx_and_verify() { + POS=$1 && + DATA="${2:-\0}" && + OBJDIR=$3 && + GREPSTR="$4" && + FILE=$OBJDIR/pack/multi-pack-index && + chmod a+w $FILE && + test_when_finished mv midx-backup $FILE && + cp $FILE midx-backup && + printf "$DATA" | dd of="$FILE" bs=1 seek="$POS" conv=notrunc && + test_must_fail git multi-pack-index verify --object-dir=$OBJDIR 2>test_err && + grep -v "^+" test_err >err && + test_i18ngrep "$GREPSTR" err +} + +test_expect_success 'verify bad signature' ' + corrupt_midx_and_verify 0 "\00" $objdir \ + "multi-pack-index signature" +' + +MIDX_BYTE_VERSION=4 +MIDX_BYTE_OID_VERSION=5 +MIDX_BYTE_CHUNK_COUNT=6 + +test_expect_success 'verify bad version' ' + corrupt_midx_and_verify $MIDX_BYTE_VERSION "\00" $objdir \ + "multi-pack-index version" +' + +test_expect_success 'verify bad OID version' ' + corrupt_midx_and_verify $MIDX_BYTE_OID_VERSION "\02" $objdir \ + "hash version" +' + +test_expect_success 'verify truncated chunk count' ' + corrupt_midx_and_verify $MIDX_BYTE_CHUNK_COUNT "\01" $objdir \ + "missing required" +' + +test_expect_success 'verify extended chunk count' ' + corrupt_midx_and_verify $MIDX_BYTE_CHUNK_COUNT "\07" $objdir \ + "terminating multi-pack-index chunk id appears earlier than expected" +' + test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && git repack -adf && @@ -191,7 +236,6 @@ test_expect_success 'multi-pack-index in an alternate' ' compare_results_with_midx "with alternate (remote midx)" - # usage: corrupt_data [] corrupt_data () { file=$1 -- gitgitgadget
[PATCH v2 06/11] multi-pack-index: verify oid fanout order
From: Derrick Stolee Signed-off-by: Derrick Stolee --- midx.c | 9 + t/t5319-multi-pack-index.sh | 8 2 files changed, 17 insertions(+) diff --git a/midx.c b/midx.c index a02b19efc1..dfd26b4d74 100644 --- a/midx.c +++ b/midx.c @@ -950,5 +950,14 @@ int verify_midx_file(const char *object_dir) midx_report("failed to load pack in position %d", i); } + for (i = 0; i < 255; i++) { + uint32_t oid_fanout1 = ntohl(m->chunk_oid_fanout[i]); + uint32_t oid_fanout2 = ntohl(m->chunk_oid_fanout[i + 1]); + + if (oid_fanout1 > oid_fanout2) + midx_report(_("oid fanout out of order: fanout[%d] = %"PRIx32" > %"PRIx32" = fanout[%d]"), + i, oid_fanout1, oid_fanout2, i + 1); + } + return verify_midx_error; } diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 0a566afb05..47a54e138d 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -186,6 +186,9 @@ MIDX_CHUNK_LOOKUP_WIDTH=12 MIDX_OFFSET_PACKNAMES=$(($MIDX_HEADER_SIZE + \ $MIDX_NUM_CHUNKS * $MIDX_CHUNK_LOOKUP_WIDTH)) MIDX_BYTE_PACKNAME_ORDER=$(($MIDX_OFFSET_PACKNAMES + 2)) +MIDX_OFFSET_OID_FANOUT=$(($MIDX_OFFSET_PACKNAMES + 652)) +MIDX_OID_FANOUT_WIDTH=4 +MIDX_BYTE_OID_FANOUT_ORDER=$((MIDX_OFFSET_OID_FANOUT + 250 * $MIDX_OID_FANOUT_WIDTH + 1)) test_expect_success 'verify bad version' ' corrupt_midx_and_verify $MIDX_BYTE_VERSION "\00" $objdir \ @@ -227,6 +230,11 @@ test_expect_success 'verify packnames out of order' ' "failed to load pack" ' +test_expect_success 'verify oid fanout out of order' ' + corrupt_midx_and_verify $MIDX_BYTE_OID_FANOUT_ORDER "\01" $objdir \ + "oid fanout out of order" +' + test_expect_success 'repack removes multi-pack-index' ' test_path_is_file $objdir/pack/multi-pack-index && git repack -adf && -- gitgitgadget
[PATCH v2 00/11] Add 'git multi-pack-index verify' command
The multi-pack-index file provides faster lookups in repos with many packfiles by duplicating the information from multiple pack-indexes into a single file. This series allows us to verify a multi-pack-index using 'git multi-pack-index verify' and 'git fsck' (when core.multiPackIndex is true). The pattern for the tests is similar to that found in t5318-commit-graph.sh. During testing, I found a bug in how we check for the size of off_t (we are not actually checking off_t, but instead uint32_t). See "multi-pack-index: fix 32-bit vs 64-bit size check". Thanks to Ævar [1], I added a commit that provides progress updates when checking object offsets. Based on ds/multi-pack-index [1] https://public-inbox.org/git/20180904202729.13900-1-ava...@gmail.com/T/#u Derrick Stolee (11): multi-pack-index: add 'verify' verb multi-pack-index: verify bad header multi-pack-index: verify corrupt chunk lookup table multi-pack-index: verify packname order multi-pack-index: verify missing pack multi-pack-index: verify oid fanout order multi-pack-index: verify oid lookup order multi-pack-index: fix 32-bit vs 64-bit size check multi-pack-index: verify object offsets multi-pack-index: report progress during 'verify' fsck: verify multi-pack-index Documentation/git-multi-pack-index.txt | 10 ++ builtin/fsck.c | 18 builtin/multi-pack-index.c | 4 +- midx.c | 113 midx.h | 1 + t/t5319-multi-pack-index.sh| 136 - 6 files changed, 262 insertions(+), 20 deletions(-) base-commit: 6a22d521260f86dff8fe6f23ab329cebb62ba4f0 Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-34%2Fderrickstolee%2Fmidx%2Fverify-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-34/derrickstolee/midx/verify-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/34 Range-diff vs v1: 1: 8dc38afe2b ! 1: d8ffd84d67 multi-pack-index: add 'verify' verb @@ -47,7 +47,7 @@ static char const * const builtin_multi_pack_index_usage[] = { - N_("git multi-pack-index [--object-dir=] write"), -+ N_("git multi-pack-index [--object-dir=] [write|verify]"), ++ N_("git multi-pack-index [--object-dir=] (write|verify)"), NULL }; 2: 787e1fb616 ! 2: 9590895830 multi-pack-index: verify bad header @@ -61,10 +61,10 @@ +# usage: corrupt_midx_and_verify +corrupt_midx_and_verify() { -+ POS=$1 -+ DATA="${2:-\0}" -+ OBJDIR=$3 -+ GREPSTR="$4" ++ POS=$1 && ++ DATA="${2:-\0}" && ++ OBJDIR=$3 && ++ GREPSTR="$4" && + FILE=$OBJDIR/pack/multi-pack-index && + chmod a+w $FILE && + test_when_finished mv midx-backup $FILE && 3: b385aa2abf = 3: 2448173844 multi-pack-index: verify corrupt chunk lookup table 4: 37ee24c82b = 4: 947241bfdc multi-pack-index: verify packname order 5: b747da415c = 5: 4058867380 multi-pack-index: verify missing pack 6: 58e5c09468 = 6: ea1c522702 multi-pack-index: verify oid fanout order 7: b21772d054 = 7: 511791de91 multi-pack-index: verify oid lookup order 8: b08d3f0055 = 8: 210649bf83 multi-pack-index: fix 32-bit vs 64-bit size check 9: e1498aea45 ! 9: ef20193d59 multi-pack-index: verify object offsets @@ -21,7 +21,8 @@ if (pack_int_id >= m->num_packs) - BUG("bad pack-int-id"); -+ die(_("bad pack-int-id")); ++ die(_("bad pack-int-id: %u (%u total packs"), ++ pack_int_id, m->num_packs); if (m->packs[pack_int_id]) return 0; 10: acf8cfd632 = 10: 29ebc17161 multi-pack-index: report progress during 'verify' 11: 09d16aff20 ! 11: 406c88b456 fsck: verify multi-pack-index @@ -40,14 +40,14 @@ --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ - DATA="${2:-\0}" - OBJDIR=$3 - GREPSTR="$4" -+ COMMAND="$5" + DATA="${2:-\0}" && + OBJDIR=$3 && + GREPSTR="$4" && ++ COMMAND="$5" && + if test -z "$COMMAND" + then + COMMAND="git multi-pack-index verify --object-dir=$OBJDIR" -+ fi ++ fi && FILE=$OBJDIR/pack/multi-pack-index && chmod a+w $FILE && test_when_finished mv midx-backup $FILE && -- gitgitgadget
[PATCH v1] preload-index: update GIT_FORCE_PRELOAD_TEST support
Rename GIT_FORCE_PRELOAD_TEST to GIT_TEST_PRELOAD for consistency with the other GIT_TEST_ special setups and properly document its use. Signed-off-by: Ben Peart --- Notes: Base Ref: v2.19.0 Web-Diff: https://github.com/benpeart/git/commit/dcd201b920 Checkout: git fetch https://github.com/benpeart/git git-test-preload-v1 && git checkout dcd201b920 preload-index.c | 2 +- t/README| 3 +++ t/t7519-status-fsmonitor.sh | 4 ++-- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/preload-index.c b/preload-index.c index 71cd2437a3..cc8a7333c2 100644 --- a/preload-index.c +++ b/preload-index.c @@ -84,7 +84,7 @@ static void preload_index(struct index_state *index, return; threads = index->cache_nr / THREAD_COST; - if ((index->cache_nr > 1) && (threads < 2) && getenv("GIT_FORCE_PRELOAD_TEST")) + if ((index->cache_nr > 1) && (threads < 2) && git_env_bool("GIT_TEST_PRELOAD", 0)) threads = 2; if (threads < 2) return; diff --git a/t/README b/t/README index 9028b47d92..73fb09560f 100644 --- a/t/README +++ b/t/README @@ -319,6 +319,9 @@ GIT_TEST_OE_DELTA_SIZE= exercises the uncomon pack-objects code path where deltas larger than this limit require extra memory allocation for bookkeeping. +GIT_TEST_PRELOAD= exercises the preload-index code path by +overriding the minimum number of cache entries required per thread. + Naming Tests diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh index 756beb0d8e..9b703d49b5 100755 --- a/t/t7519-status-fsmonitor.sh +++ b/t/t7519-status-fsmonitor.sh @@ -245,9 +245,9 @@ do git config core.preloadIndex $preload_val && if test $preload_val = true then - GIT_FORCE_PRELOAD_TEST=$preload_val; export GIT_FORCE_PRELOAD_TEST + GIT_TEST_PRELOAD=$preload_val; export GIT_TEST_PRELOAD else - unset GIT_FORCE_PRELOAD_TEST + sane_unset GIT_TEST_PRELOAD fi ' base-commit: 1d4361b0f344188ab5eec6dcea01f61a3a3a1670 -- 2.18.0.windows.1
RE: [Question] Signature calculation ignoring parts of binary files
On September 13, 2018 1:52 PM, Junio C Hamano wrote: > Junio C Hamano writes: > > > "Randall S. Becker" writes: > > > >> The scenario is slightly different. > >> 1. Person A gives me a new binary file-1 with fingerprint A1. This > >> goes into git unchanged. > >> 2. Person B gives me binary file-2 with fingerprint B2. This does not > >> go into git yet. > >> 3. We attempt a git diff between the committed file-1 and uncommitted > >> file-2 using a textconv implementation that strips what we don't need to > compare. > >> 4. If file-1 and file-2 have no difference when textconv is used, > >> file-2 is not added and not committed. It is discarded with impunity, > >> never to be seen again, although we might whine a lot at the user for > >> attempting to put > >> file-2 in - but that's not git's issue. > > > > You are forgetting that Git is a distributed version control system, > > aren't you? Person A and B can introduce their "moral equivalent but > > bytewise different" copies to their repository under the same object > > name, and you can pull from them--what happens? > > > > It is fundamental that one object name given to Git identifies one > > specific byte sequence contained in an object uniquely. Once you > > broke that, you no longer have Git. > > Having said all that, if you want to keep the original with frills but somehow > give these bytewise different things that reduce to the same essence (e.g. > when passed thru a filter like textconv), I suspect a better approach might be > to store both the "original" and the result of passing the "original" through > the filter in the object database. In the above example, you'll get two > "original" > objects from person A and person B, plus one "canonical" object that are > bytewise different from either of these two originals, but what they reduce > to when you use the filter on them. Then you record the fact that to derive > the "essence" object, you can reduce either person A's or person B's > "original" through the filter, perhaps by using "git notes" attached to the > "essence" object, recording the object names of these originals (the reason > why using notes in this direction is because you can mechanically determine > which "essence" > object any given "original" object reduces to---it is just the matter of passing > it through the filter. But there can be more than one "original" that reduces > to the same "essence"). I like that idea. It turns the reduced object into a contract. Thanks.
Re: [PATCH v2 1/1] contrib: add coverage-diff script
On Thu, Sep 13, 2018 at 10:56 AM Derrick Stolee via GitGitGadget wrote: > There have been a few bugs in recent patches what would have been caught > if the test suite covered those blocks (including a few of mine). I want > to work towards a "sensible" amount of coverage on new topics. In my opinion, > this means that any logic should be covered, but the 'die()' blocks in error > cases do not need to be covered. The bit about die() blocks is perhaps a bit too general. While it's true that some die()'s signal very unlikely (or near-impossible) conditions, others are merely reporting invalid user or other input to the program. The latter category is often very much worth testing, as the number of test_must_fail() invocations in the test suite shows. 68a6b3a1bd (worktree: teach 'move' to override lock when --force given twice, 2018-08-28), which was highlighted in your cover letter, provides a good example of legitimately testing that a die() is covered. So, perhaps the above can be toned-down a bit by saying something like: ...but die() blocks covering very unlikely (or near-impossible) situations may not warrant coverage. > It is important to not measure the coverage of the codebase by what old code > is not covered.
Re: [Question] Signature calculation ignoring parts of binary files
Junio C Hamano writes: > "Randall S. Becker" writes: > >> The scenario is slightly different. >> 1. Person A gives me a new binary file-1 with fingerprint A1. This goes into >> git unchanged. >> 2. Person B gives me binary file-2 with fingerprint B2. This does not go >> into git yet. >> 3. We attempt a git diff between the committed file-1 and uncommitted file-2 >> using a textconv implementation that strips what we don't need to compare. >> 4. If file-1 and file-2 have no difference when textconv is used, file-2 is >> not added and not committed. It is discarded with impunity, never to be seen >> again, although we might whine a lot at the user for attempting to put >> file-2 in - but that's not git's issue. > > You are forgetting that Git is a distributed version control system, > aren't you? Person A and B can introduce their "moral equivalent > but bytewise different" copies to their repository under the same > object name, and you can pull from them--what happens? > > It is fundamental that one object name given to Git identifies one > specific byte sequence contained in an object uniquely. Once you > broke that, you no longer have Git. Having said all that, if you want to keep the original with frills but somehow give these bytewise different things that reduce to the same essence (e.g. when passed thru a filter like textconv), I suspect a better approach might be to store both the "original" and the result of passing the "original" through the filter in the object database. In the above example, you'll get two "original" objects from person A and person B, plus one "canonical" object that are bytewise different from either of these two originals, but what they reduce to when you use the filter on them. Then you record the fact that to derive the "essence" object, you can reduce either person A's or person B's "original" through the filter, perhaps by using "git notes" attached to the "essence" object, recording the object names of these originals (the reason why using notes in this direction is because you can mechanically determine which "essence" object any given "original" object reduces to---it is just the matter of passing it through the filter. But there can be more than one "original" that reduces to the same "essence").
[PATCH v1] fsmonitor: update GIT_TEST_FSMONITOR support
Rename GIT_FSMONITOR_TEST to GIT_TEST_FSMONITOR for consistency with the other GIT_TEST_ special setups and properly document its use. Signed-off-by: Ben Peart --- Notes: Base Ref: v2.19.0 Web-Diff: https://github.com/benpeart/git/commit/311484a684 Checkout: git fetch https://github.com/benpeart/git git-test-fsmonitor-v1 && git checkout 311484a684 config.c| 2 +- t/README| 4 t/t1700-split-index.sh | 2 +- t/t7519-status-fsmonitor.sh | 2 +- 4 files changed, 7 insertions(+), 3 deletions(-) diff --git a/config.c b/config.c index 3461993f0a..3555c63f28 100644 --- a/config.c +++ b/config.c @@ -2278,7 +2278,7 @@ int git_config_get_max_percent_split_change(void) int git_config_get_fsmonitor(void) { if (git_config_get_pathname("core.fsmonitor", _fsmonitor)) - core_fsmonitor = getenv("GIT_FSMONITOR_TEST"); + core_fsmonitor = getenv("GIT_TEST_FSMONITOR"); if (core_fsmonitor && !*core_fsmonitor) core_fsmonitor = NULL; diff --git a/t/README b/t/README index 9028b47d92..545438c820 100644 --- a/t/README +++ b/t/README @@ -319,6 +319,10 @@ GIT_TEST_OE_DELTA_SIZE= exercises the uncomon pack-objects code path where deltas larger than this limit require extra memory allocation for bookkeeping. +GIT_TEST_FSMONITOR=$PWD/t7519/fsmonitor-all exercises the fsmonitor +code path for utilizing a file system monitor to speed up detecting +new or changed files. + Naming Tests diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh index b3b4d83eaf..f6a856f24c 100755 --- a/t/t1700-split-index.sh +++ b/t/t1700-split-index.sh @@ -6,7 +6,7 @@ test_description='split index mode tests' # We need total control of index splitting here sane_unset GIT_TEST_SPLIT_INDEX -sane_unset GIT_FSMONITOR_TEST +sane_unset GIT_TEST_FSMONITOR test_expect_success 'enable split index' ' git config splitIndex.maxPercentChange 100 && diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh index 756beb0d8e..d77012ea6d 100755 --- a/t/t7519-status-fsmonitor.sh +++ b/t/t7519-status-fsmonitor.sh @@ -8,7 +8,7 @@ test_description='git status with file system watcher' # To run the entire git test suite using fsmonitor: # # copy t/t7519/fsmonitor-all to a location in your path and then set -# GIT_FSMONITOR_TEST=fsmonitor-all and run your tests. +# GIT_TEST_FSMONITOR=fsmonitor-all and run your tests. # # Note, after "git reset --hard HEAD" no extensions exist other than 'TREE' base-commit: 1d4361b0f344188ab5eec6dcea01f61a3a3a1670 -- 2.18.0.windows.1
Re: [PATCH v2 1/1] contrib: add coverage-diff script
"Derrick Stolee via GitGitGadget" writes: > From: Derrick Stolee > > We have coverage targets in our Makefile for using gcov to display line > coverage based on our test suite. The way I like to do it is to run: > > make coverage-test > make coverage-report > > This leaves the repo in a state where every X.c file that was covered has > an X.c.gcov file containing the coverage counts for every line, and "#" > at every uncovered line. > > There have been a few bugs in recent patches what would have been caught > if the test suite covered those blocks (including a few of mine). I want > to work towards a "sensible" amount of coverage on new topics. In my opinion, > this means that any logic should be covered, but the 'die()' blocks in error > cases do not need to be covered. > > It is important to not measure the coverage of the codebase by what old code > is not covered. To help, I created the 'contrib/coverage-diff.sh' script. > After creating the coverage statistics at a version (say, 'topic') you can > then run > > contrib/coverage-diff.sh base topic > > to see the lines added between 'base' and 'topic' that are not covered by the > test suite. The output uses 'git blame -c' format so you can find the commits > responsible and view the line numbers for quick access to the context. > > Signed-off-by: Derrick Stolee > --- > contrib/coverage-diff.sh | 63 > 1 file changed, 63 insertions(+) > create mode 100755 contrib/coverage-diff.sh > > diff --git a/contrib/coverage-diff.sh b/contrib/coverage-diff.sh > new file mode 100755 > index 00..0f226f038c > --- /dev/null > +++ b/contrib/coverage-diff.sh > @@ -0,0 +1,63 @@ > +#!/bin/sh > + > +# Usage: 'contrib/coverage-diff.sh > +# Outputs a list of new lines in version2 compared to version1 that are > +# not covered by the test suite. Assumes you ran > +# 'make coverage-test coverage-report' from root first, so .gcov files exist. I presume that it is "from root first while you have a checkout of version2, so .gcov files for version2 exist in the working tree"? > +V1=$1 > +V2=$2 > + > +diff_lines () { > + while read line > + do > + if echo $line | grep -q -e "^@@ -([0-9]+)(,[0-9]+)? > \\+([0-9]+)(,[0-9]+)? @@.*" As you know you are reading from diff output, you do not have to be so strict in this if condition. It's not like you try to carefully reject a line that began with "@@" but did not match this pattern in the corresponding else block. "read line" does funny things to backslashes and whitespaces in the payload ("read -r line" sometimes helps), and echoing $line without quoting will destroy its contents here and in the line below (but you do not care because all you care about is if it begins with a dash). > + then > + line_num=$(echo $line \ > + | awk 'match($0, "@@ -([0-9]+)(,[0-9]+)? > \\+([0-9]+)(,[0-9]+)? @@.*", m) { print m[3] }') > + else > + echo "$line_num:$line" > + if ! echo $line | grep -q -e "^-" If a patch removes a line with a solitary 'n' on it, possibly followed by a SP and something else, such a line would say "-n something else", and some implementation of "echo -n something else" would do what this line does not expect. A safer way to do this, as you only care if it is a deletion line, is to do case "$line" in -*) ;; *) line_num=$(( $line_num + 1 ));; esac Also you can make the echoing of "$line_num:$line" above conditional, which will allow you to lose "grep ':+'" in the pipeline that consumes output from this thing, e.g. case "$line" in +*) echo "$line_num:$line";; esac iff you must write this in shell (but see below). > + then > + line_num=$(($line_num + 1)) > + fi > + fi > + done I have a feeling that a single Perl script instead of a shell loop that runs many grep and awk as subprocesses performs better even on Windows, and it would be more readable and maintainable. perl -e ' my $line_num; while (<>) { # Hunk header? Grab the beginning in postimage. if (/^@@ -\d+(?:,\d+)? \+(\d+)(?:,\d+)? @@/) { $line_num = $1; next; } # Have we seen a hunk? Ignore "diff --git" etc. next unless defined $line_num; # Deleted line? Ignore. if (/^-/) { next; } # Show only the line number of added lines. if (/^\+/) { print "$line_num\n"; } # Either common context or added line appear in # the postimage. Count it. $line_num++; } ' or something like that, given that you seem to only need line
Re: [PATCH v5 00/12] Hash-independent tests (part 3)
On 9/13/2018 1:17 AM, brian m. carlson wrote: This is the next in the series of improvements to make tests hash-independent. A range-diff is below. Changes from v4: * Add local statements to the &&-chain. * Fix a typo in the local statement. * Add a helpful comment about why test_detect_hash is hard-coded to SHA-1. The range-diff and these changes since v4 are reasonable. I'm happy with the current version. I'm looking forward to this change getting merged down, as I also want to use 'test_oid rawsz' for the 'git multi-pack-index verify' tests that are currently under review (v2 imminent). Thanks, -Stolee
Re: [PATCH 3/3] archive: allow archive over HTTP(S) with proto v2
Josh Steadmon writes: > Signed-off-by: Josh Steadmon > --- > builtin/archive.c | 8 +++- > http-backend.c | 10 +- > transport-helper.c | 5 +++-- > 3 files changed, 19 insertions(+), 4 deletions(-) > > diff --git a/builtin/archive.c b/builtin/archive.c > index 73831887d..5fa75b3f7 100644 > --- a/builtin/archive.c > +++ b/builtin/archive.c > @@ -87,7 +87,13 @@ static int run_remote_archiver(int argc, const char **argv, > status = packet_reader_read(); > if (status != PACKET_READ_FLUSH) > die(_("git archive: expected a flush")); > - } > + } else if (version == protocol_v2 && > +starts_with(transport->url, "http")) > + /* > + * Commands over HTTP require two requests, so there's an > + * additional server response to parse. > + */ > + discover_version(); What should happen if the version discovered here is different from v2 or the capabilities offered are different from what we saw before? Perhaps we need some sanity checks, as we on this side of the connection know we are making two requests, and may even end up talking with an instance of "upload-archive" that is different from the one we talked with earlier. > diff --git a/http-backend.c b/http-backend.c > index 458642ef7..d62d583c7 100644 > --- a/http-backend.c > +++ b/http-backend.c > @@ -32,6 +32,7 @@ struct rpc_service { > static struct rpc_service rpc_service[] = { > { "upload-pack", "uploadpack", 1, 1 }, > { "receive-pack", "receivepack", 0, -1 }, > + { "upload-archive", "uploadarchive", 1, 1 }, > }; > > static struct string_list *get_parameters(void) > @@ -637,6 +638,12 @@ static void service_rpc(struct strbuf *hdr, char > *service_name) > struct rpc_service *svc = select_service(hdr, service_name); > struct strbuf buf = STRBUF_INIT; > > + if (!strcmp(service_name, "git-upload-archive")) { > + /* git-upload-archive doesn't need --stateless-rpc */ > + argv[1] = "."; > + argv[2] = NULL; > + } > + > strbuf_reset(); > strbuf_addf(, "application/x-git-%s-request", svc->name); > check_content_type(hdr, buf.buf); > @@ -713,7 +720,8 @@ static struct service_cmd { > {"GET", "/objects/pack/pack-[0-9a-f]{40}\\.idx$", get_idx_file}, > > {"POST", "/git-upload-pack$", service_rpc}, > - {"POST", "/git-receive-pack$", service_rpc} > + {"POST", "/git-receive-pack$", service_rpc}, > + {"POST", "/git-upload-archive$", service_rpc}, > }; > > static int bad_request(struct strbuf *hdr, const struct service_cmd *c) > diff --git a/transport-helper.c b/transport-helper.c > index 143ca008c..b4b96fc89 100644 > --- a/transport-helper.c > +++ b/transport-helper.c > @@ -605,7 +605,8 @@ static int process_connect_service(struct transport > *transport, > ret = run_connect(transport, ); > } else if (data->stateless_connect && > (get_protocol_version_config() == protocol_v2) && > -!strcmp("git-upload-pack", name)) { > +(!strcmp("git-upload-pack", name) || > + !strcmp("git-upload-archive", name))) { > strbuf_addf(, "stateless-connect %s\n", name); > ret = run_connect(transport, ); > if (ret) > @@ -639,7 +640,7 @@ static int connect_helper(struct transport *transport, > const char *name, > > /* Get_helper so connect is inited. */ > get_helper(transport); > - if (!data->connect) > + if (!data->connect && !data->stateless_connect) > die(_("operation not supported by protocol")); This is somewhat curious. The upload-pack going over HTTP also is triggered by the same "stateless-connect" remote helper command, as we just saw in the previous hunk, and that support is not new. Why do we need this change then? What's different between the "data" that is obtained by get_helper(transport) for driving upload-pack and upload-archive? Presumably upload-pack was working without this change because it also sets the connect bit on, and upload-archive does not work without this change because it does not? Why do these two behave differently? > if (!process_connect_service(transport, name, exec))
Re: [PATCH v2 1/1] commit-reach: properly peel tags
On 9/13/2018 12:10 PM, Derrick Stolee via GitGitGadget wrote: From: Derrick Stolee The can_all_from_reach_with_flag() algorithm was refactored in 4fbcca4e "commit-reach: make can_all_from_reach... linear" but incorrectly assumed that all objects provided were commits. During a fetch negotiation, ok_to_give_up() in upload-pack.c may provide unpeeled tags to the 'from' array. The current code creates a segfault. Add a direct call to can_all_from_reach_with_flag() in 'test-tool reach' and add a test in t6600-test-reach.sh that demonstrates this segfault. Correct the issue by peeling tags when investigating the initial list of objects in the 'from' array. Signed-off-by: Jeff King Signed-off-by: Derrick Stolee --- commit-reach.c| 33 ++--- t/helper/test-reach.c | 22 +- t/t6600-test-reach.sh | 30 -- 3 files changed, 71 insertions(+), 14 deletions(-) diff --git a/commit-reach.c b/commit-reach.c index 86715c103c..4048a2132a 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -544,20 +544,39 @@ int can_all_from_reach_with_flag(struct object_array *from, { struct commit **list = NULL; int i; + int nr_commits; int result = 1; ALLOC_ARRAY(list, from->nr); + nr_commits = 0; for (i = 0; i < from->nr; i++) { - list[i] = (struct commit *)from->objects[i].item; + struct object *from_one = from->objects[i].item; - if (parse_commit(list[i]) || - list[i]->generation < min_generation) - return 0; + if (!from_one || from_one->flags & assign_flag) + continue; + + from_one = deref_tag(the_repository, from_one, +"a from object", 0); + if (!from_one || from_one->type != OBJ_COMMIT) { + /* no way to tell if this is reachable by +* looking at the ancestry chain alone, so +* leave a note to ourselves not to worry about +* this object anymore. +*/ + from->objects[i].item->flags |= assign_flag; + continue; + } + + list[nr_commits] = (struct commit *)from_one; + if (parse_commit(list[nr_commits]) || + list[nr_commits]->generation < min_generation) + return 0; /* is this a leak? */ Of course, after sending v2, I see this comment. This is a leak of 'list' and should be fixed. Not only is it a leak here, it is also a leak in the 'cleanup' section. I'll squash the following into v3, but I'll let v2 simmer for review before rerolling. diff --git a/commit-reach.c b/commit-reach.c index 4048a2132a..c457d8d85f 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -569,8 +569,11 @@ int can_all_from_reach_with_flag(struct object_array *from, list[nr_commits] = (struct commit *)from_one; if (parse_commit(list[nr_commits]) || - list[nr_commits]->generation < min_generation) - return 0; /* is this a leak? */ + list[nr_commits]->generation < min_generation) { + result = 0; + goto cleanup; + } + nr_commits++; } @@ -623,6 +626,7 @@ int can_all_from_reach_with_flag(struct object_array *from, clear_commit_marks(list[i], RESULT); clear_commit_marks(list[i], assign_flag); } + free(list); return result; }
Re: [PATCH 2/3] archive: implement protocol v2 archive command
Josh Steadmon writes: > +static int do_v2_command_and_cap(int out) > +{ > + packet_write_fmt(out, "command=archive\n"); > + /* Capability list would go here, if we had any. */ > + packet_delim(out); > +} > + > static int run_remote_archiver(int argc, const char **argv, > const char *remote, const char *exec, > const char *name_hint) > @@ -32,6 +41,7 @@ static int run_remote_archiver(int argc, const char **argv, > struct remote *_remote; > struct packet_reader reader; > enum packet_read_status status; > + enum protocol_version version; > > _remote = remote_get(remote); > if (!_remote->url[0]) > @@ -41,6 +51,11 @@ static int run_remote_archiver(int argc, const char **argv, > > packet_reader_init(, fd[0], NULL, 0, PACKET_READ_CHOMP_NEWLINE); > > + version = discover_version(); The original version of upload-archive that is correctly running on the other end sends either NACK (unable to spawn) or ACK (ready to serve) to us without waiting for us to speak first, so peeking that with this discover_version() is a safe thing to do. > + if (version == protocol_v2) > + do_v2_command_and_cap(fd[1]); > + With proto v2, "server capabilities" have already been collected in server_capabilities_v2 array in discover_version(). We are to pick and ask the capabilities in that function and respond. Right now we do not need to do much, as we saw that very thin implementation of that function above. > /* >* Inject a fake --format field at the beginning of the >* arguments, with the format inferred from our output And then after that, both the original and updated protocol lets us send the archive format and arguments (like revs and pathspecs), followed by a flush packet... > @@ -56,22 +71,24 @@ static int run_remote_archiver(int argc, const char > **argv, > packet_write_fmt(fd[1], "argument %s\n", argv[i]); > packet_flush(fd[1]); ... which is a piece of code shared between the protocol versions that ends here. > - status = packet_reader_read(); > - > - if (status == PACKET_READ_FLUSH) > - die(_("git archive: expected ACK/NAK, got a flush packet")); > - if (strcmp(reader.buffer, "ACK")) { > - if (starts_with(reader.buffer, "NACK ")) > - die(_("git archive: NACK %s"), reader.buffer + 5); > - if (starts_with(reader.buffer, "ERR ")) > - die(_("remote error: %s"), reader.buffer + 4); > - die(_("git archive: protocol error")); > + if (version == protocol_v0) { > + status = packet_reader_read(); > + > + if (status == PACKET_READ_FLUSH) > + die(_("git archive: expected ACK/NAK, got a flush > packet")); > + if (strcmp(reader.buffer, "ACK")) { > + if (starts_with(reader.buffer, "NACK ")) > + die(_("git archive: NACK %s"), reader.buffer + > 5); > + if (starts_with(reader.buffer, "ERR ")) > + die(_("remote error: %s"), reader.buffer + 4); > + die(_("git archive: protocol error")); > + } > + > + status = packet_reader_read(); > + if (status != PACKET_READ_FLUSH) > + die(_("git archive: expected a flush")); > } The original protocol lets upload-archive to report failure to spawn the writer backend process and lets us act on it. We do not need a similar support in the updated protocol and instead can jump right into receiving the archive stream because...? > - status = packet_reader_read(); > - if (status != PACKET_READ_FLUSH) > - die(_("git archive: expected a flush")); > - > /* Now, start reading from fd[0] and spit it out to stdout */ > rv = recv_sideband("archive", fd[0], 1); > rv |= transport_disconnect(transport); > diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c > index 25d911635..534e8fd56 100644 > --- a/builtin/upload-archive.c > +++ b/builtin/upload-archive.c > @@ -5,6 +5,7 @@ > #include "builtin.h" > #include "archive.h" > #include "pkt-line.h" > +#include "protocol.h" > #include "sideband.h" > #include "run-command.h" > #include "argv-array.h" > @@ -73,13 +74,53 @@ static ssize_t process_input(int child_fd, int band) > return sz; > } > > +static int handle_v2_command_and_cap(void) > +{ > + struct packet_reader reader; > + enum packet_read_status status; > + > + packet_reader_init(, 0, NULL, 0, PACKET_READ_CHOMP_NEWLINE); > + > + packet_write_fmt(1, "version 2\n"); This lets the discover_version() on the other side notice that we are speaking version 2. > + /* > + * We don't currently send any capabilities, but maybe we could list > + * supported archival formats? > + */ > + packet_flush(1);
[PATCH v2 1/1] commit-reach: properly peel tags
From: Derrick Stolee The can_all_from_reach_with_flag() algorithm was refactored in 4fbcca4e "commit-reach: make can_all_from_reach... linear" but incorrectly assumed that all objects provided were commits. During a fetch negotiation, ok_to_give_up() in upload-pack.c may provide unpeeled tags to the 'from' array. The current code creates a segfault. Add a direct call to can_all_from_reach_with_flag() in 'test-tool reach' and add a test in t6600-test-reach.sh that demonstrates this segfault. Correct the issue by peeling tags when investigating the initial list of objects in the 'from' array. Signed-off-by: Jeff King Signed-off-by: Derrick Stolee --- commit-reach.c| 33 ++--- t/helper/test-reach.c | 22 +- t/t6600-test-reach.sh | 30 -- 3 files changed, 71 insertions(+), 14 deletions(-) diff --git a/commit-reach.c b/commit-reach.c index 86715c103c..4048a2132a 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -544,20 +544,39 @@ int can_all_from_reach_with_flag(struct object_array *from, { struct commit **list = NULL; int i; + int nr_commits; int result = 1; ALLOC_ARRAY(list, from->nr); + nr_commits = 0; for (i = 0; i < from->nr; i++) { - list[i] = (struct commit *)from->objects[i].item; + struct object *from_one = from->objects[i].item; - if (parse_commit(list[i]) || - list[i]->generation < min_generation) - return 0; + if (!from_one || from_one->flags & assign_flag) + continue; + + from_one = deref_tag(the_repository, from_one, +"a from object", 0); + if (!from_one || from_one->type != OBJ_COMMIT) { + /* no way to tell if this is reachable by +* looking at the ancestry chain alone, so +* leave a note to ourselves not to worry about +* this object anymore. +*/ + from->objects[i].item->flags |= assign_flag; + continue; + } + + list[nr_commits] = (struct commit *)from_one; + if (parse_commit(list[nr_commits]) || + list[nr_commits]->generation < min_generation) + return 0; /* is this a leak? */ + nr_commits++; } - QSORT(list, from->nr, compare_commits_by_gen); + QSORT(list, nr_commits, compare_commits_by_gen); - for (i = 0; i < from->nr; i++) { + for (i = 0; i < nr_commits; i++) { /* DFS from list[i] */ struct commit_list *stack = NULL; @@ -600,7 +619,7 @@ int can_all_from_reach_with_flag(struct object_array *from, } cleanup: - for (i = 0; i < from->nr; i++) { + for (i = 0; i < nr_commits; i++) { clear_commit_marks(list[i], RESULT); clear_commit_marks(list[i], assign_flag); } diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c index eb21103998..08d2ea68e8 100644 --- a/t/helper/test-reach.c +++ b/t/helper/test-reach.c @@ -31,6 +31,7 @@ int cmd__reach(int ac, const char **av) struct object_id oid_A, oid_B; struct commit *A, *B; struct commit_list *X, *Y; + struct object_array X_obj = OBJECT_ARRAY_INIT; struct commit **X_array; int X_nr, X_alloc; struct strbuf buf = STRBUF_INIT; @@ -49,7 +50,8 @@ int cmd__reach(int ac, const char **av) while (strbuf_getline(, stdin) != EOF) { struct object_id oid; - struct object *o; + struct object *orig; + struct object *peeled; struct commit *c; if (buf.len < 3) continue; @@ -57,14 +59,14 @@ int cmd__reach(int ac, const char **av) if (get_oid_committish(buf.buf + 2, )) die("failed to resolve %s", buf.buf + 2); - o = parse_object(r, ); - o = deref_tag_noverify(o); + orig = parse_object(r, ); + peeled = deref_tag_noverify(orig); - if (!o) + if (!peeled) die("failed to load commit for input %s resulting in oid %s\n", buf.buf, oid_to_hex()); - c = object_as_type(r, o, OBJ_COMMIT, 0); + c = object_as_type(r, peeled, OBJ_COMMIT, 0); if (!c) die("failed to load commit for input %s resulting in oid %s\n", @@ -85,6 +87,7 @@ int cmd__reach(int ac, const char **av) commit_list_insert(c, ); ALLOC_GROW(X_array, X_nr + 1, X_alloc); X_array[X_nr++] = c; +
[PATCH v2 0/1] Properly peel tags in can_all_from_reach_with_flags()
As Peff reported [1], the refactored can_all_from_reach_with_flags() method does not properly peel tags. Since the helper method can_all_from_reach() and code in t/helper/test-reach.c all peel tags before getting to this method, it is not super-simple to create a test that demonstrates this. I modified t/helper/test-reach.c to allow calling can_all_from_reach_with_flags() directly, and added a test in t6600-test-reach.sh that demonstrates the segfault without the fix. For V2, I compared the loop that inspects the 'from' commits in commit ba3ca1edce "commit-reach: move can_all_from_reach_with_flags" to the version here and got the following diff: 3c3 < if (from_one->flags & assign_flag) --- > if (!from_one || from_one->flags & assign_flag) 5c5,7 < from_one = deref_tag(the_repository, from_one, "a from object", 0); --- > > from_one = deref_tag(the_repository, from_one, > "a from object", 0); 14a17,22 > > list[nr_commits] = (struct commit *)from_one; > if (parse_commit(list[nr_commits]) || > list[nr_commits]->generation < min_generation) > return 0; /* is this a leak? */ > nr_commits++; This diff includes the early termination we had before 'deref_tag' and the comment for why we can ignore non-commit objects. [1] https://public-inbox.org/git/0bf9103c-9377-506b-7ad7-e5273d8e9...@gmail.com/T/#u Derrick Stolee (1): commit-reach: properly peel tags commit-reach.c| 33 ++--- t/helper/test-reach.c | 22 +- t/t6600-test-reach.sh | 30 -- 3 files changed, 71 insertions(+), 14 deletions(-) base-commit: 6621c838743812aaba96e55cfec8524ea1144c2d Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-39%2Fderrickstolee%2Ftag-fix-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-39/derrickstolee/tag-fix-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/39 Range-diff vs v1: 1: 948e28 ! 1: 4bf21204dd commit-reach: properly peel tags @@ -36,9 +36,17 @@ - if (parse_commit(list[i]) || - list[i]->generation < min_generation) - return 0; ++ if (!from_one || from_one->flags & assign_flag) ++ continue; ++ + from_one = deref_tag(the_repository, from_one, + "a from object", 0); + if (!from_one || from_one->type != OBJ_COMMIT) { ++ /* no way to tell if this is reachable by ++ * looking at the ancestry chain alone, so ++ * leave a note to ourselves not to worry about ++ * this object anymore. ++ */ + from->objects[i].item->flags |= assign_flag; + continue; + } @@ -187,7 +195,7 @@ + echo "can_all_from_reach_with_flag(X,_,_,0,0):1" >expect && + test_three_modes can_all_from_reach_with_flag +' -+ ++ test_expect_success 'commit_contains:hit' ' cat >input <<-\EOF && A:commit-7-7 -- gitgitgadget
RE: [Question] Signature calculation ignoring parts of binary files
On September 13, 2018 11:03 AM, Junio C Hamano wrote: > "Randall S. Becker" writes: > > > The scenario is slightly different. > > 1. Person A gives me a new binary file-1 with fingerprint A1. This > > goes into git unchanged. > > 2. Person B gives me binary file-2 with fingerprint B2. This does not > > go into git yet. > > 3. We attempt a git diff between the committed file-1 and uncommitted > > file-2 using a textconv implementation that strips what we don't need to > compare. > > 4. If file-1 and file-2 have no difference when textconv is used, > > file-2 is not added and not committed. It is discarded with impunity, > > never to be seen again, although we might whine a lot at the user for > > attempting to put > > file-2 in - but that's not git's issue. > > You are forgetting that Git is a distributed version control system, aren't you? > Person A and B can introduce their "moral equivalent but bytewise different" > copies to their repository under the same object name, and you can pull from > them--what happens? > > It is fundamental that one object name given to Git identifies one specific > byte sequence contained in an object uniquely. Once you broke that, you no > longer have Git. At that point I have a morally questionable situation, agreed. However, both are permitted to exist in the underlying tree without conflict in git - which I do consider a legitimately possible situation that will not break the application at all - although there is a semantic conflict in the application (not in git) that requires human decision to resolve. The fact that both objects can exist in git with different fingerprints is a good thing because it provides immutable evidence and ownership of someone bypassing the intent of the application. So, rather than using textconv, I shall implement this rule in the application rather than trying to configure git to do it. If two conflicting objects enter the commit history, the application will have the responsibility to resolve the semantic/legal conflict. Thanks, Randall
Re: [PATCH 1/3] archive: use packet_reader for communications
Junio C Hamano writes: >> -if (packet_read_line(fd[0], NULL)) >> +status = packet_reader_read(); >> +if (status != PACKET_READ_FLUSH) >> die(_("git archive: expected a flush")); > > This makes me wonder what happens if we got an EOF instead. We fail > to notice protocol error here, but do the code after this part > correctly handle the situation? Sorry, this part of my comment is completely backwards. We require they send a flush, not a 0-length data packet of length 4, and otherwise we die, even though we used to treate 0-length data packet of length 4 just like a flush. So this is making the code more strict than the original. As long as all the existing implementations correctly use flush here, there would be no unintended regression, but it bothers me that we have to even worry if these behaviour changes affect the already deployed software negatively. > >> /* Now, start reading from fd[0] and spit it out to stdout */
Re: [Question] Signature calculation ignoring parts of binary files
"Randall S. Becker" writes: > The scenario is slightly different. > 1. Person A gives me a new binary file-1 with fingerprint A1. This goes into > git unchanged. > 2. Person B gives me binary file-2 with fingerprint B2. This does not go > into git yet. > 3. We attempt a git diff between the committed file-1 and uncommitted file-2 > using a textconv implementation that strips what we don't need to compare. > 4. If file-1 and file-2 have no difference when textconv is used, file-2 is > not added and not committed. It is discarded with impunity, never to be seen > again, although we might whine a lot at the user for attempting to put > file-2 in - but that's not git's issue. You are forgetting that Git is a distributed version control system, aren't you? Person A and B can introduce their "moral equivalent but bytewise different" copies to their repository under the same object name, and you can pull from them--what happens? It is fundamental that one object name given to Git identifies one specific byte sequence contained in an object uniquely. Once you broke that, you no longer have Git.
Re: [PATCH 1/1] contrib: add coverage-diff script
Derrick Stolee writes: > On 9/12/2018 6:54 PM, Junio C Hamano wrote: >> Junio C Hamano writes: >> >>> "Derrick Stolee via GitGitGadget" writes: >>> contrib/coverage-diff.sh | 70 1 file changed, 70 insertions(+) create mode 100755 contrib/coverage-diff.sh >>> I fully appreciate the motivation. But it is a bit sad that this >>> begins with "#!/bin/bash" but it seems that the script is full of >>> bash-isms. I haven't gone through the script to see if these are >>> inevitable or gratuitous yet, but I'd assume it made it easier for >>> you to write it to step outside the pure POSIX shell? > > I completely forgot to avoid bash, as I wrote this first as an experiment. > >>> ... + elif [[ $REPLY =~ \+\+\+\ (b/)?([^[:blank:]$esc]+).* ]]; then + path=${BASH_REMATCH[2]} >>> OK, it probably is easier to write in bash than using expr if you >>> want to do regexp. >> Just to clarify. I am saying that it is OK to give up writing in >> pure POSIX and relying on bash-isms after seeing these lines. > > I'll try rewriting it using POSIX shell and see how hard it is. Thanks. Don't waste too much time on it and try to bend backwards too far, though.
Re: [PATCH 1/3] archive: use packet_reader for communications
Josh Steadmon writes: > Using packet_reader will simplify version detection and capability > handling, which will make implementation of protocol v2 support in > git-archive easier. Is this meant as a change in implementation without any change in behaviour? > Signed-off-by: Josh Steadmon > --- > builtin/archive.c | 23 ++- > 1 file changed, 14 insertions(+), 9 deletions(-) > > diff --git a/builtin/archive.c b/builtin/archive.c > index e74f67539..e54fc39ad 100644 > --- a/builtin/archive.c > +++ b/builtin/archive.c > @@ -27,10 +27,11 @@ static int run_remote_archiver(int argc, const char > **argv, > const char *remote, const char *exec, > const char *name_hint) > { > - char *buf; > int fd[2], i, rv; > struct transport *transport; > struct remote *_remote; > + struct packet_reader reader; > + enum packet_read_status status; > > _remote = remote_get(remote); > if (!_remote->url[0]) > @@ -38,6 +39,8 @@ static int run_remote_archiver(int argc, const char **argv, > transport = transport_get(_remote, _remote->url[0]); > transport_connect(transport, "git-upload-archive", exec, fd); > > + packet_reader_init(, fd[0], NULL, 0, PACKET_READ_CHOMP_NEWLINE); > + > /* >* Inject a fake --format field at the beginning of the >* arguments, with the format inferred from our output > @@ -53,18 +56,20 @@ static int run_remote_archiver(int argc, const char > **argv, > packet_write_fmt(fd[1], "argument %s\n", argv[i]); > packet_flush(fd[1]); > > - buf = packet_read_line(fd[0], NULL); > - if (!buf) > + status = packet_reader_read(); > + > + if (status == PACKET_READ_FLUSH) > die(_("git archive: expected ACK/NAK, got a flush packet")); It is true that packet_read_line() returns a NULL on flush, but the function also returns NULL on other conditions for which underlying packet_read() returns 0 (or negative) length. EOF, normal data with zero length (i.e. packet length itself is 4), and DELIM packets would all have led to this die() in the original code. We fail to notice that we got something unexpected when we were expecting to get a normal packet with ACK or NACK on it. > - if (strcmp(buf, "ACK")) { > - if (starts_with(buf, "NACK ")) > - die(_("git archive: NACK %s"), buf + 5); > - if (starts_with(buf, "ERR ")) > - die(_("remote error: %s"), buf + 4); > + if (strcmp(reader.buffer, "ACK")) { The way I read packet_reader_read()'s implementation and existing callers (like the ones in fetch-pack.c) tells me that consumers should not be looking at "reader.buffer"; they should instead be reading from "reader.line". > + if (starts_with(reader.buffer, "NACK ")) > + die(_("git archive: NACK %s"), reader.buffer + 5); > + if (starts_with(reader.buffer, "ERR ")) > + die(_("remote error: %s"), reader.buffer + 4); > die(_("git archive: protocol error")); > } > > - if (packet_read_line(fd[0], NULL)) > + status = packet_reader_read(); > + if (status != PACKET_READ_FLUSH) > die(_("git archive: expected a flush")); This makes me wonder what happens if we got an EOF instead. We fail to notice protocol error here, but do the code after this part correctly handle the situation? > /* Now, start reading from fd[0] and spit it out to stdout */
[PATCH v2 1/1] contrib: add coverage-diff script
From: Derrick Stolee We have coverage targets in our Makefile for using gcov to display line coverage based on our test suite. The way I like to do it is to run: make coverage-test make coverage-report This leaves the repo in a state where every X.c file that was covered has an X.c.gcov file containing the coverage counts for every line, and "#" at every uncovered line. There have been a few bugs in recent patches what would have been caught if the test suite covered those blocks (including a few of mine). I want to work towards a "sensible" amount of coverage on new topics. In my opinion, this means that any logic should be covered, but the 'die()' blocks in error cases do not need to be covered. It is important to not measure the coverage of the codebase by what old code is not covered. To help, I created the 'contrib/coverage-diff.sh' script. After creating the coverage statistics at a version (say, 'topic') you can then run contrib/coverage-diff.sh base topic to see the lines added between 'base' and 'topic' that are not covered by the test suite. The output uses 'git blame -c' format so you can find the commits responsible and view the line numbers for quick access to the context. Signed-off-by: Derrick Stolee --- contrib/coverage-diff.sh | 63 1 file changed, 63 insertions(+) create mode 100755 contrib/coverage-diff.sh diff --git a/contrib/coverage-diff.sh b/contrib/coverage-diff.sh new file mode 100755 index 00..0f226f038c --- /dev/null +++ b/contrib/coverage-diff.sh @@ -0,0 +1,63 @@ +#!/bin/sh + +# Usage: 'contrib/coverage-diff.sh +# Outputs a list of new lines in version2 compared to version1 that are +# not covered by the test suite. Assumes you ran +# 'make coverage-test coverage-report' from root first, so .gcov files exist. + +V1=$1 +V2=$2 + +diff_lines () { + while read line + do + if echo $line | grep -q -e "^@@ -([0-9]+)(,[0-9]+)? \\+([0-9]+)(,[0-9]+)? @@.*" + then + line_num=$(echo $line \ + | awk 'match($0, "@@ -([0-9]+)(,[0-9]+)? \\+([0-9]+)(,[0-9]+)? @@.*", m) { print m[3] }') + else + echo "$line_num:$line" + if ! echo $line | grep -q -e "^-" + then + line_num=$(($line_num + 1)) + fi + fi + done +} + +files=$(git diff --raw $V1 $V2 \ + | grep \.c$ \ + | awk 'NF>1{print $NF}') + +for file in $files +do + git diff $V1 $V2 -- $file \ + | diff_lines \ + | grep ":+" \ + | sed 's/:/ /g' \ + | awk '{print $1}' \ + | sort \ + >new_lines.txt + + hash_file=$(echo $file | sed "s/\//\#/") + cat "$hash_file.gcov" \ + | grep \#\#\#\#\# \ + | sed 's/#: //g' \ + | sed 's/\:/ /g' \ + | awk '{print $1}' \ + | sort \ + >uncovered_lines.txt + + comm -12 uncovered_lines.txt new_lines.txt \ + | sed -e 's/$/\)/' \ + | sed -e 's/^/\t/' \ + >uncovered_new_lines.txt + + grep -q '[^[:space:]]' < uncovered_new_lines.txt && \ + echo $file && \ + git blame -c $file \ + | grep -f uncovered_new_lines.txt + + rm -f new_lines.txt uncovered_lines.txt uncovered_new_lines.txt +done + -- gitgitgadget
[PATCH v2 0/1] contrib: Add script to show uncovered "new" lines
We have coverage targets in our Makefile for using gcov to display line coverage based on our test suite. The way I like to do it is to run: make coverage-test make coverage-report This leaves the repo in a state where every X.c file that was covered has an X.c.gcov file containing the coverage counts for every line, and "#" at every uncovered line. There have been a few bugs in recent patches what would have been caught if the test suite covered those blocks (including a few of mine). I want to work towards a "sensible" amount of coverage on new topics. In my opinion, this means that any logic should be covered, but the 'die()' blocks in error cases do not need to be covered. It is important to not measure the coverage of the codebase by what old code is not covered. To help, I created the 'contrib/coverage-diff.sh' script. After creating the coverage statistics at a version (say, 'topic') you can then run contrib/coverage-diff.sh base topic to see the lines added between 'base' and 'topic' that are not covered by the test suite. For example, I ran this against the 'jch' branch (d3c0046) versus 'next' (dd90340) and got the following output: builtin/commit.c 859fdc0c3cf (Derrick Stolee 2018-08-29 05:49:04 -0700 1657) write_commit_graph_reachable(get_object_directory(), 0); builtin/rev-list.c 250edfa8c87 (Harald Nordgren2018-04-18 23:05:35 +0200 431) bisect_flags |= BISECT_FIND_ALL; builtin/worktree.c e5353bef550 (Eric Sunshine 2018-08-28 17:20:19 -0400 60) error_errno(_("failed to delete '%s'"), sb.buf); e19831c94f9 (Eric Sunshine 2018-08-28 17:20:23 -0400 251) die(_("unable to re-add worktree '%s'"), path); 68a6b3a1bd4 (Eric Sunshine 2018-08-28 17:20:24 -0400 793) die(_("cannot move a locked working tree, lock reason: %s\nuse 'move -f -f' to override or unlock first"), f4143101cbb (Eric Sunshine 2018-08-28 17:20:25 -0400 906) die(_("cannot remove a locked working tree, lock reason: %s\nuse 'remove -f -f' to override or unlock first"), read-cache.c 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1754) const unsigned char *cp = (const unsigned char *)name; 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1757) previous_len = previous_ce ? previous_ce->ce_namelen : 0; 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1758) strip_len = decode_varint(); 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1759) if (previous_len < strip_len) { 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1760) if (previous_ce) 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1761) die(_("malformed name field in the index, near path '%s'"), 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1762) previous_ce->name); 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1764) die(_("malformed name field in the index in the first path")); 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1766) copy_len = previous_len - strip_len; 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1767) name = (const char *)cp; 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1773) len += copy_len; 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1794) if (copy_len) 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1795) memcpy(ce->name, previous_ce->name, copy_len); 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1796) memcpy(ce->name + copy_len, name, len + 1 - copy_len); 67922a3 (Nguyễn Thái Ngọc Duy 2018-09-02 15:19:33 +0200 1797) *ent_size = (name - ((char *)ondisk)) + len + 1 - copy_len; remote-curl.c c3b9bc94b9b (Elijah Newren 2018-09-05 10:03:07 -0700 181) options.filter = xstrdup(value); Using this 'git blame' output, we can quickly inspect whether the uncovered lines are appropriate. For instance: 1. The line in builtin/commit.c is due to writing the commit-graph file when GIT_TEST_COMMIT_GRAPH is enabled, which is not on by default in the test suite. Being uncovered is expected here. 2. The lines in builtin/worktree.c are all related to error conditions. This is acceptable. 3. The line in builtin/rev-list.c is a flag replacement in a block that is otherwise unchanged. It must not be covered by the test suite normally. This could be worth adding a test to ensure the new logic maintains old behavior. 4. The lines in
Re: [PATCH v1] read-cache: add GIT_TEST_INDEX_VERSION support
On 9/12/2018 6:31 PM, Thomas Gummerer wrote: On 09/12, Ben Peart wrote: Teach get_index_format_default() to support running the test suite with specific index versions. In particular, this enables the test suite to be run using index version 4 which is not the default so gets less testing. I found this commit message slightly misleading. Running the test suite with specific index versions is already supported, by defining TEST_GIT_INDEX_VERSION in 'config.mak'. What we're doing here is introduce an additional environment variable that can also be used to set the index format in tests. Even setting TEST_GIT_INDEX_VERSION=4 in the environment does run the test suite with index-v4. Admittedly the name is a bit strange compared to our usual GIT_TEST_* environment variable names, and it should probably be documented better (it's only documented in the Makefile currently), but I'm not sure we should introduce another environment variable for this purpose? TEST_GIT_INDEX_VERSION enables the testing I was looking for but you're right, it isn't well documented and the atypical naming and implementation don't help either. I checked the documentation and code but didn't see any way to test the V4 index code path so wrote this patch. I wonder if we can improve the discoverability of TEST_GIT_INDEX_VERSION through better naming and documentation. How about this as an alternative? diff --git a/Makefile b/Makefile index 5a969f5830..9e84ef02f7 100644 --- a/Makefile +++ b/Makefile @@ -400,7 +400,7 @@ all:: # (defaults to "man") if you want to have a different default when # "git help" is called without a parameter specifying the format. # -# Define TEST_GIT_INDEX_VERSION to 2, 3 or 4 to run the test suite +# Define GIT_TEST_INDEX_VERSION to 2, 3 or 4 to run the test suite # with a different indexfile format version. If it isn't set the index # file format used is index-v[23]. # @@ -2599,8 +2599,8 @@ endif ifdef GIT_INTEROP_MAKE_OPTS @echo GIT_INTEROP_MAKE_OPTS=\''$(subst ','\'',$(subst ','\'',$(GIT_INTEROP_MAKE_OPTS)))'\' >>$@+ endif -ifdef TEST_GIT_INDEX_VERSION - @echo TEST_GIT_INDEX_VERSION=\''$(subst ','\'',$(subst ','\'',$(TEST_GIT_INDEX_VERSION)))'\' >>$@+ +ifdef GIT_TEST_INDEX_VERSION + @echo GIT_TEST_INDEX_VERSION=\''$(subst ','\'',$(subst ','\'',$(GIT_TEST_INDEX_VERSION)))'\' >>$@+ endif @if cmp $@+ $@ >/dev/null 2>&1; then $(RM) $@+; else mv $@+ $@; fi diff --git a/t/test-lib.sh b/t/test-lib.sh index 44288cbb59..31698c01c4 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -134,9 +134,9 @@ export EDITOR GIT_TRACE_BARE=1 export GIT_TRACE_BARE -if test -n "${TEST_GIT_INDEX_VERSION:+isset}" +if test -n "${GIT_TEST_INDEX_VERSION:+isset}" then - GIT_INDEX_VERSION="$TEST_GIT_INDEX_VERSION" + GIT_INDEX_VERSION="$GIT_TEST_INDEX_VERSION" export GIT_INDEX_VERSION fi diff --git a/t/README b/t/README index 9028b47d92..f872638a78 100644 --- a/t/README +++ b/t/README @@ -315,10 +315,14 @@ packs on demand. This normally only happens when the object size is over 2GB. This variable forces the code path on any object larger than bytes. -GIT_TEST_OE_DELTA_SIZE= exercises the uncomon pack-objects code +GIT_TEST_OE_DELTA_SIZE= exercises the uncommon pack-objects code path where deltas larger than this limit require extra memory allocation for bookkeeping. +GIT_TEST_INDEX_VERSION= exercises the index read/write code path +for the index version specified. Can be set to any valid version +but the non-default version 4 is probably the most beneficial. + Naming Tests
[PATCH v2] builtin/remote: quote remote name on error to display empty name
When adding new remote name with empty string, git will print the following error message, fatal: '' is not a valid remote name\n But when removing remote name with empty string as input, git shows the empty string without quote, fatal: No such remote: \n To make these error messages consistent, quote the name of the remote that we tried and failed to find. Signed-off-by: Shulhan Reviewed-by: Junio C Hamano --- builtin/remote.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/remote.c b/builtin/remote.c index 40c6f8a1b..f7edf7f2c 100644 --- a/builtin/remote.c +++ b/builtin/remote.c @@ -626,7 +626,7 @@ static int mv(int argc, const char **argv) oldremote = remote_get(rename.old_name); if (!remote_is_configured(oldremote, 1)) - die(_("No such remote: %s"), rename.old_name); + die(_("No such remote: '%s'"), rename.old_name); if (!strcmp(rename.old_name, rename.new_name) && oldremote->origin != REMOTE_CONFIG) return migrate_file(oldremote); @@ -762,7 +762,7 @@ static int rm(int argc, const char **argv) remote = remote_get(argv[1]); if (!remote_is_configured(remote, 1)) - die(_("No such remote: %s"), argv[1]); + die(_("No such remote: '%s'"), argv[1]); known_remotes.to_delete = remote; for_each_remote(add_known_remote, _remotes); @@ -861,7 +861,7 @@ static int get_remote_ref_states(const char *name, states->remote = remote_get(name); if (!states->remote) - return error(_("No such remote: %s"), name); + return error(_("No such remote: '%s'"), name); read_branches(); -- 2.19.0.400.ge8095fc63
Re: [PATCH 1/1] contrib: add coverage-diff script
On 9/12/2018 6:54 PM, Junio C Hamano wrote: Junio C Hamano writes: "Derrick Stolee via GitGitGadget" writes: contrib/coverage-diff.sh | 70 1 file changed, 70 insertions(+) create mode 100755 contrib/coverage-diff.sh I fully appreciate the motivation. But it is a bit sad that this begins with "#!/bin/bash" but it seems that the script is full of bash-isms. I haven't gone through the script to see if these are inevitable or gratuitous yet, but I'd assume it made it easier for you to write it to step outside the pure POSIX shell? I completely forgot to avoid bash, as I wrote this first as an experiment. ... + elif [[ $REPLY =~ \+\+\+\ (b/)?([^[:blank:]$esc]+).* ]]; then + path=${BASH_REMATCH[2]} OK, it probably is easier to write in bash than using expr if you want to do regexp. Just to clarify. I am saying that it is OK to give up writing in pure POSIX and relying on bash-isms after seeing these lines. I'll try rewriting it using POSIX shell and see how hard it is. Thanks, -Stolee
RE: [Question] Signature calculation ignoring parts of binary files
On September 12, 2018 7:00 PM, Junio C Hamano wrote: > "Randall S. Becker" writes: > > >> author is important to our process. My objective is to keep the > >> original file 100% exact as supplied and then ignore any changes to > >> the metadata that I don't care about (like Creator) if the remainder of the > file is the same. > > That will *not* work. If person A gave you a version of original, which > hashes to X after you strip the cruft you do not care about, you would > register that original with person A's fingerprint on under the name of X. > What happens when person B gives you another version, which is not byte- > for-byte identical to the one you got earlier from person A, but does hash to > the same X after you strip the cruft? If you are going to store it in Git, and if > by SHA-1 you are calling what we perceive as "object name" in Git land, you > must store that one with person B's fingerprint on it also under the name of > X. Now which version will you get from Git when you ask it to give you the > object that hashes to X? The scenario is slightly different. 1. Person A gives me a new binary file-1 with fingerprint A1. This goes into git unchanged. 2. Person B gives me binary file-2 with fingerprint B2. This does not go into git yet. 3. We attempt a git diff between the committed file-1 and uncommitted file-2 using a textconv implementation that strips what we don't need to compare. 4. If file-1 and file-2 have no difference when textconv is used, file-2 is not added and not committed. It is discarded with impunity, never to be seen again, although we might whine a lot at the user for attempting to put file-2 in - but that's not git's issue. 5. If file-1 and file-2 have differences when textconv is used, file-2 is committed with fingerprint B2. 6. Even if an error is made by the user and they commit file-2 with B2 regardless of textconv, there will be a human who complains about it, but git has two unambiguous fingerprints that happen to have no diffs after textconv is applied. My original hope was that textconv could be used to influence the fingerprint, but I do not think that is the case, so I went with an alternative. In the application, I am not allowed to strip any cruft off file-1 when it is stored - it must be byte-for-byte the original file. This application is marginally related to a DRM-like situation where we only care about the original image provided by a user, but any copies that are provided by another user with modified metadata will be disallowed from repository. Does that make more sense? Cheers, Randall
Re: with git 1.8.3.1 get only merged tags
On Tue, Sep 11 2018, Michal Novotny wrote: > I need to emulate git tag --merged with very old git 1.8.3.1. Is that > somehow possible? > I am looking for a bash function that would take what git 1.8.3.1 > offers and return only the tags accessible from the current branch Jeff answer the question you had, but I just have one of my own: Is RedHat stuck on 1.8-era git in some release it's still maintaining, does this mean that e.g. you're still backporting security fixes to this 2012-era release?
Re: [PATCH] add -p: coalesce hunks before testing applicability
Hi Jochen On 03/09/2018 20:01, Jochen Sprickerhof wrote: > Hi Phillip, > > * Phillip Wood [2018-08-30 14:47]: >> When $newhunk is created it is marked as dirty to prevent >> coalesce_overlapping_hunks() from coalescing it. This patch does not >> change that. What is happening is that by calling >> coalesce_overlapping_hunks() the hunks that are not currently selected >> are filtered out and any hunks that can be coalesced are (I think that >> in the test that starts passing with this patch the only change is the >> filtering as there's only a single hunk selected). > > Agreed here. It would be enough to include the first hunk in the test to > make it fail again. Still I would see the patch as going in the right > direction as we need something like coalesce_overlapping_hunks() to make > the hunks applicable after the edit. Yes in the long term we want to be able to coalesce edited hunks, but I think it is confusing to call coalesce_overlapping_hunks() at the moment as it will not coalesce the edited hunks. >> This is a subtle change to the test for the applicability of an edited >> hunk. Previously when all the hunks were used to create the test patch >> we could be certain that if the test patch applied then if the user >> later selected any unselected hunk or deselected any selected hunk >> then that operation would succeed. I'm not sure that is true now (but >> I haven't thought about it for very long). > > I'm not sure here. If we use the same test from t3701, do s(plit), > y(es), e(dit), it would fail later on. Can you come up with an example? I think that if you split a hunk, edit the first subhunk, transforming a trailing context line to a deletion then try if you try to stage the second subhunk it will fail. With your patch the edit will succeed as the second subhunk is skipped when testing the edited patch. Then when you try to stage the second subhunk it will fail as it's leading context will contradict the trailing lines of the edited subhunk. With the old method the edit failed but didn't store up trouble for the future. >> We could restore the old test condition and coalesce the hunks by >> copying all the hunks and setting $hunk->{USE}=1 when creating the >> test patch if that turns out to be useful (it would be interesting to >> see if the test still passes with that change). > > We set USE=1 for $newhunk already, or where would you set it? To match the old test it needs to be set on the hunks we've skipped or haven't got to yet so they're all in the patch that's tested after editing a hunk. Best Wishes Phillip > > Cheers Jochen
Re: [PATCH] linear-assignment: fix potential out of bounds memory access (was: Re: Git 2.19 Segmentation fault 11 on macOS)
On Wed, Sep 12, 2018 at 3:01 PM Thomas Gummerer wrote: > Subject: [PATCH] linear-assignment: fix potential out of bounds memory access > > Currently the 'compute_assignment()' function can may read memory out "can may"? > of bounds, even if used correctly. Namely this happens when we only > have one column. In that case we try to calculate the initial > minimum cost using '!j1' as column in the reduction transfer code. > That in turn causes us to try and get the cost from column 1 in the > cost matrix, which does not exist, and thus results in an out of > bounds memory read.
git-credential-libsecret not prompting to unlock keyring
> Apologies, forgot the crucial details post that log: This turned out to be an error unrelated to git or git-credential-libsecret. Apologies for the noise (and the badly threaded reply earlier). Paul
Re: with git 1.8.3.1 get only merged tags
On Tue, Sep 11, 2018 at 9:05 PM Jeff King wrote: > > On Tue, Sep 11, 2018 at 12:43:15PM +0200, Michal Novotny wrote: > > > I need to emulate git tag --merged with very old git 1.8.3.1. Is that > > somehow possible? > > I am looking for a bash function that would take what git 1.8.3.1 > > offers and return only the tags accessible from the current branch > > tip. Could you, please, give me at least a hint how this could be > > done? > > This is not particularly fast, but it should work: > > git for-each-ref refs/tags | > cut -f2 | > while read tag; do > test "$(git merge-base $tag HEAD)" = \ > "$(git rev-parse $tag^{commit})" && echo $tag > done That works for me. Thank you a lot for help! clime > > -Peff