Re: [PATCH v4 06/13] commit-graph: implement git commit-graph read
Derrick Stoleewrites: > +'read':: > + > +Read a graph file given by the graph-head file and output basic > +details about the graph file. > ++ > +With `--file=` option, consider the graph stored in the file at > +the path /info/. > + A sample reader confusion after reading the above twice: What is "the graph-head file" and how does the user specify it? Is it given by the value for the "--file=" command line option? Another sample reader reaction after reading the above: What are the kind of "basic details" we can learn from this command is unclear, but perhaps there is an example to help me decide if this command is worth studying. > @@ -44,6 +53,12 @@ EXAMPLES > $ git commit-graph write > > > +* Read basic information from a graph file. > ++ > + > +$ git commit-graph read --file= > + > + And the sample reader is utterly disappointed at this point. > +static int graph_read(int argc, const char **argv) > +{ > + struct commit_graph *graph = 0; > + struct strbuf full_path = STRBUF_INIT; > + > + static struct option builtin_commit_graph_read_options[] = { > + { OPTION_STRING, 'o', "object-dir", _dir, > + N_("dir"), > + N_("The object directory to store the graph") }, > + { OPTION_STRING, 'H', "file", _file, > + N_("file"), > + N_("The filename for a specific commit graph file in > the object directory."), > + PARSE_OPT_OPTARG, NULL, (intptr_t) "" }, > + OPT_END(), > + }; The same comment as all the previous ones apply, wrt short options and non-use of OPT_STRING(). Also, I suspect that these two would want to use OPT_FILENAME instead, if we anticipate that the command might want to be sometimes run from a subdirectory. Otherwise wouldn't cd t && git commit-graph read --file=../.git/object/info/$whatever end up referring to a wrong place because the code that uses the value obtained from OPTION_STRING does not do the equivalent of parse-options.c::fix_filename()? The same applies to object-dir handling. > + argc = parse_options(argc, argv, NULL, > + builtin_commit_graph_read_options, > + builtin_commit_graph_read_usage, 0); > + > + if (!opts.obj_dir) > + opts.obj_dir = get_object_directory(); > + > + if (!opts.graph_file) > + die("no graph hash specified"); > + > + strbuf_addf(_path, "%s/info/%s", opts.obj_dir, opts.graph_file); Ahh, I was fooled by a misnamed option. --file does *not* name the file. It is a filename in a fixed place that is determined by other things. So it would be a mistake to use OPT_FILENAME() in the parser for that misnamed "--file" option. The parser for --object-dir still would want to be OPT_FILENAME(), but quite honestly, I do not see the point of having --object-dir option in the first place. The graph file is not relative to it but is forced to have /info/ in between that directory and the filename, so it is not like the user gets useful flexibility out of being able to specify two different places using --object-dir= option and $GIT_OBJECT_DIRECTORY environment (iow, a caller that wants to work on a specific object directory can use the environment, which is how it would tell any other Git subcommand which object store it wants to work with). But stepping back a bit, I think the way --file argument is defined is halfway off from two possible more useful ways to define it. If it were just "path to the file" (iow, what OPT_FILENAME() is suited for parsing it), then a user could say "I have this graph file that I created for testing, it is not installed in its usual place in $GIT_OBJECT_DIRECTORY/info/ at all, but I want you to read it because I am debugging". That is one possible useful extreme. The other possibility would be to allow *only* the hash part to be specified, iow, not just forcing /info/ relative to object directory, you would force the "graph-" prefix and ".graph" suffix. That would be the other extreme that is useful (less typing and less error prone). For a low-level command line this, my gut feeling is that it would be better to allow paths to the object directory and the graph file to be totally independently specified. > + if (graph_signature != GRAPH_SIGNATURE) { > + munmap(graph_map, graph_size); > + close(fd); > + die("graph signature %X does not match signature %X", > + graph_signature, GRAPH_SIGNATURE); > + } > + > + graph_version = *(unsigned char*)(data + 4); > + if (graph_version != GRAPH_VERSION) { > + munmap(graph_map, graph_size); > + close(fd); > + die("graph version %X does not match
Re: [PATCH] submodule: indicate that 'submodule.recurse' doesn't apply to clone
Brandon Williamswrites: > Update the documentation for the 'submodule.recurse' config to identify > that the clone command does not respect it. > > Signed-off-by: Brandon Williams > --- > Documentation/config.txt | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) Thanks. > diff --git a/Documentation/config.txt b/Documentation/config.txt > index f57e9cf10..59ff29d7a 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -3210,7 +3210,8 @@ submodule.active:: > > submodule.recurse:: > Specifies if commands recurse into submodules by default. This > - applies to all commands that have a `--recurse-submodules` option. > + applies to all commands that have a `--recurse-submodules` option, > + except `clone`. > Defaults to false. > > submodule.fetchJobs::
Re: [PATCH v4 01/13] commit-graph: add format document
>> >> [ so in small repos, where there are fewer than 256 objects, >> F[i] == F[i+1], for all i'th where there is no object starting with i >> byte] > > > Correct. I'm not sure this additional information is valuable for the > document, though. It is not, I was just making sure I'd understand correctly.
bug in HTTP protocol spec
Hello list, I had been banging my head all morning trying to figure out why I couldn’t get a little HTTP implementation to clone/push via the smart protocol (just wrapping git-receive-pack/git-upload-pack). I kept getting the following (likely familiar to some) error: ``` fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. ``` I didn’t get an insight until I ran with GIT_TRACE_PACKET=true on a known-good remote (i.e. GitHub), that the null packet-line `` has to follow the service line. This is not reflected in the example here: https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L216 It is also not reflected in the BNF: https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L279 (Note these links are from the most recent commit of this file as of this writing.) Just thought somebody would like to know. Regards, -- Dorian Taylor Make things. Make sense. https://doriantaylor.com
Re: [PATCH v4 02/13] graph: add commit graph design document
> +[3] > https://public-inbox.org/git/20170907094718.b6kuzp2uhvkmw...@sigill.intra.peff.net/t/#m7a2ea7b355aeda962e6b86404bcbadc648abfbba > +More discussion about generation numbers and not storing them inside > +commit objects. A valuable quote: Unlike the other public inbox links this links to a discussion with all messages on one page, https://public-inbox.org/git/20170908034739.4op3w4f2ma5s6...@sigill.intra.peff.net/ would have this be more inline with the other links. (this is a super small nit, which I am not sure if we care about at all; the rest of the doc is awesome!)
Re: [PATCH v4 05/13] commit-graph: implement 'git-commit-graph write'
Derrick Stoleewrites: > +static int graph_write(int argc, const char **argv) > +{ > + ... > + graph_name = write_commit_graph(opts.obj_dir); > + > + if (graph_name) { > + printf("%s\n", graph_name); > + FREE_AND_NULL(graph_name); > + } > + > + return 0; > +} After successfully writing a graph file out, write_commit_graph() signals that fact by returning a non-NULL pointer, so that this caller can report the filename to the end user. This caller protects itself from a NULL return, presumably because the callee uses it to signal an error when writing the graph file out? Is it OK to lose that 1-bit of information, or should we have more like if (graph_name) { printf; return 0; } else { return -1; } > int cmd_commit_graph(int argc, const char **argv, const char *prefix) > { > static struct option builtin_commit_graph_options[] = { > - { OPTION_STRING, 'p', "object-dir", _dir, > + { OPTION_STRING, 'o', "object-dir", _dir, > N_("dir"), > N_("The object directory to store the graph") }, > OPT_END(), The same comment for a no-op patch from an earlier step applies here, and we have another one that we saw above in graph_write(). > @@ -31,6 +67,11 @@ int cmd_commit_graph(int argc, const char **argv, const > char *prefix) >builtin_commit_graph_usage, >PARSE_OPT_STOP_AT_NON_OPTION); > > + if (argc > 0) { > + if (!strcmp(argv[0], "write")) > + return graph_write(argc, argv); And if we fix "graph_write" to report an error with negative return, this needs to become something like return !!graph_write(argc, argv); as we do not want to return a negative value to be passed via run_builtin() to exit(3) in handle_builtin(). > diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh > new file mode 100755 > index 000..6a5e93c > --- /dev/null > +++ b/t/t5318-commit-graph.sh > @@ -0,0 +1,119 @@ > +#!/bin/sh > + > +test_description='commit graph' > +. ./test-lib.sh > + > +test_expect_success 'setup full repo' ' > + rm -rf .git && I am perfectly OK with creating a separate subdirectory called 'full' in the trash directory given by the test framework, but unless absolutely necessary I'd rather not to see "rm -rf", especially on ".git", in our test scripts. People can screw up doing various things (like copying and pasting). > + mkdir full && > + cd full && > + git init && > + objdir=".git/objects" > +' And I absolutely do not want to see "cd full" that leaves and stays in the subdirectory after this step is done. Imagine what happens if any earlier step fails before doing "cd full", causing this "setup full" step to report failure, and then the test goes on to the next step? We will not be in "full" and worse yet because we do not have "$TRASH_DIRECTORY/.git" (you removed it), the "git commit-graph write --object-dir" command we end up doing next will see the git source repository as the repository it is working on. Never risk trashing our source repository with your test. That is why we give you $TRASH_DIRECTORY to play in. Make use of it when you can. I'd make this step just a single git init full and then the next one git -C full commit-graph write --object-dir . In later tests that have multi-step things, I'd instead make them ( cd full && ... whatever you do && ... in that separate && ... 'full' repository ) if I were writing this test *and* if I really wanted to do things inside $TRASH_DIRECTORY/full/.git repository. I am not convinced yet about the latter. I know that it will make certain things simpler to use a separate /full hierarchy (e.g. cleaning up, having another unrelated test repository, etc.) while making other things more cumbersome (e.g. you need to be careful when you "cd" and the easiest way to do so is to ( do things in a subshell )). I just do not know what the trade-off would look like in this particular case. A simple rule of thumb I try to follow is not to change $(pwd) for the process that runs these test_expect_success shell functions. > + > +test_expect_success 'write graph with no packs' ' > + git commit-graph write --object-dir . > +' > + > +test_expect_success 'create commits and repack' ' > + for i in $(test_seq 3) > + do > + test_commit $i && > + git branch commits/$i > + done && > + git repack > +'
Re: Question about get_cached_commit_buffer()
On Wed, Feb 21, 2018 at 01:48:11PM -0500, Jeff King wrote: > > What confuses me about this behavior is that the OID is still shown on the > > repeat (and in the case of `git log --oneline` will not actually have a line > > break between two short-OIDs). I don't believe this behavior is something to > > preserve. > > I think that repeating the oid is intentional; the point is to dump how > the traversal code is hitting the endpoints, even if we do so multiple > times. > > The --oneline behavior just looks like a bug. I think --format is broken > with --show-all, too (it does not show anything!). I poked at one of the examples a little more closely. I actually think these are not repeats, but simply UNINTERESTING parents that we never needed to look at in our traversal (because we hit a point where everything was UNINTERESTING). So we are relying not on finish_commit() to have freed the buffer, but on the traversal code to have never parsed those commits in the first place. Which is doubly subtle. I think the rest of my email stands, though: we should just show the full headers for those commits. -Peff
Re: Git should preserve modification times at least on request
On Mon, Feb 19, 2018 at 10:22:36PM +0100, Peter Backes wrote: > > It is pretty annoying that git cannot, even if I know what I am doing, > and explicitly want it to, preserve the modification time. The use case I've come across where it would be of value is for code archeology, either importing a bunch of tar files, or importing a repo from some other VCS. There preserving the mod times can be useful when one is subsequently figuring out what changed, and the scope of the 'commits' is too big (i.e. the granularity of the tar files themselves). e.g. initial commits are done on tar boundaries, but one may try to figure out individual changes from a ChangeLog file. I've done this a couple of times, but to date it has required keeping the untarred trees around (or a timestamp list file from each tree), in addition to the git repro in to which one is then synthesizing smaller commits. DF
Re: [PATCH] t/known-leaky: add list of known-leaky test scripts
On Wed, Feb 21, 2018 at 08:53:16AM -0800, Junio C Hamano wrote: > Even though keeping track of list of known-leaky tests may not be so > interesting, we can still salvage useful pieces from the discussion > and make them available to developers, e.g. something like > > prove --dry --state=failed | > perl -lne '/^(t[0-9]{4})-.*\.sh$/ and print $1' | sort >$@+ > if cmp >/dev/null $@ $@+; then rm $@+; else mv $@+ $@; fi > > could be made into a target to stash away the list of failing tests > after a test run? Unfortunately there are some caveats in that snippet: 1. You are using prove. 2. You are using --state=save in the initial run. I think we might be better off having the test scripts write to test-results/*.counts even when run under a TAP harness, and then we can have a consistent way to get the list of failed tests (we already have a "make failed" that works this way). -Peff
Re: Question about get_cached_commit_buffer()
On Wed, Feb 21, 2018 at 09:13:22AM -0500, Derrick Stolee wrote: > > So there it is. It does show commits multiple times, but suppresses the > > verbose header after the first showing. If we do something like this: > > > >git rev-list --show-all --pretty --boundary c93150cfb0^- > > > > you'll see some boundary commits that _don't_ have their pretty headers > > shown. And with your proposed patch, we'd show them again. To keep the > > same behavior we need to store that "we've already seen this" boolean > > somewhere else (e.g., in an object flag; possibly SEEN, but that might > > run afoul of other logic). > > What confuses me about this behavior is that the OID is still shown on the > repeat (and in the case of `git log --oneline` will not actually have a line > break between two short-OIDs). I don't believe this behavior is something to > preserve. I think that repeating the oid is intentional; the point is to dump how the traversal code is hitting the endpoints, even if we do so multiple times. The --oneline behavior just looks like a bug. I think --format is broken with --show-all, too (it does not show anything!). > Unless I am misunderstanding, the current behavior on a repeated commit is > already incorrect: some amount of output occurs before checking the buffer, > so the output includes repeated records but with formatting that violates > the expectation. By doing the simple change of swapping > get_cached_commit_buffer() with get_commit_buffer(), we correct that format > violation but have duplicate copies. Yeah, I'd agree with that assessment. > The most-correct thing to do (in my opinion) is to put the requirement of > "no repeats" into the revision walk logic and stop having the formatting > methods expect them. Then, however we change this boolean setting of "we > have seen this before" it will not require the formatting methods to change. But then you wouldn't show repeats at all. If I'm understanding you correctly. TBH, I do not think it is worth spending a lot of effort on this --show-all feature. It seems mostly like forgotten debugging cruft to me. That's why I'd be OK with showing the whole header as the simplest fix (i.e., just removing those calls entirely, not even converting them to get_commit_buffer). > I can start working on a patch to move the duplicate-removal logic into > revision.c instead of these three callers: > > builtin/rev-list.c: if (revs->verbose_header && > get_cached_commit_buffer(commit, NULL)) { > log-tree.c: if (!get_cached_commit_buffer(commit, NULL)) > object.c: if (!get_cached_commit_buffer(commit, NULL)) Those first two are duplicate detection. The third one in object.c should stay, though. We've been fed a commit buffer to parse, and we want to know whether we should attach it as the cached buffer for that commit. But if we already have a cached buffer, there's no point in doing so. And that's what we're checking there. Though I think it would be equally correct to have set_commit_buffer() just throw away the existing cache entry and replace it with this one. I don't think there's a real reason to prefer the old to the new. And that might be worth doing if it would let us drop get_cached_commit_buffer() as a public function. But... > But this caller seems pretty important in pretty.c: > > /* > * Otherwise, we still want to munge the encoding header in the > * result, which will be done by modifying the buffer. If we > * are using a fresh copy, we can reuse it. But if we are using > * the cached copy from get_commit_buffer, we need to duplicate it > * to avoid munging the cached copy. > */ > if (msg == get_cached_commit_buffer(commit, NULL)) > out = xstrdup(msg); > else > out = (char *)msg Like the one in object.c, this really does want to know about the cached entry. And it should be unaffected by your patch, since we will have called get_commit_buffer() at the top of that function. If we wanted to write this one without get_cached_commit_buffer(), we'd really need a function to ask "did this pointer come from the cache, or was it freshly allocated?". That's the same thing we do for unuse_commit_buffer(). So in theory we could have a boolean function that would check that, and that would let us make get_cached_commit_buffer() private. In my opinion it's not really worth trying to make it private. The confusion you're fixing in the first two calls is not due to a bad API, but due to some subtly confusing logic in that code's use of the API. ;) So I'd probably do this: diff --git a/builtin/rev-list.c b/builtin/rev-list.c index d94062bc84..3af56921c8 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -150,7 +150,7 @@ static void show_commit(struct commit *commit, void *data) else putchar('\n'); - if (revs->verbose_header && get_cached_commit_buffer(commit,
[PATCH v2 3/3] config: change default of `pager.config` to "on"
This is similar to ff1e72483 (tag: change default of `pager.tag` to "on", 2017-08-02) and is safe now that we do not consider `pager.config` at all when we are not listing or getting configuration. This change will help with listing large configurations, but will not hurt users of `git config --edit` as it would have before the previous commit. Signed-off-by: Martin Ågren--- Documentation/git-config.txt | 1 + t/t7006-pager.sh | 12 ++-- builtin/config.c | 2 +- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt index 249090ac84..e09ed5d7d5 100644 --- a/Documentation/git-config.txt +++ b/Documentation/git-config.txt @@ -237,6 +237,7 @@ CONFIGURATION - `pager.config` is only respected when listing configuration, i.e., when using `--list` or any of the `--get-*` which may return multiple results. +The default is to use a pager. [[FILES]] FILES diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh index 95bd26f0b2..7541ba5edb 100755 --- a/t/t7006-pager.sh +++ b/t/t7006-pager.sh @@ -267,23 +267,23 @@ test_expect_success TTY 'git config --get ignores pager.config' ' ! test -e paginated.out ' -test_expect_success TTY 'git config --get-urlmatch defaults to not paging' ' +test_expect_success TTY 'git config --get-urlmatch defaults to paging' ' rm -f paginated.out && test_terminal git -c http."https://foo.com/".bar=foo \ config --get-urlmatch http https://foo.com && - ! test -e paginated.out + test -e paginated.out ' test_expect_success TTY 'git config --get-all respects pager.config' ' rm -f paginated.out && - test_terminal git -c pager.config config --get-all foo.bar && - test -e paginated.out + test_terminal git -c pager.config=false config --get-all foo.bar && + ! test -e paginated.out ' -test_expect_success TTY 'git config --list defaults to not paging' ' +test_expect_success TTY 'git config --list defaults to paging' ' rm -f paginated.out && test_terminal git config --list && - ! test -e paginated.out + test -e paginated.out ' diff --git a/builtin/config.c b/builtin/config.c index a732d9b56c..01169dd628 100644 --- a/builtin/config.c +++ b/builtin/config.c @@ -602,7 +602,7 @@ int cmd_config(int argc, const char **argv, const char *prefix) } if (actions & PAGING_ACTIONS) - setup_auto_pager("config", 0); + setup_auto_pager("config", 1); if (actions == ACTION_LIST) { check_argc(argc, 0, 0); -- 2.16.2.246.ga4ee8f
Re: Git should preserve modification times at least on request
On 20/02/18 22:48, Peter Backes wrote: On Tue, Feb 20, 2018 at 11:32:23PM +0100, Johannes Schindelin wrote: Hi Peter, On Tue, 20 Feb 2018, Peter Backes wrote: On Tue, Feb 20, 2018 at 11:46:38AM +0100, Johannes Schindelin wrote: I would probably invent a file format (``) I'm stuck there because of being munged. From which command do you want to get it? If you are looking at `git diff`, you may want to use the `-z --name-only` options to avoid munging the paths. I plan to use "git diff-tree --name-only $w_tree HEAD" and subtract all lines from "git diff-index --name-only HEAD" to get the files for which the timestamp should be stored.. If I use "-z" I get the non-munged path, but I cannot safely store such paths in the proposed file format; they might contain newlines (sigh). So at one point I have to munge. Then the same question arises when I have to get the actual path from the munged path when restoring the timestamps. If there's no ready-made functionality to munge and unmunge paths, I have to write some awk for this. At first I thought this might add one more dependency to git, but it seems that awk is already used in git-mergetool.sh, so I suppose it's okay to use in git-stash.sh etc, too. In recent versions of git there's unquote_path() in Git.pm, you could possibly use that with perl -e from your script Best Wishes Phillip Best wishes Peter
Re: Git should preserve modification times at least on request
On Wed, Feb 21 2018, Derek Fawcus jotted: > On Mon, Feb 19, 2018 at 10:22:36PM +0100, Peter Backes wrote: >> >> It is pretty annoying that git cannot, even if I know what I am doing, >> and explicitly want it to, preserve the modification time. > > The use case I've come across where it would be of value is for code > archeology, either importing a bunch of tar files, or importing a > repo from some other VCS. > > There preserving the mod times can be useful when one is subsequently > figuring out what changed, and the scope of the 'commits' is too big > (i.e. the granularity of the tar files themselves). > > e.g. initial commits are done on tar boundaries, but one may try to > figure out individual changes from a ChangeLog file. I've done this > a couple of times, but to date it has required keeping the untarred > trees around (or a timestamp list file from each tree), in addition > to the git repro in to which one is then synthesizing smaller commits. This sounds like a sensible job for a git import tool, i.e. import a target directory into git, and instead of 'git add'-ing the whole thing it would look at the mtimes, sort files by mtime, then add them in order and only commit those files that had the same mtime in the same commit (or within some boundary). The advantage of doing this via such a tool is that you could tweak it to commit by any criteria you wanted, e.g. not mtime but ctime or even atime. You'd get the same thing as you'd get if git's tree format would change to include mtimes (which isn't going to happen), but with a lot more flexibility.
Re: [PATCH v4 08/13] commit-graph: implement --delete-expired
On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stoleewrote: > graph_name = write_commit_graph(opts.obj_dir); > > if (graph_name) { > if (opts.set_latest) > set_latest_file(opts.obj_dir, graph_name); > > + if (opts.delete_expired) > + do_delete_expired(opts.obj_dir, > + old_graph_name, > + graph_name); > + So this only allows to delete expired things and setting the latest when writing a new graph. Would we ever envision a user to produce a new graph (e.g. via obtaining a graph that they got from a server) and then manually rerouting the latest to that new graph file without writing that graph file in the same process? The same for expired. I guess these operations are just available via editing the latest or deleting files manually, which slightly contradicts e.g. "git update-ref", which in olden times was just a fancy way of rewriting the refs file manually. (though it claims to be less prone to errors as it takes lock files) > > extern char *get_graph_latest_filename(const char *obj_dir); > +extern char *get_graph_latest_contents(const char *obj_dir); Did https://public-inbox.org/git/20180208213806.ga6...@sigill.intra.peff.net/ ever make it into tree? (It is sort of new, but I feel we'd want to strive for consistency in the code base, eventually.)
Congratulations i willed to you
My Dear friend, I am Mr. Micheal Pascal Anderson I got your contact on my personal search of the person.. I willed the only funds left in my account to you. If you want to know why I have willed the only funds left in my account to you please contact the bank manager whose name and address I will give to you as soon as you reply to this mail. Right now I am in the hospital. Please, Mr. Micheal Pascal Anderson
Re: [PATCH v3 04/35] upload-pack: convert to a builtin
On Tue, 6 Feb 2018 17:12:41 -0800 Brandon Williamswrote: > In order to allow for code sharing with the server-side of fetch in > protocol-v2 convert upload-pack to be a builtin. > > Signed-off-by: Brandon Williams As Stefan mentioned in [1], also mention in the commit message that this means that the "git-upload-pack" invocation gains additional capabilities (for example, invoking a pager for --help). Having said that, the main purpose of this patch seems to be to libify upload-pack, and the move to builtin is just a way of putting the program somewhere - we could have easily renamed upload-pack.c and created a new upload-pack.c containing the main(), preserving the non-builtin-ness of upload-pack, while still gaining the benefits of libifying upload-pack. If the community does want to make upload-pack a builtin, I would write the commit message this way: upload-pack: libify Libify upload-pack. The main() function is moved to builtin/upload-pack.c, thus making upload-pack a builtin. Note that this means that "git-upload-pack" gains functionality such as the ability to invoke a pager when passed "--help". And if not: upload-pack: libify Libify upload-pack by moving most of the functionality in upload-pack.c into a file upload-pack-lib.c (or some other name), to be used in subsequent patches. [1] https://public-inbox.org/git/CAGZ79kb2=uu0_k8wr27gndnx-t+p+7gvdgc5ebdyc3zbobs...@mail.gmail.com/ > -static void upload_pack(void) > -{ > - struct string_list symref = STRING_LIST_INIT_DUP; > - > - head_ref_namespaced(find_symref, ); > - > - if (advertise_refs || !stateless_rpc) { > - reset_timeout(); > - head_ref_namespaced(send_ref, ); > - for_each_namespaced_ref(send_ref, ); > - advertise_shallow_grafts(1); > - packet_flush(1); > - } else { > - head_ref_namespaced(check_ref, NULL); > - for_each_namespaced_ref(check_ref, NULL); > - } > - string_list_clear(, 1); > - if (advertise_refs) > - return; > - > - receive_needs(); > - if (want_obj.nr) { > - get_common_commits(); > - create_pack_file(); > - } > -} I see that this function had to be moved to the bottom because it now also needs to make use of functions like upload_pack_config() - that's fine. > +struct upload_pack_options { > + int stateless_rpc; > + int advertise_refs; > + unsigned int timeout; > + int daemon_mode; > +}; I would have expected "unsigned stateless_rpc : 1" etc., but I see that this makes it easier to use with OPT_BOOL (which needs us to pass it a pointer-to-int). As for what existing code does, files like fetch-pack and diff use "unsigned : 1", but they also process arguments without OPT_, so I don't think they are relevant. I think that we should decide if we're going to prefer "unsigned : 1" or "int" for flags in new code. Personally, I prefer "unsigned : 1" (despite the slight inconvenience in that argument parsers will need to declare their own temporary "int" and then assign its contents to the options struct) because of the stronger type, but I'm OK either way. Whatever the decision, I don't think it needs to block the review of this patch set.
Re: [PATCH 04/26] upload-pack: convert to a builtin
Brandon Williams wrote: > On 01/03, Stefan Beller wrote: > > On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williamswrote: >>> In order to allow for code sharing with the server-side of fetch in >>> protocol-v2 convert upload-pack to be a builtin. >> >> What is the security aspect of this patch? >> >> By making upload-pack builtin, it gains additional abilities, >> such as answers to '-h' or '--help' (which would start a pager). >> Is there an easy way to sooth my concerns? (best put into the >> commit message) > > receive-pack is already a builtin, so theres that. *nod* Since v2.4.12~1^2 (shell: disallow repo names beginning with dash, 2017-04-29), git-shell refuses to pass --help to upload-pack, limiting the security impact in configurations that use git-shell (e.g. gitolite installations). If you're not using git-shell, then hopefully you have some other form of filtering preventing arbitrary options being passed to git-upload-pack. If you don't, then you're in trouble, for the reasons described in that commit. Since some installations may be allowing access to git-upload-pack (for read-only access) without allowing access to git-receive-pack, this does increase the chance of attack. On the other hand, I suspect the maintainability benefit is worth it. For defense in depth, it would be comforting if the git wrapper had some understanding of "don't support --help in handle_builtin when invoked as a dashed command". That is, I don't expect that anyone has been relying on git-add --help acting like git help add instead of printing the usage message from git add -h It's a little fussy because today we rewrite "git add --help" to "git-add --help" before rewriting it to "git help add"; we'd have to skip that middle hop for this to work. I don't think that has to block this patch or series, though --- it's just a separate thought about hardening. Cc-ing Sitaram Chamarty since he tends to be wiser about this kind of thing than I am. What do you think? Thanks, Jonathan
Re: [PATCH 05/27] object-store: move packed_git and packed_git_mru to object store
Stefan Bellerwrites: > + > +/* > + * The mru list_head is supposed to be initialized using > + * the LIST_HEAD macro, assigning prev/next to itself. > + * However this doesn't work in this case as some compilers dislike > + * that macro on member variables. Use NULL instead as that is defined > + * and accepted, deferring the real init to prepare_packed_git_mru(). */ > +#define __MRU_INIT { NULL, NULL } > +#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL } I do not think it has to be this way to abuse two NULLs, if you faithfully mimicked how LIST_HEAD() macro is constructed. The reason why it does not try to introduce struct list_head x = LIST_HEAD_INIT; and instead, uses LIST_HEAD(x); is because of the need for self referral. If we learn from it, we can do the same, i.e. instead of doing struct raw_object_store x = RAW_OBJECT_STORE_INIT; we can do RAW_OBJECT_STORE(x); that expands to struct raw_object_store x = { ... { _git_mru, _git_mru }, ... }; no? Then we do not need such a lengthy comment that reads only like an excuse ;-)
Re: [PATCH] commit: drop uses of get_cached_commit_buffer()
On 2/21/2018 2:17 PM, Derrick Stolee wrote: The get_cached_commit_buffer() method provides access to the buffer loaded for a struct commit, if it was ever loadead and was not freed. Two places use this to inform how to output information about commits. log-tree.c uses this method to short-circuit the output of commit information when the buffer is not cached. However, this leads to incorrect output in 'git log --oneline' where the short-OID is written but then the rest of the commit information is dropped and the next commit is written on the same line. rev-list uses this method for two reasons: - First, if the revision walk visits a commit twice, the buffer was freed by rev-list in the first write. The output then does not match the format expectations, since the OID is written without the rest of the content. - Second, if the revision walk visits a commit that was marked UNINTERESTING, the walk may never load a buffer and hence rev-list will not output the verbose information. These behaviors are undocumented, untested, and unlikely to be expected by users or other software attempting to parse this output. Helped-by: Jeff KingSigned-off-by: Derrick Stolee This would be a good time to allow multiple authors, or to just change the author, since this is exactly the diff you (Peff) provided in an earlier email. The commit message hopefully summarizes our discussion, but I welcome edits. --- builtin/rev-list.c | 2 +- log-tree.c | 3 --- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 48300d9..d320b6f 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -134,7 +134,7 @@ static void show_commit(struct commit *commit, void *data) else putchar('\n'); - if (revs->verbose_header && get_cached_commit_buffer(commit, NULL)) { + if (revs->verbose_header) { struct strbuf buf = STRBUF_INIT; struct pretty_print_context ctx = {0}; ctx.abbrev = revs->abbrev; diff --git a/log-tree.c b/log-tree.c index fc0cc0d..22b2fb6 100644 --- a/log-tree.c +++ b/log-tree.c @@ -659,9 +659,6 @@ void show_log(struct rev_info *opt) show_mergetag(opt, commit); } - if (!get_cached_commit_buffer(commit, NULL)) - return; - if (opt->show_notes) { int raw; struct strbuf notebuf = STRBUF_INIT;