Re: [BUG] submodule config does not apply to upper case submodules?
On 02/15/2017 03:11 PM, Junio C Hamano wrote: Junio C Hamanowrites: Perhaps something like this? This looks good. I was hoping to unify the processing logic between this CLI parsing and the usual stream parsing, but this approach is probably simpler. config.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/config.c b/config.c index c6b874a7bf..98bf8fee32 100644 --- a/config.c +++ b/config.c @@ -201,6 +201,20 @@ void git_config_push_parameter(const char *text) strbuf_release(); } +static void canonicalize_config_variable_name(struct strbuf *var) +{ + char *first_dot = strchr(var->buf, '.'); + char *last_dot = strrchr(var->buf, '.'); + char *cp; + + if (first_dot) + for (cp = var->buf; *cp && cp < first_dot; cp++) "*cp &&" is unnecessary, as far as I can tell. + *cp = tolower(*cp); + if (last_dot) + for (cp = last_dot; *cp; cp++) + *cp = tolower(*cp); +} + int git_config_parse_parameter(const char *text, config_fn_t fn, void *data) { @@ -223,7 +237,7 @@ int git_config_parse_parameter(const char *text, strbuf_list_free(pair); return error("bogus config parameter: %s", text); } - strbuf_tolower(pair[0]); + canonicalize_config_variable_name(pair[0]); if (fn(pair[0]->buf, value, data) < 0) { strbuf_list_free(pair); return -1;
Re: [BUG] submodule config does not apply to upper case submodules?
On 02/15/2017 10:53 AM, Junio C Hamano wrote: Lars Schneiderwrites: It looks like as if submodule configs ("submodule.*") for submodules with upper case names are ignored. This observation is surprising, as the second level in three-level names like ".." is designed to be case sensitive. A code that uses the config API needs to do extra things to cause the behaviour you showed, i.e. to get submodule.U.update ignored while submodule.l.update to be honoured. Perhaps somebody downcases things too aggressively before comparing? This is worth making it work as expected, needless to say ;-) I had some time to look into this, and yes, command-line parameters are too aggressively downcased ("git_config_parse_parameter" calls "strbuf_tolower" on the entire key part in config.c). Updating the original patch to use "test_global_config" makes the test pass, and commenting out the "strbuf_tolower" line in config.c also makes the test pass. I'll see if I can fix this.
Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case
On 02/14/2017 10:04 AM, Jeff King wrote: On Tue, Feb 14, 2017 at 08:53:04AM -0800, Jonathan Tan wrote: On 02/13/2017 10:07 PM, Jeff King wrote: diff --git a/builtin/grep.c b/builtin/grep.c index e83b33bda..c4c632594 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix) break; } + if (!use_index) { + if (seen_dashdash) + die(_("--no-index cannot be used with revs")); There is a subsequent check that prints "--no-index or --untracked cannot be used with revs." - maybe we should just expand this part to incorporate that case. (That is, write `if (!use_index || untracked)` instead of `if (!use_index)`.) This also allows us to preserve the error message, which might be useful for someone using a translated version of Git. I wasn't sure if we wanted to treat "untracked" in the same way. Certainly we can catch the error here for the seen_dashdash case, but is it also the case that: echo content >master git grep --untracked pattern master should treat "master" as a path? -Peff It is already the case that it cannot be treated as a rev: $ git grep --untracked pattern master fatal: --no-index or --untracked cannot be used with revs. So I think it would be better if it was treated as a path, for consistency with --no-index. I'm OK either way, though.
Re: [PATCH 0/7] grep rev/path parsing fixes
On 02/13/2017 10:00 PM, Jeff King wrote: I've fixed that, along with a few other bugs and cleanups. The complete series is below. Patch 2 is your (untouched) patch. My suggestions for your test are in patch 3, which can either remain on its own or be squashed in. [1/7]: grep: move thread initialization a little lower [2/7]: grep: do not unnecessarily query repo for "--" [3/7]: t7810: make "--no-index --" test more robust [4/7]: grep: re-order rev-parsing loop [5/7]: grep: fix "--" rev/pathspec disambiguation [6/7]: grep: avoid resolving revision names in --no-index case [7/7]: grep: do not diagnose misspelt revs with --no-index Thanks - these look good to me. I replied to 6/7 with a comment, but I also think that these are good as-is. Also, 3/7 can probably be squashed in.
Re: [PATCH 6/7] grep: avoid resolving revision names in --no-index case
On 02/13/2017 10:07 PM, Jeff King wrote: diff --git a/builtin/grep.c b/builtin/grep.c index e83b33bda..c4c632594 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -1176,6 +1176,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix) break; } + if (!use_index) { + if (seen_dashdash) + die(_("--no-index cannot be used with revs")); There is a subsequent check that prints "--no-index or --untracked cannot be used with revs." - maybe we should just expand this part to incorporate that case. (That is, write `if (!use_index || untracked)` instead of `if (!use_index)`.) This also allows us to preserve the error message, which might be useful for someone using a translated version of Git. + break; + } + if (get_sha1_with_context(arg, 0, sha1, )) { if (seen_dashdash) die(_("unable to resolve revision: %s"), arg);
[PATCH for NEXT] grep: do not unnecessarily query repo for "--"
When running a command of the form git grep --no-index pattern -- path in the absence of a Git repository, an error message will be printed: fatal: BUG: setup_git_env called without repository This is because "git grep" tries to interpret "--" as a rev. "git grep" has always tried to first interpret "--" as a rev for at least a few years, but this issue was upgraded from a pessimization to a bug in commit 59332d1 ("Resurrect "git grep --no-index"", 2010-02-06), which calls get_sha1 regardless of whether --no-index was specified. This bug appeared to be benign until commit b1ef400 ("setup_git_env: avoid blind fall-back to ".git"", 2016-10-20) when Git was taught to die in this situation. (This "git grep" bug appears to be one of the bugs that commit b1ef400 is meant to flush out.) Therefore, always interpret "--" as signaling the end of options, instead of trying to interpret it as a rev first. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- There is probably a similar bug for commands of the form: git grep --no-index pattern foo If there is a repo and "foo" is a rev, the "--no-index or --untracked cannot be used with revs." error would occur. If there is a repo and "foo" is not a rev, this command would proceed as usual. If there is no repo, the "setup_git_env called without repository" error would occur. (This is my understanding from reading the code - I haven't tested it out.) This patch does not fix this similar bug, but I decided to send it out anyway because it still fixes a bug and unlocks the ability to specify paths with "git grep --no-index". builtin/grep.c | 9 + t/t7810-grep.sh | 12 2 files changed, 17 insertions(+), 4 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 2c727ef49..1b68d1638 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -1154,6 +1154,11 @@ int cmd_grep(int argc, const char **argv, const char *prefix) const char *arg = argv[i]; unsigned char sha1[20]; struct object_context oc; + if (!strcmp(arg, "--")) { + i++; + seen_dashdash = 1; + break; + } /* Is it a rev? */ if (!get_sha1_with_context(arg, 0, sha1, )) { struct object *object = parse_object_or_die(sha1, arg); @@ -1162,10 +1167,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix) add_object_array_with_path(object, arg, , oc.mode, oc.path); continue; } - if (!strcmp(arg, "--")) { - i++; - seen_dashdash = 1; - } break; } diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index 19f0108f8..29202f0e7 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -982,6 +982,18 @@ test_expect_success 'grep -e -- -- path' ' test_cmp expected actual ' +test_expect_success 'grep --no-index pattern -- path' ' + rm -fr non && + mkdir -p non/git && + ( + GIT_CEILING_DIRECTORIES="$(pwd)/non" && + export GIT_CEILING_DIRECTORIES && + cd non/git && + echo hello >hello && + git grep --no-index o -- . + ) +' + cat >expected <
Re: [RFC 00/14] Allow fetch-pack to send ref names (globs allowed)
Looking back at the comments I have received in reply, I think that there were two major concerns: (i) the case where a server ACKs a client "have" line and the client forever thinks that the server has it, but it may not be the case for future servers (or future invocations of the same server), and (ii) what the client does with 2 "versions" of remote refs. For (i), the issue already exists and as far as I can tell, this patch set does not directly impact it, positively or negatively. The "have"/"ACK" part of negotiation is kept the same - the only difference in this patch set is that wants can be specified by name instead of by SHA-1 hash. This patch set does not help the "have"/"ACK" part of negotiation, but it helps the "want" part. For (ii), I have prepared a patch to be squashed, and extended the commit message with an explanation of what is happening. (The commit message and the patch are appended to this e-mail). (There was also some discussion of the client being required to send exact matches in its "want-ref" lines.) Please let me know if you have any other opinions or thoughts. It does seem to me like such a protocol update (or something similar) would help for large repositories with many ever-changing refs (like refs/changes in Gerrit or refs/pull in GitHub) - and the creation of a ref would look like a deletion depending on the order in which we access the servers in a load-balancing arrangement and the order in which those servers synchronize themselves. And deletion of refs does not work with the current protocol, but would work with a protocol that supports wildcards (like this one). -- 8< -- fetch: send want-ref and receive fetched refs Teach fetch to send refspecs to the underlying transport, and teach all components used by the HTTP transport (remote-curl, transport-helper, fetch-pack) to understand and propagate the names and SHA-1s of the refs fetched. The do_fetch method in builtin/fetch.c originally had only one remote-local ref map, generated from the already-fetched list of remote refs. This patch introduces a new ref map generated from the list of fetched refs. Usages of the list of remote refs or the remote-local ref map are updated as follows: - check_not_current_branch (which checks that the current branch is not affected by the fetch) is performed both on the original ref map (to die before the fetch if we can, as an optimization) and on the new ref map (since the new ref map is the one actually applied). - Pruning is done based on the new ref map. - backfill_tags (for tag following) is performed using the original list of remote refs because the new list of fetched refs is not guaranteed to contain tag information. (Since backfill_tags performs another fetch, it does not need to be fully consistent with the just-returned packfile.) The list of remote refs and the remote-local ref map are not otherwise used by do_fetch or any function in its invocation chain (fetch_one and cmd_fetch). --- builtin/fetch.c | 26 ++ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 87de00e49..b8432394c 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1177,6 +1177,20 @@ static int do_fetch(struct transport *transport, if (tags == TAGS_DEFAULT && autotags) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1"); + if (fetch_refs(transport, e_rs, e_rs_nr, ref_map, _remote_refs)) { + free_refs(ref_map); + retcode = 1; + goto cleanup; + } + if (new_remote_refs) { + free_refs(ref_map); + ref_map = get_ref_map(transport->remote, new_remote_refs, + refs, ref_count, tags, autotags); + if (!update_head_ok) + check_not_current_branch(ref_map); + free_refs(new_remote_refs); + } + if (prune) { /* * We only prune based on refspecs specified @@ -1192,18 +1206,6 @@ static int do_fetch(struct transport *transport, transport->url); } } - if (fetch_refs(transport, e_rs, e_rs_nr, ref_map, _remote_refs)) { - free_refs(ref_map); - retcode = 1; - goto cleanup; - } - if (new_remote_refs) { - free_refs(ref_map); - ref_map = get_ref_map(transport->remote, new_remote_refs, - refs, ref_count, tags, autotags); - free_refs(new_remote_refs); - } - if (consume_refs(transport, ref_map)) { free_refs(ref_map); retcode = 1; -- 2.11.0.483.g087da7b7c-goog
Re: Deadlock between git-remote-http and git fetch-pack
On 01/27/2017 02:31 PM, tsuna wrote: Hi there, While investigating a hung job in our CI system today, I think I found a deadlock in git-remote-http Git version: 2.9.3 Linux (amd64) kernel 4.9.0 Excerpt from the process list: jenkins 27316 0.0 0.0 18508 6024 ?S19:30 0:00 | \_ git -C ../../../arista fetch --unshallow jenkins 27317 0.0 0.0 169608 10916 ?S19:30 0:00 | \_ git-remote-http origin http://gerrit/arista jenkins 27319 0.0 0.0 24160 8260 ?S19:30 0:00 | \_ git fetch-pack --stateless-rpc --stdin --lock-pack --include-tag --thin --no-progress --depth=2147483647 http://gerrit/arista/ Here PID 27319 (git fetch-pack) is stuck reading on stdin, while its parent, PID 27317 (git-remote-http) is stuck reading on its child’s stdout. Nothing has moved for like 2h, it’s deadlocked. strace -fp 27319 strace: Process 27319 attached read(0, Here FD 0 is a pipe: ~ @8a33a534e2f7> lsof -np 27319 | grep 0r git 27319 jenkins0r FIFO 0,10 0t0 354519158 pipe The writing end of which is owned by the parent process: ~ @8a33a534e2f7> lsof -n 2>/dev/null | fgrep 354519158 git-remot 27317jenkins4w FIFO 0,10 0t0 354519158 pipe git 27319jenkins0r FIFO 0,10 0t0 354519158 pipe And the parent process (git-remote-http) is stuck reading from another FD: strace -fp 27317 strace: Process 27317 attached read(5, And here FD 5 is another pipe: ~ @8a33a534e2f7> lsof -np 27317 | grep 5r git-remot 27317 jenkins5r FIFO 0,10 0t0 354519159 pipe Which is the child’s stdout: lsof -n 2>/dev/null | fgrep 354519159 git-remot 27317jenkins5r FIFO 0,10 0t0 354519159 pipe git 27319jenkins1w FIFO 0,10 0t0 354519159 pipe Hence the deadlock. Stack trace in git-remote-http: (gdb) bt #0 0x7f04f1e1363d in read () from target:/lib64/libpthread.so.0 #1 0x562417472d73 in xread () #2 0x562417472f2b in read_in_full () #3 0x562417438a6e in get_packet_data () #4 0x562417439129 in packet_read () #5 0x5624174245e0 in rpc_service () #6 0x562417424f10 in fetch_git () #7 0x5624174233fd in main () Stack trace in git fetch-pack: (gdb) bt #0 0x7fb3ab478620 in __read_nocancel () from target:/lib64/libpthread.so.0 #1 0x55f688827283 in xread () #2 0x55f68882743b in read_in_full () #3 0x55f6887ce35e in get_packet_data () #4 0x55f6887cea19 in packet_read () #5 0x55f6887ceb90 in packet_read_line () #6 0x55f68879dd05 in get_ack () #7 0x55f68879f6b4 in fetch_pack () #8 0x55f688710619 in cmd_fetch_pack () #9 0x55f6886dff7b in handle_builtin () #10 0x55f6886df026 in main () I looked at the diff between v2.9.3 and HEAD on fetch-pack.c and remote-curl.c and didn’t see anything noteworthy in that area of the code, so I presume the bug is still there in master. I haven't looked into this in detail, but this might be related to something I discovered while writing my patch set. I noticed that upload-pack (the process on the "other side" of fetch-pack) can die without first writing any notification, causing fetch-pack to block forever on a read. A fix would probably look like that patch [1]. [1]
Re: [RFC 03/14] upload-pack: test negotiation with changing repo
On 01/26/2017 02:33 PM, Junio C Hamano wrote: Jonathan Tan <jonathanta...@google.com> writes: diff --git a/t/lib-httpd/one-time-sed.sh b/t/lib-httpd/one-time-sed.sh new file mode 100644 index 0..060ec0300 --- /dev/null +++ b/t/lib-httpd/one-time-sed.sh @@ -0,0 +1,8 @@ +#!/bin/sh + +if [ -e one-time-sed ]; then + "$GIT_EXEC_PATH/git-http-backend" | sed "$(cat one-time-sed)" + rm one-time-sed +else + "$GIT_EXEC_PATH/git-http-backend" +fi CodingGuidelines? Thanks - done locally and will send out in the next reroll. +inconsistency() { + # Simulate that the server initially reports $2 as the ref + # corresponding to $1, and after that, $1 as the ref corresponding to + # $1. This corresponds to the real-life situation where the server's + # repository appears to change during negotiation, for example, when + # different servers in a load-balancing arrangement serve (stateless) + # RPCs during a single negotiation. + printf "s/%s/%s/" \ + $(git -C "$REPO" rev-parse $1 | tr -d "\n") \ + $(git -C "$REPO" rev-parse $2 | tr -d "\n") \ + >"$HTTPD_ROOT_PATH/one-time-sed" I'd prefer for the printf'd result to have a final LF (i.e. not leaving the resulting one-time-sed with a final incomplete line). Also, do you really need the pipe to tr-d? Doesn't the result of $(command substitution) omit the final LF anyway? $ printf '1 %s 2 %s 3\n' "$(echo foo)" "$(echo bar)"; echo OK 1 foo 2 bar 3 OK Done. diff --git a/upload-pack.c b/upload-pack.c index b88ed8e26..0678c53d6 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -862,9 +862,13 @@ static void receive_needs(struct string_list *wanted_ns_refs) } else if (skip_prefix(line, "want ", ) && !get_sha1_hex(arg, sha1_buf)) { o = parse_object(sha1_buf); - if (!o) + if (!o) { + packet_write_fmt(1, +"ERR upload-pack: not our ref %s", +sha1_to_hex(sha1_buf)); die("git upload-pack: not our ref %s", sha1_to_hex(sha1_buf)); + } This somehow looks like a good thing to do even in production. Am I mistaken? Yes, that's true. If this patch set stalls (for whatever reason), I'll spin this off into an independent patch.
Re: [RFC 02/14] upload-pack: allow ref name and glob requests
On 01/26/2017 02:23 PM, Junio C Hamano wrote: Jonathan Tan <jonathanta...@google.com> writes: Currently, while performing packfile negotiation [1], upload-pack allows clients to specify their desired objects only as SHA-1s. This causes: (a) vulnerability to failure when an object turns non-existent during negotiation, which may happen if, for example, upload-pack is provided by multiple Git servers in a load-balancing arrangement, and (b) dependence on the server first publishing a list of refs with associated objects. To eliminate (a) and take a step towards eliminating (b), teach upload-pack to support requests in the form of ref names and globs (in addition to the existing support for SHA-1s) through a new line of the form "want-ref " where ref is the full name of a ref, a glob pattern, or a SHA-1. At the conclusion of negotiation, the server will write "wanted-ref " for all requests that have been specified this way. I am not sure if this "at the conclusion of" is sensible. It is OK to assume that what the client side has is fixed, and it is probably OK to desire that what the server side has can change, but at the same time, it feels quite fragile to move the goalpost in between. Do you have any specific concerns as to this fragility? Peff mentioned some concerns with the client making some decisions based on the initial SHA-1 vs the SHA-1 reported by "wanted-ref", to which I replied [1]. Stepping back a bit, in an environment that involves multiple server instances that have inconsistent set of refs, can the negotiation even be sensibly and safely implemented? The first server the client contacts may, in response to a "have", say "I do have that commit so you do not have to send its ancestors to me. We found one cut-off point. Please do explore other lines of histories." The next server that concludes the negotiation exchange may not have that commit and will be unable to produce a pack that excludes the objects reachable from that commit---wouldn't that become a problem? It's true that this patch set wouldn't solve this problem. This problem only occurs when there is a commit that the client knows but only a few of the servers know (maybe because the client just pushed it to one of them). If, for example, the client does not know a commit and only a few of the servers know it (for example, because another user just pushed it), this patch set does help. The latter scenario seems like it would occur relatively commonly. One way to prevent such a problem from hurting clients may be for these multiple server instances to coordinate and make sure they have a shared perception of the common history among them. Some pushes may have come to one instance but may not have propagated to other instances, and such a commit cannot be accepted as usable "have" if the servers anticipate that the final client request would go to any of the servers. Otherwise the multiple server arrangement would not work safely, methinks. And if the servers are ensuring the safety using such a mechanism, they can use the same mechanism to restrain "faster" instances from sending too fresh state of refs that other instances haven't caught up to, which would mean they can present a consistent set of refs to the client in the first place, no? So I am not sure if the mechanism to request history by refname instead of the tip commit would help the multi-server environment as advertised. It may help solving other problems, though (e.g. like "somebody pushed to update after the initial advertisement was sent out" which can happen even in a single server environment). This patch set would solve the problem you describe (whether in a single server environment or the coordination between multiple servers that provides "strong consistency"). (Although it may not be an important problem to solve, since it is probably OK if the client got a "slow" version of the state of the refs.) To be flexible with respect to client needs, the server does not indicate an error if a "want-ref" line corresponds to no refs, but instead relies on the client to ensure that what the user needs has been fetched. For example, a client could reasonably expand an abbreviated name "foo" to "want-ref foo", "want-ref refs/heads/foo", "want-ref refs/tags/foo", among others, and ensure that at least one such ref has been fetched. Cute. This may be one way to implement the DWIM thing within the constraint of eventually wanting to go to "client speaks first, the server does not advertise things the client is not interested in" world. But at the same time it may end up bloating the set of refs the client asks instead. Instead of receiving the advertisement and then sending one request after picking the matching one from it, t
Re: [RFC 00/14] Allow fetch-pack to send ref names (globs allowed)
Thanks for your comments. On 01/26/2017 03:00 PM, Jeff King wrote: On Wed, Jan 25, 2017 at 02:02:53PM -0800, Jonathan Tan wrote: Negotiation currently happens by upload-pack initially sending a list of refs with names and SHA-1 hashes, and then several request/response pairs in which the request from fetch-pack consists of SHA-1 hashes (selected from the initial list). Allowing the request to consist of names instead of SHA-1 hashes increases tolerance to refs changing (due to time, and due to having load-balanced servers without strong consistency), Interesting. My big question is: what happens when a ref _does_ change? How does the client handle this? The existing uploadpack.allowReachableSHA1InWant is there to work around the problem that an http client may get a ref advertisement in one step, and then come back later to do the want/have negotiation, at which point the server has moved on (or maybe it's even a different server). There the client says "I want sha1 X", and the server needs to say "well, X isn't my tip now, but it's still acceptable for you to fetch". But this seems to go in the opposite direction. After the advertisement, the client decides "OK, I want to fetch refs/heads/master which is at SHA1 X". It connects to the server and says "I want refs/heads/master". Let's say the server has moved its version of the ref to SHA1 Y. What happens? I think the server will say "wanted-ref master Y". Does the client just decide to use "Y" then? How does that interact with any decisions the client might have made about X? I guess things like fast-forwards have to be decided after we fetch the objects anyway (since we cannot compute them until we get the pack), so maybe there aren't any such decisions. I haven't checked. Yes, the server will say "wanted-ref master Y". The relevant code regarding the decisions the client makes regarding X or Y is in do_fetch in builtin/fetch.c. There, I can see these decisions done using X: - check_not_current_branch (forbidding fetching that modifies the current branch) (I just noticed that this has to be done for Y too, and will do so) - prune refs [*] - automatic tag following [*] [*] X and Y may differ in that one relevant ref appears in one set but not in the other (because a ref was added or removed in the meantime), causing a different result if these decisions were to be done using Y, but I think that it is OK either way. Fetch optimizations (for example, everything_local in fetch-pack.c) that check if the client really needs to fetch are also done using X, of course (and if the optimization succeeds, there is no Y). Fast-forwards (and everything else in store_updated_refs) are decided using Y. and is a step towards eliminating the need for the server to send the list of refs first (possibly improving performance). I'm not sure it is all that useful towards that end. You still have to break compatibility so that the client tells the server to suppress the ref advertisement. After that, it is just a question of asking for the refs. And you have two options: 1. Ask the server to tell you the values of some subset of the refs, pick what you want, and then do the want/have as normal. 2. Go straight to the want/have, but tell the server the refs you want instead of their sha1s. I think your approach here would lead to (2). But (1), besides being closer to how the protocol works now, seems like it's more flexible. I can ask about the ref state without necessarily having to retrieve the objects. How would you write git-ls-remote with such a system? Assuming a new protocol with the appropriate backwards compatibility (which would have to be done for both options), (2) does save a request/response pair (because we can send the ref names directly as "want-ref" in the initial request instead of sending ref names in the initial request and then confirming them using "want " in the subsequent request). Also, (2) is more tolerant towards refs changing over time. (1) might be closer to the current protocol, but I think that the difference is not so significant (only in "want-ref" vs "want" request and the "wanted-ref" response). As for git-ls-remote, I currently think that it would have to use the existing protocol. [1] There has been some discussion about whether the server should accept partial ref names, e.g. [2]. In this patch set, I have made the server only accept full names, and it is the responsibility of the client to send the multiple patterns which it wants to match. Quoting from the commit message of the second patch: For example, a client could reasonably expand an abbreviated name "foo" to "want-ref foo", "want-ref refs/heads/foo", "want-ref refs/tags/foo", among others, and ensure that at least one such ref has been fetc
Re: [RFC 12/14] fetch-pack: do not printf after closing stdout
On 01/25/2017 04:50 PM, Stefan Beller wrote: On Wed, Jan 25, 2017 at 2:03 PM, Jonathan Tan <jonathanta...@google.com> wrote: In fetch-pack, during a stateless RPC, printf is invoked after stdout is closed. Update the code to not do this, preserving the existing behavior. This seems to me as if it could go as an independent bugfix(?) or refactoring as this seems to be unclear from the code? The subsequent patches in this patch set are dependent on this patch, but it's true that this could be sent out on its own first. I'm not sure if bugfix is the right word, since the existing behavior is correct (except perhaps that we rely on the fact that printf after closing stdout does effectively nothing).
[RFC 06/14] fetch: refactor to make function args narrower
Refactor find_non_local_tags and get_ref_map to only take the information they need instead of the entire transport struct. Besides improving code clarity, this also improves their flexibility, which will be needed in a future patch. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fetch.c | 23 --- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 43e35c494..ae7c6daa1 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -220,7 +220,7 @@ static int will_fetch(struct ref **head, const unsigned char *sha1) return 0; } -static void find_non_local_tags(struct transport *transport, +static void find_non_local_tags(const struct ref *refs, struct ref **head, struct ref ***tail) { @@ -230,7 +230,7 @@ static void find_non_local_tags(struct transport *transport, struct string_list_item *item = NULL; for_each_ref(add_existing, _refs); - for (ref = transport_get_remote_refs(transport); ref; ref = ref->next) { + for (ref = refs; ref; ref = ref->next) { if (!starts_with(ref->name, "refs/tags/")) continue; @@ -302,7 +302,8 @@ static void find_non_local_tags(struct transport *transport, string_list_clear(_refs, 0); } -static struct ref *get_ref_map(struct transport *transport, +static struct ref *get_ref_map(const struct remote *remote, + const struct ref *remote_refs, struct refspec *refspecs, int refspec_count, int tags, int *autotags) { @@ -314,8 +315,6 @@ static struct ref *get_ref_map(struct transport *transport, /* opportunistically-updated references: */ struct ref *orefs = NULL, **oref_tail = - const struct ref *remote_refs = transport_get_remote_refs(transport); - struct string_list existing_refs = STRING_LIST_INIT_DUP; if (refspec_count) { @@ -355,8 +354,8 @@ static struct ref *get_ref_map(struct transport *transport, fetch_refspec = parse_fetch_refspec(refmap_nr, refmap_array); fetch_refspec_nr = refmap_nr; } else { - fetch_refspec = transport->remote->fetch; - fetch_refspec_nr = transport->remote->fetch_refspec_nr; + fetch_refspec = remote->fetch; + fetch_refspec_nr = remote->fetch_refspec_nr; } for (i = 0; i < fetch_refspec_nr; i++) @@ -365,7 +364,6 @@ static struct ref *get_ref_map(struct transport *transport, die("--refmap option is only meaningful with command-line refspec(s)."); } else { /* Use the defaults */ - struct remote *remote = transport->remote; struct branch *branch = branch_get(NULL); int has_merge = branch_has_merge_config(branch); if (remote && @@ -404,7 +402,7 @@ static struct ref *get_ref_map(struct transport *transport, /* also fetch all tags */ get_fetch_map(remote_refs, tag_refspec, , 0); else if (tags == TAGS_DEFAULT && *autotags) - find_non_local_tags(transport, _map, ); + find_non_local_tags(remote_refs, _map, ); /* Now append any refs to be updated opportunistically: */ *tail = orefs; @@ -1083,6 +1081,7 @@ static int do_fetch(struct transport *transport, struct ref *ref_map; int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; + const struct ref *remote_refs; if (tags == TAGS_DEFAULT) { if (transport->remote->fetch_tags == 2) @@ -1101,7 +1100,9 @@ static int do_fetch(struct transport *transport, goto cleanup; } - ref_map = get_ref_map(transport, refs, ref_count, tags, ); + remote_refs = transport_get_remote_refs(transport); + ref_map = get_ref_map(transport->remote, remote_refs, refs, ref_count, + tags, ); if (!update_head_ok) check_not_current_branch(ref_map); @@ -1134,7 +1135,7 @@ static int do_fetch(struct transport *transport, if (tags == TAGS_DEFAULT && autotags) { struct ref **tail = _map; ref_map = NULL; - find_non_local_tags(transport, _map, ); + find_non_local_tags(remote_refs, _map, ); if (ref_map) backfill_tags(transport, ref_map); free_refs(ref_map); -- 2.11.0.483.g087da7b7c-goog
[RFC 10/14] fetch-pack: support partial names and globs
Teach fetch-pack to support partial ref names and ref patterns as input. This does not use "want-ref" yet - support for that will be added in a future patch. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fetch-pack.c | 40 - remote.c | 55 +++ remote.h | 16 +++ t/t5500-fetch-pack.sh | 38 +++ 4 files changed, 122 insertions(+), 27 deletions(-) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index a18fd0c44..5f14242ae 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -11,32 +11,12 @@ static const char fetch_pack_usage[] = "[--include-tag] [--upload-pack=] [--depth=] " "[--no-progress] [--diag-url] [-v] [:] [...]"; -static void add_sought_entry(const struct ref ***sought, int *nr, int *alloc, +static void add_sought_entry(struct refspec **sought, int *nr, int *alloc, const char *name) { - struct ref *ref; - struct object_id oid; - - if (!get_oid_hex(name, )) { - if (name[GIT_SHA1_HEXSZ] == ' ') { - /* , find refname */ - name += GIT_SHA1_HEXSZ + 1; - } else if (name[GIT_SHA1_HEXSZ] == '\0') { - ; /* , leave sha1 as name */ - } else { - /* , clear cruft from oid */ - oidclr(); - } - } else { - /* , clear cruft from get_oid_hex */ - oidclr(); - } - - ref = alloc_ref(name); - oidcpy(>old_oid, ); (*nr)++; ALLOC_GROW(*sought, *nr, *alloc); - (*sought)[*nr - 1] = ref; + parse_ref_or_pattern(&(*sought)[*nr - 1], name); } int cmd_fetch_pack(int argc, const char **argv, const char *prefix) @@ -44,8 +24,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) int i, ret; struct ref *ref = NULL; const char *dest = NULL; - const struct ref **sought = NULL; + struct refspec *sought = NULL; int nr_sought = 0, alloc_sought = 0; + const struct ref **sought_refs; + int nr_sought_refs; int fd[2]; char *pack_lockfile = NULL; char **pack_lockfile_ptr = NULL; @@ -195,8 +177,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) return args.diag_url ? 0 : 1; } get_remote_heads(fd[0], NULL, 0, , 0, NULL, ); + get_ref_array(_refs, _sought_refs, ref, sought, nr_sought); - ref = fetch_pack(, fd, conn, ref, dest, sought, nr_sought, + ref = fetch_pack(, fd, conn, ref, dest, sought_refs, nr_sought_refs, , pack_lockfile_ptr); if (pack_lockfile) { printf("lock %s\n", pack_lockfile); @@ -222,12 +205,15 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) */ for (i = 0; i < nr_sought; i++) { struct ref *r; - for (r = ref; r; r = r->next) - if (!sought[i] || refname_match(sought[i]->name, r->name)) + if (sought[i].pattern) + continue; /* patterns do not need to match anything */ + for (r = ref; r; r = r->next) { + if (refname_match(sought[i].src, r->name)) break; + } if (r) continue; - error("no such remote ref %s", sought[i]->name); + error("no such remote ref %s", sought[i].src); ret = 1; } diff --git a/remote.c b/remote.c index 725a2d39a..08f3c910e 100644 --- a/remote.c +++ b/remote.c @@ -612,6 +612,39 @@ static struct refspec *parse_refspec_internal(int nr_refspec, const char **refsp die("Invalid refspec '%s'", refspec[i]); } +void parse_ref_or_pattern(struct refspec *refspec, const char *str) +{ + struct object_id oid; + memset(refspec, 0, sizeof(*refspec)); + + if (!get_oid_hex(str, )) { + if (str[GIT_SHA1_HEXSZ] == ' ') { + struct object_id oid2; + /* , find refname */ + refspec->src = xstrdup(str + GIT_SHA1_HEXSZ + 1); + if (!get_oid_hex(refspec->src, ) + && !oidcmp(, )) + /* The name is actually a SHA-1 */ + refspec->exact_sha1 = 1; + } else if (str[GIT_SHA1_HEXSZ] == '\0') { + ; /* , leave sha1 as name */ + refspec->src = xstrdup(str); + refspec->exact_sha1 = 1; +
[RFC 08/14] fetch-pack: check returned refs for matches
Instead of setting "matched" on matched refs, detect matches by checking the sought refs against the returned refs. Also, since the "matched" field in struct ref is now no longer used, remove it. This is the 2nd of 3 patches to eliminate using input refs to communicate information obtained by the fetch mechanism. (There are possibly ways more efficient than a nested for loop to accomplish this. However, since a subsequent patch will compare the user-provided refspecs against the fetched refs directly, and thus necessitating the nested for loop anyway, I decided to use the nested for loop in this patch.) Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fetch-pack.c | 7 ++- fetch-pack.c | 9 ++--- fetch-pack.h | 2 -- remote.h | 3 +-- 4 files changed, 13 insertions(+), 8 deletions(-) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index cfe9e447c..f31f874a0 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -4,6 +4,7 @@ #include "remote.h" #include "connect.h" #include "sha1-array.h" +#include "refs.h" static const char fetch_pack_usage[] = "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] " @@ -220,7 +221,11 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) * an error. */ for (i = 0; i < nr_sought; i++) { - if (!sought[i] || sought[i]->matched) + struct ref *r; + for (r = ref; r; r = r->next) + if (!sought[i] || refname_match(sought[i]->name, r->name)) + break; + if (r) continue; error("no such remote ref %s", sought[i]->name); ret = 1; diff --git a/fetch-pack.c b/fetch-pack.c index 9a87ddd3d..d4642b05c 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -562,6 +562,7 @@ static void filter_refs(struct fetch_pack_args *args, struct ref **newtail = struct ref *ref, *next; int i; + int *matched = xcalloc(nr_sought, sizeof(*matched)); i = 0; for (ref = *refs; ref; ref = next) { @@ -578,7 +579,7 @@ static void filter_refs(struct fetch_pack_args *args, break; /* definitely do not have it */ else if (cmp == 0) { keep = 1; /* definitely have it */ - sought[i]->matched = 1; + matched[i] = 1; } i++; } @@ -604,19 +605,21 @@ static void filter_refs(struct fetch_pack_args *args, unsigned char sha1[20]; ref = sought[i]; - if (ref->matched) + if (matched[i]) continue; if (get_sha1_hex(ref->name, sha1) || ref->name[40] != '\0' || hashcmp(sha1, ref->old_oid.hash)) continue; - ref->matched = 1; + matched[i] = 1; *newtail = copy_ref(ref); newtail = &(*newtail)->next; } } *refs = newlist; + + free(matched); } static void mark_alternate_complete(const struct ref *ref, void *unused) diff --git a/fetch-pack.h b/fetch-pack.h index c912e3d32..76f7c719c 100644 --- a/fetch-pack.h +++ b/fetch-pack.h @@ -33,8 +33,6 @@ struct fetch_pack_args { /* * sought represents remote references that should be updated from. - * On return, the names that were found on the remote will have been - * marked as such. */ struct ref *fetch_pack(struct fetch_pack_args *args, int fd[], struct child_process *conn, diff --git a/remote.h b/remote.h index 57d059431..2f7f23d47 100644 --- a/remote.h +++ b/remote.h @@ -89,8 +89,7 @@ struct ref { force:1, forced_update:1, expect_old_sha1:1, - deletion:1, - matched:1; + deletion:1; /* * Order is important here, as we write to FETCH_HEAD -- 2.11.0.483.g087da7b7c-goog
[RFC 09/14] transport: put ref oid in out param
Return new OID information obtained through fetching in new structs instead of reusing the existing ones. With this change, the input structs are no longer used for output, and are thus marked const. This is the 3rd of 3 patches to eliminate using input refs to communicate information obtained by the fetch mechanism. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/clone.c | 14 -- builtin/fetch-pack.c | 4 ++-- fetch-pack.c | 26 +++--- fetch-pack.h | 2 +- transport-helper.c | 34 +++--- transport.c | 6 +++--- transport.h | 13 + 7 files changed, 61 insertions(+), 38 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 0135c5f1c..3191da391 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -858,6 +858,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) struct refspec *refspec; const char *fetch_pattern; + struct ref *new_remote_refs = NULL; + packet_trace_identity("clone"); argc = parse_options(argc, argv, prefix, builtin_clone_options, builtin_clone_usage, 0); @@ -1075,8 +1077,15 @@ int cmd_clone(int argc, const char **argv, const char *prefix) break; } - if (!is_local && !complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs, NULL); + if (!is_local && !complete_refs_before_fetch) { + transport_fetch_refs(transport, mapped_refs, +_remote_refs); + if (new_remote_refs) { + refs = new_remote_refs; + free_refs(mapped_refs); + mapped_refs = wanted_peer_refs(refs, refspec); + } + } remote_head = find_ref_by_name(refs, "HEAD"); remote_head_points_at = @@ -1148,5 +1157,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix) junk_mode = JUNK_LEAVE_ALL; free(refspec); + free_refs(new_remote_refs); return err; } diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index f31f874a0..a18fd0c44 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -11,7 +11,7 @@ static const char fetch_pack_usage[] = "[--include-tag] [--upload-pack=] [--depth=] " "[--no-progress] [--diag-url] [-v] [:] [...]"; -static void add_sought_entry(struct ref ***sought, int *nr, int *alloc, +static void add_sought_entry(const struct ref ***sought, int *nr, int *alloc, const char *name) { struct ref *ref; @@ -44,7 +44,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) int i, ret; struct ref *ref = NULL; const char *dest = NULL; - struct ref **sought = NULL; + const struct ref **sought = NULL; int nr_sought = 0, alloc_sought = 0; int fd[2]; char *pack_lockfile = NULL; diff --git a/fetch-pack.c b/fetch-pack.c index d4642b05c..8cc85c19f 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -52,6 +52,10 @@ static int non_common_revs, multi_ack, use_sideband; #define ALLOW_REACHABLE_SHA1 02 static unsigned int allow_unadvertised_object_request; +/* An arbitrary non-NULL non-const pointer to assign to the util field of + * string list items when we need one. */ +#define ARBITRARY (_unpack_limit) + __attribute__((format (printf, 2, 3))) static inline void print_verbose(const struct fetch_pack_args *args, const char *fmt, ...) @@ -556,7 +560,7 @@ static void mark_recent_complete_commits(struct fetch_pack_args *args, static void filter_refs(struct fetch_pack_args *args, struct ref **refs, - struct ref **sought, int nr_sought) + const struct ref **sought, int nr_sought) { struct ref *newlist = NULL; struct ref **newtail = @@ -604,16 +608,16 @@ static void filter_refs(struct fetch_pack_args *args, for (i = 0; i < nr_sought; i++) { unsigned char sha1[20]; - ref = sought[i]; + const struct ref *sref = sought[i]; if (matched[i]) continue; - if (get_sha1_hex(ref->name, sha1) || - ref->name[40] != '\0' || - hashcmp(sha1, ref->old_oid.hash)) + if (get_sha1_hex(sref->name, sha1) || + sref->name[40] != '\0' || + hashcmp(sha1, sref->old_oid.hash)) c
[RFC 14/14] DONT USE advertise_ref_in_want=1
--- t/t5500-fetch-pack.sh | 2 ++ upload-pack.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh index 18fe23c97..f39dbcab8 100755 --- a/t/t5500-fetch-pack.sh +++ b/t/t5500-fetch-pack.sh @@ -551,6 +551,7 @@ test_expect_success 'fetch-pack can fetch refs using a partial name' ' git init server && ( cd server && + git config uploadpack.advertiseRefInWant false test_commit 1 && test_commit 2 && git checkout -b one @@ -587,6 +588,7 @@ test_expect_success 'fetch-pack can fetch refs using a glob' ' git init server && ( cd server && + git config uploadpack.advertiseRefInWant false test_commit 1 && test_commit 2 && git checkout -b ona && diff --git a/upload-pack.c b/upload-pack.c index 0678c53d6..4998a8c7e 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -62,7 +62,7 @@ static int use_sideband; static int advertise_refs; static int stateless_rpc; static const char *pack_objects_hook; -static int advertise_ref_in_want; +static int advertise_ref_in_want = 1; static void reset_timeout(void) { -- 2.11.0.483.g087da7b7c-goog
[RFC 07/14] fetch-pack: put shallow info in out param
Expand the transport fetch method signature, by adding an out parameter, to allow transports to return information about the refs they have fetched. Then communicate shallow status information through this mechanism instead of by modifying the input list of refs. This does require clients to sometimes generate the ref map twice: one generation from the list of refs provided by the remote (as is currently done) and potentially one regeneration from the new list of refs that the fetch mechanism provides (added in this patch). The double generation may result in duplicate error messages when a remote-tracking branch seems to track more than one remote branch. This is the 1st of 3 patches to eliminate using input refs to communicate information obtained by the fetch mechanism. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/clone.c| 4 ++-- builtin/fetch.c| 23 +++ fetch-pack.c | 18 +++--- remote.c | 12 +++- remote.h | 1 + t/t5536-fetch-conflicts.sh | 2 ++ transport-helper.c | 5 +++-- transport.c| 32 ++-- transport.h| 12 ++-- 9 files changed, 85 insertions(+), 24 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 5ef81927a..0135c5f1c 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1076,7 +1076,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } if (!is_local && !complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs); + transport_fetch_refs(transport, mapped_refs, NULL); remote_head = find_ref_by_name(refs, "HEAD"); remote_head_points_at = @@ -1115,7 +1115,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) if (is_local) clone_local(path, git_dir); else if (refs && complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs); + transport_fetch_refs(transport, mapped_refs, NULL); update_remote_refs(refs, mapped_refs, remote_head_points_at, branch_top.buf, reflog_msg.buf, transport, !is_local); diff --git a/builtin/fetch.c b/builtin/fetch.c index ae7c6daa1..19e3f40a0 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -911,11 +911,13 @@ static int quickfetch(struct ref *ref_map) return check_connected(iterate_ref_map, , ); } -static int fetch_refs(struct transport *transport, struct ref *ref_map) +static int fetch_refs(struct transport *transport, struct ref *ref_map, + struct ref **updated_remote_refs) { int ret = quickfetch(ref_map); if (ret) - ret = transport_fetch_refs(transport, ref_map); + ret = transport_fetch_refs(transport, ref_map, + updated_remote_refs); if (ret) transport_unlock_pack(transport); return ret; @@ -1066,7 +1068,7 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL); transport_set_option(transport, TRANS_OPT_DEPTH, "0"); transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL); - if (!fetch_refs(transport, ref_map)) + if (!fetch_refs(transport, ref_map, NULL)) consume_refs(transport, ref_map); if (gsecondary) { @@ -1082,6 +1084,7 @@ static int do_fetch(struct transport *transport, int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; const struct ref *remote_refs; + struct ref *new_remote_refs = NULL; if (tags == TAGS_DEFAULT) { if (transport->remote->fetch_tags == 2) @@ -1123,7 +1126,19 @@ static int do_fetch(struct transport *transport, transport->url); } } - if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) { + if (fetch_refs(transport, ref_map, _remote_refs)) { + free_refs(ref_map); + retcode = 1; + goto cleanup; + } + if (new_remote_refs) { + free_refs(ref_map); + ref_map = get_ref_map(transport->remote, new_remote_refs, + refs, ref_count, tags, ); + free_refs(new_remote_refs); + } + + if (consume_refs(transport, ref_map)) { free_refs(ref_map); retcode = 1; goto cleanup; diff --git a/fetch-pack.c b/fetch-pack.c index 601f0779a..9a87ddd3d 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -974,12 +974,13 @@ static int remove_duplicates_in_refs(struct
[RFC 11/14] fetch-pack: support want-ref
Teach fetch-pack to use the want-ref mechanism whenever the server advertises it. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fetch-pack.c | 5 +- fetch-pack.c | 173 -- fetch-pack.h | 2 + t/t5500-fetch-pack.sh | 42 transport.c | 2 +- 5 files changed, 175 insertions(+), 49 deletions(-) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 5f14242ae..ae073ab24 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -179,8 +179,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) get_remote_heads(fd[0], NULL, 0, , 0, NULL, ); get_ref_array(_refs, _sought_refs, ref, sought, nr_sought); - ref = fetch_pack(, fd, conn, ref, dest, sought_refs, nr_sought_refs, -, pack_lockfile_ptr); + ref = fetch_pack(, fd, conn, ref, dest, sought, nr_sought, +sought_refs, nr_sought_refs, , +pack_lockfile_ptr); if (pack_lockfile) { printf("lock %s\n", pack_lockfile); fflush(stdout); diff --git a/fetch-pack.c b/fetch-pack.c index 8cc85c19f..02149c930 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -219,11 +219,19 @@ static void consume_shallow_list(struct fetch_pack_args *args, int fd) } } -static enum ack_type get_ack(int fd, unsigned char *result_sha1) +/* + * Reads an ACK or NAK from fd. If wanted_ref_tail is not NULL, also accepts + * any "wanted-ref" lines before that ACK or NAK, writing them to + * wanted_ref_tail. + */ +static enum ack_type get_ack(int fd, unsigned char *result_sha1, +struct ref ***wanted_ref_tail) { int len; - char *line = packet_read_line(fd, ); + char *line; const char *arg; +start: + line = packet_read_line(fd, ); if (!len) die(_("git fetch-pack: expected ACK/NAK, got EOF")); @@ -244,7 +252,19 @@ static enum ack_type get_ack(int fd, unsigned char *result_sha1) return ACK; } } - die(_("git fetch-pack: expected ACK/NAK, got '%s'"), line); + if (wanted_ref_tail) { + struct object_id oid; + if (skip_prefix(line, "wanted-ref ", ) && + !get_sha1_hex(arg, oid.hash) && arg[40] == ' ' && arg[41]) { + struct ref *ref = alloc_ref(arg + 41); + oidcpy(>old_oid, ); + **wanted_ref_tail = ref; + *wanted_ref_tail = >next; + goto start; + } + die(_("git fetch_pack: expected ACK/NAK or wanted-ref, got '%s'"), line); + } + die(_("git fetch_pack: expected ACK/NAK, got '%s'"), line); } static void send_request(struct fetch_pack_args *args, @@ -282,29 +302,55 @@ static int next_flush(struct fetch_pack_args *args, int count) return count; } -static int find_common(struct fetch_pack_args *args, - int fd[2], unsigned char *result_sha1, - struct ref *refs) +static void write_capabilities(struct strbuf *sb, + const struct fetch_pack_args *args) { - int fetching; - int count = 0, flushes = 0, flush_at = INITIAL_FLUSH, retval; - const unsigned char *sha1; - unsigned in_vain = 0; - int got_continue = 0; - int got_ready = 0; - struct strbuf req_buf = STRBUF_INIT; - size_t state_len = 0; + if (multi_ack == 2) strbuf_addstr(sb, " multi_ack_detailed"); + if (multi_ack == 1) strbuf_addstr(sb, " multi_ack"); + if (no_done)strbuf_addstr(sb, " no-done"); + if (use_sideband == 2) strbuf_addstr(sb, " side-band-64k"); + if (use_sideband == 1) strbuf_addstr(sb, " side-band"); + if (args->deepen_relative) strbuf_addstr(sb, " deepen-relative"); + if (args->use_thin_pack) strbuf_addstr(sb, " thin-pack"); + if (args->no_progress) strbuf_addstr(sb, " no-progress"); + if (args->include_tag) strbuf_addstr(sb, " include-tag"); + if (prefer_ofs_delta) strbuf_addstr(sb, " ofs-delta"); + if (deepen_since_ok)strbuf_addstr(sb, " deepen-since"); + if (deepen_not_ok) strbuf_addstr(sb, " deepen-not"); + if (agent_supported)strbuf_addf(sb, " agent=%s", + git_user_agent_sanitized()); +} - if (args->stateless_rpc && multi_ack == 1) - die(_("--stateless-rpc requires multi_ack_detailed")); - if (marked) -
[RFC 03/14] upload-pack: test negotiation with changing repo
Make upload-pack report "not our ref" errors to the client. (If not, the client would be left waiting for a response when the server is already dead.) Add tests to check the behavior of upload-pack and fetch-pack when upload-pack is served from a changing repository (for example, when different servers in a load-balancing agreement participate in the same stateless RPC negotiation). This forms a baseline of comparison to the ref-in-want functionality (which will be introduced in subsequent commits), and ensures that subsequent commits do not change existing behavior. As part of this effort, a mechanism to substitute strings in an HTTP response only on the first invocation is added. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- t/lib-httpd.sh | 1 + t/lib-httpd/apache.conf| 8 t/lib-httpd/one-time-sed.sh| 8 t/t5552-upload-pack-ref-in-want.sh | 91 ++ upload-pack.c | 6 ++- 5 files changed, 113 insertions(+), 1 deletion(-) create mode 100644 t/lib-httpd/one-time-sed.sh diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index 435a37465..84f8efdd4 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -132,6 +132,7 @@ prepare_httpd() { cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH" install_script broken-smart-http.sh install_script error.sh + install_script one-time-sed.sh ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules" diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf index 69174c6e3..ef218ff15 100644 --- a/t/lib-httpd/apache.conf +++ b/t/lib-httpd/apache.conf @@ -106,9 +106,14 @@ Alias /auth/dumb/ www/auth/dumb/ SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} SetEnv GIT_HTTP_EXPORT_ALL + + SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH} + SetEnv GIT_HTTP_EXPORT_ALL + ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1 ScriptAlias /broken_smart/ broken-smart-http.sh/ ScriptAlias /error/ error.sh/ +ScriptAliasMatch /one_time_sed/(.*) one-time-sed.sh/$1 Options FollowSymlinks @@ -118,6 +123,9 @@ ScriptAlias /error/ error.sh/ Options ExecCGI + + Options ExecCGI + Options ExecCGI diff --git a/t/lib-httpd/one-time-sed.sh b/t/lib-httpd/one-time-sed.sh new file mode 100644 index 0..060ec0300 --- /dev/null +++ b/t/lib-httpd/one-time-sed.sh @@ -0,0 +1,8 @@ +#!/bin/sh + +if [ -e one-time-sed ]; then + "$GIT_EXEC_PATH/git-http-backend" | sed "$(cat one-time-sed)" + rm one-time-sed +else + "$GIT_EXEC_PATH/git-http-backend" +fi diff --git a/t/t5552-upload-pack-ref-in-want.sh b/t/t5552-upload-pack-ref-in-want.sh index 3496af641..80cf2263a 100755 --- a/t/t5552-upload-pack-ref-in-want.sh +++ b/t/t5552-upload-pack-ref-in-want.sh @@ -292,4 +292,95 @@ test_expect_success 'hideRefs with namespaces' ' check_output ' +. "$TEST_DIRECTORY"/lib-httpd.sh +start_httpd + +REPO="$HTTPD_DOCUMENT_ROOT_PATH/repo" +LOCAL_PRISTINE="$(pwd)/local_pristine" + +test_expect_success 'setup repos for change-while-negotiating test' ' + ( + git init "$REPO" && + cd "$REPO" && + >.git/git-daemon-export-ok && + test_commit m1 && + git tag -d m1 && + + # Local repo with many commits (so that negotiation will take + # more than 1 request/response pair) + git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo; "$LOCAL_PRISTINE" && + cd "$LOCAL_PRISTINE" && + git checkout -b side && + for i in $(seq 1 33); do test_commit s$i; done && + + # Add novel commits to upstream + git checkout master && + cd "$REPO" && + test_commit m2 && + test_commit m3 && + git tag -d m2 m3 + ) && + git -C "$LOCAL_PRISTINE" remote set-url origin "http://127.0.0.1:$LIB_HTTPD_PORT/one_time_sed/repo; +' + +inconsistency() { + # Simulate that the server initially reports $2 as the ref + # corresponding to $1, and after that, $1 as the ref corresponding to + # $1. This corresponds to the real-life situation where the server's + # repository appears to change during negotiation, for example, when + # different servers in a load-balancing arrangement serve (stateless) + # RPCs during a single negotiation. + printf "s/%s/%s/" \ + $(git -C "$REPO" rev-parse $1 | tr -d "\n") \ + $(git -C "$REPO" rev-parse $2 | tr -d "\n") \ +
[RFC 05/14] fetch: refactor fetch_refs into two functions
Refactor the fetch_refs function into a function that does the fetching of refs and another function that stores them. This prepares for a future patch where some processing may be done between those tasks. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fetch.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index c71d5eb9b..43e35c494 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -918,10 +918,16 @@ static int fetch_refs(struct transport *transport, struct ref *ref_map) int ret = quickfetch(ref_map); if (ret) ret = transport_fetch_refs(transport, ref_map); - if (!ret) - ret |= store_updated_refs(transport->url, - transport->remote->name, - ref_map); + if (ret) + transport_unlock_pack(transport); + return ret; +} + +static int consume_refs(struct transport *transport, struct ref *ref_map) +{ + int ret = store_updated_refs(transport->url, +transport->remote->name, +ref_map); transport_unlock_pack(transport); return ret; } @@ -1062,7 +1068,8 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL); transport_set_option(transport, TRANS_OPT_DEPTH, "0"); transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL); - fetch_refs(transport, ref_map); + if (!fetch_refs(transport, ref_map)) + consume_refs(transport, ref_map); if (gsecondary) { transport_disconnect(gsecondary); @@ -1115,7 +1122,7 @@ static int do_fetch(struct transport *transport, transport->url); } } - if (fetch_refs(transport, ref_map)) { + if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) { free_refs(ref_map); retcode = 1; goto cleanup; -- 2.11.0.483.g087da7b7c-goog
[RFC 12/14] fetch-pack: do not printf after closing stdout
In fetch-pack, during a stateless RPC, printf is invoked after stdout is closed. Update the code to not do this, preserving the existing behavior. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fetch-pack.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index ae073ab24..24af3b7c5 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -191,10 +191,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) printf("connectivity-ok\n"); fflush(stdout); } - close(fd[0]); - close(fd[1]); - if (finish_connect(conn)) - return 1; + if (finish_connect(conn)) { + ret = 1; + goto cleanup; + } ret = !ref; @@ -218,11 +218,17 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) ret = 1; } + if (args.stateless_rpc) + goto cleanup; + while (ref) { printf("%s %s\n", oid_to_hex(>old_oid), ref->name); ref = ref->next; } +cleanup: + close(fd[0]); + close(fd[1]); return ret; } -- 2.11.0.483.g087da7b7c-goog
[RFC 13/14] fetch: send want-ref and receive fetched refs
Teach fetch to send refspecs to the underlying transport, and teach all components used by the HTTP transport (remote-curl, transport-helper, fetch-pack) to understand and propagate the names and SHA-1s of the refs fetched. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/clone.c| 4 +- builtin/fetch-pack.c | 8 ++- builtin/fetch.c| 100 ++--- remote-curl.c | 91 - t/t5552-upload-pack-ref-in-want.sh | 4 +- transport-helper.c | 74 +-- transport.c| 10 ++-- transport.h| 20 +--- 8 files changed, 229 insertions(+), 82 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 3191da391..765a3a3b6 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1078,7 +1078,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } if (!is_local && !complete_refs_before_fetch) { - transport_fetch_refs(transport, mapped_refs, + transport_fetch_refs(transport, NULL, 0, mapped_refs, _remote_refs); if (new_remote_refs) { refs = new_remote_refs; @@ -1124,7 +1124,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) if (is_local) clone_local(path, git_dir); else if (refs && complete_refs_before_fetch) - transport_fetch_refs(transport, mapped_refs, NULL); + transport_fetch_refs(transport, NULL, 0, mapped_refs, NULL); update_remote_refs(refs, mapped_refs, remote_head_points_at, branch_top.buf, reflog_msg.buf, transport, !is_local); diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index 24af3b7c5..ed1608c12 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -35,6 +35,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) struct fetch_pack_args args; struct sha1_array shallow = SHA1_ARRAY_INIT; struct string_list deepen_not = STRING_LIST_INIT_DUP; + int always_print_refs = 0; packet_trace_identity("fetch-pack"); @@ -126,6 +127,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) args.update_shallow = 1; continue; } + if (!strcmp("--always-print-refs", arg)) { + always_print_refs = 1; + continue; + } usage(fetch_pack_usage); } if (deepen_not.nr) @@ -218,7 +223,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) ret = 1; } - if (args.stateless_rpc) + if (args.stateless_rpc && !always_print_refs) goto cleanup; while (ref) { @@ -226,6 +231,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) oid_to_hex(>old_oid), ref->name); ref = ref->next; } + fflush(stdout); cleanup: close(fd[0]); diff --git a/builtin/fetch.c b/builtin/fetch.c index 19e3f40a0..87de00e49 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -302,10 +302,75 @@ static void find_non_local_tags(const struct ref *refs, string_list_clear(_refs, 0); } +static void get_effective_refspecs(struct refspec **e_rs, int *e_rs_nr, + const struct remote *remote, + const struct refspec *cli_rs, int cli_rs_nr, + int tags, int *autotags) +{ + static struct refspec head_refspec; + + const struct refspec *base_rs; + int base_rs_nr; + struct branch *merge_branch = NULL; + int i; + + struct refspec *rs; + int nr, alloc; + + if (cli_rs_nr) { + base_rs = cli_rs; + base_rs_nr = cli_rs_nr; + } else if (refmap_array) { + die("--refmap option is only meaningful with command-line refspec(s)."); + } else { + /* Use the defaults */ + struct branch *branch = branch_get(NULL); + int has_merge = branch_has_merge_config(branch); + /* Note: has_merge implies non-NULL branch->remote_name */ + if (has_merge && !strcmp(branch->remote_name, remote->name)) + /* +* if the remote we're fetching from is the same +* as given in branch..remote, we add the +* ref given in branch..merge, too. +*/ +
[RFC 00/14] Allow fetch-pack to send ref names (globs allowed)
Hello everyone - this is a proposal for a protocol change to allow the fetch-pack/upload-pack to converse in terms of ref names (globs allowed), and also an implementation of the server (upload-pack) and fetch-from-HTTP client (fetch-pack invoked through fetch). Negotiation currently happens by upload-pack initially sending a list of refs with names and SHA-1 hashes, and then several request/response pairs in which the request from fetch-pack consists of SHA-1 hashes (selected from the initial list). Allowing the request to consist of names instead of SHA-1 hashes increases tolerance to refs changing (due to time, and due to having load-balanced servers without strong consistency), and is a step towards eliminating the need for the server to send the list of refs first (possibly improving performance). The protocol is extended by allowing fetch-pack to send "want-ref ", where is a full name (refs/...) [1], possibly including glob characters. Upload-pack will inform the client of the refs actually matched by sending "wanted-ref " before sending the last ACK or NAK. This patch set is laid out in this way: 1-2: Upload-pack, protocol documentation, tests that test upload-pack independently. A configuration option is added to control if the "ref-in-want" capability is advertised. (It is always supported even if not advertised - this is so that this feature in multiple load-balanced servers can be switched on or off without needing any atomic switching.) 3: Mechanism to test a repo that changes over the negotiation (currently, only with the existing mechanism). 4-9: The current transport mechanism takes in an array of refs which is used as both input and output. Since we are planning to extend the transport mechanism to also allow the specification of ref names (which may include globs, and thus do not have a 1-1 correspondence to refs), refactor to make this parameter to be solely an input parameter. 10-11: Changes to fetch-pack (which is used by remote-curl) to support "want-ref". 12-13: Changes to the rest (fetch -> transport -> transport-helper -> remote-curl) to support "want-ref" when fetching from HTTP. The transport fetch function signature has been widened to allow passing in ref names - transports may use those ref names instead of or in addition to refs if they support it. (I chose to preserve refs in the function signature because many parts of Git, including remote helpers, only understand the old functionality anyway, and because precalculating the refs allows some optimizations.) 14: This is not meant for submission - this is just to show that the tests pass if ref-in-want was advertised by default (except for some newly added tests that explicitly check for the old behavior). [1] There has been some discussion about whether the server should accept partial ref names, e.g. [2]. In this patch set, I have made the server only accept full names, and it is the responsibility of the client to send the multiple patterns which it wants to match. Quoting from the commit message of the second patch: For example, a client could reasonably expand an abbreviated name "foo" to "want-ref foo", "want-ref refs/heads/foo", "want-ref refs/tags/foo", among others, and ensure that at least one such ref has been fetched. [2] <20161024132932.i42rqn2vlpocq...@sigill.intra.peff.net> Jonathan Tan (14): upload-pack: move parsing of "want" line upload-pack: allow ref name and glob requests upload-pack: test negotiation with changing repo fetch: refactor the population of hashes fetch: refactor fetch_refs into two functions fetch: refactor to make function args narrower fetch-pack: put shallow info in out param fetch-pack: check returned refs for matches transport: put ref oid in out param fetch-pack: support partial names and globs fetch-pack: support want-ref fetch-pack: do not printf after closing stdout fetch: send want-ref and receive fetched refs DONT USE advertise_ref_in_want=1 Documentation/technical/http-protocol.txt | 20 +- Documentation/technical/pack-protocol.txt | 24 +- Documentation/technical/protocol-capabilities.txt | 6 + builtin/clone.c | 16 +- builtin/fetch-pack.c | 64 ++-- builtin/fetch.c | 178 +++--- fetch-pack.c | 226 + fetch-pack.h | 6 +- remote-curl.c | 91 +++-- remote.c | 67 +++- remote.h | 20 +- t/lib-httpd.sh| 1 + t/lib-httpd/apache.conf | 8 + t/lib-httpd/one-time-sed.sh
[RFC 04/14] fetch: refactor the population of hashes
Populate SHA-1 ref hashes in get_ref_map instead of do_fetch. Besides tightening scopes of variables in the code, this also prepares for a future patch where get_ref_map is called multiple times within do_fetch. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/fetch.c | 37 +++-- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index f1570e346..c71d5eb9b 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -316,6 +316,8 @@ static struct ref *get_ref_map(struct transport *transport, const struct ref *remote_refs = transport_get_remote_refs(transport); + struct string_list existing_refs = STRING_LIST_INIT_DUP; + if (refspec_count) { struct refspec *fetch_refspec; int fetch_refspec_nr; @@ -411,7 +413,23 @@ static struct ref *get_ref_map(struct transport *transport, tail = >next; } - return ref_remove_duplicates(ref_map); + ref_map = ref_remove_duplicates(ref_map); + + for_each_ref(add_existing, _refs); + for (rm = ref_map; rm; rm = rm->next) { + if (rm->peer_ref) { + struct string_list_item *peer_item = + string_list_lookup(_refs, + rm->peer_ref->name); + if (peer_item) { + struct object_id *old_oid = peer_item->util; + oidcpy(>peer_ref->old_oid, old_oid); + } + } + } + string_list_clear(_refs, 1); + + return ref_map; } #define STORE_REF_ERROR_OTHER 1 @@ -1055,14 +1073,10 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) static int do_fetch(struct transport *transport, struct refspec *refs, int ref_count) { - struct string_list existing_refs = STRING_LIST_INIT_DUP; struct ref *ref_map; - struct ref *rm; int autotags = (transport->remote->fetch_tags == 1); int retcode = 0; - for_each_ref(add_existing, _refs); - if (tags == TAGS_DEFAULT) { if (transport->remote->fetch_tags == 2) tags = TAGS_SET; @@ -1084,18 +1098,6 @@ static int do_fetch(struct transport *transport, if (!update_head_ok) check_not_current_branch(ref_map); - for (rm = ref_map; rm; rm = rm->next) { - if (rm->peer_ref) { - struct string_list_item *peer_item = - string_list_lookup(_refs, - rm->peer_ref->name); - if (peer_item) { - struct object_id *old_oid = peer_item->util; - oidcpy(>peer_ref->old_oid, old_oid); - } - } - } - if (tags == TAGS_DEFAULT && autotags) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1"); if (prune) { @@ -1132,7 +1134,6 @@ static int do_fetch(struct transport *transport, } cleanup: - string_list_clear(_refs, 1); return retcode; } -- 2.11.0.483.g087da7b7c-goog
[RFC 02/14] upload-pack: allow ref name and glob requests
Currently, while performing packfile negotiation [1], upload-pack allows clients to specify their desired objects only as SHA-1s. This causes: (a) vulnerability to failure when an object turns non-existent during negotiation, which may happen if, for example, upload-pack is provided by multiple Git servers in a load-balancing arrangement, and (b) dependence on the server first publishing a list of refs with associated objects. To eliminate (a) and take a step towards eliminating (b), teach upload-pack to support requests in the form of ref names and globs (in addition to the existing support for SHA-1s) through a new line of the form "want-ref " where ref is the full name of a ref, a glob pattern, or a SHA-1. At the conclusion of negotiation, the server will write "wanted-ref " for all requests that have been specified this way. The server indicates that it supports this feature by advertising the capability "ref-in-want". Advertisement of this capability is by default disabled, but can be enabled through a configuration option. To be flexible with respect to client needs, the server does not indicate an error if a "want-ref" line corresponds to no refs, but instead relies on the client to ensure that what the user needs has been fetched. For example, a client could reasonably expand an abbreviated name "foo" to "want-ref foo", "want-ref refs/heads/foo", "want-ref refs/tags/foo", among others, and ensure that at least one such ref has been fetched. [1] pack-protocol.txt Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/technical/http-protocol.txt | 20 +- Documentation/technical/pack-protocol.txt | 24 +- Documentation/technical/protocol-capabilities.txt | 6 + t/t5552-upload-pack-ref-in-want.sh| 295 ++ upload-pack.c | 89 ++- 5 files changed, 411 insertions(+), 23 deletions(-) create mode 100755 t/t5552-upload-pack-ref-in-want.sh diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt index 1c561bdd9..162d6bc07 100644 --- a/Documentation/technical/http-protocol.txt +++ b/Documentation/technical/http-protocol.txt @@ -316,7 +316,8 @@ to prevent caching of the response. Servers SHOULD support all capabilities defined here. -Clients MUST send at least one "want" command in the request body. +Clients MUST send at least one "want" or "want-ref" command in the +request body. Clients MUST NOT reference an id in a "want" command which did not appear in the response obtained through ref discovery unless the server advertises capability `allow-tip-sha1-in-want` or @@ -330,7 +331,7 @@ server advertises capability `allow-tip-sha1-in-want` or want_list = PKT-LINE(want NUL cap_list LF) *(want_pkt) want_pkt = PKT-LINE(want LF) - want = "want" SP id + want = "want" SP id / "want-ref" SP name cap_list = *(SP capability) SP have_list = *PKT-LINE("have" SP id LF) @@ -352,7 +353,8 @@ C: Build an empty set, `common`, to hold the objects that are later determined to be on both ends. C: Build a set, `want`, of the objects from `advertised` the client - wants to fetch, based on what it saw during ref discovery. + wants to fetch, based on what it saw during ref discovery. This is to + be used if the server does not support the ref-in-want capability. C: Start a queue, `c_pending`, ordered by commit time (popping newest first). Add all client refs. When a commit is popped from @@ -363,8 +365,8 @@ C: Start a queue, `c_pending`, ordered by commit time (popping newest C: Send one `$GIT_URL/git-upload-pack` request: - C: 0032want ... - C: 0032want ... + C: want-ref ... + C: want-ref ... C: 0032have . C: 0032have . @@ -384,7 +386,7 @@ the pkt-line value. Commands MUST appear in the following order, if they appear at all in the request stream: -* "want" +* "want" or "want-ref" * "have" The stream is terminated by a pkt-line flush (``). @@ -393,6 +395,9 @@ A single "want" or "have" command MUST have one hex formatted SHA-1 as its value. Multiple SHA-1s MUST be sent by sending multiple commands. +A "want-ref" command MUST be followed by a ref name (which may include +shell glob characters) or a hex formatted SHA-1. + The `have` list is created by popping the first 32 commits from `c_pending`. Less can be supplied if `c_pending` empties. @@ -410,6 +415,9 @@ Verify a
[RFC 01/14] upload-pack: move parsing of "want" line
Refactor to parse "want" lines when the prefix is found. This makes it easier to add support for another prefix, which will be done in a subsequent commit. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- upload-pack.c | 28 ++-- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index 7597ba340..15c60a204 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -793,8 +793,20 @@ static void receive_needs(void) deepen_rev_list = 1; continue; } - if (!skip_prefix(line, "want ", ) || - get_sha1_hex(arg, sha1_buf)) + if (skip_prefix(line, "want ", ) && + !get_sha1_hex(arg, sha1_buf)) { + o = parse_object(sha1_buf); + if (!o) + die("git upload-pack: not our ref %s", + sha1_to_hex(sha1_buf)); + if (!(o->flags & WANTED)) { + o->flags |= WANTED; + if (!((allow_unadvertised_object_request & ALLOW_ANY_SHA1) == ALLOW_ANY_SHA1 + || is_our_ref(o))) + has_non_tip = 1; + add_object_array(o, NULL, _obj); + } + } else die("git upload-pack: protocol error, " "expected to get sha, not '%s'", line); @@ -820,18 +832,6 @@ static void receive_needs(void) no_progress = 1; if (parse_feature_request(features, "include-tag")) use_include_tag = 1; - - o = parse_object(sha1_buf); - if (!o) - die("git upload-pack: not our ref %s", - sha1_to_hex(sha1_buf)); - if (!(o->flags & WANTED)) { - o->flags |= WANTED; - if (!((allow_unadvertised_object_request & ALLOW_ANY_SHA1) == ALLOW_ANY_SHA1 - || is_our_ref(o))) - has_non_tip = 1; - add_object_array(o, NULL, _obj); - } } /* -- 2.11.0.483.g087da7b7c-goog
Re: [PATCH 2/2] mailinfo: Understand forwarded patches
On 01/12/2017 01:20 AM, Matthew Wilcox wrote: From: Matthew WilcoxExtend the --scissors mechanism to strip off the preamble created by forwarding a patch. There are a couple of extra headers ("Sent" and "To") added by forwarding, but other than that, the --scissors option will now remove patches forwarded from Microsoft Outlook to a Linux email account. Signed-off-by: Matthew Wilcox Also add a test showing the kind of message that the current code doesn't handle, and that this commit addresses. --- mailinfo.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mailinfo.c b/mailinfo.c index 2059704a8..fc1275532 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -332,7 +332,7 @@ static void cleanup_subject(struct mailinfo *mi, struct strbuf *subject) #define MAX_HDR_PARSED 10 static const char *header[MAX_HDR_PARSED] = { - "From","Subject","Date", + "From","Subject","Date","Sent","To", Are these extra headers used in both the "real" e-mail headers and the in-body headers, or only one of them? (If the latter, they should probably be handled only in the relevant function - my previous patches to this file were in that direction too, if I remember correctly.) Also, I suspect that these will have to be handled differently to the other 3, but that will be clearer when you add the test with an example message. }; static inline int cmp_header(const struct strbuf *line, const char *hdr) @@ -685,6 +685,13 @@ static int is_scissors_line(const char *line) c++; continue; } + if (!memcmp(c, "Original Message", 16)) { 1) You can use starts_with or skip_prefix. 2) This seems vulnerable to false positives. If "Original Message" always follows a certain kind of line, it might be better to check for that. (Again, it will be clearer when we have an example message.) + in_perforation = 1; + perforation += 16; + scissors += 16; + c += 15; Why 15? Also, can skip_prefix avoid these magic numbers? + continue; + } in_perforation = 0; }
Re: [PATCH 1/2] mailinfo: Add support for keep_cr
On 01/12/2017 01:20 AM, Matthew Wilcox wrote: From: Matthew WilcoxIf you have a base-64 encoded patch with CRLF endings (as produced by forwarding a patch from Outlook to a Linux machine, for example), the keep_cr setting is not honoured because keep_cr is only passed to mailsplit, which does not look through the encoding. The keep_cr logic needs to be applied after the base64 decode. I copied that logic to handle_filter(), and rather than add a new keep_cr parameter to handle_filter, I opted to add keep_cr to struct mailinfo; it seemed appropriate given use_scissors was already there. Then I needed to initialise keep_cr in the struct mailinfo passed from git-am, and rather than thread a 'keep_cr' parameter all the way through to parse_mail(), I decided to add keep_cr to struct am_state, which let it be removed as a parameter from five other functions. Signed-off-by: Matthew Wilcox A test exercising the new functionality would be nice. Also, maybe a more descriptive title like "mailinfo: also respect keep_cr after base64 decode" (50 characters) is better. @@ -143,6 +144,7 @@ static void am_state_init(struct am_state *state, const char *dir) memset(state, 0, sizeof(*state)); + state->keep_cr = -1; Maybe query the git config here (instead of later) so that we never have to worry about state->keep_cr being neither 0 nor 1? This function already queries the git config anyway. diff --git a/mailinfo.h b/mailinfo.h index 04a25351d..9fddcf684 100644 --- a/mailinfo.h +++ b/mailinfo.h @@ -12,6 +12,7 @@ struct mailinfo { struct strbuf email; int keep_subject; int keep_non_patch_brackets_in_subject; + int keep_cr; int add_message_id; int use_scissors; int use_inbody_headers; I personally would write "unsigned keep_cr : 1" to further emphasize that this can only be 0 or 1, but I don't know if it's better to keep with the style existing in the file (that is, using int).
Re: [PATCH] mailinfo.c: move side-effects outside of assert
On 12/19/2016 12:38 PM, Kyle J. McKay wrote: On Dec 19, 2016, at 12:03, Jeff King wrote: On Sat, Dec 17, 2016 at 11:54:18AM -0800, Kyle J. McKay wrote: Since 6b4b013f18 (mailinfo: handle in-body header continuations, 2016-09-20, v2.11.0) mailinfo.c has contained new code with an assert of the form: assert(call_a_function(...)) Thanks for spotting this - I'm not sure how I missed that. This is obviously an improvement, but it makes me wonder if we should be doing: if (!check_header(mi, >inbody_header_accum, mi->s_hdr_data)) die("BUG: some explanation of why this can never happen"); which perhaps documents the intended assumptions more clearly. A comment regarding the side effects might also be helpful. I wondered exactly the same thing myself. I was hoping Jonathan would pipe in here with some analysis about whether this is: a) a super paranoid, just-in-case, can't really ever fail because by the time we get to this code we've already effectively validated everything that could cause check_header to return false in this case -or- b) Yeah, it could fail in the real world and it should "die" (and probably have a test added that triggers such death) -or- c) Actually, if check_header does return false we can keep going without problem -or- d) Actually, if check_header does return false we can keep going by making a minor change that should be in the patch I assume that since Jonathan added the code he will just know the answer as to which one it is and I won't have to rely on the results of my imaginary analysis. ;) The answer is "a". The only time that mi->inbody_header_accum is appended to is in check_inbody_header, and appending onto a blank mi->inbody_header_accum always happens when is_inbody_header is true (which guarantees a prefix that causes check_header to always return true). Peff's suggestion sounds reasonable to me, maybe with an error message like "BUG: inbody_header_accum, if not empty, must always contain a valid in-body header".
[PATCH] transport-helper: drop broken "unchanged" feature
Commit f8ec916 ("Allow helpers to report in "list" command that the ref is unchanged", 2009-11-17) allowed a remote helper to report that a ref is unchanged even if it did not know the contents of a ref. However, that commit also made Git wrongly assume that such a remote ref always has the contents of the local ref of the same name. (Git also cannot assume that the remote ref has the value of the current destination local ref, or any other ref, since the previous import/fetch could have been made using a different refspec.) Drop that assumption. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Here is a patch that would remove the assumption (if it is indeed wrong). Documentation/gitremote-helpers.txt | 3 +++ transport-helper.c | 43 - 2 files changed, 3 insertions(+), 43 deletions(-) diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt index 9e8681f..c862339 100644 --- a/Documentation/gitremote-helpers.txt +++ b/Documentation/gitremote-helpers.txt @@ -393,6 +393,9 @@ attributes are defined. 'unchanged':: This ref is unchanged since the last import or fetch, although the helper cannot necessarily determine what value that produced. + Git may import or fetch this ref anyway, because it does not + keep a record of the last values corresponding to the refs of a + specific remote. OPTIONS --- diff --git a/transport-helper.c b/transport-helper.c index 91aed35..6ab8e2f 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -392,9 +392,6 @@ static int fetch_with_fetch(struct transport *transport, for (i = 0; i < nr_heads; i++) { const struct ref *posn = to_fetch[i]; - if (posn->status & REF_STATUS_UPTODATE) - continue; - strbuf_addf(, "fetch %s %s\n", oid_to_hex(>old_oid), posn->symref ? posn->symref : posn->name); @@ -492,9 +489,6 @@ static int fetch_with_import(struct transport *transport, for (i = 0; i < nr_heads; i++) { posn = to_fetch[i]; - if (posn->status & REF_STATUS_UPTODATE) - continue; - strbuf_addf(, "import %s\n", posn->symref ? posn->symref : posn->name); sendline(data, ); @@ -531,8 +525,6 @@ static int fetch_with_import(struct transport *transport, for (i = 0; i < nr_heads; i++) { char *private, *name; posn = to_fetch[i]; - if (posn->status & REF_STATUS_UPTODATE) - continue; name = posn->symref ? posn->symref : posn->name; if (data->refspecs) private = apply_refspecs(data->refspecs, data->refspec_nr, name); @@ -649,21 +641,12 @@ static int fetch(struct transport *transport, int nr_heads, struct ref **to_fetch) { struct helper_data *data = transport->data; - int i, count; if (process_connect(transport, 0)) { do_take_over(transport); return transport->fetch(transport, nr_heads, to_fetch); } - count = 0; - for (i = 0; i < nr_heads; i++) - if (!(to_fetch[i]->status & REF_STATUS_UPTODATE)) - count++; - - if (!count) - return 0; - if (data->check_connectivity && data->transport_options.check_self_contained_and_connected) set_helper_option(transport, "check-connectivity", "true"); @@ -1009,23 +992,6 @@ static int push_refs(struct transport *transport, return -1; } - -static int has_attribute(const char *attrs, const char *attr) { - int len; - if (!attrs) - return 0; - - len = strlen(attr); - for (;;) { - const char *space = strchrnul(attrs, ' '); - if (len == space - attrs && !strncmp(attrs, attr, len)) - return 1; - if (!*space) - return 0; - attrs = space + 1; - } -} - static struct ref *get_refs_list(struct transport *transport, int for_push) { struct helper_data *data = transport->data; @@ -1067,15 +1033,6 @@ static struct ref *get_refs_list(struct transport *transport, int for_push) (*tail)->symref = xstrdup(buf.buf + 1); else if (buf.buf[0] != '?') get_oid_hex(buf.buf, &(*tail)->old_oid); - if (eon) { - if (has_attribute(eon + 1, "unchanged")) { - (*tail)->status |= REF_STATUS_UPTODATE; -
[BUG?] Remote helper's wrong assumption of unchanged ref
Commit f8ec916 ("Allow helpers to report in "list" command that the ref is unchanged", 2009-11-17) allowed a remote helper to report that a ref is unchanged even if it did not know the contents of a ref. However, that commit also made Git assume that such a remote ref has the contents of the local ref of the same name. If I'm not missing anything, this assumption seems wrong; the attached test illustrates one case of local edits being made after cloning with default parameters. The original e-mail thread [1] seems to indicate that this feature is meant for a remote helper with no Git-specific code (which is possible if it supports "import" but not "fetch" - in this case, it would not deal with SHA-1s at all) to nevertheless indicate "unchanged", most likely to support optimizations on the client side. But it seems to me that Git cannot perform this optimization. In other words, it should just ignore "unchanged". If this makes sense, I'll prepare a patch to do this. [1] "[PATCH 00/13] Native and foreign helpers" <alpine.lnx.2.00.0908050052390.2...@iabervon.org> --- git-remote-testgit.sh | 8 +++- t/t5801-remote-helpers.sh | 40 2 files changed, 47 insertions(+), 1 deletion(-) diff --git a/git-remote-testgit.sh b/git-remote-testgit.sh index 752c763..6357868 100755 --- a/git-remote-testgit.sh +++ b/git-remote-testgit.sh @@ -52,7 +52,13 @@ do echo ;; list) - git for-each-ref --format='? %(refname)' 'refs/heads/' + if test -n "$GIT_REMOTE_TESTGIT_UNCHANGED_BRANCH_REGEX" + then + git for-each-ref --format='? %(refname)' 'refs/heads/' | + sed "/${GIT_REMOTE_TESTGIT_UNCHANGED_BRANCH_REGEX}/s/$/ unchanged/" + else + git for-each-ref --format='? %(refname)' 'refs/heads/' + fi head=$(git symbolic-ref HEAD) echo "@$head HEAD" echo diff --git a/t/t5801-remote-helpers.sh b/t/t5801-remote-helpers.sh index 362b158..4a48f2b 100755 --- a/t/t5801-remote-helpers.sh +++ b/t/t5801-remote-helpers.sh @@ -301,4 +301,44 @@ test_expect_success 'fetch url' ' compare_refs server HEAD local FETCH_HEAD ' +test_expect_success 'setup remote repository and divergent clone' ' + git init s2 && + ( + cd s2 && + test_commit M1 && + git checkout -b mybranch && + test_commit B1 + ) && + git clone "testgit::${PWD}/s2" divergent && + + ( + cd divergent && + git checkout master && + test_commit M2 && + git checkout mybranch && + test_commit B2 + ) +' + +test_expect_success 'fetch with unchanged claims' ' + rm -rf local && + cp -r divergent local && + + # No unchanged branches + + GIT_REMOTE_TESTGIT_UNCHANGED_BRANCH_REGEX=ABCDE git -C local fetch && + compare_refs s2 M1 local refs/remotes/origin/master && + compare_refs s2 B1 local refs/remotes/origin/mybranch && + + # One unchanged branch + + GIT_REMOTE_TESTGIT_UNCHANGED_BRANCH_REGEX=mybranch git -C local fetch && + compare_refs s2 M1 local refs/remotes/origin/master && + + # I (Jonathan Tan) would expect refs/remotes/origin/mybranch to be B1, + # but it is B2. + test_must_fail compare_refs s2 B1 local refs/remotes/origin/mybranch && + compare_refs local B2 local refs/remotes/origin/mybranch +' + test_done -- 2.8.0.rc3.226.g39d4020
Re: What's cooking in git.git (Nov 2016, #05; Wed, 23)
On Mon, Nov 28, 2016 at 4:06 PM, Junio C Hamano <gits...@pobox.com> wrote: > Jonathan Tan <jonathanta...@google.com> writes: > >> On 11/23/2016 03:21 PM, Junio C Hamano wrote: >>> * jt/use-trailer-api-in-commands (2016-11-02) 6 commits >>> - sequencer: use trailer's trailer layout >>> - trailer: have function to describe trailer layout >>> - trailer: avoid unnecessary splitting on lines >>> - commit: make ignore_non_trailer take buf/len >>> - SQUASH??? >>> - trailer: be stricter in parsing separators >>> >>> Commands that operate on a log message and add lines to the trailer >>> blocks, such as "format-patch -s", "cherry-pick (-x|-s)", and >>> "commit -s", have been taught to use the logic of and share the >>> code with "git interpret-trailer". >>> >>> What's the doneness of this topic? >> >> Stefan Beller mentioned [1] that this seemed OK to him from a cursory >> read. Do I need to look for another reviewer (or a more thorough >> review)? > > I gave it a cursory review when it was queued, too, so another > cursory read does not help very much ;) If I recall correctly, I > got an impression that it was reasonably well done. > > I haven't had a chance to look at the series again to see if the > SQUASH is just the simple matter of squashing it into the one > previous, which is the main reason why I haven't decided if it is > ready to be in 'next'. > > Thanks. Sorry, I'm not sure what you mean by this. The SQUASH is an update in documentation (for a C function) (reproduced below [1]) which can be squashed cleanly (just to confirm, I tried it using "git rebase -i"). It can also be rebased cleanly onto master or next (at least, as of now). If you are asking about whether I think the contents of your SQUASH commit is reasonable, I think that it is. [1] $ git show 7e7587d commit 7e7587d563ddb540875075e5a84359a8a96a5188 Author: Junio C Hamano <gits...@pobox.com> Date: Wed Nov 2 19:28:53 2016 -0700 SQUASH??? diff --git a/trailer.c b/trailer.c index dc525e3..eefe91d 100644 --- a/trailer.c +++ b/trailer.c @@ -565,7 +565,9 @@ static int token_matches_item(const char *tok, struct arg_item *item, int tok_le /* * If the given line is of the form * "..." or "...", return the - * location of the separator. Otherwise, return -1. + * location of the separator. Otherwise, return -1. The optional whitespace + * is allowed there primarily to allow things like "Bug #43" where is + * "Bug" and is "#". * * The separator-starts-line case (in which this function returns 0) is * distinguished from the non-well-formed-line case (in which this function
Re: What's cooking in git.git (Nov 2016, #05; Wed, 23)
On 11/23/2016 03:21 PM, Junio C Hamano wrote: * jt/use-trailer-api-in-commands (2016-11-02) 6 commits - sequencer: use trailer's trailer layout - trailer: have function to describe trailer layout - trailer: avoid unnecessary splitting on lines - commit: make ignore_non_trailer take buf/len - SQUASH??? - trailer: be stricter in parsing separators Commands that operate on a log message and add lines to the trailer blocks, such as "format-patch -s", "cherry-pick (-x|-s)", and "commit -s", have been taught to use the logic of and share the code with "git interpret-trailer". What's the doneness of this topic? Stefan Beller mentioned [1] that this seemed OK to him from a cursory read. Do I need to look for another reviewer (or a more thorough review)? [1]
[PATCH] doc: mention user-configured trailers
In commit 1462450 ("trailer: allow non-trailers in trailer block", 2016-10-21), functionality was added (and tested [1]) to allow non-trailer lines in trailer blocks, as long as those blocks contain at least one Git-generated or user-configured trailer, and consists of at least 25% trailers. The documentation was updated to mention this new functionality, but did not mention "user-configured trailer". Further update the documentation to also mention "user-configured trailer". [1] "with non-trailer lines mixed with a configured trailer" in t/t7513-interpret-trailers.sh Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Yes, mentioning a trailer in a Git config will cause interpret-trailers to treat it similarly to a Git-generated trailer (in that its presence causes a block partially consisting of trailers to be considered a trailer block). See the commit message above for a test case that verifies that. I took a look at the documentation, and it wasn't completely documented, so here is a patch to correct that. Documentation/git-interpret-trailers.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index e99bda6..09074c7 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -49,7 +49,8 @@ will be added before the new trailer. Existing trailers are extracted from the input message by looking for a group of one or more lines that (i) are all trailers, or (ii) contains at -least one Git-generated trailer and consists of at least 25% trailers. +least one Git-generated or user-configured trailer and consists of at +least 25% trailers. The group must be preceded by one or more empty (or whitespace-only) lines. The group must either be at the end of the message or be the last non-whitespace lines before a line that starts with '---'. Such three -- 2.8.0.rc3.226.g39d4020
Re: [PATCH v3 5/6] grep: enable recurse-submodules to work on objects
On 11/14/2016 10:56 AM, Junio C Hamano wrote: Jonathan Tan <jonathanta...@google.com> writes: to: HEAD:file HEAD:sub/file Signed-off-by: Brandon Williams <bmw...@google.com> --- Unrelated tangent, but this makes readers wonder what the updated trailer code would do to the last paragraph ;-). Does it behave sensibly (with some sane definition of sensibleness)? I am guessing that it would, because neither To: or HEAD: is what we normally recognize as a known trailer block element. Yes, it behaves sensibly :-) because "Signed-off-by:" is preceded by a blank line, so the trailer block consists only of that line. Oh, that was not what I was wondering. Imagine Brandon writing his message that ends in these three questionable lines and then running "commit -s --amend" to add his sign-off---that was the case I was wondering. Ah, I see. In that case, it would consider the last block as a trailer block and attach it directly: to: HEAD:file HEAD:sub/file Signed-off-by: ... It is true that neither to: nor HEAD: are known trailers, but my patch set accepts trailer blocks that are 100% well-formed regardless of whether the trailers are known (to provide backwards compatibility with git-interpret-trailers, and to satisfy the certain use cases that I brought up). The "known trailer" check is used when the trailer block is not 100% well-formed. This issue can be avoided if those lines were indented with at least one space or at least one tab.
Re: [PATCH v3 5/6] grep: enable recurse-submodules to work on objects
On 11/14/2016 10:10 AM, Junio C Hamano wrote: Brandon Williamswrites: Teach grep to recursively search in submodules when provided with a object. This allows grep to search a submodule based on the state of the submodule that is present in a commit of the super project. When grep is provided with a object, the name of the object is prefixed to all output. In order to provide uniformity of output between the parent and child processes the option `--parent-basename` has been added so that the child can preface all of it's output with the name of the parent's object instead of the name of the commit SHA1 of the submodule. This changes output from the command `git grep -e. -l --recurse-submodules HEAD` from: HEAD:file :sub/file to: HEAD:file HEAD:sub/file Signed-off-by: Brandon Williams --- Unrelated tangent, but this makes readers wonder what the updated trailer code would do to the last paragraph ;-). Does it behave sensibly (with some sane definition of sensibleness)? I am guessing that it would, because neither To: or HEAD: is what we normally recognize as a known trailer block element. Yes, it behaves sensibly :-) because "Signed-off-by:" is preceded by a blank line, so the trailer block consists only of that line. Having said that, it is probably better to indent those examples in the commit message (by at least one space or one tab) - then they will never be confused with trailers (once my patch set is in).
Re: [PATCH v2 5/6] grep: enable recurse-submodules to work on objects
On 10/31/2016 03:38 PM, Brandon Williams wrote: diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 17aa1ba..386a868 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -26,7 +26,7 @@ SYNOPSIS [--threads ] [-f ] [-e] [--and|--or|--not|(|)|-e ...] - [--recurse-submodules] + [--recurse-submodules] [--parent-basename] Maybe add something after --parent-basename, since it takes an argument (like --threads above). @@ -91,7 +91,16 @@ OPTIONS --recurse-submodules:: Recursively search in each submodule that has been initialized and - checked out in the repository. + checked out in the repository. When used in combination with the +option the prefix of all submodule output will be the name of + the parent project's object. + +--parent-basename:: Same comment as above. diff --git a/builtin/grep.c b/builtin/grep.c index cf4f51e..2f10930 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -19,6 +19,7 @@ #include "dir.h" #include "pathspec.h" #include "submodule.h" +#include "submodule-config.h" static char const * const grep_usage[] = { N_("git grep [] [-e] [...] [[--] ...]"), @@ -28,6 +29,7 @@ static char const * const grep_usage[] = { static const char *super_prefix; static int recurse_submodules; static struct argv_array submodule_options = ARGV_ARRAY_INIT; +static const char *parent_basename; Can this be passed as an argument to the functions (grep_objects and grep_object, it seems) instead of having a file-visible variable? @@ -671,12 +707,29 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, enum interesting match = entry_not_interesting; struct name_entry entry; int old_baselen = base->len; + struct strbuf name = STRBUF_INIT; + int name_base_len = 0; + if (super_prefix) { + name_base_len = strlen(super_prefix); + strbuf_addstr(, super_prefix); Better to invoke strbuf_addstr, and then set name_base_len from name.len. This makes it clear where strbuf_setlen (subsequently) resets the strbuf to, and is also a slight performance improvement.
[PATCH] fetch: do not redundantly calculate tag refmap
builtin/fetch.c redundantly calculates refmaps for tags twice. Remove the first calculation. This is only a code simplification and slight performance improvement - the result is unchanged, as the redundant refmaps are subsequently removed by the invocation to "ref_remove_duplicates" anyway. This was introduced in commit c5a84e9 ("fetch --tags: fetch tags *in addition to* other stuff", 2013-10-29) when modifying the effect of the --tags parameter to "git fetch". The refmap-for-tag calculation was copied instead of moved. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- (I noticed this when working on something in this file.) builtin/fetch.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index b6a5597..1d77e58 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -359,9 +359,6 @@ static struct ref *get_ref_map(struct transport *transport, for (i = 0; i < fetch_refspec_nr; i++) get_fetch_map(ref_map, _refspec[i], _tail, 1); - - if (tags == TAGS_SET) - get_fetch_map(remote_refs, tag_refspec, , 0); } else if (refmap_array) { die("--refmap option is only meaningful with command-line refspec(s)."); } else { -- 2.8.0.rc3.226.g39d4020
Re: [PATCH 4/5] grep: optionally recurse into submodules
On Thu, Oct 27, 2016 at 3:38 PM, Brandon Williamswrote: > diff --git a/builtin/grep.c b/builtin/grep.c > index 8887b6add..f34f16df9 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -18,12 +18,20 @@ > #include "quote.h" > #include "dir.h" > #include "pathspec.h" > +#include "submodule.h" > > static char const * const grep_usage[] = { > N_("git grep [] [-e] [...] [[--] ...]"), > NULL > }; > > +static const char *super_prefix; I think that the super_prefix changes could be in its own patch. > +static int recurse_submodules; > +static struct argv_array submodule_options = ARGV_ARRAY_INIT; I guess this has to be static because it is shared by multiple threads. > + > +static int grep_submodule_launch(struct grep_opt *opt, > +const struct grep_source *gs); > + > #define GREP_NUM_THREADS_DEFAULT 8 > static int num_threads; > > @@ -174,7 +182,10 @@ static void *run(void *arg) > break; > > opt->output_priv = w; > - hit |= grep_source(opt, >source); > + if (w->source.type == GREP_SOURCE_SUBMODULE) > + hit |= grep_submodule_launch(opt, >source); > + else > + hit |= grep_source(opt, >source); It seems to me that GREP_SOURCE_SUBMODULE is of a different nature than the other GREP_SOURCE_.* - in struct work_item, could we instead have another variable that distinguishes between submodules and "native" sources? This might also assuage Junio's concerns in about the nature of the sources. That variable could also be the discriminant for a tagged union, such that we have "struct grep_source" for the "native" sources and a new struct (holding only submodule-relevant information) for the submodule. > +/* > + * Prep grep structures for a submodule grep > + * sha1: the sha1 of the submodule or NULL if using the working tree > + * filename: name of the submodule including tree name of parent > + * path: location of the submodule > + */ > +static int grep_submodule(struct grep_opt *opt, const unsigned char *sha1, > + const char *filename, const char *path) > +{ > + if (!(is_submodule_initialized(path) && > + is_submodule_checked_out(path))) { > + warning("skiping submodule '%s%s' since it is not initialized > and checked out", > + super_prefix ? super_prefix: "", > + path); > + return 0; > + } > + > +#ifndef NO_PTHREADS > + if (num_threads) { > + add_work(opt, GREP_SOURCE_SUBMODULE, filename, path, sha1); > + return 0; > + } else > +#endif > + { > + struct work_item w; > + int hit; > + > + grep_source_init(, GREP_SOURCE_SUBMODULE, > +filename, path, sha1); > + strbuf_init(, 0); > + opt->output_priv = > + hit = grep_submodule_launch(opt, ); > + > + write_or_die(1, w.out.buf, w.out.len); > + > + grep_source_clear(); > + strbuf_release(); > + return hit; > + } This is at least the third invocation of this "if pthreads, add work, otherwise do it now" pattern - could this be extracted into its own function (in another patch)? Ideally, there would also be exactly one function in which the grep_source.* functions are invoked, and both "run" and the non-pthread code path can use it. > +} > + > +static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, > + int cached) This line isn't modified other than the line break, as far as I can tell, so I wouldn't break it. > diff --git a/t/t7814-grep-recurse-submodules.sh > b/t/t7814-grep-recurse-submodules.sh > new file mode 100755 > index 0..b670c70cb > --- /dev/null > +++ b/t/t7814-grep-recurse-submodules.sh > @@ -0,0 +1,99 @@ > +#!/bin/sh > + > +test_description='Test grep recurse-submodules feature > + > +This test verifies the recurse-submodules feature correctly greps across > +submodules. > +' > + > +. ./test-lib.sh > + Would it be possible to also test it while num_threads is zero? (Or, if num_threads is already zero, to test it while it is not zero?)
Re: [PATCH v2 1/6] submodules: add helper functions to determine presence of submodules
On 10/31/2016 03:38 PM, Brandon Williams wrote: + struct strbuf buf = STRBUF_INIT; + char *submodule_url = NULL; + + strbuf_addf(, "submodule.%s.url", module->name); + ret = !git_config_get_string(buf.buf, _url); + + free(submodule_url); + strbuf_release(); + } + + return ret; +} + +/* + * Determine if a submodule has been checked out at a given 'path' + */ +int is_submodule_checked_out(const char *path) +{ + int ret = 0; + struct strbuf buf = STRBUF_INIT; + + strbuf_addf(, "%s/.git", path); + ret = file_exists(buf.buf); + + strbuf_release(); In this and the previous function, you can use xstrfmt.
[PATCH v3 2/5] commit: make ignore_non_trailer take buf/len
Make ignore_non_trailer take a buf/len pair instead of struct strbuf. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/commit.c | 2 +- commit.c | 22 +++--- commit.h | 2 +- trailer.c| 2 +- 4 files changed, 14 insertions(+), 14 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 8976c3d..887ccc7 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -790,7 +790,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix, strbuf_stripspace(, 0); if (signoff) - append_signoff(, ignore_non_trailer(), 0); + append_signoff(, ignore_non_trailer(sb.buf, sb.len), 0); if (fwrite(sb.buf, 1, sb.len, s->fp) < sb.len) die_errno(_("could not write commit template")); diff --git a/commit.c b/commit.c index 856fd4a..2cf8515 100644 --- a/commit.c +++ b/commit.c @@ -1649,7 +1649,7 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len } /* - * Inspect sb and determine the true "end" of the log message, in + * Inspect the given string and determine the true "end" of the log message, in * order to find where to put a new Signed-off-by: line. Ignored are * trailing comment lines and blank lines, and also the traditional * "Conflicts:" block that is not commented out, so that we can use @@ -1659,37 +1659,37 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len * Returns the number of bytes from the tail to ignore, to be fed as * the second parameter to append_signoff(). */ -int ignore_non_trailer(struct strbuf *sb) +int ignore_non_trailer(const char *buf, size_t len) { int boc = 0; int bol = 0; int in_old_conflicts_block = 0; - while (bol < sb->len) { - char *next_line; + while (bol < len) { + const char *next_line = memchr(buf + bol, '\n', len - bol); - if (!(next_line = memchr(sb->buf + bol, '\n', sb->len - bol))) - next_line = sb->buf + sb->len; + if (!next_line) + next_line = buf + len; else next_line++; - if (sb->buf[bol] == comment_line_char || sb->buf[bol] == '\n') { + if (buf[bol] == comment_line_char || buf[bol] == '\n') { /* is this the first of the run of comments? */ if (!boc) boc = bol; /* otherwise, it is just continuing */ - } else if (starts_with(sb->buf + bol, "Conflicts:\n")) { + } else if (starts_with(buf + bol, "Conflicts:\n")) { in_old_conflicts_block = 1; if (!boc) boc = bol; - } else if (in_old_conflicts_block && sb->buf[bol] == '\t') { + } else if (in_old_conflicts_block && buf[bol] == '\t') { ; /* a pathname in the conflicts block */ } else if (boc) { /* the previous was not trailing comment */ boc = 0; in_old_conflicts_block = 0; } - bol = next_line - sb->buf; + bol = next_line - buf; } - return boc ? sb->len - boc : 0; + return boc ? len - boc : 0; } diff --git a/commit.h b/commit.h index afd14f3..9c12abb 100644 --- a/commit.h +++ b/commit.h @@ -355,7 +355,7 @@ extern const char *find_commit_header(const char *msg, const char *key, size_t *out_len); /* Find the end of the log message, the right place for a new trailer. */ -extern int ignore_non_trailer(struct strbuf *sb); +extern int ignore_non_trailer(const char *buf, size_t len); typedef void (*each_mergetag_fn)(struct commit *commit, struct commit_extra_header *extra, void *cb_data); diff --git a/trailer.c b/trailer.c index dc525e3..9d7765e 100644 --- a/trailer.c +++ b/trailer.c @@ -840,7 +840,7 @@ static int find_trailer_end(struct strbuf **lines, int patch_start) for (i = 0; i < patch_start; i++) strbuf_addbuf(, lines[i]); - ignore_bytes = ignore_non_trailer(); + ignore_bytes = ignore_non_trailer(sb.buf, sb.len); strbuf_release(); for (i = patch_start - 1; i >= 0 && ignore_bytes > 0; i--) ignore_bytes -= lines[i]->len; -- 2.8.0.rc3.226.g39d4020
[PATCH v3 3/5] trailer: avoid unnecessary splitting on lines
trailer.c currently splits lines while processing a buffer (and also rejoins lines when needing to invoke ignore_non_trailer). Avoid such line splitting, except when generating the strings corresponding to trailers (for ease of use by clients - a subsequent patch will allow other components to obtain the layout of a trailer block in a buffer, including the trailers themselves). The main purpose of this is to make it easy to return pointers into the original buffer (for a subsequent patch), but this also significantly reduces the number of memory allocations required. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 194 -- 1 file changed, 100 insertions(+), 94 deletions(-) diff --git a/trailer.c b/trailer.c index 9d7765e..afbff4b 100644 --- a/trailer.c +++ b/trailer.c @@ -102,12 +102,12 @@ static int same_trailer(struct trailer_item *a, struct arg_item *b) return same_token(a, b) && same_value(a, b); } -static inline int contains_only_spaces(const char *str) +static inline int is_blank_line(const char *str) { const char *s = str; - while (*s && isspace(*s)) + while (*s && *s != '\n' && isspace(*s)) s++; - return !*s; + return !*s || *s == '\n'; } static inline void strbuf_replace(struct strbuf *sb, const char *a, const char *b) @@ -700,51 +700,71 @@ static void process_command_line_args(struct list_head *arg_head, free(cl_separators); } -static struct strbuf **read_input_file(const char *file) +static void read_input_file(struct strbuf *sb, const char *file) { - struct strbuf **lines; - struct strbuf sb = STRBUF_INIT; - if (file) { - if (strbuf_read_file(, file, 0) < 0) + if (strbuf_read_file(sb, file, 0) < 0) die_errno(_("could not read input file '%s'"), file); } else { - if (strbuf_read(, fileno(stdin), 0) < 0) + if (strbuf_read(sb, fileno(stdin), 0) < 0) die_errno(_("could not read from stdin")); } +} - lines = strbuf_split(, '\n'); +static const char *next_line(const char *str) +{ + const char *nl = strchrnul(str, '\n'); + return nl + !!*nl; +} - strbuf_release(); +/* + * Return the position of the start of the last line. If len is 0, return -1. + */ +static int last_line(const char *buf, size_t len) +{ + int i; + if (len == 0) + return -1; + if (len == 1) + return 0; + /* +* Skip the last character (in addition to the null terminator), +* because if the last character is a newline, it is considered as part +* of the last line anyway. +*/ + i = len - 2; - return lines; + for (; i >= 0; i--) { + if (buf[i] == '\n') + return i + 1; + } + return 0; } /* - * Return the (0 based) index of the start of the patch or the line - * count if there is no patch in the message. + * Return the position of the start of the patch or the length of str if there + * is no patch in the message. */ -static int find_patch_start(struct strbuf **lines, int count) +static int find_patch_start(const char *str) { - int i; + const char *s; - /* Get the start of the patch part if any */ - for (i = 0; i < count; i++) { - if (starts_with(lines[i]->buf, "---")) - return i; + for (s = str; *s; s = next_line(s)) { + if (starts_with(s, "---")) + return s - str; } - return count; + return s - str; } /* - * Return the (0 based) index of the first trailer line or count if - * there are no trailers. Trailers are searched only in the lines from - * index (count - 1) down to index 0. + * Return the position of the first trailer line or len if there are no + * trailers. */ -static int find_trailer_start(struct strbuf **lines, int count) +static int find_trailer_start(const char *buf, size_t len) { - int start, end_of_title, only_spaces = 1; + const char *s; + int end_of_title, l, only_spaces = 1; int recognized_prefix = 0, trailer_lines = 0, non_trailer_lines = 0; /* * Number of possible continuation lines encountered. This will be @@ -756,13 +776,13 @@ static int find_trailer_start(struct strbuf **lines, int count) int possible_continuation_lines = 0; /* The first paragraph is the title and cannot be trailers */ - for (start = 0; start < count; start++) { - if (lines[start]->buf[0] == comment_line_char) + for (s = buf; s < buf + len; s = next_line(s)) { + if (s[0] == comment_line_char)
[PATCH v3 5/5] sequencer: use trailer's trailer layout
Make sequencer use trailer.c's trailer layout definition, as opposed to parsing the footer by itself. This makes "commit -s", "cherry-pick -x", and "format-patch --signoff" consistent with trailer, allowing non-trailer lines and multiple-line trailers in trailer blocks under certain conditions, and therefore suppressing the extra newline in those cases. Consistency with trailer extends to respecting trailer configs. Tests have been included to show that. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sequencer.c | 75 +--- t/t3511-cherry-pick-x.sh | 16 +-- t/t4014-format-patch.sh | 37 t/t7501-commit.sh| 36 +++ 4 files changed, 95 insertions(+), 69 deletions(-) diff --git a/sequencer.c b/sequencer.c index 5fd75f3..d64c973 100644 --- a/sequencer.c +++ b/sequencer.c @@ -16,6 +16,7 @@ #include "refs.h" #include "argv-array.h" #include "quote.h" +#include "trailer.h" #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION" @@ -56,30 +57,6 @@ static const char *get_todo_path(const struct replay_opts *opts) return git_path_todo_file(); } -static int is_rfc2822_line(const char *buf, int len) -{ - int i; - - for (i = 0; i < len; i++) { - int ch = buf[i]; - if (ch == ':') - return 1; - if (!isalnum(ch) && ch != '-') - break; - } - - return 0; -} - -static int is_cherry_picked_from_line(const char *buf, int len) -{ - /* -* We only care that it looks roughly like (cherry picked from ...) -*/ - return len > strlen(cherry_picked_prefix) + 1 && - starts_with(buf, cherry_picked_prefix) && buf[len - 1] == ')'; -} - /* * Returns 0 for non-conforming footer * Returns 1 for conforming footer @@ -89,49 +66,25 @@ static int is_cherry_picked_from_line(const char *buf, int len) static int has_conforming_footer(struct strbuf *sb, struct strbuf *sob, int ignore_footer) { - char prev; - int i, k; - int len = sb->len - ignore_footer; - const char *buf = sb->buf; - int found_sob = 0; - - /* footer must end with newline */ - if (!len || buf[len - 1] != '\n') - return 0; + struct trailer_info info; + int i; + int found_sob = 0, found_sob_last = 0; - prev = '\0'; - for (i = len - 1; i > 0; i--) { - char ch = buf[i]; - if (prev == '\n' && ch == '\n') /* paragraph break */ - break; - prev = ch; - } + trailer_info_get(, sb->buf); - /* require at least one blank line */ - if (prev != '\n' || buf[i] != '\n') + if (info.trailer_start == info.trailer_end) return 0; - /* advance to start of last paragraph */ - while (i < len - 1 && buf[i] == '\n') - i++; - - for (; i < len; i = k) { - int found_rfc2822; - - for (k = i; k < len && buf[k] != '\n'; k++) - ; /* do nothing */ - k++; + for (i = 0; i < info.trailer_nr; i++) + if (sob && !strncmp(info.trailers[i], sob->buf, sob->len)) { + found_sob = 1; + if (i == info.trailer_nr - 1) + found_sob_last = 1; + } - found_rfc2822 = is_rfc2822_line(buf + i, k - i - 1); - if (found_rfc2822 && sob && - !strncmp(buf + i, sob->buf, sob->len)) - found_sob = k; + trailer_info_release(); - if (!(found_rfc2822 || - is_cherry_picked_from_line(buf + i, k - i - 1))) - return 0; - } - if (found_sob == i) + if (found_sob_last) return 3; if (found_sob) return 2; diff --git a/t/t3511-cherry-pick-x.sh b/t/t3511-cherry-pick-x.sh index 9cce5ae..bf0a5c9 100755 --- a/t/t3511-cherry-pick-x.sh +++ b/t/t3511-cherry-pick-x.sh @@ -25,9 +25,8 @@ Signed-off-by: B.U. Thor <but...@example.com>" mesg_broken_footer="$mesg_no_footer -The signed-off-by string should begin with the words Signed-off-by followed -by a colon and space, and then the signers name and email address. e.g. -Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>" +This is not recognized as a footer because Myfooter is not a recognized token. +Myfooter: A.U. Thor <aut...@example.com>" mesg_with_footer_sob="$mesg_with_footer Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>" @@ -112,6 +111,17 @@ test_e
[PATCH v3 4/5] trailer: have function to describe trailer layout
Create a function that, taking a string, describes the position of its trailer block (if available) and the contents thereof, and make trailer use it. This makes it easier for other Git components, in the future, to interpret trailer blocks in the same way as trailer. In a subsequent patch, another component will be made to use this. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 118 +++--- trailer.h | 25 + 2 files changed, 107 insertions(+), 36 deletions(-) diff --git a/trailer.c b/trailer.c index afbff4b..bc6893b 100644 --- a/trailer.c +++ b/trailer.c @@ -46,6 +46,8 @@ static LIST_HEAD(conf_head); static char *separators = ":"; +static int configured; + #define TRAILER_ARG_STRING "$ARG" static const char *git_generated_prefixes[] = { @@ -546,6 +548,17 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } +static void ensure_configured(void) +{ + if (configured) + return; + + /* Default config must be setup first */ + git_config(git_trailer_default_config, NULL); + git_config(git_trailer_config, NULL); + configured = 1; +} + static const char *token_from_item(struct arg_item *item, char *tok) { if (item->conf.key) @@ -873,59 +886,43 @@ static int process_input_file(FILE *outfile, const char *str, struct list_head *head) { - int patch_start, trailer_start, trailer_end; + struct trailer_info info; struct strbuf tok = STRBUF_INIT; struct strbuf val = STRBUF_INIT; - struct trailer_item *last = NULL; - struct strbuf *trailer, **trailer_lines, **ptr; + int i; - patch_start = find_patch_start(str); - trailer_end = find_trailer_end(str, patch_start); - trailer_start = find_trailer_start(str, trailer_end); + trailer_info_get(, str); /* Print lines before the trailers as is */ - fwrite(str, 1, trailer_start, outfile); + fwrite(str, 1, info.trailer_start - str, outfile); - if (!ends_with_blank_line(str, trailer_start)) + if (!info.blank_line_before_trailer) fprintf(outfile, "\n"); - /* Parse trailer lines */ - trailer_lines = strbuf_split_buf(str + trailer_start, -trailer_end - trailer_start, -'\n', -0); - for (ptr = trailer_lines; *ptr; ptr++) { + for (i = 0; i < info.trailer_nr; i++) { int separator_pos; - trailer = *ptr; - if (trailer->buf[0] == comment_line_char) - continue; - if (last && isspace(trailer->buf[0])) { - struct strbuf sb = STRBUF_INIT; - strbuf_addf(, "%s\n%s", last->value, trailer->buf); - strbuf_strip_suffix(, "\n"); - free(last->value); - last->value = strbuf_detach(, NULL); + char *trailer = info.trailers[i]; + if (trailer[0] == comment_line_char) continue; - } - separator_pos = find_separator(trailer->buf, separators); + separator_pos = find_separator(trailer, separators); if (separator_pos >= 1) { - parse_trailer(, , NULL, trailer->buf, + parse_trailer(, , NULL, trailer, separator_pos); - last = add_trailer_item(head, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + add_trailer_item(head, +strbuf_detach(, NULL), +strbuf_detach(, NULL)); } else { - strbuf_addbuf(, trailer); + strbuf_addstr(, trailer); strbuf_strip_suffix(, "\n"); add_trailer_item(head, NULL, strbuf_detach(, NULL)); - last = NULL; } } - strbuf_list_free(trailer_lines); - return trailer_end; + trailer_info_release(); + + return info.trailer_end - str; } static void free_all(struct list_head *head) @@ -976,9 +973,7 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str int trailer_end; FILE *outfile = stdout; - /* Default config must be setup first */ - git_config
[PATCH v3 0/5] Make other git commands use trailer layout
This is the same as v2 except that in 1/5, the comment about find_separators has been moved from the commit message to the code itself. Also, a trailing whitespace and unused variable fix. Jonathan Tan (5): trailer: be stricter in parsing separators commit: make ignore_non_trailer take buf/len trailer: avoid unnecessary splitting on lines trailer: have function to describe trailer layout sequencer: use trailer's trailer layout builtin/commit.c | 2 +- commit.c | 22 ++-- commit.h | 2 +- sequencer.c | 75 +++- t/t3511-cherry-pick-x.sh | 16 ++- t/t4014-format-patch.sh | 37 +- t/t7501-commit.sh| 36 ++ trailer.c| 299 +-- trailer.h| 25 9 files changed, 316 insertions(+), 198 deletions(-) -- 2.8.0.rc3.226.g39d4020
[PATCH v3 1/5] trailer: be stricter in parsing separators
Currently, a line is interpreted to be a trailer line if it contains a separator. Make parsing stricter by requiring the text on the left of the separator, if not the empty string, to be of the "" form. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/trailer.c b/trailer.c index f0ecde2..dc525e3 100644 --- a/trailer.c +++ b/trailer.c @@ -563,15 +563,30 @@ static int token_matches_item(const char *tok, struct arg_item *item, int tok_le } /* - * Return the location of the first separator in line, or -1 if there is no - * separator. + * If the given line is of the form + * "..." or "...", return the + * location of the separator. Otherwise, return -1. + * + * The separator-starts-line case (in which this function returns 0) is + * distinguished from the non-well-formed-line case (in which this function + * returns -1) because some callers of this function need such a distinction. */ static int find_separator(const char *line, const char *separators) { - int loc = strcspn(line, separators); - if (!line[loc]) - return -1; - return loc; + int whitespace_found = 0; + const char *c; + for (c = line; *c; c++) { + if (strchr(separators, *c)) + return c - line; + if (!whitespace_found && (isalnum(*c) || *c == '-')) + continue; + if (c != line && (*c == ' ' || *c == '\t')) { + whitespace_found = 1; + continue; + } + break; + } + return -1; } /* -- 2.8.0.rc3.226.g39d4020
Re: [PATCH v2 1/5] trailer: be stricter in parsing separators
On 11/01/2016 01:37 PM, Junio C Hamano wrote: Junio C Hamano <gits...@pobox.com> writes: Jonathan Tan <jonathanta...@google.com> writes: Currently, a line is interpreted to be a trailer line if it contains a separator. Make parsing stricter by requiring the text on the left of the separator, if not the empty string, to be of the "" form. Hmph. The optional whitespace is to allow for what kind of line? It is not for "Signed off by:" that is a misspelt "Signed-off-by:"; it may not hurt but I do not think of a case that would be useful offhand. This is to allow trailers of the form "Fix #42" (mentioned in the git-interpret-trailers documentation and also in the test). Other than this "Hmph" (which is not an objection---just something that the reviewer did not understand), the rest looked good to me. Will re-queue. Thanks. Thanks!
[PATCH v2 0/5] Make other git commands use trailer layout
Thanks for all your comments. This patch set is now built off master (since jt/trailer-with-cruft is merged). I couldn't think of an easy way to clearly decide if a token with spaces should be considered a token, so I've tightened the restrictions. One benefit is that we no longer need to create temporary strings that include '\n' to be passed into the find_separator method. In 2/4 (now 3/5), I've also changed some variable names as requested (e.g. sb -> input, and un-did some others). Jonathan Tan (5): trailer: be stricter in parsing separators commit: make ignore_non_trailer take buf/len trailer: avoid unnecessary splitting on lines trailer: have function to describe trailer layout sequencer: use trailer's trailer layout builtin/commit.c | 2 +- commit.c | 22 ++-- commit.h | 2 +- sequencer.c | 75 +++- t/t3511-cherry-pick-x.sh | 16 ++- t/t4014-format-patch.sh | 37 +- t/t7501-commit.sh| 36 ++ trailer.c| 296 --- trailer.h| 25 9 files changed, 313 insertions(+), 198 deletions(-) -- 2.8.0.rc3.226.g39d4020
[PATCH v2 1/5] trailer: be stricter in parsing separators
Currently, a line is interpreted to be a trailer line if it contains a separator. Make parsing stricter by requiring the text on the left of the separator, if not the empty string, to be of the "" form. (The find_separator function distinguishes the no-separator case from the separator-starts-line case because some callers of this function need such a distinction.) Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/trailer.c b/trailer.c index f0ecde2..0ee634f 100644 --- a/trailer.c +++ b/trailer.c @@ -563,15 +563,26 @@ static int token_matches_item(const char *tok, struct arg_item *item, int tok_le } /* - * Return the location of the first separator in line, or -1 if there is no - * separator. + * If the given line is of the form + * "..." or "...", return the + * location of the separator. Otherwise, return -1. */ static int find_separator(const char *line, const char *separators) { - int loc = strcspn(line, separators); - if (!line[loc]) - return -1; - return loc; + int whitespace_found = 0; + const char *c; + for (c = line; *c; c++) { + if (strchr(separators, *c)) + return c - line; + if (!whitespace_found && (isalnum(*c) || *c == '-')) + continue; + if (c != line && (*c == ' ' || *c == '\t')) { + whitespace_found = 1; + continue; + } + break; + } + return -1; } /* -- 2.8.0.rc3.226.g39d4020
[PATCH v2 3/5] trailer: avoid unnecessary splitting on lines
trailer.c currently splits lines while processing a buffer (and also rejoins lines when needing to invoke ignore_non_trailer). Avoid such line splitting, except when generating the strings corresponding to trailers (for ease of use by clients - a subsequent patch will allow other components to obtain the layout of a trailer block in a buffer, including the trailers themselves). The main purpose of this is to make it easy to return pointers into the original buffer (for a subsequent patch), but this also significantly reduces the number of memory allocations required. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 195 -- 1 file changed, 101 insertions(+), 94 deletions(-) diff --git a/trailer.c b/trailer.c index 04edab2..f5427ec 100644 --- a/trailer.c +++ b/trailer.c @@ -102,12 +102,12 @@ static int same_trailer(struct trailer_item *a, struct arg_item *b) return same_token(a, b) && same_value(a, b); } -static inline int contains_only_spaces(const char *str) +static inline int is_blank_line(const char *str) { const char *s = str; - while (*s && isspace(*s)) + while (*s && *s != '\n' && isspace(*s)) s++; - return !*s; + return !*s || *s == '\n'; } static inline void strbuf_replace(struct strbuf *sb, const char *a, const char *b) @@ -696,51 +696,71 @@ static void process_command_line_args(struct list_head *arg_head, free(cl_separators); } -static struct strbuf **read_input_file(const char *file) +static void read_input_file(struct strbuf *sb, const char *file) { - struct strbuf **lines; - struct strbuf sb = STRBUF_INIT; - if (file) { - if (strbuf_read_file(, file, 0) < 0) + if (strbuf_read_file(sb, file, 0) < 0) die_errno(_("could not read input file '%s'"), file); } else { - if (strbuf_read(, fileno(stdin), 0) < 0) + if (strbuf_read(sb, fileno(stdin), 0) < 0) die_errno(_("could not read from stdin")); } +} - lines = strbuf_split(, '\n'); +static const char *next_line(const char *str) +{ + const char *nl = strchrnul(str, '\n'); + return nl + !!*nl; +} - strbuf_release(); +/* + * Return the position of the start of the last line. If len is 0, return -1. + */ +static int last_line(const char *buf, size_t len) +{ + int i; + if (len == 0) + return -1; + if (len == 1) + return 0; + /* +* Skip the last character (in addition to the null terminator), +* because if the last character is a newline, it is considered as part +* of the last line anyway. +*/ + i = len - 2; - return lines; + for (; i >= 0; i--) { + if (buf[i] == '\n') + return i + 1; + } + return 0; } /* - * Return the (0 based) index of the start of the patch or the line - * count if there is no patch in the message. + * Return the position of the start of the patch or the length of str if there + * is no patch in the message. */ -static int find_patch_start(struct strbuf **lines, int count) +static int find_patch_start(const char *str) { - int i; + const char *s; - /* Get the start of the patch part if any */ - for (i = 0; i < count; i++) { - if (starts_with(lines[i]->buf, "---")) - return i; + for (s = str; *s; s = next_line(s)) { + if (starts_with(s, "---")) + return s - str; } - return count; + return s - str; } /* - * Return the (0 based) index of the first trailer line or count if - * there are no trailers. Trailers are searched only in the lines from - * index (count - 1) down to index 0. + * Return the position of the first trailer line or len if there are no + * trailers. */ -static int find_trailer_start(struct strbuf **lines, int count) +static int find_trailer_start(const char *buf, size_t len) { - int start, end_of_title, only_spaces = 1; + const char *s; + int end_of_title, l, only_spaces = 1; int recognized_prefix = 0, trailer_lines = 0, non_trailer_lines = 0; /* * Number of possible continuation lines encountered. This will be @@ -750,15 +770,16 @@ static int find_trailer_start(struct strbuf **lines, int count) * are to be considered non-trailers). */ int possible_continuation_lines = 0; + int ret; /* The first paragraph is the title and cannot be trailers */ - for (start = 0; start < count; start++) { - if (lines[start]->buf[0] == comment_line_char) + for (s = buf; s < bu
[PATCH v2 4/5] trailer: have function to describe trailer layout
Create a function that, taking a string, describes the position of its trailer block (if available) and the contents thereof, and make trailer use it. This makes it easier for other Git components, in the future, to interpret trailer blocks in the same way as trailer. In a subsequent patch, another component will be made to use this. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 118 +++--- trailer.h | 25 + 2 files changed, 107 insertions(+), 36 deletions(-) diff --git a/trailer.c b/trailer.c index f5427ec..7265a50 100644 --- a/trailer.c +++ b/trailer.c @@ -46,6 +46,8 @@ static LIST_HEAD(conf_head); static char *separators = ":"; +static int configured; + #define TRAILER_ARG_STRING "$ARG" static const char *git_generated_prefixes[] = { @@ -546,6 +548,17 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } +static void ensure_configured(void) +{ + if (configured) + return; + + /* Default config must be setup first */ + git_config(git_trailer_default_config, NULL); + git_config(git_trailer_config, NULL); + configured = 1; +} + static const char *token_from_item(struct arg_item *item, char *tok) { if (item->conf.key) @@ -870,59 +883,43 @@ static int process_input_file(FILE *outfile, const char *str, struct list_head *head) { - int patch_start, trailer_start, trailer_end; + struct trailer_info info; struct strbuf tok = STRBUF_INIT; struct strbuf val = STRBUF_INIT; - struct trailer_item *last = NULL; - struct strbuf *trailer, **trailer_lines, **ptr; + int i; - patch_start = find_patch_start(str); - trailer_end = find_trailer_end(str, patch_start); - trailer_start = find_trailer_start(str, trailer_end); + trailer_info_get(, str); /* Print lines before the trailers as is */ - fwrite(str, 1, trailer_start, outfile); + fwrite(str, 1, info.trailer_start - str, outfile); - if (!ends_with_blank_line(str, trailer_start)) + if (!info.blank_line_before_trailer) fprintf(outfile, "\n"); - /* Parse trailer lines */ - trailer_lines = strbuf_split_buf(str + trailer_start, -trailer_end - trailer_start, -'\n', -0); - for (ptr = trailer_lines; *ptr; ptr++) { + for (i = 0; i < info.trailer_nr; i++) { int separator_pos; - trailer = *ptr; - if (trailer->buf[0] == comment_line_char) - continue; - if (last && isspace(trailer->buf[0])) { - struct strbuf sb = STRBUF_INIT; - strbuf_addf(, "%s\n%s", last->value, trailer->buf); - strbuf_strip_suffix(, "\n"); - free(last->value); - last->value = strbuf_detach(, NULL); + char *trailer = info.trailers[i]; + if (trailer[0] == comment_line_char) continue; - } - separator_pos = find_separator(trailer->buf, separators); + separator_pos = find_separator(trailer, separators); if (separator_pos >= 1) { - parse_trailer(, , NULL, trailer->buf, + parse_trailer(, , NULL, trailer, separator_pos); - last = add_trailer_item(head, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + add_trailer_item(head, +strbuf_detach(, NULL), +strbuf_detach(, NULL)); } else { - strbuf_addbuf(, trailer); + strbuf_addstr(, trailer); strbuf_strip_suffix(, "\n"); add_trailer_item(head, NULL, strbuf_detach(, NULL)); - last = NULL; } } - strbuf_list_free(trailer_lines); - return trailer_end; + trailer_info_release(); + + return info.trailer_end - str; } static void free_all(struct list_head *head) @@ -973,9 +970,7 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str int trailer_end; FILE *outfile = stdout; - /* Default config must be setup first */ - git_config
[PATCH v2 2/5] commit: make ignore_non_trailer take buf/len
Make ignore_non_trailer take a buf/len pair instead of struct strbuf. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/commit.c | 2 +- commit.c | 22 +++--- commit.h | 2 +- trailer.c| 2 +- 4 files changed, 14 insertions(+), 14 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 8976c3d..887ccc7 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -790,7 +790,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix, strbuf_stripspace(, 0); if (signoff) - append_signoff(, ignore_non_trailer(), 0); + append_signoff(, ignore_non_trailer(sb.buf, sb.len), 0); if (fwrite(sb.buf, 1, sb.len, s->fp) < sb.len) die_errno(_("could not write commit template")); diff --git a/commit.c b/commit.c index 856fd4a..2cf8515 100644 --- a/commit.c +++ b/commit.c @@ -1649,7 +1649,7 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len } /* - * Inspect sb and determine the true "end" of the log message, in + * Inspect the given string and determine the true "end" of the log message, in * order to find where to put a new Signed-off-by: line. Ignored are * trailing comment lines and blank lines, and also the traditional * "Conflicts:" block that is not commented out, so that we can use @@ -1659,37 +1659,37 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len * Returns the number of bytes from the tail to ignore, to be fed as * the second parameter to append_signoff(). */ -int ignore_non_trailer(struct strbuf *sb) +int ignore_non_trailer(const char *buf, size_t len) { int boc = 0; int bol = 0; int in_old_conflicts_block = 0; - while (bol < sb->len) { - char *next_line; + while (bol < len) { + const char *next_line = memchr(buf + bol, '\n', len - bol); - if (!(next_line = memchr(sb->buf + bol, '\n', sb->len - bol))) - next_line = sb->buf + sb->len; + if (!next_line) + next_line = buf + len; else next_line++; - if (sb->buf[bol] == comment_line_char || sb->buf[bol] == '\n') { + if (buf[bol] == comment_line_char || buf[bol] == '\n') { /* is this the first of the run of comments? */ if (!boc) boc = bol; /* otherwise, it is just continuing */ - } else if (starts_with(sb->buf + bol, "Conflicts:\n")) { + } else if (starts_with(buf + bol, "Conflicts:\n")) { in_old_conflicts_block = 1; if (!boc) boc = bol; - } else if (in_old_conflicts_block && sb->buf[bol] == '\t') { + } else if (in_old_conflicts_block && buf[bol] == '\t') { ; /* a pathname in the conflicts block */ } else if (boc) { /* the previous was not trailing comment */ boc = 0; in_old_conflicts_block = 0; } - bol = next_line - sb->buf; + bol = next_line - buf; } - return boc ? sb->len - boc : 0; + return boc ? len - boc : 0; } diff --git a/commit.h b/commit.h index afd14f3..9c12abb 100644 --- a/commit.h +++ b/commit.h @@ -355,7 +355,7 @@ extern const char *find_commit_header(const char *msg, const char *key, size_t *out_len); /* Find the end of the log message, the right place for a new trailer. */ -extern int ignore_non_trailer(struct strbuf *sb); +extern int ignore_non_trailer(const char *buf, size_t len); typedef void (*each_mergetag_fn)(struct commit *commit, struct commit_extra_header *extra, void *cb_data); diff --git a/trailer.c b/trailer.c index 0ee634f..04edab2 100644 --- a/trailer.c +++ b/trailer.c @@ -836,7 +836,7 @@ static int find_trailer_end(struct strbuf **lines, int patch_start) for (i = 0; i < patch_start; i++) strbuf_addbuf(, lines[i]); - ignore_bytes = ignore_non_trailer(); + ignore_bytes = ignore_non_trailer(sb.buf, sb.len); strbuf_release(); for (i = patch_start - 1; i >= 0 && ignore_bytes > 0; i--) ignore_bytes -= lines[i]->len; -- 2.8.0.rc3.226.g39d4020
[PATCH v2 5/5] sequencer: use trailer's trailer layout
Make sequencer use trailer.c's trailer layout definition, as opposed to parsing the footer by itself. This makes "commit -s", "cherry-pick -x", and "format-patch --signoff" consistent with trailer, allowing non-trailer lines and multiple-line trailers in trailer blocks under certain conditions, and therefore suppressing the extra newline in those cases. Consistency with trailer extends to respecting trailer configs. Tests have been included to show that. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sequencer.c | 75 +--- t/t3511-cherry-pick-x.sh | 16 +-- t/t4014-format-patch.sh | 37 t/t7501-commit.sh| 36 +++ 4 files changed, 95 insertions(+), 69 deletions(-) diff --git a/sequencer.c b/sequencer.c index 5fd75f3..d64c973 100644 --- a/sequencer.c +++ b/sequencer.c @@ -16,6 +16,7 @@ #include "refs.h" #include "argv-array.h" #include "quote.h" +#include "trailer.h" #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION" @@ -56,30 +57,6 @@ static const char *get_todo_path(const struct replay_opts *opts) return git_path_todo_file(); } -static int is_rfc2822_line(const char *buf, int len) -{ - int i; - - for (i = 0; i < len; i++) { - int ch = buf[i]; - if (ch == ':') - return 1; - if (!isalnum(ch) && ch != '-') - break; - } - - return 0; -} - -static int is_cherry_picked_from_line(const char *buf, int len) -{ - /* -* We only care that it looks roughly like (cherry picked from ...) -*/ - return len > strlen(cherry_picked_prefix) + 1 && - starts_with(buf, cherry_picked_prefix) && buf[len - 1] == ')'; -} - /* * Returns 0 for non-conforming footer * Returns 1 for conforming footer @@ -89,49 +66,25 @@ static int is_cherry_picked_from_line(const char *buf, int len) static int has_conforming_footer(struct strbuf *sb, struct strbuf *sob, int ignore_footer) { - char prev; - int i, k; - int len = sb->len - ignore_footer; - const char *buf = sb->buf; - int found_sob = 0; - - /* footer must end with newline */ - if (!len || buf[len - 1] != '\n') - return 0; + struct trailer_info info; + int i; + int found_sob = 0, found_sob_last = 0; - prev = '\0'; - for (i = len - 1; i > 0; i--) { - char ch = buf[i]; - if (prev == '\n' && ch == '\n') /* paragraph break */ - break; - prev = ch; - } + trailer_info_get(, sb->buf); - /* require at least one blank line */ - if (prev != '\n' || buf[i] != '\n') + if (info.trailer_start == info.trailer_end) return 0; - /* advance to start of last paragraph */ - while (i < len - 1 && buf[i] == '\n') - i++; - - for (; i < len; i = k) { - int found_rfc2822; - - for (k = i; k < len && buf[k] != '\n'; k++) - ; /* do nothing */ - k++; + for (i = 0; i < info.trailer_nr; i++) + if (sob && !strncmp(info.trailers[i], sob->buf, sob->len)) { + found_sob = 1; + if (i == info.trailer_nr - 1) + found_sob_last = 1; + } - found_rfc2822 = is_rfc2822_line(buf + i, k - i - 1); - if (found_rfc2822 && sob && - !strncmp(buf + i, sob->buf, sob->len)) - found_sob = k; + trailer_info_release(); - if (!(found_rfc2822 || - is_cherry_picked_from_line(buf + i, k - i - 1))) - return 0; - } - if (found_sob == i) + if (found_sob_last) return 3; if (found_sob) return 2; diff --git a/t/t3511-cherry-pick-x.sh b/t/t3511-cherry-pick-x.sh index 9cce5ae..bf0a5c9 100755 --- a/t/t3511-cherry-pick-x.sh +++ b/t/t3511-cherry-pick-x.sh @@ -25,9 +25,8 @@ Signed-off-by: B.U. Thor <but...@example.com>" mesg_broken_footer="$mesg_no_footer -The signed-off-by string should begin with the words Signed-off-by followed -by a colon and space, and then the signers name and email address. e.g. -Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>" +This is not recognized as a footer because Myfooter is not a recognized token. +Myfooter: A.U. Thor <aut...@example.com>" mesg_with_footer_sob="$mesg_with_footer Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>" @@ -112,6 +111,17 @@ test_e
Re: [PATCH 4/4] sequencer: use trailer's trailer layout
On 10/31/2016 06:11 PM, Junio C Hamano wrote: Jonathan Tan <jonathanta...@google.com> writes: diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh index ba4902d..635b394 100755 --- a/t/t4014-format-patch.sh +++ b/t/t4014-format-patch.sh @@ -1277,8 +1277,7 @@ EOF 4:Subject: [PATCH] subject 8: 9:I want to mention about Signed-off-by: here. -10: -11:Signed-off-by: C O Mitter <commit...@example.com> +10:Signed-off-by: C O Mitter <commit...@example.com> EOF test_cmp expected actual ' The original log message is a single-liner subject line, blank, "I want to mention..." and when asked to append S-o-b:, we would want to see a blank before the added S-o-b, no? This seems a bit weird. This is because the "I want to mention" block has 100% trailer lines (since its only line contains a colon). We could forbid spaces in trailer field names, but as you said [1], it might be better to allow them since users might include them. The original sequencer.c interpreted this block as not a trailer block, because it only accepted alphanumeric characters or '-' before the colon (and no spaces) - hence the difference in behavior. [1] <xmqqbmyhr4vt@gitster.mtv.corp.google.com>
[PATCH 3/4] trailer: have function to describe trailer layout
Create a function that, taking a string, describes the position of its trailer block (if available) and the contents thereof, and make trailer use it. This makes it easier for other Git components, in the future, to interpret trailer blocks in the same way as trailer. In a subsequent patch, another component will be made to use this. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 120 +++--- trailer.h | 25 + 2 files changed, 108 insertions(+), 37 deletions(-) diff --git a/trailer.c b/trailer.c index d4d9e10..2f5c815 100644 --- a/trailer.c +++ b/trailer.c @@ -46,6 +46,8 @@ static LIST_HEAD(conf_head); static char *separators = ":"; +static int configured = 0; + #define TRAILER_ARG_STRING "$ARG" static const char *git_generated_prefixes[] = { @@ -549,6 +551,17 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } +static void ensure_configured(void) +{ + if (configured) + return; + + /* Default config must be setup first */ + git_config(git_trailer_default_config, NULL); + git_config(git_trailer_config, NULL); + configured = 1; +} + static const char *token_from_item(struct arg_item *item, char *tok) { if (item->conf.key) @@ -872,59 +885,43 @@ static int process_input_file(FILE *outfile, const char *str, struct list_head *head) { - int patch_start, trailer_start, trailer_end; + struct trailer_info info; struct strbuf tok = STRBUF_INIT; struct strbuf val = STRBUF_INIT; - struct trailer_item *last = NULL; - struct strbuf *trailer, **trailer_lines, **ptr; + int i; - patch_start = find_patch_start(str); - trailer_end = find_trailer_end(str, patch_start); - trailer_start = find_trailer_start(str, trailer_end); + trailer_info_get(, str); /* Print lines before the trailers as is */ - fwrite(str, 1, trailer_start, outfile); + fwrite(str, 1, info.trailer_start - str, outfile); - if (!ends_with_blank_line(str, trailer_start)) + if (!info.blank_line_before_trailer) fprintf(outfile, "\n"); - /* Parse trailer lines */ - trailer_lines = strbuf_split_buf(str + trailer_start, -trailer_end - trailer_start, -'\n', -0); - for (ptr = trailer_lines; *ptr; ptr++) { + for (i = 0; i < info.trailer_nr; i++) { int separator_pos; - trailer = *ptr; - if (trailer->buf[0] == comment_line_char) + char *trailer = info.trailers[i]; + if (trailer[0] == comment_line_char) continue; - if (last && isspace(trailer->buf[0])) { - struct strbuf sb = STRBUF_INIT; - strbuf_addf(, "%s\n%s", last->value, trailer->buf); - strbuf_strip_suffix(, "\n"); - free(last->value); - last->value = strbuf_detach(, NULL); - continue; - } - separator_pos = find_separator(trailer->buf, separators); + separator_pos = find_separator(trailer, separators); if (separator_pos >= 1) { - parse_trailer(, , NULL, trailer->buf, - separator_pos); - last = add_trailer_item(head, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + parse_trailer(, , NULL, trailer, + separator_pos); + add_trailer_item(head, +strbuf_detach(, NULL), +strbuf_detach(, NULL)); } else { - strbuf_addbuf(, trailer); + strbuf_addstr(, trailer); strbuf_strip_suffix(, "\n"); add_trailer_item(head, NULL, strbuf_detach(, NULL)); - last = NULL; } } - strbuf_list_free(trailer_lines); - return trailer_end; + trailer_info_release(); + + return info.trailer_end - str; } static void free_all(struct list_head *head) @@ -975,9 +972,7 @@ void process_trailers(const char *file, int in_place, int trim_empty, struct str int trailer_end; FILE *outfile = stdout; - /* Def
[PATCH 4/4] sequencer: use trailer's trailer layout
Make sequencer use trailer.c's trailer layout definition, as opposed to parsing the footer by itself. This makes "commit -s", "cherry-pick -x", and "format-patch --signoff" consistent with trailer, allowing non-trailer lines and multiple-line trailers in trailer blocks under certain conditions, and therefore suppressing the extra newline in those cases. Consistency with trailer extends to respecting trailer configs. Tests have been included to show that. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- sequencer.c | 75 +--- t/t3511-cherry-pick-x.sh | 16 +-- t/t4014-format-patch.sh | 40 +- t/t7501-commit.sh| 36 +++ 4 files changed, 96 insertions(+), 71 deletions(-) diff --git a/sequencer.c b/sequencer.c index eec8a60..ec0157d 100644 --- a/sequencer.c +++ b/sequencer.c @@ -15,6 +15,7 @@ #include "merge-recursive.h" #include "refs.h" #include "argv-array.h" +#include "trailer.h" #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION" @@ -26,30 +27,6 @@ static GIT_PATH_FUNC(git_path_opts_file, SEQ_OPTS_FILE) static GIT_PATH_FUNC(git_path_seq_dir, SEQ_DIR) static GIT_PATH_FUNC(git_path_head_file, SEQ_HEAD_FILE) -static int is_rfc2822_line(const char *buf, int len) -{ - int i; - - for (i = 0; i < len; i++) { - int ch = buf[i]; - if (ch == ':') - return 1; - if (!isalnum(ch) && ch != '-') - break; - } - - return 0; -} - -static int is_cherry_picked_from_line(const char *buf, int len) -{ - /* -* We only care that it looks roughly like (cherry picked from ...) -*/ - return len > strlen(cherry_picked_prefix) + 1 && - starts_with(buf, cherry_picked_prefix) && buf[len - 1] == ')'; -} - /* * Returns 0 for non-conforming footer * Returns 1 for conforming footer @@ -59,49 +36,25 @@ static int is_cherry_picked_from_line(const char *buf, int len) static int has_conforming_footer(struct strbuf *sb, struct strbuf *sob, int ignore_footer) { - char prev; - int i, k; - int len = sb->len - ignore_footer; - const char *buf = sb->buf; - int found_sob = 0; - - /* footer must end with newline */ - if (!len || buf[len - 1] != '\n') - return 0; + struct trailer_info info; + int i; + int found_sob = 0, found_sob_last = 0; - prev = '\0'; - for (i = len - 1; i > 0; i--) { - char ch = buf[i]; - if (prev == '\n' && ch == '\n') /* paragraph break */ - break; - prev = ch; - } + trailer_info_get(, sb->buf); - /* require at least one blank line */ - if (prev != '\n' || buf[i] != '\n') + if (info.trailer_start == info.trailer_end) return 0; - /* advance to start of last paragraph */ - while (i < len - 1 && buf[i] == '\n') - i++; - - for (; i < len; i = k) { - int found_rfc2822; - - for (k = i; k < len && buf[k] != '\n'; k++) - ; /* do nothing */ - k++; + for (i = 0; i < info.trailer_nr; i++) + if (sob && !strncmp(info.trailers[i], sob->buf, sob->len)) { + found_sob = 1; + if (i == info.trailer_nr - 1) + found_sob_last = 1; + } - found_rfc2822 = is_rfc2822_line(buf + i, k - i - 1); - if (found_rfc2822 && sob && - !strncmp(buf + i, sob->buf, sob->len)) - found_sob = k; + trailer_info_release(); - if (!(found_rfc2822 || - is_cherry_picked_from_line(buf + i, k - i - 1))) - return 0; - } - if (found_sob == i) + if (found_sob_last) return 3; if (found_sob) return 2; diff --git a/t/t3511-cherry-pick-x.sh b/t/t3511-cherry-pick-x.sh index 9cce5ae..bf0a5c9 100755 --- a/t/t3511-cherry-pick-x.sh +++ b/t/t3511-cherry-pick-x.sh @@ -25,9 +25,8 @@ Signed-off-by: B.U. Thor <but...@example.com>" mesg_broken_footer="$mesg_no_footer -The signed-off-by string should begin with the words Signed-off-by followed -by a colon and space, and then the signers name and email address. e.g. -Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>" +This is not recognized as a footer because Myfooter is not a recognized token. +Myfooter: A.U. Thor <aut...@example.com>" mesg_with_footer_sob="$mesg_with_footer Signed-off-by: $G
[PATCH 1/4] commit: make ignore_non_trailer take buf/len
Make ignore_non_trailer take a buf/len pair instead of struct strbuf. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- builtin/commit.c | 2 +- commit.c | 22 +++--- commit.h | 2 +- trailer.c| 2 +- 4 files changed, 14 insertions(+), 14 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 1cba3b7..c26b939 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -790,7 +790,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix, strbuf_stripspace(, 0); if (signoff) - append_signoff(, ignore_non_trailer(), 0); + append_signoff(, ignore_non_trailer(sb.buf, sb.len), 0); if (fwrite(sb.buf, 1, sb.len, s->fp) < sb.len) die_errno(_("could not write commit template")); diff --git a/commit.c b/commit.c index 856fd4a..f70835a 100644 --- a/commit.c +++ b/commit.c @@ -1649,7 +1649,7 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len } /* - * Inspect sb and determine the true "end" of the log message, in + * Inspect the given string and determine the true "end" of the log message, in * order to find where to put a new Signed-off-by: line. Ignored are * trailing comment lines and blank lines, and also the traditional * "Conflicts:" block that is not commented out, so that we can use @@ -1659,37 +1659,37 @@ const char *find_commit_header(const char *msg, const char *key, size_t *out_len * Returns the number of bytes from the tail to ignore, to be fed as * the second parameter to append_signoff(). */ -int ignore_non_trailer(struct strbuf *sb) +int ignore_non_trailer(const char *buf, size_t len) { int boc = 0; int bol = 0; int in_old_conflicts_block = 0; - while (bol < sb->len) { - char *next_line; + while (bol < len) { + const char *next_line; - if (!(next_line = memchr(sb->buf + bol, '\n', sb->len - bol))) - next_line = sb->buf + sb->len; + if (!(next_line = memchr(buf + bol, '\n', len - bol))) + next_line = buf + len; else next_line++; - if (sb->buf[bol] == comment_line_char || sb->buf[bol] == '\n') { + if (buf[bol] == comment_line_char || buf[bol] == '\n') { /* is this the first of the run of comments? */ if (!boc) boc = bol; /* otherwise, it is just continuing */ - } else if (starts_with(sb->buf + bol, "Conflicts:\n")) { + } else if (starts_with(buf + bol, "Conflicts:\n")) { in_old_conflicts_block = 1; if (!boc) boc = bol; - } else if (in_old_conflicts_block && sb->buf[bol] == '\t') { + } else if (in_old_conflicts_block && buf[bol] == '\t') { ; /* a pathname in the conflicts block */ } else if (boc) { /* the previous was not trailing comment */ boc = 0; in_old_conflicts_block = 0; } - bol = next_line - sb->buf; + bol = next_line - buf; } - return boc ? sb->len - boc : 0; + return boc ? len - boc : 0; } diff --git a/commit.h b/commit.h index afd14f3..9c12abb 100644 --- a/commit.h +++ b/commit.h @@ -355,7 +355,7 @@ extern const char *find_commit_header(const char *msg, const char *key, size_t *out_len); /* Find the end of the log message, the right place for a new trailer. */ -extern int ignore_non_trailer(struct strbuf *sb); +extern int ignore_non_trailer(const char *buf, size_t len); typedef void (*each_mergetag_fn)(struct commit *commit, struct commit_extra_header *extra, void *cb_data); diff --git a/trailer.c b/trailer.c index d19a92c..6d8375b 100644 --- a/trailer.c +++ b/trailer.c @@ -828,7 +828,7 @@ static int find_trailer_end(struct strbuf **lines, int patch_start) for (i = 0; i < patch_start; i++) strbuf_addbuf(, lines[i]); - ignore_bytes = ignore_non_trailer(); + ignore_bytes = ignore_non_trailer(sb.buf, sb.len); strbuf_release(); for (i = patch_start - 1; i >= 0 && ignore_bytes > 0; i--) ignore_bytes -= lines[i]->len; -- 2.8.0.rc3.226.g39d4020
[PATCH 2/4] trailer: avoid unnecessary splitting on lines
trailer.c currently splits lines while processing a buffer (and also rejoins lines when needing to invoke ignore_non_trailer). Avoid such line splitting, except when generating the strings corresponding to trailers (for ease of use by clients - a subsequent patch will allow other components to obtain the layout of a trailer block in a buffer, including the trailers themselves). The main purpose of this is to make it easy to return pointers into the original buffer (for a subsequent patch), but this also significantly reduces the number of memory allocations required. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 215 +- 1 file changed, 116 insertions(+), 99 deletions(-) diff --git a/trailer.c b/trailer.c index 6d8375b..d4d9e10 100644 --- a/trailer.c +++ b/trailer.c @@ -102,12 +102,12 @@ static int same_trailer(struct trailer_item *a, struct arg_item *b) return same_token(a, b) && same_value(a, b); } -static inline int contains_only_spaces(const char *str) +static inline int is_blank_line(const char *str) { const char *s = str; - while (*s && isspace(*s)) + while (*s && *s != '\n' && isspace(*s)) s++; - return !*s; + return !*s || *s == '\n'; } static inline void strbuf_replace(struct strbuf *sb, const char *a, const char *b) @@ -566,13 +566,18 @@ static int token_matches_item(const char *tok, struct arg_item *item, int tok_le } /* - * Return the location of the first separator in line, or -1 if there is no - * separator. + * Return the location of the first separator in the given line, or -1 if there + * is no separator. + * + * The interests parameter must contain the acceptable separators and may + * contain '\n'. If interests contains '\n', the input line is considered to + * end at the first newline encountered. Otherwise, the input line is + * considered to end at its null terminator. */ -static int find_separator(const char *line, const char *separators) +static int find_separator(const char *line, const char *interests) { - int loc = strcspn(line, separators); - if (!line[loc]) + int loc = strcspn(line, interests); + if (!line[loc] || line[loc] == '\n') return -1; return loc; } @@ -688,51 +693,71 @@ static void process_command_line_args(struct list_head *arg_head, free(cl_separators); } -static struct strbuf **read_input_file(const char *file) +static void read_input_file(struct strbuf *sb, const char *file) { - struct strbuf **lines; - struct strbuf sb = STRBUF_INIT; - if (file) { - if (strbuf_read_file(, file, 0) < 0) + if (strbuf_read_file(sb, file, 0) < 0) die_errno(_("could not read input file '%s'"), file); } else { - if (strbuf_read(, fileno(stdin), 0) < 0) + if (strbuf_read(sb, fileno(stdin), 0) < 0) die_errno(_("could not read from stdin")); } +} - lines = strbuf_split(, '\n'); +static const char *next_line(const char *str) +{ + const char *nl = strchrnul(str, '\n'); + return nl + !!*nl; +} - strbuf_release(); +/* + * Return the position of the start of the last line. If len is 0, return -1. + */ +static int last_line(const char *buf, size_t len) +{ + int i; + if (len == 0) + return -1; + if (len == 1) + return 0; + /* +* Skip the last character (in addition to the null terminator), +* because if the last character is a newline, it is considered as part +* of the last line anyway. +*/ + i = len - 2; - return lines; + for (; i >= 0; i--) { + if (buf[i] == '\n') + return i + 1; + } + return 0; } /* - * Return the (0 based) index of the start of the patch or the line - * count if there is no patch in the message. + * Return the position of the start of the patch or the length of str if there + * is no patch in the message. */ -static int find_patch_start(struct strbuf **lines, int count) +static int find_patch_start(const char *str) { - int i; + const char *s; - /* Get the start of the patch part if any */ - for (i = 0; i < count; i++) { - if (starts_with(lines[i]->buf, "---")) - return i; + for (s = str; *s; s = next_line(s)) { + if (starts_with(s, "---")) + return s - str; } - return count; + return s - str; } /* - * Return the (0 based) index of the first trailer line or count if - * there are no trailers. Trailers are searched only in the lines from - * index (count - 1) down to index 0. + * Return the position of the
[PATCH 0/4] Make other git commands use trailer layout
This is built off jt/trailer-with-cruft (commit 60ef86a). This patch set makes "commit -s", "cherry-pick -x", and "format-patch --signoff" use the new trailer definition implemented in jt/trailer-with-cruft, with some refactoring along the way. With this patch set, the aforementioned commands would now handle trailers like those described in [1]. [1] <84f28caa-2e4b-1231-1a76-3b7e765c0...@google.com> Jonathan Tan (4): commit: make ignore_non_trailer take buf/len trailer: avoid unnecessary splitting on lines trailer: have function to describe trailer layout sequencer: use trailer's trailer layout builtin/commit.c | 2 +- commit.c | 22 ++-- commit.h | 2 +- sequencer.c | 75 +++- t/t3511-cherry-pick-x.sh | 16 ++- t/t4014-format-patch.sh | 40 +-- t/t7501-commit.sh| 36 ++ trailer.c| 295 --- trailer.h| 25 9 files changed, 313 insertions(+), 200 deletions(-) -- 2.8.0.rc3.226.g39d4020
Re: What's cooking in git.git (Oct 2016, #07; Wed, 26)
On 10/26/2016 03:29 PM, Junio C Hamano wrote: * jt/trailer-with-cruft (2016-10-21) 8 commits - trailer: support values folded to multiple lines - trailer: forbid leading whitespace in trailers - trailer: allow non-trailers in trailer block - trailer: clarify failure modes in parse_trailer - trailer: make args have their own struct - trailer: streamline trailer item create and add - trailer: use list.h for doubly-linked list - trailer: improve const correctness Update "interpret-trailers" machinery and teaches it that people in real world write all sorts of crufts in the "trailer" that was originally designed to have the neat-o "Mail-Header: like thing" and nothing else. Waiting for review. For this patch set, should I look for another reviewer? In the e-mail thread [1], you and Stefan seemed positive about this patch set. (I've also checked [2] that this is not merged yet.) [1][2] git log --grep='jt/trailer'
[PATCH v5 3/8] trailer: streamline trailer item create and add
Currently, creation and addition (to a list) of trailer items are spread across multiple functions. Streamline this by only having 2 functions: one to parse the user-supplied string, and one to add the parsed information to a list. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 130 +- 1 file changed, 60 insertions(+), 70 deletions(-) diff --git a/trailer.c b/trailer.c index 4e85aae..ae3972a 100644 --- a/trailer.c +++ b/trailer.c @@ -500,10 +500,31 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } -static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *trailer) +static const char *token_from_item(struct trailer_item *item, char *tok) +{ + if (item->conf.key) + return item->conf.key; + if (tok) + return tok; + return item->conf.name; +} + +static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) +{ + if (!strncasecmp(tok, item->conf.name, tok_len)) + return 1; + return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; +} + +static int parse_trailer(struct strbuf *tok, struct strbuf *val, +const struct conf_info **conf, const char *trailer) { size_t len; struct strbuf seps = STRBUF_INIT; + struct trailer_item *item; + int tok_len; + struct list_head *pos; + strbuf_addstr(, separators); strbuf_addch(, '='); len = strcspn(trailer, seps.buf); @@ -523,74 +544,31 @@ static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *tra strbuf_addstr(tok, trailer); strbuf_trim(tok); } - return 0; -} - -static const char *token_from_item(struct trailer_item *item, char *tok) -{ - if (item->conf.key) - return item->conf.key; - if (tok) - return tok; - return item->conf.name; -} - -static struct trailer_item *new_trailer_item(struct trailer_item *conf_item, -char *tok, char *val) -{ - struct trailer_item *new = xcalloc(sizeof(*new), 1); - new->value = val ? val : xstrdup(""); - - if (conf_item) { - duplicate_conf(>conf, _item->conf); - new->token = xstrdup(token_from_item(conf_item, tok)); - free(tok); - } else { - duplicate_conf(>conf, _conf_info); - new->token = tok; - } - - return new; -} - -static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) -{ - if (!strncasecmp(tok, item->conf.name, tok_len)) - return 1; - return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; -} - -static struct trailer_item *create_trailer_item(const char *string) -{ - struct strbuf tok = STRBUF_INIT; - struct strbuf val = STRBUF_INIT; - struct trailer_item *item; - int tok_len; - struct list_head *pos; - - if (parse_trailer(, , string)) - return NULL; - - tok_len = token_len_without_separator(tok.buf, tok.len); /* Lookup if the token matches something in the config */ + tok_len = token_len_without_separator(tok->buf, tok->len); + *conf = _conf_info; list_for_each(pos, _head) { item = list_entry(pos, struct trailer_item, list); - if (token_matches_item(tok.buf, item, tok_len)) - return new_trailer_item(item, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + if (token_matches_item(tok->buf, item, tok_len)) { + char *tok_buf = strbuf_detach(tok, NULL); + *conf = >conf; + strbuf_addstr(tok, token_from_item(item, tok_buf)); + free(tok_buf); + break; + } } - return new_trailer_item(NULL, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + return 0; } -static void add_trailer_item(struct list_head *head, struct trailer_item *new) +static void add_trailer_item(struct list_head *head, char *tok, char *val, +const struct conf_info *conf) { - if (!new) - return; + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = tok; + new->value = val; + duplicate_conf(>conf, conf); list_add_tail(>list, head); } @@ -599,21 +577,28 @@ static void process_command_line_args(struct list_head *arg_head, { struct string_list_item *
[PATCH v5 8/8] trailer: support values folded to multiple lines
Currently, interpret-trailers requires that a trailer be only on 1 line. For example: a: first line second line would be interpreted as one trailer line followed by one non-trailer line. Make interpret-trailers support RFC 822-style folding, treating those lines as one logical trailer. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/git-interpret-trailers.txt | 7 +- t/t7513-interpret-trailers.sh| 169 +++ trailer.c| 45 ++-- 3 files changed, 211 insertions(+), 10 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 4966b5b..e99bda6 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -57,11 +57,12 @@ minus signs start the patch part of the message. When reading trailers, there can be whitespaces after the token, the separator and the value. There can also be whitespaces -inside the token and the value. +inside the token and the value. The value may be split over multiple lines with +each subsequent line starting with whitespace, like the "folding" in RFC 822. Note that 'trailers' do not follow and are not intended to follow many -rules for RFC 822 headers. For example they do not follow the line -folding rules, the encoding rules and probably many other rules. +rules for RFC 822 headers. For example they do not follow +the encoding rules and probably many other rules. OPTIONS --- diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 3d94b3a..4dd1d7c 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -256,6 +256,175 @@ test_expect_success 'line with leading whitespace is not trailer' ' test_cmp expected actual ' +test_expect_success 'multiline field treated as one trailer for 25% check' ' + q_to_tab >patch <<-\EOF && + + Signed-off-by: a <a...@example.com> + name: value on + Qmultiple lines + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + EOF + q_to_tab >expected <<-\EOF && + + Signed-off-by: a <a...@example.com> + name: value on + Qmultiple lines + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + name: value + EOF + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for placement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + name: value + another: trailer + EOF + test_config trailer.name.where after && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for replacement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + another: trailer + name: value + EOF + test_config trailer.name.ifexists replace && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for difference check' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + test_config trailer.name.ifexists addIfDifferent && + + q_to_tab >trailer <<-\EOF && + name: first line + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + git interpret-trailers --trailer "$(cat trailer)&quo
[PATCH v5 2/8] trailer: use list.h for doubly-linked list
Replace the existing handwritten implementation of a doubly-linked list in trailer.c with the functions and macros from list.h. This significantly simplifies the code. Signed-off-by: Jonathan Tan <jonathanta...@google.com> Signed-off-by: Ramsay Jones <ram...@ramsayjones.plus.com> --- trailer.c | 258 ++ 1 file changed, 91 insertions(+), 167 deletions(-) diff --git a/trailer.c b/trailer.c index 1f191b2..4e85aae 100644 --- a/trailer.c +++ b/trailer.c @@ -4,6 +4,7 @@ #include "commit.h" #include "tempfile.h" #include "trailer.h" +#include "list.h" /* * Copyright (c) 2013, 2014 Christian Couder <chrisc...@tuxfamily.org> */ @@ -25,19 +26,24 @@ struct conf_info { static struct conf_info default_conf_info; struct trailer_item { - struct trailer_item *previous; - struct trailer_item *next; + struct list_head list; char *token; char *value; struct conf_info conf; }; -static struct trailer_item *first_conf_item; +static LIST_HEAD(conf_head); static char *separators = ":"; #define TRAILER_ARG_STRING "$ARG" +/* Iterate over the elements of the list. */ +#define list_for_each_dir(pos, head, is_reverse) \ + for (pos = is_reverse ? (head)->prev : (head)->next; \ + pos != (head); \ + pos = is_reverse ? pos->prev : pos->next) + static int after_or_end(enum action_where where) { return (where == WHERE_AFTER) || (where == WHERE_END); @@ -120,101 +126,49 @@ static void print_tok_val(FILE *outfile, const char *tok, const char *val) fprintf(outfile, "%s%c %s\n", tok, separators[0], val); } -static void print_all(FILE *outfile, struct trailer_item *first, int trim_empty) +static void print_all(FILE *outfile, struct list_head *head, int trim_empty) { + struct list_head *pos; struct trailer_item *item; - for (item = first; item; item = item->next) { + list_for_each(pos, head) { + item = list_entry(pos, struct trailer_item, list); if (!trim_empty || strlen(item->value) > 0) print_tok_val(outfile, item->token, item->value); } } -static void update_last(struct trailer_item **last) -{ - if (*last) - while ((*last)->next != NULL) - *last = (*last)->next; -} - -static void update_first(struct trailer_item **first) -{ - if (*first) - while ((*first)->previous != NULL) - *first = (*first)->previous; -} - static void add_arg_to_input_list(struct trailer_item *on_tok, - struct trailer_item *arg_tok, - struct trailer_item **first, - struct trailer_item **last) -{ - if (after_or_end(arg_tok->conf.where)) { - arg_tok->next = on_tok->next; - on_tok->next = arg_tok; - arg_tok->previous = on_tok; - if (arg_tok->next) - arg_tok->next->previous = arg_tok; - update_last(last); - } else { - arg_tok->previous = on_tok->previous; - on_tok->previous = arg_tok; - arg_tok->next = on_tok; - if (arg_tok->previous) - arg_tok->previous->next = arg_tok; - update_first(first); - } + struct trailer_item *arg_tok) +{ + if (after_or_end(arg_tok->conf.where)) + list_add(_tok->list, _tok->list); + else + list_add_tail(_tok->list, _tok->list); } static int check_if_different(struct trailer_item *in_tok, struct trailer_item *arg_tok, - int check_all) + int check_all, + struct list_head *head) { enum action_where where = arg_tok->conf.where; + struct list_head *next_head; do { - if (!in_tok) - return 1; if (same_trailer(in_tok, arg_tok)) return 0; /* * if we want to add a trailer after another one, * we have to check those before this one */ - in_tok = after_or_end(where) ? in_tok->previous : in_tok->next; + next_head = after_or_end(where) ? in_tok->list.prev + : in_tok->list.next; + if (next_head == head) + break; + in_tok = list_entry(next_head, struct trailer_item, list); } while (check_all); return 1; } -static void
[PATCH v5 7/8] trailer: forbid leading whitespace in trailers
Currently, interpret-trailers allows leading whitespace in trailer lines. This leads to false positives, especially for quoted lines or bullet lists. Forbid leading whitespace in trailers. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/git-interpret-trailers.txt | 2 +- t/t7513-interpret-trailers.sh| 15 +++ trailer.c| 2 +- 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index cf4c5ea..4966b5b 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -55,7 +55,7 @@ The group must either be at the end of the message or be the last non-whitespace lines before a line that starts with '---'. Such three minus signs start the patch part of the message. -When reading trailers, there can be whitespaces before and after the +When reading trailers, there can be whitespaces after the token, the separator and the value. There can also be whitespaces inside the token and the value. diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 003e90f..3d94b3a 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -241,6 +241,21 @@ test_expect_success 'with non-trailer lines only' ' test_cmp expected actual ' +test_expect_success 'line with leading whitespace is not trailer' ' + q_to_tab >patch <<-\EOF && + + Qtoken: value + EOF + q_to_tab >expected <<-\EOF && + + Qtoken: value + + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + test_expect_success 'with config setup' ' git config trailer.ack.key "Acked-by: " && cat >expected <<-\EOF && diff --git a/trailer.c b/trailer.c index 199f86a..d978437 100644 --- a/trailer.c +++ b/trailer.c @@ -775,7 +775,7 @@ static int find_trailer_start(struct strbuf **lines, int count) } separator_pos = find_separator(lines[start]->buf); - if (separator_pos >= 1) { + if (separator_pos >= 1 && !isspace(lines[start]->buf[0])) { struct list_head *pos; trailer_lines++; -- 2.8.0.rc3.226.g39d4020
[PATCH v5 6/8] trailer: allow non-trailers in trailer block
Currently, interpret-trailers requires all lines of a trailer block to be trailers (or comments) - if not it would not identify that block as a trailer block, and thus create its own trailer block, inserting a blank line. For example: echo -e "\nSigned-off-by: x\nnot trailer" | git interpret-trailers --trailer "c: d" would result in: Signed-off-by: x not trailer c: d Relax the definition of a trailer block to require that the trailers (i) are all trailers, or (ii) contain at least one Git-generated trailer and consists of at least 25% trailers. Signed-off-by: x not trailer c: d (i) is the existing functionality. (ii) allows arbitrary lines to be included in trailer blocks, like those in [1], and still allow interpret-trailers to be used. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/e7d316a02f683864a12389f8808570e37fb90aa3 Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/git-interpret-trailers.txt | 5 +- t/t7513-interpret-trailers.sh| 115 +++ trailer.c| 89 3 files changed, 194 insertions(+), 15 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 93d1db6..cf4c5ea 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -48,8 +48,9 @@ with only spaces at the end of the commit message part, one blank line will be added before the new trailer. Existing trailers are extracted from the input message by looking for -a group of one or more lines that contain a colon (by default), where -the group is preceded by one or more empty (or whitespace-only) lines. +a group of one or more lines that (i) are all trailers, or (ii) contains at +least one Git-generated trailer and consists of at least 25% trailers. +The group must be preceded by one or more empty (or whitespace-only) lines. The group must either be at the end of the message or be the last non-whitespace lines before a line that starts with '---'. Such three minus signs start the patch part of the message. diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index aee785c..003e90f 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -126,6 +126,121 @@ test_expect_success 'with multiline title in the message' ' test_cmp expected actual ' +test_expect_success 'with non-trailer lines mixed with Signed-off-by' ' + cat >patch <<-\EOF && + + this is not a trailer + this is not a trailer + Signed-off-by: a <a...@example.com> + this is not a trailer + EOF + cat >expected <<-\EOF && + + this is not a trailer + this is not a trailer + Signed-off-by: a <a...@example.com> + this is not a trailer + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'with non-trailer lines mixed with cherry picked from' ' + cat >patch <<-\EOF && + + this is not a trailer + this is not a trailer + (cherry picked from commit x) + this is not a trailer + EOF + cat >expected <<-\EOF && + + this is not a trailer + this is not a trailer + (cherry picked from commit x) + this is not a trailer + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'with non-trailer lines mixed with a configured trailer' ' + cat >patch <<-\EOF && + + this is not a trailer + this is not a trailer + My-trailer: x + this is not a trailer + EOF + cat >expected <<-\EOF && + + this is not a trailer + this is not a trailer + My-trailer: x + this is not a trailer + token: value + EOF + test_config trailer.my.key "My-trailer: " && + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'with non-trailer lines mixed with a non-configured trailer' ' + cat >patch <<-\EOF && + + this is not a trailer + this is not a trailer + I-am-not-configured: x + this is not a trailer + EOF + cat >expected <<-\EOF && + +
[PATCH v5 0/8] allow non-trailers and multiple-line trailers
I've updated patch 5/8 to use strcspn and to pass in the list of separators, meaning that we no longer accept '=' in file input (and also updated its commit message accordingly). We also discussed inlining find_separator, but after looking at the code, I think that it is more convenient if find_separator returns -1 when there is no separator, because of the 3 times it is used (as of 8/8), it is checked twice with '>= 1' (since both "no separator" and "string begins with separator" are handled in the same way - treating them as a non-trailer line). So I have left it as its own function. No other updates. Jonathan Tan (8): trailer: improve const correctness trailer: use list.h for doubly-linked list trailer: streamline trailer item create and add trailer: make args have their own struct trailer: clarify failure modes in parse_trailer trailer: allow non-trailers in trailer block trailer: forbid leading whitespace in trailers trailer: support values folded to multiple lines Documentation/git-interpret-trailers.txt | 14 +- t/t7513-interpret-trailers.sh| 299 +++ trailer.c| 620 +-- 3 files changed, 654 insertions(+), 279 deletions(-) -- 2.8.0.rc3.226.g39d4020
[PATCH v5 1/8] trailer: improve const correctness
Change "const char *" to "char *" in struct trailer_item and in the return value of apply_command (since those strings are owned strings). Change "struct conf_info *" to "const struct conf_info *" (since that struct is not modified). Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/trailer.c b/trailer.c index c6ea9ac..1f191b2 100644 --- a/trailer.c +++ b/trailer.c @@ -27,8 +27,8 @@ static struct conf_info default_conf_info; struct trailer_item { struct trailer_item *previous; struct trailer_item *next; - const char *token; - const char *value; + char *token; + char *value; struct conf_info conf; }; @@ -95,8 +95,8 @@ static void free_trailer_item(struct trailer_item *item) free(item->conf.name); free(item->conf.key); free(item->conf.command); - free((char *)item->token); - free((char *)item->value); + free(item->token); + free(item->value); free(item); } @@ -215,13 +215,13 @@ static struct trailer_item *remove_first(struct trailer_item **first) return item; } -static const char *apply_command(const char *command, const char *arg) +static char *apply_command(const char *command, const char *arg) { struct strbuf cmd = STRBUF_INIT; struct strbuf buf = STRBUF_INIT; struct child_process cp = CHILD_PROCESS_INIT; const char *argv[] = {NULL, NULL}; - const char *result; + char *result; strbuf_addstr(, command); if (arg) @@ -425,7 +425,7 @@ static int set_if_missing(struct conf_info *item, const char *value) return 0; } -static void duplicate_conf(struct conf_info *dst, struct conf_info *src) +static void duplicate_conf(struct conf_info *dst, const struct conf_info *src) { *dst = *src; if (src->name) -- 2.8.0.rc3.226.g39d4020
[PATCH v5 5/8] trailer: clarify failure modes in parse_trailer
The parse_trailer function has a few modes of operation, all depending on whether the separator is present in its input, and if yes, the separator's position. Some of these modes are failure modes, and these failure modes are handled differently depending on whether the trailer line was sourced from a file or from a command-line argument. Extract a function to find the separator, allowing the invokers of parse_trailer to determine how to handle the failure modes instead of making parse_trailer do it. In this function, also take in the list of separators, so that we can distinguish between command line arguments (which allow '=' as separator) and file input (which does not allow '=' as separator). Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 75 --- 1 file changed, 53 insertions(+), 22 deletions(-) diff --git a/trailer.c b/trailer.c index 99018f8..aff858b 100644 --- a/trailer.c +++ b/trailer.c @@ -543,29 +543,37 @@ static int token_matches_item(const char *tok, struct arg_item *item, int tok_le return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; } -static int parse_trailer(struct strbuf *tok, struct strbuf *val, -const struct conf_info **conf, const char *trailer) +/* + * Return the location of the first separator in line, or -1 if there is no + * separator. + */ +static int find_separator(const char *line, const char *separators) +{ + int loc = strcspn(line, separators); + if (!line[loc]) + return -1; + return loc; +} + +/* + * Obtain the token, value, and conf from the given trailer. + * + * separator_pos must not be 0, since the token cannot be an empty string. + * + * If separator_pos is -1, interpret the whole trailer as a token. + */ +static void parse_trailer(struct strbuf *tok, struct strbuf *val, +const struct conf_info **conf, const char *trailer, +int separator_pos) { - size_t len; - struct strbuf seps = STRBUF_INIT; struct arg_item *item; int tok_len; struct list_head *pos; - strbuf_addstr(, separators); - strbuf_addch(, '='); - len = strcspn(trailer, seps.buf); - strbuf_release(); - if (len == 0) { - int l = strlen(trailer); - while (l > 0 && isspace(trailer[l - 1])) - l--; - return error(_("empty trailer token in trailer '%.*s'"), l, trailer); - } - if (len < strlen(trailer)) { - strbuf_add(tok, trailer, len); + if (separator_pos != -1) { + strbuf_add(tok, trailer, separator_pos); strbuf_trim(tok); - strbuf_addstr(val, trailer + len + 1); + strbuf_addstr(val, trailer + separator_pos + 1); strbuf_trim(val); } else { strbuf_addstr(tok, trailer); @@ -587,8 +595,6 @@ static int parse_trailer(struct strbuf *tok, struct strbuf *val, break; } } - - return 0; } static void add_trailer_item(struct list_head *head, char *tok, char *val) @@ -619,6 +625,12 @@ static void process_command_line_args(struct list_head *arg_head, const struct conf_info *conf; struct list_head *pos; + /* +* In command-line arguments, '=' is accepted (in addition to the +* separators that are defined). +*/ + char *cl_separators = xstrfmt("=%s", separators); + /* Add an arg item for each configured trailer with a command */ list_for_each(pos, _head) { item = list_entry(pos, struct arg_item, list); @@ -631,12 +643,25 @@ static void process_command_line_args(struct list_head *arg_head, /* Add an arg item for each trailer on the command line */ for_each_string_list_item(tr, trailers) { - if (!parse_trailer(, , , tr->string)) + int separator_pos = find_separator(tr->string, cl_separators); + if (separator_pos == 0) { + struct strbuf sb = STRBUF_INIT; + strbuf_addstr(, tr->string); + strbuf_trim(); + error(_("empty trailer token in trailer '%.*s'"), + (int) sb.len, sb.buf); + strbuf_release(); + } else { + parse_trailer(, , , tr->string, + separator_pos); add_arg_item(arg_head, strbuf_detach(, NULL), strbuf_detach(, NULL), conf); + } } + + free(cl_separators); } static struct strbuf **read_input_file(const
[PATCH v5 4/8] trailer: make args have their own struct
Improve type safety by making arguments (whether from configuration or from the command line) have their own "struct arg_item" type, separate from the "struct trailer_item" type used to represent the trailers in the buffer being manipulated. This change also prepares "struct trailer_item" to be further differentiated from "struct arg_item" in future patches. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 135 +++--- 1 file changed, 85 insertions(+), 50 deletions(-) diff --git a/trailer.c b/trailer.c index ae3972a..99018f8 100644 --- a/trailer.c +++ b/trailer.c @@ -29,6 +29,12 @@ struct trailer_item { struct list_head list; char *token; char *value; +}; + +struct arg_item { + struct list_head list; + char *token; + char *value; struct conf_info conf; }; @@ -62,7 +68,7 @@ static size_t token_len_without_separator(const char *token, size_t len) return len; } -static int same_token(struct trailer_item *a, struct trailer_item *b) +static int same_token(struct trailer_item *a, struct arg_item *b) { size_t a_len = token_len_without_separator(a->token, strlen(a->token)); size_t b_len = token_len_without_separator(b->token, strlen(b->token)); @@ -71,12 +77,12 @@ static int same_token(struct trailer_item *a, struct trailer_item *b) return !strncasecmp(a->token, b->token, min_len); } -static int same_value(struct trailer_item *a, struct trailer_item *b) +static int same_value(struct trailer_item *a, struct arg_item *b) { return !strcasecmp(a->value, b->value); } -static int same_trailer(struct trailer_item *a, struct trailer_item *b) +static int same_trailer(struct trailer_item *a, struct arg_item *b) { return same_token(a, b) && same_value(a, b); } @@ -98,6 +104,13 @@ static inline void strbuf_replace(struct strbuf *sb, const char *a, const char * static void free_trailer_item(struct trailer_item *item) { + free(item->token); + free(item->value); + free(item); +} + +static void free_arg_item(struct arg_item *item) +{ free(item->conf.name); free(item->conf.key); free(item->conf.command); @@ -137,17 +150,29 @@ static void print_all(FILE *outfile, struct list_head *head, int trim_empty) } } +static struct trailer_item *trailer_from_arg(struct arg_item *arg_tok) +{ + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = arg_tok->token; + new->value = arg_tok->value; + arg_tok->token = arg_tok->value = NULL; + free_arg_item(arg_tok); + return new; +} + static void add_arg_to_input_list(struct trailer_item *on_tok, - struct trailer_item *arg_tok) + struct arg_item *arg_tok) { - if (after_or_end(arg_tok->conf.where)) - list_add(_tok->list, _tok->list); + int aoe = after_or_end(arg_tok->conf.where); + struct trailer_item *to_add = trailer_from_arg(arg_tok); + if (aoe) + list_add(_add->list, _tok->list); else - list_add_tail(_tok->list, _tok->list); + list_add_tail(_add->list, _tok->list); } static int check_if_different(struct trailer_item *in_tok, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, int check_all, struct list_head *head) { @@ -200,7 +225,7 @@ static char *apply_command(const char *command, const char *arg) return result; } -static void apply_item_command(struct trailer_item *in_tok, struct trailer_item *arg_tok) +static void apply_item_command(struct trailer_item *in_tok, struct arg_item *arg_tok) { if (arg_tok->conf.command) { const char *arg; @@ -218,13 +243,13 @@ static void apply_item_command(struct trailer_item *in_tok, struct trailer_item } static void apply_arg_if_exists(struct trailer_item *in_tok, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, struct trailer_item *on_tok, struct list_head *head) { switch (arg_tok->conf.if_exists) { case EXISTS_DO_NOTHING: - free_trailer_item(arg_tok); + free_arg_item(arg_tok); break; case EXISTS_REPLACE: apply_item_command(in_tok, arg_tok); @@ -241,39 +266,41 @@ static void apply_arg_if_exists(struct trailer_item *in_tok, if (check_if_different(in_tok, arg_tok, 1, head)) add_arg_to_input_list(on_tok, arg_tok);
Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer
On 10/20/2016 03:45 PM, Junio C Hamano wrote: Jonathan Tan <jonathanta...@google.com> writes: If we do that, there is also the necessity of creating a string that combines the separators and '=' (I guess '\n' is not necessary now, since all the lines are null terminated). I'm OK either way. (We could cache that string, although I would think that if we did that, we might as well write the loop manually, like in this patch.) I wonder if there is a legit reason to look for '=' in the first place. "Signed-off-by= Jonathan Tan <j...@my.home>" does not look like a valid trailer line to me. Isn't that a remnant of lazy coding in the original that tried to share a single parser for contents and command line options or something? That is true - I think we can take the allowed separators as an argument (meaning that we can have different behavior for file parsing and command line parsing), and since we already have that string, we can use strcspn. I'll try this out in the next reroll.
Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer
On 10/20/2016 03:40 PM, Jonathan Tan wrote: On 10/20/2016 03:14 PM, Junio C Hamano wrote: Stefan Beller <sbel...@google.com> writes: +static int find_separator(const char *line) +{ + const char *c; + for (c = line; ; c++) { + if (!*c || *c == '\n') + return -1; + if (*c == '=' || strchr(separators, *c)) + return c - line; + } I was about to suggest this function can be simplified and maybe even inlined by the use of strspn or strcspn, but I think manual processing of the string is fine, too, as it would not really be shorter. Hmm, I fear that iterating over a line one-byte-at-a-time and running strchr(separators, *c) on it for each byte has a performance implication over running a single call to strcspn(line, separators). If we do that, there is also the necessity of creating a string that combines the separators and '=' (I guess '\n' is not necessary now, since all the lines are null terminated). I'm OK either way. (We could cache that string, although I would think that if we did that, we might as well write the loop manually, like in this patch.) Actually I guess we could generate the separators_and_equal string whenever we obtain new separators from the config. I'll do this in the next reroll.
Re: [PATCH v4 5/8] trailer: clarify failure modes in parse_trailer
On 10/20/2016 03:14 PM, Junio C Hamano wrote: Stefan Bellerwrites: +static int find_separator(const char *line) +{ + const char *c; + for (c = line; ; c++) { + if (!*c || *c == '\n') + return -1; + if (*c == '=' || strchr(separators, *c)) + return c - line; + } I was about to suggest this function can be simplified and maybe even inlined by the use of strspn or strcspn, but I think manual processing of the string is fine, too, as it would not really be shorter. Hmm, I fear that iterating over a line one-byte-at-a-time and running strchr(separators, *c) on it for each byte has a performance implication over running a single call to strcspn(line, separators). If we do that, there is also the necessity of creating a string that combines the separators and '=' (I guess '\n' is not necessary now, since all the lines are null terminated). I'm OK either way. (We could cache that string, although I would think that if we did that, we might as well write the loop manually, like in this patch.)
[PATCH v4 4/8] trailer: make args have their own struct
Improve type safety by making arguments (whether from configuration or from the command line) have their own "struct arg_item" type, separate from the "struct trailer_item" type used to represent the trailers in the buffer being manipulated. This change also prepares "struct trailer_item" to be further differentiated from "struct arg_item" in future patches. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 135 +++--- 1 file changed, 85 insertions(+), 50 deletions(-) diff --git a/trailer.c b/trailer.c index ae3972a..99018f8 100644 --- a/trailer.c +++ b/trailer.c @@ -29,6 +29,12 @@ struct trailer_item { struct list_head list; char *token; char *value; +}; + +struct arg_item { + struct list_head list; + char *token; + char *value; struct conf_info conf; }; @@ -62,7 +68,7 @@ static size_t token_len_without_separator(const char *token, size_t len) return len; } -static int same_token(struct trailer_item *a, struct trailer_item *b) +static int same_token(struct trailer_item *a, struct arg_item *b) { size_t a_len = token_len_without_separator(a->token, strlen(a->token)); size_t b_len = token_len_without_separator(b->token, strlen(b->token)); @@ -71,12 +77,12 @@ static int same_token(struct trailer_item *a, struct trailer_item *b) return !strncasecmp(a->token, b->token, min_len); } -static int same_value(struct trailer_item *a, struct trailer_item *b) +static int same_value(struct trailer_item *a, struct arg_item *b) { return !strcasecmp(a->value, b->value); } -static int same_trailer(struct trailer_item *a, struct trailer_item *b) +static int same_trailer(struct trailer_item *a, struct arg_item *b) { return same_token(a, b) && same_value(a, b); } @@ -98,6 +104,13 @@ static inline void strbuf_replace(struct strbuf *sb, const char *a, const char * static void free_trailer_item(struct trailer_item *item) { + free(item->token); + free(item->value); + free(item); +} + +static void free_arg_item(struct arg_item *item) +{ free(item->conf.name); free(item->conf.key); free(item->conf.command); @@ -137,17 +150,29 @@ static void print_all(FILE *outfile, struct list_head *head, int trim_empty) } } +static struct trailer_item *trailer_from_arg(struct arg_item *arg_tok) +{ + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = arg_tok->token; + new->value = arg_tok->value; + arg_tok->token = arg_tok->value = NULL; + free_arg_item(arg_tok); + return new; +} + static void add_arg_to_input_list(struct trailer_item *on_tok, - struct trailer_item *arg_tok) + struct arg_item *arg_tok) { - if (after_or_end(arg_tok->conf.where)) - list_add(_tok->list, _tok->list); + int aoe = after_or_end(arg_tok->conf.where); + struct trailer_item *to_add = trailer_from_arg(arg_tok); + if (aoe) + list_add(_add->list, _tok->list); else - list_add_tail(_tok->list, _tok->list); + list_add_tail(_add->list, _tok->list); } static int check_if_different(struct trailer_item *in_tok, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, int check_all, struct list_head *head) { @@ -200,7 +225,7 @@ static char *apply_command(const char *command, const char *arg) return result; } -static void apply_item_command(struct trailer_item *in_tok, struct trailer_item *arg_tok) +static void apply_item_command(struct trailer_item *in_tok, struct arg_item *arg_tok) { if (arg_tok->conf.command) { const char *arg; @@ -218,13 +243,13 @@ static void apply_item_command(struct trailer_item *in_tok, struct trailer_item } static void apply_arg_if_exists(struct trailer_item *in_tok, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, struct trailer_item *on_tok, struct list_head *head) { switch (arg_tok->conf.if_exists) { case EXISTS_DO_NOTHING: - free_trailer_item(arg_tok); + free_arg_item(arg_tok); break; case EXISTS_REPLACE: apply_item_command(in_tok, arg_tok); @@ -241,39 +266,41 @@ static void apply_arg_if_exists(struct trailer_item *in_tok, if (check_if_different(in_tok, arg_tok, 1, head)) add_arg_to_input_list(on_tok, arg_tok);
[PATCH v4 3/8] trailer: streamline trailer item create and add
Currently, creation and addition (to a list) of trailer items are spread across multiple functions. Streamline this by only having 2 functions: one to parse the user-supplied string, and one to add the parsed information to a list. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 130 +- 1 file changed, 60 insertions(+), 70 deletions(-) diff --git a/trailer.c b/trailer.c index 4e85aae..ae3972a 100644 --- a/trailer.c +++ b/trailer.c @@ -500,10 +500,31 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } -static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *trailer) +static const char *token_from_item(struct trailer_item *item, char *tok) +{ + if (item->conf.key) + return item->conf.key; + if (tok) + return tok; + return item->conf.name; +} + +static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) +{ + if (!strncasecmp(tok, item->conf.name, tok_len)) + return 1; + return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; +} + +static int parse_trailer(struct strbuf *tok, struct strbuf *val, +const struct conf_info **conf, const char *trailer) { size_t len; struct strbuf seps = STRBUF_INIT; + struct trailer_item *item; + int tok_len; + struct list_head *pos; + strbuf_addstr(, separators); strbuf_addch(, '='); len = strcspn(trailer, seps.buf); @@ -523,74 +544,31 @@ static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *tra strbuf_addstr(tok, trailer); strbuf_trim(tok); } - return 0; -} - -static const char *token_from_item(struct trailer_item *item, char *tok) -{ - if (item->conf.key) - return item->conf.key; - if (tok) - return tok; - return item->conf.name; -} - -static struct trailer_item *new_trailer_item(struct trailer_item *conf_item, -char *tok, char *val) -{ - struct trailer_item *new = xcalloc(sizeof(*new), 1); - new->value = val ? val : xstrdup(""); - - if (conf_item) { - duplicate_conf(>conf, _item->conf); - new->token = xstrdup(token_from_item(conf_item, tok)); - free(tok); - } else { - duplicate_conf(>conf, _conf_info); - new->token = tok; - } - - return new; -} - -static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) -{ - if (!strncasecmp(tok, item->conf.name, tok_len)) - return 1; - return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; -} - -static struct trailer_item *create_trailer_item(const char *string) -{ - struct strbuf tok = STRBUF_INIT; - struct strbuf val = STRBUF_INIT; - struct trailer_item *item; - int tok_len; - struct list_head *pos; - - if (parse_trailer(, , string)) - return NULL; - - tok_len = token_len_without_separator(tok.buf, tok.len); /* Lookup if the token matches something in the config */ + tok_len = token_len_without_separator(tok->buf, tok->len); + *conf = _conf_info; list_for_each(pos, _head) { item = list_entry(pos, struct trailer_item, list); - if (token_matches_item(tok.buf, item, tok_len)) - return new_trailer_item(item, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + if (token_matches_item(tok->buf, item, tok_len)) { + char *tok_buf = strbuf_detach(tok, NULL); + *conf = >conf; + strbuf_addstr(tok, token_from_item(item, tok_buf)); + free(tok_buf); + break; + } } - return new_trailer_item(NULL, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + return 0; } -static void add_trailer_item(struct list_head *head, struct trailer_item *new) +static void add_trailer_item(struct list_head *head, char *tok, char *val, +const struct conf_info *conf) { - if (!new) - return; + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = tok; + new->value = val; + duplicate_conf(>conf, conf); list_add_tail(>list, head); } @@ -599,21 +577,28 @@ static void process_command_line_args(struct list_head *arg_head, { struct string_list_item *
[PATCH v4 6/8] trailer: allow non-trailers in trailer block
Currently, interpret-trailers requires all lines of a trailer block to be trailers (or comments) - if not it would not identify that block as a trailer block, and thus create its own trailer block, inserting a blank line. For example: echo -e "\nSigned-off-by: x\nnot trailer" | git interpret-trailers --trailer "c: d" would result in: Signed-off-by: x not trailer c: d Relax the definition of a trailer block to require that the trailers (i) are all trailers, or (ii) contain at least one Git-generated trailer and consists of at least 25% trailers. Signed-off-by: x not trailer c: d (i) is the existing functionality. (ii) allows arbitrary lines to be included in trailer blocks, like those in [1], and still allow interpret-trailers to be used. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/e7d316a02f683864a12389f8808570e37fb90aa3 Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/git-interpret-trailers.txt | 5 +- t/t7513-interpret-trailers.sh| 115 +++ trailer.c| 89 3 files changed, 194 insertions(+), 15 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 93d1db6..cf4c5ea 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -48,8 +48,9 @@ with only spaces at the end of the commit message part, one blank line will be added before the new trailer. Existing trailers are extracted from the input message by looking for -a group of one or more lines that contain a colon (by default), where -the group is preceded by one or more empty (or whitespace-only) lines. +a group of one or more lines that (i) are all trailers, or (ii) contains at +least one Git-generated trailer and consists of at least 25% trailers. +The group must be preceded by one or more empty (or whitespace-only) lines. The group must either be at the end of the message or be the last non-whitespace lines before a line that starts with '---'. Such three minus signs start the patch part of the message. diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index aee785c..003e90f 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -126,6 +126,121 @@ test_expect_success 'with multiline title in the message' ' test_cmp expected actual ' +test_expect_success 'with non-trailer lines mixed with Signed-off-by' ' + cat >patch <<-\EOF && + + this is not a trailer + this is not a trailer + Signed-off-by: a <a...@example.com> + this is not a trailer + EOF + cat >expected <<-\EOF && + + this is not a trailer + this is not a trailer + Signed-off-by: a <a...@example.com> + this is not a trailer + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'with non-trailer lines mixed with cherry picked from' ' + cat >patch <<-\EOF && + + this is not a trailer + this is not a trailer + (cherry picked from commit x) + this is not a trailer + EOF + cat >expected <<-\EOF && + + this is not a trailer + this is not a trailer + (cherry picked from commit x) + this is not a trailer + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'with non-trailer lines mixed with a configured trailer' ' + cat >patch <<-\EOF && + + this is not a trailer + this is not a trailer + My-trailer: x + this is not a trailer + EOF + cat >expected <<-\EOF && + + this is not a trailer + this is not a trailer + My-trailer: x + this is not a trailer + token: value + EOF + test_config trailer.my.key "My-trailer: " && + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'with non-trailer lines mixed with a non-configured trailer' ' + cat >patch <<-\EOF && + + this is not a trailer + this is not a trailer + I-am-not-configured: x + this is not a trailer + EOF + cat >expected <<-\EOF && + +
[PATCH v4 2/8] trailer: use list.h for doubly-linked list
Replace the existing handwritten implementation of a doubly-linked list in trailer.c with the functions and macros from list.h. This significantly simplifies the code. Signed-off-by: Jonathan Tan <jonathanta...@google.com> Signed-off-by: Ramsay Jones <ram...@ramsayjones.plus.com> --- trailer.c | 258 ++ 1 file changed, 91 insertions(+), 167 deletions(-) diff --git a/trailer.c b/trailer.c index 1f191b2..4e85aae 100644 --- a/trailer.c +++ b/trailer.c @@ -4,6 +4,7 @@ #include "commit.h" #include "tempfile.h" #include "trailer.h" +#include "list.h" /* * Copyright (c) 2013, 2014 Christian Couder <chrisc...@tuxfamily.org> */ @@ -25,19 +26,24 @@ struct conf_info { static struct conf_info default_conf_info; struct trailer_item { - struct trailer_item *previous; - struct trailer_item *next; + struct list_head list; char *token; char *value; struct conf_info conf; }; -static struct trailer_item *first_conf_item; +static LIST_HEAD(conf_head); static char *separators = ":"; #define TRAILER_ARG_STRING "$ARG" +/* Iterate over the elements of the list. */ +#define list_for_each_dir(pos, head, is_reverse) \ + for (pos = is_reverse ? (head)->prev : (head)->next; \ + pos != (head); \ + pos = is_reverse ? pos->prev : pos->next) + static int after_or_end(enum action_where where) { return (where == WHERE_AFTER) || (where == WHERE_END); @@ -120,101 +126,49 @@ static void print_tok_val(FILE *outfile, const char *tok, const char *val) fprintf(outfile, "%s%c %s\n", tok, separators[0], val); } -static void print_all(FILE *outfile, struct trailer_item *first, int trim_empty) +static void print_all(FILE *outfile, struct list_head *head, int trim_empty) { + struct list_head *pos; struct trailer_item *item; - for (item = first; item; item = item->next) { + list_for_each(pos, head) { + item = list_entry(pos, struct trailer_item, list); if (!trim_empty || strlen(item->value) > 0) print_tok_val(outfile, item->token, item->value); } } -static void update_last(struct trailer_item **last) -{ - if (*last) - while ((*last)->next != NULL) - *last = (*last)->next; -} - -static void update_first(struct trailer_item **first) -{ - if (*first) - while ((*first)->previous != NULL) - *first = (*first)->previous; -} - static void add_arg_to_input_list(struct trailer_item *on_tok, - struct trailer_item *arg_tok, - struct trailer_item **first, - struct trailer_item **last) -{ - if (after_or_end(arg_tok->conf.where)) { - arg_tok->next = on_tok->next; - on_tok->next = arg_tok; - arg_tok->previous = on_tok; - if (arg_tok->next) - arg_tok->next->previous = arg_tok; - update_last(last); - } else { - arg_tok->previous = on_tok->previous; - on_tok->previous = arg_tok; - arg_tok->next = on_tok; - if (arg_tok->previous) - arg_tok->previous->next = arg_tok; - update_first(first); - } + struct trailer_item *arg_tok) +{ + if (after_or_end(arg_tok->conf.where)) + list_add(_tok->list, _tok->list); + else + list_add_tail(_tok->list, _tok->list); } static int check_if_different(struct trailer_item *in_tok, struct trailer_item *arg_tok, - int check_all) + int check_all, + struct list_head *head) { enum action_where where = arg_tok->conf.where; + struct list_head *next_head; do { - if (!in_tok) - return 1; if (same_trailer(in_tok, arg_tok)) return 0; /* * if we want to add a trailer after another one, * we have to check those before this one */ - in_tok = after_or_end(where) ? in_tok->previous : in_tok->next; + next_head = after_or_end(where) ? in_tok->list.prev + : in_tok->list.next; + if (next_head == head) + break; + in_tok = list_entry(next_head, struct trailer_item, list); } while (check_all); return 1; } -static void
[PATCH v4 7/8] trailer: forbid leading whitespace in trailers
Currently, interpret-trailers allows leading whitespace in trailer lines. This leads to false positives, especially for quoted lines or bullet lists. Forbid leading whitespace in trailers. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/git-interpret-trailers.txt | 2 +- t/t7513-interpret-trailers.sh| 15 +++ trailer.c| 2 +- 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index cf4c5ea..4966b5b 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -55,7 +55,7 @@ The group must either be at the end of the message or be the last non-whitespace lines before a line that starts with '---'. Such three minus signs start the patch part of the message. -When reading trailers, there can be whitespaces before and after the +When reading trailers, there can be whitespaces after the token, the separator and the value. There can also be whitespaces inside the token and the value. diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 003e90f..3d94b3a 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -241,6 +241,21 @@ test_expect_success 'with non-trailer lines only' ' test_cmp expected actual ' +test_expect_success 'line with leading whitespace is not trailer' ' + q_to_tab >patch <<-\EOF && + + Qtoken: value + EOF + q_to_tab >expected <<-\EOF && + + Qtoken: value + + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + test_expect_success 'with config setup' ' git config trailer.ack.key "Acked-by: " && cat >expected <<-\EOF && diff --git a/trailer.c b/trailer.c index da15b79..3ef5576 100644 --- a/trailer.c +++ b/trailer.c @@ -770,7 +770,7 @@ static int find_trailer_start(struct strbuf **lines, int count) } separator_pos = find_separator(lines[start]->buf); - if (separator_pos >= 1) { + if (separator_pos >= 1 && !isspace(lines[start]->buf[0])) { struct list_head *pos; trailer_lines++; -- 2.8.0.rc3.226.g39d4020
[PATCH v4 8/8] trailer: support values folded to multiple lines
Currently, interpret-trailers requires that a trailer be only on 1 line. For example: a: first line second line would be interpreted as one trailer line followed by one non-trailer line. Make interpret-trailers support RFC 822-style folding, treating those lines as one logical trailer. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/git-interpret-trailers.txt | 7 +- t/t7513-interpret-trailers.sh| 169 +++ trailer.c| 43 ++-- 3 files changed, 210 insertions(+), 9 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 4966b5b..e99bda6 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -57,11 +57,12 @@ minus signs start the patch part of the message. When reading trailers, there can be whitespaces after the token, the separator and the value. There can also be whitespaces -inside the token and the value. +inside the token and the value. The value may be split over multiple lines with +each subsequent line starting with whitespace, like the "folding" in RFC 822. Note that 'trailers' do not follow and are not intended to follow many -rules for RFC 822 headers. For example they do not follow the line -folding rules, the encoding rules and probably many other rules. +rules for RFC 822 headers. For example they do not follow +the encoding rules and probably many other rules. OPTIONS --- diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 3d94b3a..4dd1d7c 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -256,6 +256,175 @@ test_expect_success 'line with leading whitespace is not trailer' ' test_cmp expected actual ' +test_expect_success 'multiline field treated as one trailer for 25% check' ' + q_to_tab >patch <<-\EOF && + + Signed-off-by: a <a...@example.com> + name: value on + Qmultiple lines + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + EOF + q_to_tab >expected <<-\EOF && + + Signed-off-by: a <a...@example.com> + name: value on + Qmultiple lines + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + this is not a trailer + name: value + EOF + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for placement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + name: value + another: trailer + EOF + test_config trailer.name.where after && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for replacement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + another: trailer + name: value + EOF + test_config trailer.name.ifexists replace && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for difference check' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + test_config trailer.name.ifexists addIfDifferent && + + q_to_tab >trailer <<-\EOF && + name: first line + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + git interpret-trailers --trailer "$(cat trailer)&quo
[PATCH v4 5/8] trailer: clarify failure modes in parse_trailer
The parse_trailer function has a few modes of operation, all depending on whether the separator is present in its input, and if yes, the separator's position. Some of these modes are failure modes, and these failure modes are handled differently depending on whether the trailer line was sourced from a file or from a command-line argument. Extract a function to find the separator, allowing the invokers of parse_trailer to determine how to handle the failure modes instead of making parse_trailer do it. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 70 +++ 1 file changed, 48 insertions(+), 22 deletions(-) diff --git a/trailer.c b/trailer.c index 99018f8..137a3fb 100644 --- a/trailer.c +++ b/trailer.c @@ -543,29 +543,40 @@ static int token_matches_item(const char *tok, struct arg_item *item, int tok_le return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; } -static int parse_trailer(struct strbuf *tok, struct strbuf *val, -const struct conf_info **conf, const char *trailer) +/* + * Return the location of the first separator or '=' in line, or -1 if either a + * newline or the null terminator is reached first. + */ +static int find_separator(const char *line) +{ + const char *c; + for (c = line; ; c++) { + if (!*c || *c == '\n') + return -1; + if (*c == '=' || strchr(separators, *c)) + return c - line; + } +} + +/* + * Obtain the token, value, and conf from the given trailer. + * + * separator_pos must not be 0, since the token cannot be an empty string. + * + * If separator_pos is -1, interpret the whole trailer as a token. + */ +static void parse_trailer(struct strbuf *tok, struct strbuf *val, +const struct conf_info **conf, const char *trailer, +int separator_pos) { - size_t len; - struct strbuf seps = STRBUF_INIT; struct arg_item *item; int tok_len; struct list_head *pos; - strbuf_addstr(, separators); - strbuf_addch(, '='); - len = strcspn(trailer, seps.buf); - strbuf_release(); - if (len == 0) { - int l = strlen(trailer); - while (l > 0 && isspace(trailer[l - 1])) - l--; - return error(_("empty trailer token in trailer '%.*s'"), l, trailer); - } - if (len < strlen(trailer)) { - strbuf_add(tok, trailer, len); + if (separator_pos != -1) { + strbuf_add(tok, trailer, separator_pos); strbuf_trim(tok); - strbuf_addstr(val, trailer + len + 1); + strbuf_addstr(val, trailer + separator_pos + 1); strbuf_trim(val); } else { strbuf_addstr(tok, trailer); @@ -587,8 +598,6 @@ static int parse_trailer(struct strbuf *tok, struct strbuf *val, break; } } - - return 0; } static void add_trailer_item(struct list_head *head, char *tok, char *val) @@ -631,11 +640,22 @@ static void process_command_line_args(struct list_head *arg_head, /* Add an arg item for each trailer on the command line */ for_each_string_list_item(tr, trailers) { - if (!parse_trailer(, , , tr->string)) + int separator_pos = find_separator(tr->string); + if (separator_pos == 0) { + struct strbuf sb = STRBUF_INIT; + strbuf_addstr(, tr->string); + strbuf_trim(); + error(_("empty trailer token in trailer '%.*s'"), + (int) sb.len, sb.buf); + strbuf_release(); + } else { + parse_trailer(, , , tr->string, + separator_pos); add_arg_item(arg_head, strbuf_detach(, NULL), strbuf_detach(, NULL), conf); + } } } @@ -775,11 +795,17 @@ static int process_input_file(FILE *outfile, /* Parse trailer lines */ for (i = trailer_start; i < trailer_end; i++) { - if (lines[i]->buf[0] != comment_line_char && - !parse_trailer(, , NULL, lines[i]->buf)) + int separator_pos; + if (lines[i]->buf[0] == comment_line_char) + continue; + separator_pos = find_separator(lines[i]->buf); + if (separator_pos >= 1) { + parse_trailer(, , NULL, lines[i]->buf, + separator_pos);
[PATCH v4 0/8] allow non-trailers and multiple-line trailers
Main changes are: - implemented the previously discussed trailer block recognizing rule (recognized trailer + 25% trailers or 100% trailers) - forbidding leading whitespace in trailers to avoid false positives Once the recognized trailer + 25% trailers rule is implemented, implementing the 100% trailer rule gives us backwards compatibility and is only a few lines of code, so I included it. Summary of changes from v3: 2/6->2/8: - squashed Ramsay Jones's "static" patch new->5/8: - new patch 5/6->6/8: - new trailer block recognizing rule - reverted to the existing behavior of ignoring comments, since the number of trailers and non-trailers in the trailer block now matters more new->7/8: - new patch 6/6->8/8: - updated trailer block recognizing code, since the continuation lines must not be counted if they follow a trailer line Jonathan Tan (8): trailer: improve const correctness trailer: use list.h for doubly-linked list trailer: streamline trailer item create and add trailer: make args have their own struct trailer: clarify failure modes in parse_trailer trailer: allow non-trailers in trailer block trailer: forbid leading whitespace in trailers trailer: support values folded to multiple lines Documentation/git-interpret-trailers.txt | 14 +- t/t7513-interpret-trailers.sh| 299 +++ trailer.c| 619 +-- 3 files changed, 651 insertions(+), 281 deletions(-) -- 2.8.0.rc3.226.g39d4020
[PATCH v4 1/8] trailer: improve const correctness
Change "const char *" to "char *" in struct trailer_item and in the return value of apply_command (since those strings are owned strings). Change "struct conf_info *" to "const struct conf_info *" (since that struct is not modified). Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/trailer.c b/trailer.c index c6ea9ac..1f191b2 100644 --- a/trailer.c +++ b/trailer.c @@ -27,8 +27,8 @@ static struct conf_info default_conf_info; struct trailer_item { struct trailer_item *previous; struct trailer_item *next; - const char *token; - const char *value; + char *token; + char *value; struct conf_info conf; }; @@ -95,8 +95,8 @@ static void free_trailer_item(struct trailer_item *item) free(item->conf.name); free(item->conf.key); free(item->conf.command); - free((char *)item->token); - free((char *)item->value); + free(item->token); + free(item->value); free(item); } @@ -215,13 +215,13 @@ static struct trailer_item *remove_first(struct trailer_item **first) return item; } -static const char *apply_command(const char *command, const char *arg) +static char *apply_command(const char *command, const char *arg) { struct strbuf cmd = STRBUF_INIT; struct strbuf buf = STRBUF_INIT; struct child_process cp = CHILD_PROCESS_INIT; const char *argv[] = {NULL, NULL}; - const char *result; + char *result; strbuf_addstr(, command); if (arg) @@ -425,7 +425,7 @@ static int set_if_missing(struct conf_info *item, const char *value) return 0; } -static void duplicate_conf(struct conf_info *dst, struct conf_info *src) +static void duplicate_conf(struct conf_info *dst, const struct conf_info *src) { *dst = *src; if (src->name) -- 2.8.0.rc3.226.g39d4020
Re: [PATCH v3 5/6] trailer: allow non-trailers in trailer block
On 10/18/2016 09:36 AM, Junio C Hamano wrote: Jonathan Tan <jonathanta...@google.com> writes: * rs/c-auto-resets-attributes: pretty: avoid adding reset for %C(auto) if output is empty And neither of the two colon containing line remotely resembles how a typical RFC-822 header is formatted. So that may serve as a hint to how we can tighten it without introducing false negative. The only "offending" character is the space (according to RFC 822), but that sounds like a good rule to have. I suspect that we should be willing to deviate from the letter of RFC to reject misidentification. I saw things like Thanks to: Jonathan Tan <j...@host.xz> Signed-off-by: A U Thor <a...@th.or> in the wild (notice the SP between Thanks and to), for example. Rejecting leading whitespace as a line that does *not* start the header (hence its colon does not count) may be a good compromise. Good point. I think that "Signed-off-by:" is not guaranteed to be present. But do we really care about that case where there is no S-o-b:? I personally do not think so. I think we should - the use case I had in mind when I started this is the Android Git repository, which does not use Signed-off-by. For example, I quoted this commit in an earlier e-mail [1]: https://android.googlesource.com/platform/frameworks/base/+/4c5281862f750cbc9d7355a07ef1a5545b9b3523 which has the footer: Bug: http://b/30562229 Test: readelf --dyn-sym app_process32 and check that bsd_signal is exported readelf --dyn-sym app_process64 and check that bsd_signal is not exported Change-Id: Iec584070b42bc7fa43b114c0f884aff2db5a6858 Defining a trailer line as "a line starting with a token, then optional whitespace, then separator", maybe the following rule: - at least one trailer line generated by Git ("(cherry picked by" or "Signed-off-by") or configured in the "trailer" section in gitconfig OR - at least 3/4 logical trailer lines (I'm wondering if this should be 100% trailer lines) I'd strongly suggest turning that OR to AND. We will not safely be able to write a commit log message that describes how S-o-b lines are handled in its last paragraph otherwise. I do not care too deeply about 3/4, but I meant to allow 75% cruft but no more than that, and the fact that the threashold is set at way more than 50% is important. IOW, if you have Ordinary log message here... S-o-b: A U Thor <a...@th.or> [a short description that is typically a single liner in the real world use pattern we saw in the world, but could overflow to become multi line cruft] S-o-b: R E Layer <r...@lay.er> "last paragraph" is 5 lines long, among which 60% are cruft that is below the 75% threshold, and "am -s" can still add the S-o-b of the committer at the end of that existing last paragraph. Making it too strict would raise the false negative ratio. Ah, sorry, I misread your original suggestion. Would this work then: - at least one trailer line generated by Git ("(cherry picked by" or "Signed-off-by: ") or configured in the "trailer" section in git config AND at least 25% logical trailer lines OR - 100% logical trailer lines The first part is your original suggestion except that I think that it is more consistent to allow other trailer lines as well (but I do not feel too strongly about this). The second part would satisfy the Android Git use case above, and also retain existing behavior when "Signed-off-by" (for example) is added to an existing footer that does not contain "Signed-off-by" yet. What do you think? [1] Message ID <29cb0f55-f729-80af-cdca-64e927fa9...@google.com>
Re: [PATCH v3 5/6] trailer: allow non-trailers in trailer block
On 10/17/2016 06:42 PM, Junio C Hamano wrote: Stefan Beller <sbel...@google.com> writes: On Fri, Oct 14, 2016 at 10:38 AM, Jonathan Tan <jonathanta...@google.com> wrote: Existing trailers are extracted from the input message by looking for -a group of one or more lines that contain a colon (by default), where +a group of one or more lines in which at least one line contains a +colon (by default), where Please see commit 578e6021c0819d7be1179e05e7ce0e6fdb2a01b7 for an example where I think this is overly broad. Hmph. That's a merge. Merge branch 'rs/c-auto-resets-attributes' When "%C(auto)" appears at the very beginning of the pretty format string, it did not need to issue the reset sequence, but it did. * rs/c-auto-resets-attributes: pretty: avoid adding reset for %C(auto) if output is empty And neither of the two colon containing line remotely resembles how a typical RFC-822 header is formatted. So that may serve as a hint to how we can tighten it without introducing false negative. The only "offending" character is the space (according to RFC 822), but that sounds like a good rule to have. Another made up example, that I'd want to feed in commit -s eventually: --8<-- demonstrate colons in Java First paragraph is not interesting. Also if using another Language such as Java, where I point out Class::function() to be problematic --8<-- This would lack the white space between the last paragraph and the Sign off ? So for this patch I am mostly concerned about false positives hidden in actual text. Yes. These are exactly why I mentioned "if certian number or percentage" in my earlier suggestion. I think in practice, "A paragraph with at least one Signed-off-by: line, and has no more than 3/4 of the (logical) lines that do not resemble how a typical RFC-822 header is formatted" or something along that line would give us a reasonable safety. I think that "Signed-off-by:" is not guaranteed to be present. Defining a trailer line as "a line starting with a token, then optional whitespace, then separator", maybe the following rule: - at least one trailer line generated by Git ("(cherry picked by" or "Signed-off-by") or configured in the "trailer" section in gitconfig OR - at least 3/4 logical trailer lines (I'm wondering if this should be 100% trailer lines) ? Your Java example will fail the criteria in two ways, so we'd be safe ;-)
[PATCH v3 6/6] trailer: support values folded to multiple lines
Currently, interpret-trailers requires that a trailer be only on 1 line. For example: a: first line second line would be interpreted as one trailer line followed by one non-trailer line. Make interpret-trailers support RFC 822-style folding, treating those lines as one logical trailer. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/git-interpret-trailers.txt | 7 +- t/t7513-interpret-trailers.sh| 139 +++ trailer.c| 22 +++-- 3 files changed, 160 insertions(+), 8 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index c480da6..cfec636 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -57,11 +57,12 @@ minus signs start the patch part of the message. When reading trailers, there can be whitespaces before and after the token, the separator and the value. There can also be whitespaces -inside the token and the value. +inside the token and the value. The value may be split over multiple lines with +each subsequent line starting with whitespace, like the "folding" in RFC 822. Note that 'trailers' do not follow and are not intended to follow many -rules for RFC 822 headers. For example they do not follow the line -folding rules, the encoding rules and probably many other rules. +rules for RFC 822 headers. For example they do not follow +the encoding rules and probably many other rules. OPTIONS --- diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 7f5cd2a..31db699 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -157,6 +157,145 @@ test_expect_success 'with non-trailer lines only' ' test_cmp expected actual ' +test_expect_success 'multiline field treated as atomic for placement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + name: value + another: trailer + EOF + test_config trailer.name.where after && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for replacement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + another: trailer + name: value + EOF + test_config trailer.name.ifexists replace && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for difference check' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + test_config trailer.name.ifexists addIfDifferent && + + q_to_tab >trailer <<-\EOF && + name: first line + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + git interpret-trailers --trailer "$(cat trailer)" patch >actual && + test_cmp expected actual && + + q_to_tab >trailer <<-\EOF && + name: first line + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + name: first line + Qsecond line + EOF + git interpret-trailers --trailer "$(cat trailer)" patch >actual && + test_cmp expected actual && + + q_to_tab >trailer <<-\EOF && + name: first line *DIFFERENT* + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + name: first line *DIFFERENT* + Qsecond line + EOF + git interpret-trailers --trailer "$(cat trailer
[PATCH v3 5/6] trailer: allow non-trailers in trailer block
Currently, interpret-trailers requires all lines of a trailer block to be trailers (or comments) - if not it would not identify that block as a trailer block, and thus create its own trailer block, inserting a blank line. For example: echo -e "\na: b\nnot trailer" | git interpret-trailers --trailer "c: d" would result in: a: b not trailer c: d Relax the definition of a trailer block to only require 1 trailer, so that trailers can be directly added to such blocks, resulting in: a: b not trailer c: d This allows arbitrary lines to be included in trailer blocks, like those in [1], and still allow interpret-trailers to be used. This change also makes comments in the trailer block be treated as any other non-trailer line, preserving them in the output of interpret-trailers. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/e7d316a02f683864a12389f8808570e37fb90aa3 Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- Documentation/git-interpret-trailers.txt | 3 +- t/t7513-interpret-trailers.sh| 35 +++ trailer.c| 77 ++-- 3 files changed, 90 insertions(+), 25 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 93d1db6..c480da6 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -48,7 +48,8 @@ with only spaces at the end of the commit message part, one blank line will be added before the new trailer. Existing trailers are extracted from the input message by looking for -a group of one or more lines that contain a colon (by default), where +a group of one or more lines in which at least one line contains a +colon (by default), where the group is preceded by one or more empty (or whitespace-only) lines. The group must either be at the end of the message or be the last non-whitespace lines before a line that starts with '---'. Such three diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index aee785c..7f5cd2a 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -126,6 +126,37 @@ test_expect_success 'with multiline title in the message' ' test_cmp expected actual ' +test_expect_success 'with non-trailer lines mixed with trailer lines' ' + cat >patch <<-\EOF && + + this: is a trailer + this is not a trailer + EOF + cat >expected <<-\EOF && + + this: is a trailer + this is not a trailer + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'with non-trailer lines only' ' + cat >patch <<-\EOF && + + this is not a trailer + EOF + cat >expected <<-\EOF && + + this is not a trailer + + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + test_expect_success 'with config setup' ' git config trailer.ack.key "Acked-by: " && cat >expected <<-\EOF && @@ -257,6 +288,8 @@ test_expect_success 'with message that has comments' ' cat >>expected <<-\EOF && # comment + # other comment + # yet another comment Reviewed-by: Johan Cc: Peff # last comment @@ -286,6 +319,8 @@ test_expect_success 'with message that has an old style conflict block' ' cat >>expected <<-\EOF && # comment + # other comment + # yet another comment Reviewed-by: Johan Cc: Peff # last comment diff --git a/trailer.c b/trailer.c index a9ed3f8..d6dfc7a 100644 --- a/trailer.c +++ b/trailer.c @@ -27,6 +27,10 @@ static struct conf_info default_conf_info; struct trailer_item { struct list_head list; + /* +* If this is not a trailer line, the line is stored in value +* (excluding the terminating newline) and token is NULL. +*/ char *token; char *value; }; @@ -70,9 +74,14 @@ static size_t token_len_without_separator(const char *token, size_t len) static int same_token(struct trailer_item *a, struct arg_item *b) { - size_t a_len = token_len_without_separator(a->token, strlen(a->token)); - size_t b_len = token_len_without_separator(b->token, strlen(b->token)); - size_t min_len = (a_len > b_len) ? b_len : a_len; + size_t a_len, b_len, min_len; + + if (!a->token) +
[PATCH v3 4/6] trailer: make args have their own struct
Improve type safety by making arguments (whether from configuration or from the command line) have their own "struct arg_item" type, separate from the "struct trailer_item" type used to represent the trailers in the buffer being manipulated. This change also prepares "struct trailer_item" to be further differentiated from "struct arg_item" in future patches. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 135 +++--- 1 file changed, 85 insertions(+), 50 deletions(-) diff --git a/trailer.c b/trailer.c index 54cc930..a9ed3f8 100644 --- a/trailer.c +++ b/trailer.c @@ -29,6 +29,12 @@ struct trailer_item { struct list_head list; char *token; char *value; +}; + +struct arg_item { + struct list_head list; + char *token; + char *value; struct conf_info conf; }; @@ -62,7 +68,7 @@ static size_t token_len_without_separator(const char *token, size_t len) return len; } -static int same_token(struct trailer_item *a, struct trailer_item *b) +static int same_token(struct trailer_item *a, struct arg_item *b) { size_t a_len = token_len_without_separator(a->token, strlen(a->token)); size_t b_len = token_len_without_separator(b->token, strlen(b->token)); @@ -71,12 +77,12 @@ static int same_token(struct trailer_item *a, struct trailer_item *b) return !strncasecmp(a->token, b->token, min_len); } -static int same_value(struct trailer_item *a, struct trailer_item *b) +static int same_value(struct trailer_item *a, struct arg_item *b) { return !strcasecmp(a->value, b->value); } -static int same_trailer(struct trailer_item *a, struct trailer_item *b) +static int same_trailer(struct trailer_item *a, struct arg_item *b) { return same_token(a, b) && same_value(a, b); } @@ -98,6 +104,13 @@ static inline void strbuf_replace(struct strbuf *sb, const char *a, const char * static void free_trailer_item(struct trailer_item *item) { + free(item->token); + free(item->value); + free(item); +} + +static void free_arg_item(struct arg_item *item) +{ free(item->conf.name); free(item->conf.key); free(item->conf.command); @@ -137,17 +150,29 @@ static void print_all(FILE *outfile, struct list_head *head, int trim_empty) } } +static struct trailer_item *trailer_from_arg(struct arg_item *arg_tok) +{ + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = arg_tok->token; + new->value = arg_tok->value; + arg_tok->token = arg_tok->value = NULL; + free_arg_item(arg_tok); + return new; +} + static void add_arg_to_input_list(struct trailer_item *on_tok, - struct trailer_item *arg_tok) + struct arg_item *arg_tok) { - if (after_or_end(arg_tok->conf.where)) - list_add(_tok->list, _tok->list); + int aoe = after_or_end(arg_tok->conf.where); + struct trailer_item *to_add = trailer_from_arg(arg_tok); + if (aoe) + list_add(_add->list, _tok->list); else - list_add_tail(_tok->list, _tok->list); + list_add_tail(_add->list, _tok->list); } static int check_if_different(struct trailer_item *in_tok, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, int check_all, struct list_head *head) { @@ -200,7 +225,7 @@ static char *apply_command(const char *command, const char *arg) return result; } -static void apply_item_command(struct trailer_item *in_tok, struct trailer_item *arg_tok) +static void apply_item_command(struct trailer_item *in_tok, struct arg_item *arg_tok) { if (arg_tok->conf.command) { const char *arg; @@ -218,13 +243,13 @@ static void apply_item_command(struct trailer_item *in_tok, struct trailer_item } static void apply_arg_if_exists(struct trailer_item *in_tok, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, struct trailer_item *on_tok, struct list_head *head) { switch (arg_tok->conf.if_exists) { case EXISTS_DO_NOTHING: - free_trailer_item(arg_tok); + free_arg_item(arg_tok); break; case EXISTS_REPLACE: apply_item_command(in_tok, arg_tok); @@ -241,39 +266,41 @@ static void apply_arg_if_exists(struct trailer_item *in_tok, if (check_if_different(in_tok, arg_tok, 1, head)) add_arg_to_input_list(on_tok, arg_tok);
[PATCH v3 2/6] trailer: use list.h for doubly-linked list
Replace the existing handwritten implementation of a doubly-linked list in trailer.c with the functions and macros from list.h. This significantly simplifies the code. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 258 ++ 1 file changed, 91 insertions(+), 167 deletions(-) diff --git a/trailer.c b/trailer.c index 1f191b2..0afa240 100644 --- a/trailer.c +++ b/trailer.c @@ -4,6 +4,7 @@ #include "commit.h" #include "tempfile.h" #include "trailer.h" +#include "list.h" /* * Copyright (c) 2013, 2014 Christian Couder <chrisc...@tuxfamily.org> */ @@ -25,19 +26,24 @@ struct conf_info { static struct conf_info default_conf_info; struct trailer_item { - struct trailer_item *previous; - struct trailer_item *next; + struct list_head list; char *token; char *value; struct conf_info conf; }; -static struct trailer_item *first_conf_item; +LIST_HEAD(conf_head); static char *separators = ":"; #define TRAILER_ARG_STRING "$ARG" +/* Iterate over the elements of the list. */ +#define list_for_each_dir(pos, head, is_reverse) \ + for (pos = is_reverse ? (head)->prev : (head)->next; \ + pos != (head); \ + pos = is_reverse ? pos->prev : pos->next) + static int after_or_end(enum action_where where) { return (where == WHERE_AFTER) || (where == WHERE_END); @@ -120,101 +126,49 @@ static void print_tok_val(FILE *outfile, const char *tok, const char *val) fprintf(outfile, "%s%c %s\n", tok, separators[0], val); } -static void print_all(FILE *outfile, struct trailer_item *first, int trim_empty) +static void print_all(FILE *outfile, struct list_head *head, int trim_empty) { + struct list_head *pos; struct trailer_item *item; - for (item = first; item; item = item->next) { + list_for_each(pos, head) { + item = list_entry(pos, struct trailer_item, list); if (!trim_empty || strlen(item->value) > 0) print_tok_val(outfile, item->token, item->value); } } -static void update_last(struct trailer_item **last) -{ - if (*last) - while ((*last)->next != NULL) - *last = (*last)->next; -} - -static void update_first(struct trailer_item **first) -{ - if (*first) - while ((*first)->previous != NULL) - *first = (*first)->previous; -} - static void add_arg_to_input_list(struct trailer_item *on_tok, - struct trailer_item *arg_tok, - struct trailer_item **first, - struct trailer_item **last) -{ - if (after_or_end(arg_tok->conf.where)) { - arg_tok->next = on_tok->next; - on_tok->next = arg_tok; - arg_tok->previous = on_tok; - if (arg_tok->next) - arg_tok->next->previous = arg_tok; - update_last(last); - } else { - arg_tok->previous = on_tok->previous; - on_tok->previous = arg_tok; - arg_tok->next = on_tok; - if (arg_tok->previous) - arg_tok->previous->next = arg_tok; - update_first(first); - } + struct trailer_item *arg_tok) +{ + if (after_or_end(arg_tok->conf.where)) + list_add(_tok->list, _tok->list); + else + list_add_tail(_tok->list, _tok->list); } static int check_if_different(struct trailer_item *in_tok, struct trailer_item *arg_tok, - int check_all) + int check_all, + struct list_head *head) { enum action_where where = arg_tok->conf.where; + struct list_head *next_head; do { - if (!in_tok) - return 1; if (same_trailer(in_tok, arg_tok)) return 0; /* * if we want to add a trailer after another one, * we have to check those before this one */ - in_tok = after_or_end(where) ? in_tok->previous : in_tok->next; + next_head = after_or_end(where) ? in_tok->list.prev + : in_tok->list.next; + if (next_head == head) + break; + in_tok = list_entry(next_head, struct trailer_item, list); } while (check_all); return 1; } -static void remove_from_list(struct trailer_item *item, -
[PATCH v3 3/6] trailer: streamline trailer item create and add
Currently, creation and addition (to a list) of trailer items are spread across multiple functions. Streamline this by only having 2 functions: one to parse the user-supplied string, and one to add the parsed information to a list. Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 130 +- 1 file changed, 60 insertions(+), 70 deletions(-) diff --git a/trailer.c b/trailer.c index 0afa240..54cc930 100644 --- a/trailer.c +++ b/trailer.c @@ -500,10 +500,31 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } -static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *trailer) +static const char *token_from_item(struct trailer_item *item, char *tok) +{ + if (item->conf.key) + return item->conf.key; + if (tok) + return tok; + return item->conf.name; +} + +static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) +{ + if (!strncasecmp(tok, item->conf.name, tok_len)) + return 1; + return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; +} + +static int parse_trailer(struct strbuf *tok, struct strbuf *val, +const struct conf_info **conf, const char *trailer) { size_t len; struct strbuf seps = STRBUF_INIT; + struct trailer_item *item; + int tok_len; + struct list_head *pos; + strbuf_addstr(, separators); strbuf_addch(, '='); len = strcspn(trailer, seps.buf); @@ -523,74 +544,31 @@ static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *tra strbuf_addstr(tok, trailer); strbuf_trim(tok); } - return 0; -} - -static const char *token_from_item(struct trailer_item *item, char *tok) -{ - if (item->conf.key) - return item->conf.key; - if (tok) - return tok; - return item->conf.name; -} - -static struct trailer_item *new_trailer_item(struct trailer_item *conf_item, -char *tok, char *val) -{ - struct trailer_item *new = xcalloc(sizeof(*new), 1); - new->value = val ? val : xstrdup(""); - - if (conf_item) { - duplicate_conf(>conf, _item->conf); - new->token = xstrdup(token_from_item(conf_item, tok)); - free(tok); - } else { - duplicate_conf(>conf, _conf_info); - new->token = tok; - } - - return new; -} - -static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) -{ - if (!strncasecmp(tok, item->conf.name, tok_len)) - return 1; - return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; -} - -static struct trailer_item *create_trailer_item(const char *string) -{ - struct strbuf tok = STRBUF_INIT; - struct strbuf val = STRBUF_INIT; - struct trailer_item *item; - int tok_len; - struct list_head *pos; - - if (parse_trailer(, , string)) - return NULL; - - tok_len = token_len_without_separator(tok.buf, tok.len); /* Lookup if the token matches something in the config */ + tok_len = token_len_without_separator(tok->buf, tok->len); + *conf = _conf_info; list_for_each(pos, _head) { item = list_entry(pos, struct trailer_item, list); - if (token_matches_item(tok.buf, item, tok_len)) - return new_trailer_item(item, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + if (token_matches_item(tok->buf, item, tok_len)) { + char *tok_buf = strbuf_detach(tok, NULL); + *conf = >conf; + strbuf_addstr(tok, token_from_item(item, tok_buf)); + free(tok_buf); + break; + } } - return new_trailer_item(NULL, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + return 0; } -static void add_trailer_item(struct list_head *head, struct trailer_item *new) +static void add_trailer_item(struct list_head *head, char *tok, char *val, +const struct conf_info *conf) { - if (!new) - return; + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = tok; + new->value = val; + duplicate_conf(>conf, conf); list_add_tail(>list, head); } @@ -599,21 +577,28 @@ static void process_command_line_args(struct list_head *arg_head, { struct string_list_item *
[PATCH v3 1/6] trailer: improve const correctness
Change "const char *" to "char *" in struct trailer_item and in the return value of apply_command (since those strings are owned strings). Change "struct conf_info *" to "const struct conf_info *" (since that struct is not modified). Signed-off-by: Jonathan Tan <jonathanta...@google.com> --- trailer.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/trailer.c b/trailer.c index c6ea9ac..1f191b2 100644 --- a/trailer.c +++ b/trailer.c @@ -27,8 +27,8 @@ static struct conf_info default_conf_info; struct trailer_item { struct trailer_item *previous; struct trailer_item *next; - const char *token; - const char *value; + char *token; + char *value; struct conf_info conf; }; @@ -95,8 +95,8 @@ static void free_trailer_item(struct trailer_item *item) free(item->conf.name); free(item->conf.key); free(item->conf.command); - free((char *)item->token); - free((char *)item->value); + free(item->token); + free(item->value); free(item); } @@ -215,13 +215,13 @@ static struct trailer_item *remove_first(struct trailer_item **first) return item; } -static const char *apply_command(const char *command, const char *arg) +static char *apply_command(const char *command, const char *arg) { struct strbuf cmd = STRBUF_INIT; struct strbuf buf = STRBUF_INIT; struct child_process cp = CHILD_PROCESS_INIT; const char *argv[] = {NULL, NULL}; - const char *result; + char *result; strbuf_addstr(, command); if (arg) @@ -425,7 +425,7 @@ static int set_if_missing(struct conf_info *item, const char *value) return 0; } -static void duplicate_conf(struct conf_info *dst, struct conf_info *src) +static void duplicate_conf(struct conf_info *dst, const struct conf_info *src) { *dst = *src; if (src->name) -- 2.8.0.rc3.226.g39d4020
[PATCH v3 0/6] allow non-trailers and multiple-line trailers
Ah, I knew I forgot something. These are exactly the same as v2, except that these are signed off. Jonathan Tan (6): trailer: improve const correctness trailer: use list.h for doubly-linked list trailer: streamline trailer item create and add trailer: make args have their own struct trailer: allow non-trailers in trailer block trailer: support values folded to multiple lines Documentation/git-interpret-trailers.txt | 10 +- t/t7513-interpret-trailers.sh| 174 ++ trailer.c| 538 +++ 3 files changed, 444 insertions(+), 278 deletions(-) -- 2.8.0.rc3.226.g39d4020
[PATCH v2 2/6] trailer: use list.h for doubly-linked list
Replace the existing handwritten implementation of a doubly-linked list in trailer.c with the functions and macros from list.h. This significantly simplifies the code. --- trailer.c | 258 ++ 1 file changed, 91 insertions(+), 167 deletions(-) diff --git a/trailer.c b/trailer.c index 1f191b2..0afa240 100644 --- a/trailer.c +++ b/trailer.c @@ -4,6 +4,7 @@ #include "commit.h" #include "tempfile.h" #include "trailer.h" +#include "list.h" /* * Copyright (c) 2013, 2014 Christian Couder*/ @@ -25,19 +26,24 @@ struct conf_info { static struct conf_info default_conf_info; struct trailer_item { - struct trailer_item *previous; - struct trailer_item *next; + struct list_head list; char *token; char *value; struct conf_info conf; }; -static struct trailer_item *first_conf_item; +LIST_HEAD(conf_head); static char *separators = ":"; #define TRAILER_ARG_STRING "$ARG" +/* Iterate over the elements of the list. */ +#define list_for_each_dir(pos, head, is_reverse) \ + for (pos = is_reverse ? (head)->prev : (head)->next; \ + pos != (head); \ + pos = is_reverse ? pos->prev : pos->next) + static int after_or_end(enum action_where where) { return (where == WHERE_AFTER) || (where == WHERE_END); @@ -120,101 +126,49 @@ static void print_tok_val(FILE *outfile, const char *tok, const char *val) fprintf(outfile, "%s%c %s\n", tok, separators[0], val); } -static void print_all(FILE *outfile, struct trailer_item *first, int trim_empty) +static void print_all(FILE *outfile, struct list_head *head, int trim_empty) { + struct list_head *pos; struct trailer_item *item; - for (item = first; item; item = item->next) { + list_for_each(pos, head) { + item = list_entry(pos, struct trailer_item, list); if (!trim_empty || strlen(item->value) > 0) print_tok_val(outfile, item->token, item->value); } } -static void update_last(struct trailer_item **last) -{ - if (*last) - while ((*last)->next != NULL) - *last = (*last)->next; -} - -static void update_first(struct trailer_item **first) -{ - if (*first) - while ((*first)->previous != NULL) - *first = (*first)->previous; -} - static void add_arg_to_input_list(struct trailer_item *on_tok, - struct trailer_item *arg_tok, - struct trailer_item **first, - struct trailer_item **last) -{ - if (after_or_end(arg_tok->conf.where)) { - arg_tok->next = on_tok->next; - on_tok->next = arg_tok; - arg_tok->previous = on_tok; - if (arg_tok->next) - arg_tok->next->previous = arg_tok; - update_last(last); - } else { - arg_tok->previous = on_tok->previous; - on_tok->previous = arg_tok; - arg_tok->next = on_tok; - if (arg_tok->previous) - arg_tok->previous->next = arg_tok; - update_first(first); - } + struct trailer_item *arg_tok) +{ + if (after_or_end(arg_tok->conf.where)) + list_add(_tok->list, _tok->list); + else + list_add_tail(_tok->list, _tok->list); } static int check_if_different(struct trailer_item *in_tok, struct trailer_item *arg_tok, - int check_all) + int check_all, + struct list_head *head) { enum action_where where = arg_tok->conf.where; + struct list_head *next_head; do { - if (!in_tok) - return 1; if (same_trailer(in_tok, arg_tok)) return 0; /* * if we want to add a trailer after another one, * we have to check those before this one */ - in_tok = after_or_end(where) ? in_tok->previous : in_tok->next; + next_head = after_or_end(where) ? in_tok->list.prev + : in_tok->list.next; + if (next_head == head) + break; + in_tok = list_entry(next_head, struct trailer_item, list); } while (check_all); return 1; } -static void remove_from_list(struct trailer_item *item, -struct trailer_item **first, -struct trailer_item **last) -{ - struct trailer_item *next = item->next; - struct trailer_item *previous = item->previous; - - if (next) { - item->next->previous =
[PATCH v2 5/6] trailer: allow non-trailers in trailer block
Currently, interpret-trailers requires all lines of a trailer block to be trailers (or comments) - if not it would not identify that block as a trailer block, and thus create its own trailer block, inserting a blank line. For example: echo -e "\na: b\nnot trailer" | git interpret-trailers --trailer "c: d" would result in: a: b not trailer c: d Relax the definition of a trailer block to only require 1 trailer, so that trailers can be directly added to such blocks, resulting in: a: b not trailer c: d This allows arbitrary lines to be included in trailer blocks, like those in [1], and still allow interpret-trailers to be used. This change also makes comments in the trailer block be treated as any other non-trailer line, preserving them in the output of interpret-trailers. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux-stable/+/e7d316a02f683864a12389f8808570e37fb90aa3 --- Documentation/git-interpret-trailers.txt | 3 +- t/t7513-interpret-trailers.sh| 35 +++ trailer.c| 77 ++-- 3 files changed, 90 insertions(+), 25 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index 93d1db6..c480da6 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -48,7 +48,8 @@ with only spaces at the end of the commit message part, one blank line will be added before the new trailer. Existing trailers are extracted from the input message by looking for -a group of one or more lines that contain a colon (by default), where +a group of one or more lines in which at least one line contains a +colon (by default), where the group is preceded by one or more empty (or whitespace-only) lines. The group must either be at the end of the message or be the last non-whitespace lines before a line that starts with '---'. Such three diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index aee785c..7f5cd2a 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -126,6 +126,37 @@ test_expect_success 'with multiline title in the message' ' test_cmp expected actual ' +test_expect_success 'with non-trailer lines mixed with trailer lines' ' + cat >patch <<-\EOF && + + this: is a trailer + this is not a trailer + EOF + cat >expected <<-\EOF && + + this: is a trailer + this is not a trailer + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'with non-trailer lines only' ' + cat >patch <<-\EOF && + + this is not a trailer + EOF + cat >expected <<-\EOF && + + this is not a trailer + + token: value + EOF + git interpret-trailers --trailer "token: value" patch >actual && + test_cmp expected actual +' + test_expect_success 'with config setup' ' git config trailer.ack.key "Acked-by: " && cat >expected <<-\EOF && @@ -257,6 +288,8 @@ test_expect_success 'with message that has comments' ' cat >>expected <<-\EOF && # comment + # other comment + # yet another comment Reviewed-by: Johan Cc: Peff # last comment @@ -286,6 +319,8 @@ test_expect_success 'with message that has an old style conflict block' ' cat >>expected <<-\EOF && # comment + # other comment + # yet another comment Reviewed-by: Johan Cc: Peff # last comment diff --git a/trailer.c b/trailer.c index a9ed3f8..d6dfc7a 100644 --- a/trailer.c +++ b/trailer.c @@ -27,6 +27,10 @@ static struct conf_info default_conf_info; struct trailer_item { struct list_head list; + /* +* If this is not a trailer line, the line is stored in value +* (excluding the terminating newline) and token is NULL. +*/ char *token; char *value; }; @@ -70,9 +74,14 @@ static size_t token_len_without_separator(const char *token, size_t len) static int same_token(struct trailer_item *a, struct arg_item *b) { - size_t a_len = token_len_without_separator(a->token, strlen(a->token)); - size_t b_len = token_len_without_separator(b->token, strlen(b->token)); - size_t min_len = (a_len > b_len) ? b_len : a_len; + size_t a_len, b_len, min_len; + + if (!a->token) + return 0; + + a_len = token_len_without_separator(a->token, strlen(a->token)); + b_len = token_len_without_separator(b->token, strlen(b->token)); + min_len = (a_len > b_len) ? b_len : a_len; return !strncasecmp(a->token, b->token, min_len); } @@ -130,7
[PATCH v2 4/6] trailer: make args have their own struct
Improve type safety by making arguments (whether from configuration or from the command line) have their own "struct arg_item" type, separate from the "struct trailer_item" type used to represent the trailers in the buffer being manipulated. This change also prepares "struct trailer_item" to be further differentiated from "struct arg_item" in future patches. --- trailer.c | 135 +++--- 1 file changed, 85 insertions(+), 50 deletions(-) diff --git a/trailer.c b/trailer.c index 54cc930..a9ed3f8 100644 --- a/trailer.c +++ b/trailer.c @@ -29,6 +29,12 @@ struct trailer_item { struct list_head list; char *token; char *value; +}; + +struct arg_item { + struct list_head list; + char *token; + char *value; struct conf_info conf; }; @@ -62,7 +68,7 @@ static size_t token_len_without_separator(const char *token, size_t len) return len; } -static int same_token(struct trailer_item *a, struct trailer_item *b) +static int same_token(struct trailer_item *a, struct arg_item *b) { size_t a_len = token_len_without_separator(a->token, strlen(a->token)); size_t b_len = token_len_without_separator(b->token, strlen(b->token)); @@ -71,12 +77,12 @@ static int same_token(struct trailer_item *a, struct trailer_item *b) return !strncasecmp(a->token, b->token, min_len); } -static int same_value(struct trailer_item *a, struct trailer_item *b) +static int same_value(struct trailer_item *a, struct arg_item *b) { return !strcasecmp(a->value, b->value); } -static int same_trailer(struct trailer_item *a, struct trailer_item *b) +static int same_trailer(struct trailer_item *a, struct arg_item *b) { return same_token(a, b) && same_value(a, b); } @@ -98,6 +104,13 @@ static inline void strbuf_replace(struct strbuf *sb, const char *a, const char * static void free_trailer_item(struct trailer_item *item) { + free(item->token); + free(item->value); + free(item); +} + +static void free_arg_item(struct arg_item *item) +{ free(item->conf.name); free(item->conf.key); free(item->conf.command); @@ -137,17 +150,29 @@ static void print_all(FILE *outfile, struct list_head *head, int trim_empty) } } +static struct trailer_item *trailer_from_arg(struct arg_item *arg_tok) +{ + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = arg_tok->token; + new->value = arg_tok->value; + arg_tok->token = arg_tok->value = NULL; + free_arg_item(arg_tok); + return new; +} + static void add_arg_to_input_list(struct trailer_item *on_tok, - struct trailer_item *arg_tok) + struct arg_item *arg_tok) { - if (after_or_end(arg_tok->conf.where)) - list_add(_tok->list, _tok->list); + int aoe = after_or_end(arg_tok->conf.where); + struct trailer_item *to_add = trailer_from_arg(arg_tok); + if (aoe) + list_add(_add->list, _tok->list); else - list_add_tail(_tok->list, _tok->list); + list_add_tail(_add->list, _tok->list); } static int check_if_different(struct trailer_item *in_tok, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, int check_all, struct list_head *head) { @@ -200,7 +225,7 @@ static char *apply_command(const char *command, const char *arg) return result; } -static void apply_item_command(struct trailer_item *in_tok, struct trailer_item *arg_tok) +static void apply_item_command(struct trailer_item *in_tok, struct arg_item *arg_tok) { if (arg_tok->conf.command) { const char *arg; @@ -218,13 +243,13 @@ static void apply_item_command(struct trailer_item *in_tok, struct trailer_item } static void apply_arg_if_exists(struct trailer_item *in_tok, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, struct trailer_item *on_tok, struct list_head *head) { switch (arg_tok->conf.if_exists) { case EXISTS_DO_NOTHING: - free_trailer_item(arg_tok); + free_arg_item(arg_tok); break; case EXISTS_REPLACE: apply_item_command(in_tok, arg_tok); @@ -241,39 +266,41 @@ static void apply_arg_if_exists(struct trailer_item *in_tok, if (check_if_different(in_tok, arg_tok, 1, head)) add_arg_to_input_list(on_tok, arg_tok); else - free_trailer_item(arg_tok); + free_arg_item(arg_tok); break; case EXISTS_ADD_IF_DIFFERENT_NEIGHBOR:
[PATCH v2 3/6] trailer: streamline trailer item create and add
Currently, creation and addition (to a list) of trailer items are spread across multiple functions. Streamline this by only having 2 functions: one to parse the user-supplied string, and one to add the parsed information to a list. --- trailer.c | 130 +- 1 file changed, 60 insertions(+), 70 deletions(-) diff --git a/trailer.c b/trailer.c index 0afa240..54cc930 100644 --- a/trailer.c +++ b/trailer.c @@ -500,10 +500,31 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } -static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *trailer) +static const char *token_from_item(struct trailer_item *item, char *tok) +{ + if (item->conf.key) + return item->conf.key; + if (tok) + return tok; + return item->conf.name; +} + +static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) +{ + if (!strncasecmp(tok, item->conf.name, tok_len)) + return 1; + return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; +} + +static int parse_trailer(struct strbuf *tok, struct strbuf *val, +const struct conf_info **conf, const char *trailer) { size_t len; struct strbuf seps = STRBUF_INIT; + struct trailer_item *item; + int tok_len; + struct list_head *pos; + strbuf_addstr(, separators); strbuf_addch(, '='); len = strcspn(trailer, seps.buf); @@ -523,74 +544,31 @@ static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *tra strbuf_addstr(tok, trailer); strbuf_trim(tok); } - return 0; -} - -static const char *token_from_item(struct trailer_item *item, char *tok) -{ - if (item->conf.key) - return item->conf.key; - if (tok) - return tok; - return item->conf.name; -} - -static struct trailer_item *new_trailer_item(struct trailer_item *conf_item, -char *tok, char *val) -{ - struct trailer_item *new = xcalloc(sizeof(*new), 1); - new->value = val ? val : xstrdup(""); - - if (conf_item) { - duplicate_conf(>conf, _item->conf); - new->token = xstrdup(token_from_item(conf_item, tok)); - free(tok); - } else { - duplicate_conf(>conf, _conf_info); - new->token = tok; - } - - return new; -} - -static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) -{ - if (!strncasecmp(tok, item->conf.name, tok_len)) - return 1; - return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; -} - -static struct trailer_item *create_trailer_item(const char *string) -{ - struct strbuf tok = STRBUF_INIT; - struct strbuf val = STRBUF_INIT; - struct trailer_item *item; - int tok_len; - struct list_head *pos; - - if (parse_trailer(, , string)) - return NULL; - - tok_len = token_len_without_separator(tok.buf, tok.len); /* Lookup if the token matches something in the config */ + tok_len = token_len_without_separator(tok->buf, tok->len); + *conf = _conf_info; list_for_each(pos, _head) { item = list_entry(pos, struct trailer_item, list); - if (token_matches_item(tok.buf, item, tok_len)) - return new_trailer_item(item, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + if (token_matches_item(tok->buf, item, tok_len)) { + char *tok_buf = strbuf_detach(tok, NULL); + *conf = >conf; + strbuf_addstr(tok, token_from_item(item, tok_buf)); + free(tok_buf); + break; + } } - return new_trailer_item(NULL, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + return 0; } -static void add_trailer_item(struct list_head *head, struct trailer_item *new) +static void add_trailer_item(struct list_head *head, char *tok, char *val, +const struct conf_info *conf) { - if (!new) - return; + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = tok; + new->value = val; + duplicate_conf(>conf, conf); list_add_tail(>list, head); } @@ -599,21 +577,28 @@ static void process_command_line_args(struct list_head *arg_head, { struct string_list_item *tr; struct trailer_item *item; + struct strbuf tok = STRBUF_INIT; + struct strbuf val = STRBUF_INIT; + const struct conf_info
[PATCH v2 6/6] trailer: support values folded to multiple lines
Currently, interpret-trailers requires that a trailer be only on 1 line. For example: a: first line second line would be interpreted as one trailer line followed by one non-trailer line. Make interpret-trailers support RFC 822-style folding, treating those lines as one logical trailer. --- Documentation/git-interpret-trailers.txt | 7 +- t/t7513-interpret-trailers.sh| 139 +++ trailer.c| 22 +++-- 3 files changed, 160 insertions(+), 8 deletions(-) diff --git a/Documentation/git-interpret-trailers.txt b/Documentation/git-interpret-trailers.txt index c480da6..cfec636 100644 --- a/Documentation/git-interpret-trailers.txt +++ b/Documentation/git-interpret-trailers.txt @@ -57,11 +57,12 @@ minus signs start the patch part of the message. When reading trailers, there can be whitespaces before and after the token, the separator and the value. There can also be whitespaces -inside the token and the value. +inside the token and the value. The value may be split over multiple lines with +each subsequent line starting with whitespace, like the "folding" in RFC 822. Note that 'trailers' do not follow and are not intended to follow many -rules for RFC 822 headers. For example they do not follow the line -folding rules, the encoding rules and probably many other rules. +rules for RFC 822 headers. For example they do not follow +the encoding rules and probably many other rules. OPTIONS --- diff --git a/t/t7513-interpret-trailers.sh b/t/t7513-interpret-trailers.sh index 7f5cd2a..31db699 100755 --- a/t/t7513-interpret-trailers.sh +++ b/t/t7513-interpret-trailers.sh @@ -157,6 +157,145 @@ test_expect_success 'with non-trailer lines only' ' test_cmp expected actual ' +test_expect_success 'multiline field treated as atomic for placement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + name: value + another: trailer + EOF + test_config trailer.name.where after && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for replacement' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: value on + Qmultiple lines + another: trailer + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + another: trailer + name: value + EOF + test_config trailer.name.ifexists replace && + git interpret-trailers --trailer "name: value" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for difference check' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + test_config trailer.name.ifexists addIfDifferent && + + q_to_tab >trailer <<-\EOF && + name: first line + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + git interpret-trailers --trailer "$(cat trailer)" patch >actual && + test_cmp expected actual && + + q_to_tab >trailer <<-\EOF && + name: first line + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + name: first line + Qsecond line + EOF + git interpret-trailers --trailer "$(cat trailer)" patch >actual && + test_cmp expected actual && + + q_to_tab >trailer <<-\EOF && + name: first line *DIFFERENT* + Qsecond line + EOF + q_to_tab >expected <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + name: first line *DIFFERENT* + Qsecond line + EOF + git interpret-trailers --trailer "$(cat trailer)" patch >actual && + test_cmp expected actual +' + +test_expect_success 'multiline field treated as atomic for neighbor check' ' + q_to_tab >patch <<-\EOF && + + another: trailer + name: first line + Qsecond line + another: trailer + EOF + test_config trailer.name.where after && + test_config
[PATCH v2 1/6] trailer: improve const correctness
Change "const char *" to "char *" in struct trailer_item and in the return value of apply_command (since those strings are owned strings). Change "struct conf_info *" to "const struct conf_info *" (since that struct is not modified). --- trailer.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/trailer.c b/trailer.c index c6ea9ac..1f191b2 100644 --- a/trailer.c +++ b/trailer.c @@ -27,8 +27,8 @@ static struct conf_info default_conf_info; struct trailer_item { struct trailer_item *previous; struct trailer_item *next; - const char *token; - const char *value; + char *token; + char *value; struct conf_info conf; }; @@ -95,8 +95,8 @@ static void free_trailer_item(struct trailer_item *item) free(item->conf.name); free(item->conf.key); free(item->conf.command); - free((char *)item->token); - free((char *)item->value); + free(item->token); + free(item->value); free(item); } @@ -215,13 +215,13 @@ static struct trailer_item *remove_first(struct trailer_item **first) return item; } -static const char *apply_command(const char *command, const char *arg) +static char *apply_command(const char *command, const char *arg) { struct strbuf cmd = STRBUF_INIT; struct strbuf buf = STRBUF_INIT; struct child_process cp = CHILD_PROCESS_INIT; const char *argv[] = {NULL, NULL}; - const char *result; + char *result; strbuf_addstr(, command); if (arg) @@ -425,7 +425,7 @@ static int set_if_missing(struct conf_info *item, const char *value) return 0; } -static void duplicate_conf(struct conf_info *dst, struct conf_info *src) +static void duplicate_conf(struct conf_info *dst, const struct conf_info *src) { *dst = *src; if (src->name) -- 2.8.0.rc3.226.g39d4020
[PATCH v2 0/6] allow non-trailers and multiple-line trailers
Thanks, Peff, for the pointer to list.h. Using list.h does simplify the code by a similar amount to switching it to a singly-linked list, so I have done that (replacing my earlier "trailer: use singly-linked list, not doubly" patch). Another advantage is that I no longer need to change the algorithm, making for a smaller patch. (There are some quirks resulting from list.h implementing a circular list, like needing to pass "head" as a sentinel when iterating from the middle of the list, but those are minor, and my original singly-linked list implementation had quirks too anyway like needing to pass a pointer to the next pointer.) Updates: (-> 1/6) - Added separate patch for const correctness changes (1/5 -> 2/6) - Dropped singly-linked list patch, instead replacing existing doubly-linked list implementation with list.h (5/5 -> 6/6) - Used "char *" instead of "struct strbuf" - Modified test slightly to test whitespace at beginning of line Jonathan Tan (6): trailer: improve const correctness trailer: use list.h for doubly-linked list trailer: streamline trailer item create and add trailer: make args have their own struct trailer: allow non-trailers in trailer block trailer: support values folded to multiple lines Documentation/git-interpret-trailers.txt | 10 +- t/t7513-interpret-trailers.sh| 174 ++ trailer.c| 538 +++ 3 files changed, 444 insertions(+), 278 deletions(-) -- 2.8.0.rc3.226.g39d4020
[PATCH 3/5] trailer: make args have their own struct
Improve type safety by making arguments (whether from configuration or from the command line) have their own "struct arg_item" type, separate from the "struct trailer_item" type used to represent the trailers in the buffer being manipulated. Also take the opportunity to refine the "struct trailer_item" definition by removing the conf (which is used only by arguments) and by removing const on the string fields, since those fields are owned by the struct. This change also prepares "struct trailer_item" to be further differentiated from "struct arg_item" in future patches. --- trailer.c | 161 ++ 1 file changed, 99 insertions(+), 62 deletions(-) diff --git a/trailer.c b/trailer.c index e8b1bfb..167b2fd 100644 --- a/trailer.c +++ b/trailer.c @@ -26,12 +26,18 @@ static struct conf_info default_conf_info; struct trailer_item { struct trailer_item *next; - const char *token; - const char *value; + char *token; + char *value; +}; + +struct arg_item { + struct arg_item *next; + char *token; + char *value; struct conf_info conf; }; -static struct trailer_item *first_conf_item; +static struct arg_item *first_conf_item; static char *separators = ":"; @@ -55,8 +61,7 @@ static size_t token_len_without_separator(const char *token, size_t len) return len; } -static int same_token(const struct trailer_item *a, - const struct trailer_item *b) +static int same_token(const struct trailer_item *a, const struct arg_item *b) { size_t a_len = token_len_without_separator(a->token, strlen(a->token)); size_t b_len = token_len_without_separator(b->token, strlen(b->token)); @@ -65,14 +70,12 @@ static int same_token(const struct trailer_item *a, return !strncasecmp(a->token, b->token, min_len); } -static int same_value(const struct trailer_item *a, - const struct trailer_item *b) +static int same_value(const struct trailer_item *a, const struct arg_item *b) { return !strcasecmp(a->value, b->value); } -static int same_trailer(const struct trailer_item *a, - const struct trailer_item *b) +static int same_trailer(const struct trailer_item *a, const struct arg_item *b) { return same_token(a, b) && same_value(a, b); } @@ -94,11 +97,18 @@ static inline void strbuf_replace(struct strbuf *sb, const char *a, const char * static void free_trailer_item(struct trailer_item *item) { + free(item->token); + free(item->value); + free(item); +} + +static void free_arg_item(struct arg_item *item) +{ free(item->conf.name); free(item->conf.key); free(item->conf.command); - free((char *)item->token); - free((char *)item->value); + free(item->token); + free(item->value); free(item); } @@ -131,13 +141,13 @@ static void print_all(FILE *outfile, struct trailer_item *first, int trim_empty) } } -static const char *apply_command(const char *command, const char *arg) +static char *apply_command(const char *command, const char *arg) { struct strbuf cmd = STRBUF_INIT; struct strbuf buf = STRBUF_INIT; struct child_process cp = CHILD_PROCESS_INIT; const char *argv[] = {NULL, NULL}; - const char *result; + char *result; strbuf_addstr(, command); if (arg) @@ -162,7 +172,7 @@ static const char *apply_command(const char *command, const char *arg) return result; } -static void apply_item_command(struct trailer_item *in_tok, struct trailer_item *arg_tok) +static void apply_item_command(struct trailer_item *in_tok, struct arg_item *arg_tok) { if (arg_tok->conf.command) { const char *arg; @@ -179,60 +189,77 @@ static void apply_item_command(struct trailer_item *in_tok, struct trailer_item } } +static struct trailer_item *trailer_from_arg(struct arg_item *arg_tok) +{ + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = arg_tok->token; + new->value = arg_tok->value; + arg_tok->token = arg_tok->value = NULL; + free_arg_item(arg_tok); + return new; +} + static void apply_existing_arg(struct trailer_item **found_next, - struct trailer_item *arg_tok, + struct arg_item *arg_tok, struct trailer_item **position_to_add, const struct trailer_item *in_tok_head, const struct trailer_item *neighbor) { - if (arg_tok->conf.if_exists == EXISTS_DO_NOTHING) { - free_trailer_item(arg_tok); + enum action_if_exists if_exists = arg_tok->conf.if_exists; + struct trailer_item *to_add; + + if (if_exists == EXISTS_DO_NOTHING) { + free_arg_item(arg_tok); return; }
[PATCH 2/5] trailer: streamline trailer item create and add
Currently, creation and addition (to a list) of trailer items are spread across multiple functions. Streamline this by only having 2 functions: one to parse the user-supplied string, and one to add the parsed information to a list. --- trailer.c | 135 +- 1 file changed, 62 insertions(+), 73 deletions(-) diff --git a/trailer.c b/trailer.c index bf3d7d0..e8b1bfb 100644 --- a/trailer.c +++ b/trailer.c @@ -335,7 +335,7 @@ static int set_if_missing(struct conf_info *item, const char *value) return 0; } -static void duplicate_conf(struct conf_info *dst, struct conf_info *src) +static void duplicate_conf(struct conf_info *dst, const struct conf_info *src) { *dst = *src; if (src->name) @@ -467,10 +467,30 @@ static int git_trailer_config(const char *conf_key, const char *value, void *cb) return 0; } -static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *trailer) +static const char *token_from_item(struct trailer_item *item, char *tok) +{ + if (item->conf.key) + return item->conf.key; + if (tok) + return tok; + return item->conf.name; +} + +static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) +{ + if (!strncasecmp(tok, item->conf.name, tok_len)) + return 1; + return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; +} + +static int parse_trailer(struct strbuf *tok, struct strbuf *val, +const struct conf_info **conf, const char *trailer) { size_t len; struct strbuf seps = STRBUF_INIT; + struct trailer_item *item; + int tok_len; + strbuf_addstr(, separators); strbuf_addch(, '='); len = strcspn(trailer, seps.buf); @@ -490,73 +510,31 @@ static int parse_trailer(struct strbuf *tok, struct strbuf *val, const char *tra strbuf_addstr(tok, trailer); strbuf_trim(tok); } - return 0; -} - -static const char *token_from_item(struct trailer_item *item, char *tok) -{ - if (item->conf.key) - return item->conf.key; - if (tok) - return tok; - return item->conf.name; -} - -static struct trailer_item *new_trailer_item(struct trailer_item *conf_item, -char *tok, char *val) -{ - struct trailer_item *new = xcalloc(sizeof(*new), 1); - new->value = val ? val : xstrdup(""); - - if (conf_item) { - duplicate_conf(>conf, _item->conf); - new->token = xstrdup(token_from_item(conf_item, tok)); - free(tok); - } else { - duplicate_conf(>conf, _conf_info); - new->token = tok; - } - - return new; -} - -static int token_matches_item(const char *tok, struct trailer_item *item, int tok_len) -{ - if (!strncasecmp(tok, item->conf.name, tok_len)) - return 1; - return item->conf.key ? !strncasecmp(tok, item->conf.key, tok_len) : 0; -} - -static struct trailer_item *create_trailer_item(const char *string) -{ - struct strbuf tok = STRBUF_INIT; - struct strbuf val = STRBUF_INIT; - struct trailer_item *item; - int tok_len; - - if (parse_trailer(, , string)) - return NULL; - - tok_len = token_len_without_separator(tok.buf, tok.len); /* Lookup if the token matches something in the config */ + tok_len = token_len_without_separator(tok->buf, tok->len); + *conf = _conf_info; for (item = first_conf_item; item; item = item->next) { - if (token_matches_item(tok.buf, item, tok_len)) - return new_trailer_item(item, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + if (token_matches_item(tok->buf, item, tok_len)) { + char *tok_buf = strbuf_detach(tok, NULL); + *conf = >conf; + strbuf_addstr(tok, token_from_item(item, tok_buf)); + free(tok_buf); + break; + } } - return new_trailer_item(NULL, - strbuf_detach(, NULL), - strbuf_detach(, NULL)); + return 0; } -static void add_trailer_item(struct trailer_item ***tail, -struct trailer_item *new) +static void add_trailer_item(struct trailer_item ***tail, char *tok, char *val, +const struct conf_info *conf) { - if (!new) - return; + struct trailer_item *new = xcalloc(sizeof(*new), 1); + new->token = tok; + new->value = val; + duplicate_conf(>conf, conf); + **tail = new; *tail = >next; } @@ -567,19