Re: What's cooking in git.git (Aug 2016, #08; Wed, 24)
On Tue, Aug 30, 2016 at 10:10 PM, Stefan Beller wrote: >> >> * sb/submodule-clone-rr (2016-08-17) 8 commits >> - clone: recursive and reference option triggers submodule alternates >> - clone: implement optional references >> - clone: clarify option_reference as required >> - clone: factor out checking for an alternate path >> - submodule--helper update-clone: allow multiple references >> - submodule--helper module-clone: allow multiple references >> - t7408: merge short tests, factor out testing method >> - t7408: modernize style >> >> I spotted a last-minute bug in v5, which is not a very good sign >> (it shows that nobody is reviewing). Any more comments? >> >> > > Jacob Keller reviewed that series as announced in > > https://public-inbox.org/git/CA+P7+xpE=GoFWfdzmT+k=zku8+yjeh-aomsfutjjjwfha1h...@mail.gmail.com/#t > > and concluded it is fine as in > > https://public-inbox.org/git/ca+p7+xrokr0zgidqfuvpn+-j_wdjkauropcnpgvjzhafc12...@mail.gmail.com/ > > Thanks, > Stefan Yep. Thanks, Jake
Re: [PATCHv5 8/8] clone: recursive and reference option triggers submodule alternates
On Tue, Aug 30, 2016 at 10:04 PM, Stefan Beller wrote: > On Wed, Aug 24, 2016 at 4:37 PM, Jacob Keller wrote: > >> Yes that seems reasonable. >> >> Thanks, >> Jake > > I reviewed all your comments and you seem to be ok with including this > series as it is queued currently? > > Thanks, > Stefan Yea based on what's in Junio's tree, I think we've squashed in all the suggested changes, unless there have been more suggestions I missed. Regards, Jake
Re: Reducing CPU load on git server
On Mon, Aug 29, 2016 at 03:41:59PM -0700, W. David Jarvis wrote: > We have an open support thread with the GHE support folks, but the > only feedback we've gotten so far on this subject is that they believe > our CPU load is being driven by the quantity of fetch operations (an > average of 16 fetch requests per second during normal business hours, > so 4 requests per second per subscriber box). About 4,000 fetch > requests on our main repository per day. Hmm. That might well be, but I'd have to see the numbers to say more. At this point I think we're exhausting what is useful to talk about on the Git list. I'll point GHE Support at this thread, which might help your conversation with them (and they might pull me in behind the scenes). I'll try to answer any Git-specific questions here, though. > > None of those operations is triggered by client fetches. You'll see > > "rev-list" for a variety of operations, so that's hard to pinpoint. But > > I'm surprised that "prune" is a common one for you. It is run as part of > > the post-push, but I happen to know that the version that ships on GHE > > is optimized to use bitmaps, and to avoid doing any work when there are > > no loose objects that need pruning in the first place. > > Would regular force-pushing trigger prune operations? Our engineering > body loves to force-push. No, it shouldn't make a difference. For stock git, "prune" will only be run occasionally as part of "git gc". On GitHub Enterprise, every push kicks off a "sync" job that moves objects from a specific fork into storage shared by all of the related forks. So GHE will run prune more often than stock git would, but force-pushing wouldn't have any effect on that. There are also some custom patches to optimize prune on GHE, so it shouldn't generally be very expensive. Unless perhaps for some reason the reachability bitmaps on your repository aren't performing very well. You could try something like comparing: time git rev-list --objects --all >/dev/null and time git rev-list --objects --all --use-bitmap-index >/dev/null on your server. The second should be a lot faster. If it's not, that may be an indication that Git could be doing a better job of selecting bitmap commits (that code is not GitHub-specific at all). > >> I might be misunderstanding this, but if the subscriber is already "up > >> to date" modulo a single updated ref tip, then this problem shouldn't > >> occur, right? Concretely: if ref A is built off of ref B, and the > >> subscriber already has B when it requests A, that shouldn't cause this > >> behavior, but it would cause this behavior if the subscriber didn't > >> have B when it requested A. > > > > Correct. So this shouldn't be a thing you are running into now, but it's > > something that might be made worse if you switch to fetching only single > > refs. > > But in theory if we were always up-to-date (since we'd always fetch > any updated ref) we wouldn't run into this problem? We could have a > cron job to ensure that we run a full git fetch every once in a while > but I'd hope that if this was written properly we'd almost always have > the most recent commit for any dependency ref. It's a little more complicated than that. What you're really going for is letting git reuse on-disk deltas when serving fetches. But depending on when the last repack was run, we might be cobbling together the fetch from multiple packs on disk, in which case there will still be some delta search. In my experience that's not _generally_ a big deal, though. Small fetches don't have that many deltas to find. > Our primary repository is fairly busy. It has about 1/3 the commits of > Linux and about 1/3 the refs, but seems otherwise to be on the same > scale. And, of course, it both hasn't been around for as long as Linux > has and has been experiencing exponential growth, which means its > current activity is higher than it has ever been before -- might put > it on a similar scale to Linux's current activity. Most of the work for repacking scales with the number of total objects (not quite linearly, though). For torvalds/linux (and its forks), that's around 16 million objects. You might try "git count-objects -v" on your server for comparison (but do it in the "network.git" directory, as that's the shared object storage). -Peff
Re: git blame [was: Reducing CPU load on git server]
On Tue, Aug 30, 2016 at 12:46:20PM +0200, Jakub Narębski wrote: > W dniu 29.08.2016 o 23:31, Jeff King pisze: > > > Blame-tree is a GitHub-specific command (it feeds the main repository > > view page), and is a known CPU hog. There's more clever caching for that > > coming down the pipe, but it's not shipped yet. > > I wonder if having support for 'git blame ' in Git core would > be something interesting to Git users. I once tried to implement it, > but it went nowhere. Would it be hard to implement? I think there's some interest; I have received a few off-list emails over the years about it. There was some preliminary discussion long ago: http://public-inbox.org/git/20110302164031.ga18...@sigill.intra.peff.net/ The code that runs on GitHub is available in my fork of git. I haven't submitted it upstream because there are some lingering issues. I mentioned them on-list in the first few items of: http://public-inbox.org/git/20130318121243.gc14...@sigill.intra.peff.net/ That code is in the jk/blame-tree branch of https://github.com/peff/git if you are interested in addressing them (note that I haven't touched that code in a few years except for rebasing it forward, so it may have bitrotted a little). Here's a snippet from an off-list conversation I had with Dennis (cc'd) in 2014 (I think he uses that blame-tree code as part of a custom git web interface): > The things I think it needs are: > > 1. The max-depth patches need to be reconciled with Duy's pathspec > work upstream. The current implementation works only for tree > diffs, and is not really part of the pathspec at all. > > 2. Docs/output formats for blame-tree need to be friendlier, as you > noticed. > > 3. Blame-tree does not use revision pathspecs at all. This makes it > take way longer than it could, because it does not prune away side > branches deep in history that affect only paths whose blame we have > already found. But the current pathspec code is so slow that using > it outweighs the pruning benefit. > > I have a series, which I just pushed up to jk/faster-blame-tree, > which tries to improve this. But it's got a lot of new, untested > code itself (we are not even running it at GitHub yet). It's also > based on v1.9.4; I think there are going to be a lot of conflicts > with the combine-tree work done in v2.0. > > [...] > > I also think it would probably make sense for blame-tree to support the > same output formats as git-blame (e.g., to have an identical --porcelain > mode, to have a reasonable human-readable format by default, etc). That's all I could dig out of my archives. I'd be happy if somebody wanted to pick it up and run with it. Polishing for upstream has been on my list for several years now, but there's usually something more important (or interesting) to work on at any given moment. You might also look at how GitLab does it. I've never talked to them about it, and as far as I know they do not use blame-tree. -Peff
Re: [PATCH v1] pack-protocol: fix maximum pkt-line size
On Mon, Aug 29, 2016 at 10:55 AM, wrote: > From: Lars Schneider > > According to LARGE_PACKET_MAX in pkt-line.h the maximal length of a > pkt-line packet is 65520 bytes. The pkt-line header takes 4 bytes and > therefore the pkt-line data component must not exceed 65516 bytes. > > Signed-off-by: Lars Schneider > --- > > This patch was part of my "Git filter protocol" series [1]. Stefan > suggested to submit this patch individually as it is not strictly > required in the series [2]. > > Thanks, > Lars > > > [1] > http://public-inbox.org/git/20160825110752.31581-1-larsxschnei...@gmail.com/ > [2] > http://public-inbox.org/git/cagz79kajn-yf28+ls2jyqss6jt-oj2jw-saxqn-oe5xaxsy...@mail.gmail.com/ > > > Documentation/technical/protocol-common.txt | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/Documentation/technical/protocol-common.txt > b/Documentation/technical/protocol-common.txt > index bf30167..ecedb34 100644 > --- a/Documentation/technical/protocol-common.txt > +++ b/Documentation/technical/protocol-common.txt > @@ -67,9 +67,9 @@ with non-binary data the same whether or not they contain > the trailing > LF (stripping the LF if present, and not complaining when it is > missing). > > -The maximum length of a pkt-line's data component is 65520 bytes. > -Implementations MUST NOT send pkt-line whose length exceeds 65524 > -(65520 bytes of payload + 4 bytes of length data). > +The maximum length of a pkt-line's data component is 65516 bytes. > +Implementations MUST NOT send pkt-line whose length exceeds 65520 > +(65516 bytes of payload + 4 bytes of length data). > > Implementations SHOULD NOT send an empty pkt-line ("0004"). > This looks good, Thanks, Stefan > -- > 2.9.2 >
Re: What's cooking in git.git (Aug 2016, #08; Wed, 24)
> > * sb/submodule-clone-rr (2016-08-17) 8 commits > - clone: recursive and reference option triggers submodule alternates > - clone: implement optional references > - clone: clarify option_reference as required > - clone: factor out checking for an alternate path > - submodule--helper update-clone: allow multiple references > - submodule--helper module-clone: allow multiple references > - t7408: merge short tests, factor out testing method > - t7408: modernize style > > I spotted a last-minute bug in v5, which is not a very good sign > (it shows that nobody is reviewing). Any more comments? > > Jacob Keller reviewed that series as announced in https://public-inbox.org/git/CA+P7+xpE=GoFWfdzmT+k=zku8+yjeh-aomsfutjjjwfha1h...@mail.gmail.com/#t and concluded it is fine as in https://public-inbox.org/git/ca+p7+xrokr0zgidqfuvpn+-j_wdjkauropcnpgvjzhafc12...@mail.gmail.com/ Thanks, Stefan
[PATCH 3/3] diff-highlight: avoid highlighting combined diffs
The algorithm in diff-highlight only understands how to look at two sides of a diff; it cannot correctly handle combined diffs with multiple preimages. Often highlighting does not trigger at all for these diffs because the line counts do not match up. E.g., if we see: - ours -theirs ++resolved we would not bother highlighting; it naively looks like a single line went away, and then a separate hunk added another single line. But of course there are exceptions. E.g., if the other side deleted the line, we might see: - ours ++resolved which looks like we dropped " ours" and added "+resolved". This is only a small highlighting glitch (we highlight the space and the "+" along with the content), but it's also the tip of the iceberg. Even if we learned to find the true content here (by noticing we are in a 3-way combined diff and marking _two_ characters from the front of the line as uninteresting), there are other more complicated cases where we really do need to handle a 3-way hunk. Let's just punt for now; we can recognize combined diffs by the presence of extra "@" symbols in the hunk header, and treat them as non-diff content. Signed-off-by: Jeff King --- contrib/diff-highlight/diff-highlight| 2 +- contrib/diff-highlight/t/t9400-diff-highlight.sh | 37 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight index 9280c88..81bd804 100755 --- a/contrib/diff-highlight/diff-highlight +++ b/contrib/diff-highlight/diff-highlight @@ -36,7 +36,7 @@ $SIG{PIPE} = 'DEFAULT'; while (<>) { if (!$in_hunk) { print; - $in_hunk = /^$GRAPH*$COLOR*\@/; + $in_hunk = /^$GRAPH*$COLOR*\@\@ /; } elsif (/^$GRAPH*$COLOR*-/) { push @removed, $_; diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh b/contrib/diff-highlight/t/t9400-diff-highlight.sh index 7d034aa..64dd9f7 100755 --- a/contrib/diff-highlight/t/t9400-diff-highlight.sh +++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh @@ -254,4 +254,41 @@ test_expect_success 'diff-highlight works with the --graph option' ' test_cmp graph.exp graph.act ' +# Most combined diffs won't meet diff-highlight's line-number filter. So we +# create one here where one side drops a line and the other modifies it. That +# should result in a diff like: +# +#- modified content +#++resolved content +# +# which naively looks like one side added "+resolved". +test_expect_success 'diff-highlight ignores combined diffs' ' + echo "content" >file && + git add file && + git commit -m base && + + >file && + git commit -am master && + + git checkout -b other HEAD^ && + echo "modified content" >file && + git commit -am other && + + test_must_fail git merge master && + echo "resolved content" >file && + git commit -am resolved && + + cat >expect <<-\EOF && + --- a/file + +++ b/file + @@@ -1,1 -1,0 +1,1 @@@ + - modified content + ++resolved content + EOF + + git show -c | "$DIFF_HIGHLIGHT" >actual.raw && + sed -n "/^---/,\$p" actual && + test_cmp expect actual +' + test_done -- 2.10.0.rc2.125.gcfb3d08
Re: [PATCHv5 8/8] clone: recursive and reference option triggers submodule alternates
On Wed, Aug 24, 2016 at 4:37 PM, Jacob Keller wrote: > Yes that seems reasonable. > > Thanks, > Jake I reviewed all your comments and you seem to be ok with including this series as it is queued currently? Thanks, Stefan
[PATCH 1/3] diff-highlight: ignore test cruft
These are the same as in the normal t/.gitignore, with the exception of ".prove", as our Makefile does not support it. Signed-off-by: Jeff King --- contrib/diff-highlight/t/.gitignore | 2 ++ 1 file changed, 2 insertions(+) create mode 100644 contrib/diff-highlight/t/.gitignore diff --git a/contrib/diff-highlight/t/.gitignore b/contrib/diff-highlight/t/.gitignore new file mode 100644 index 000..7dcbb23 --- /dev/null +++ b/contrib/diff-highlight/t/.gitignore @@ -0,0 +1,2 @@ +/trash directory* +/test-results -- 2.10.0.rc2.125.gcfb3d08
[PATCH 2/3] diff-highlight: add multi-byte tests
Now that we have a test suite for diff highlight, we can show off the improvements from 8d00662 (diff-highlight: do not split multibyte characters, 2015-04-03). While we're at it, we can also add another case that _doesn't_ work: combining code points are treated as their own unit, which means that we may stick colors between them and the character they are modifying (with the result that the color is not shown in an xterm, though it's possible that other terminals err the other way, and show the color but not the accent). There's no fix here, but let's document it as a failure. Signed-off-by: Jeff King --- contrib/diff-highlight/t/t9400-diff-highlight.sh | 36 +++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh b/contrib/diff-highlight/t/t9400-diff-highlight.sh index e42232d..7d034aa 100755 --- a/contrib/diff-highlight/t/t9400-diff-highlight.sh +++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh @@ -207,7 +207,41 @@ test_expect_failure 'diff-highlight highlights mismatched hunk size' ' EOF ' -# TODO add multi-byte test +# These two code points share the same leading byte in UTF-8 representation; +# a naive byte-wise diff would highlight only the second byte. +# +# - U+00f3 ("o" with acute) +o_accent=$(printf '\303\263') +# - U+00f8 ("o" with stroke) +o_stroke=$(printf '\303\270') + +test_expect_success 'diff-highlight treats multibyte utf-8 as a unit' ' + echo "unic${o_accent}de" >a && + echo "unic${o_stroke}de" >b && + dh_test a b <<-EOF + @@ -1 +1 @@ + -unic${CW}${o_accent}${CR}de + +unic${CW}${o_stroke}${CR}de + EOF +' + +# Unlike the UTF-8 above, these are combining code points which are meant +# to modify the character preceding them: +# +# - U+0301 (combining acute accent) +combine_accent=$(printf '\314\201') +# - U+0302 (combining circumflex) +combine_circum=$(printf '\314\202') + +test_expect_failure 'diff-highlight treats combining code points as a unit' ' + echo "unico${combine_accent}de" >a && + echo "unico${combine_circum}de" >b && + dh_test a b <<-EOF + @@ -1 +1 @@ + -unic${CW}o${combine_accent}${CR}de + +unic${CW}o${combine_circum}${CR}de + EOF +' test_expect_success 'diff-highlight works with the --graph option' ' dh_test_setup_history && -- 2.10.0.rc2.125.gcfb3d08
Re: [PATCH v4 0/3] diff-highlight: add support for --graph option
On Tue, Aug 30, 2016 at 07:07:11AM -0700, Brian Henderson wrote: > On Mon, Aug 29, 2016 at 02:37:46PM -0700, Junio C Hamano wrote: > > Brian Henderson writes: > > > > > How does this look? > > > > > > Drawing the graph helped me a lot in figuring out what I was > > > actually testing. thanks! > > > > Yeah, I also am pleased to see the picture of what is being tested > > in the test script. > > > > With your sign-off, they would have been almost perfect ;-). > > doh. fixed. > > I left the subject as v4, probably mostly because I have this weird aversion > to > increasing version numbers :) but I justified it by thinking that the actual > patch set isn't changing, I just added the sign-off (and updated the commit > messages per Jeff.) Hope that's ok. Thanks. Here are a few patches to go on top. The first one could arguably be squashed into your first patch (and I don't mind if Junio wants to do so while applying, but I don't think it's worth you re-sending). The second one fleshes out the test scripts a bit, now that we have them (yay!). And the third fixes a bug that was reported to me off-list. I held back because it touches the same lines as your topic (and as a bonus, I was now able to write a test for it). It could be its own topic branch that graduates separately, but seeing as it's contrib, I don't mind one big diff-highlight potpourri topic if it makes things simpler. [1/3]: diff-highlight: ignore test cruft [2/3]: diff-highlight: add multi-byte tests [3/3]: diff-highlight: avoid highlighting combined diffs -Peff
Re: [PATCH v6 12/13] convert: add filter..process option
On Tue, Aug 30, 2016 at 03:23:10PM -0700, Junio C Hamano wrote: > Lars Schneider writes: > > > On 30 Aug 2016, at 20:59, Junio C Hamano wrote: > >> > >>> "abort" could be ambiguous because it could be read as "abort only > >>> this file". "abort-all" would work, though. Would you prefer to see > >>> "error" replaced by "abort" and "error-all" by "abort-all"? > >> > >> No. > >> > >> I was primarily reacting to "-all" part, so anything that ends with > >> "-all" is equally ugly from my point of view and not an improvement. > >> > >> As I said, "error-all" as long as other reviewers are happy with is > >> OK by me, too. > > > > Now, I see your point. How about "error-and-stop" or "error-stop" > > instead of "error-all"? > > Not really. I was primarily reacting to having "error" and > "error-all", that share the same prefix. Changing "-all" suffix to > "-stop" does not make all that difference to me. But in any case, > as long as other reviewers are happy with it, it is OK by me, too. > > > Good argument. However, my intention here was to mimic the v1 filter > > mechanism. > > I am not making any argument and in no way against the chosen > behaviour. That "intention here" that was revealed after two > iterations is something we would want to see as the reasoning > behind the choice of the final behaviour recorded somewhere, > and now I drew it out of you, I achieved what I set out to do > initially ;-) > > > I am not sure I understand your last sentence. Just to be clear: > > Would you prefer it, if Git would just close the pipe to the filter process > > on Git exit and leave the filter running? > > As a potential filter writer, I would probably feel it the easiest > to handle if there is no signal involved. My job would be to read > from the command pipe, react to it, and when the command pipe gives > me EOF, I am expected to exit myself. That would be simple enough. > > >> I meant it as primarily an example people can learn from when they > >> want to write their own. > > > > I think `t/t0021/rot13-filter.pl` (part of this patch) serves this purpose > > already. > > I would expect them to peek at contrib/, but do you seriously expect > people to comb through t/ directory? How about a filter written in C, and placed in ./t/helper/ ? At least I feel that a filter in C-language could be a starting point for others which prefer, depending on the task the filter is going to do, to use shell scripts, perl, python or any other high-level language. A test case, where data can not be filtered, would be a minimum. As Jakub pointed out, you can use iconv with good and bad data.
[PATCH 2/2] color_parse_mem: initialize "struct color" temporary
Compiling color.c with gcc 6.2.0 using -O3 produces some -Wmaybe-uninitialized false positives: color.c: In function ‘color_parse_mem’: color.c:189:10: warning: ‘bg.blue’ may be used uninitialized in this function [-Wmaybe-uninitialized] out += xsnprintf(out, len, "%c8;2;%d;%d;%d", type, ^~~ c->red, c->green, c->blue); ~~ color.c:208:15: note: ‘bg.blue’ was declared here struct color bg = { COLOR_UNSPECIFIED }; ^~ [ditto for bg.green, bg.red, fg.blue, etc] This is doubly confusing, because the declaration shows it being initialized! Even though we do not explicitly initialize the color components, an incomplete initializer sets the unmentioned members to zero. What the warning doesn't show is that we later do this: struct color c; if (!parse_color(&c, ...)) { if (fg.type == COLOR_UNSPECIFIED) fg = c; ... } gcc is clever enough to realize that a struct assignment from an uninitialized variable taints the destination. But unfortunately it's _not_ clever enough to realize that we only look at those members when type is set to COLOR_RGB, in which case they are always initialized. With -O2, gcc does not look into parse_color() and must assume that "c" emerges fully initialized. With -O3, it inlines parse_color(), and learns just enough to get confused. We can silence the false positive by initializing the temporary "c". This also future-proofs us against violating the type assumptions (the result would probably still be buggy, but in a deterministic way). Signed-off-by: Jeff King --- Of course it's possible that I am wrong and gcc is right, but I just don't see it (and that was the real reason I dug; I don't care _that_ much about -O3 warnings, but I wanted to see if gcc had found a real bug). color.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/color.c b/color.c index 81c2676..1b95e6b 100644 --- a/color.c +++ b/color.c @@ -215,7 +215,7 @@ int color_parse_mem(const char *value, int value_len, char *dst) /* [fg [bg]] [attr]... */ while (len > 0) { const char *word = ptr; - struct color c; + struct color c = { COLOR_UNSPECIFIED }; int val, wordlen = 0; while (len > 0 && !isspace(word[wordlen])) { -- 2.10.0.rc2.125.gcfb3d08
[PATCH 1/2] error_errno: use constant return similar to error()
Commit e208f9c (make error()'s constant return value more visible, 2012-12-15) introduced some macro trickery to make the constant return from error() more visible to callers, which in turn can help gcc produce better warnings (and possibly even better code). Later, fd1d672 (usage.c: add warning_errno() and error_errno(), 2016-05-08) introduced another variant, and subsequent commits converted some uses of error() to error_errno(), losing the magic from e208f9c for those sites. As a result, compiling vcs-svn/svndiff.c with "gcc -O3" produces -Wmaybe-uninitialized false positives (at least with gcc 6.2.0). Let's give error_errno() the same treatment, which silences these warnings. Signed-off-by: Jeff King --- git-compat-util.h | 1 + usage.c | 1 + 2 files changed, 2 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index db89ba7..2ad45b3 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -436,6 +436,7 @@ static inline int const_error(void) return -1; } #define error(...) (error(__VA_ARGS__), const_error()) +#define error_errno(...) (error_errno(__VA_ARGS__), const_error()) #endif extern void set_die_routine(NORETURN_PTR void (*routine)(const char *err, va_list params)); diff --git a/usage.c b/usage.c index 1dad03f..0efa3fa 100644 --- a/usage.c +++ b/usage.c @@ -148,6 +148,7 @@ void NORETURN die_errno(const char *fmt, ...) va_end(params); } +#undef error_errno int error_errno(const char *fmt, ...) { char buf[1024]; -- 2.10.0.rc2.125.gcfb3d08
[PATCH 0/2] squelch some "gcc -O3 -Wmaybe-uninitialized" warnings
I happened to be compiling git with -O3 today, and noticed we generate some warnings about uninitialized variables (I actually compile with -Wall, but the only false positives I saw were these). Here are patches to squelch them. [1/2]: error_errno: use constant return similar to error() [2/2]: color_parse_mem: initialize "struct color" temporary -Peff
Re: git submodules implementation question
On Tue, Aug 30, 2016 at 10:53 AM, Junio C Hamano wrote: > Uma Srinivasan writes: > >> I think the following fix is still needed to is_submodule_modified(): >> >> strbuf_addf(&buf, "%s/.git", path); >> git_dir = read_gitfile(buf.buf); >> if (!git_dir) { >> git_dir = buf.buf; >> ==> if (!is_git_directory(git_dir)) { >> ==> die("Corrupted .git dir in submodule %s", path); >> ==> } >> } > > If it is so important that git_dir is a valid Git > repository after > > git_dir = read_gitfile(buf.buf); > if (!git_dir) > git_dir = buf.buf; > > is done to determine what "git_dir" to use, it seems to me that it > does not make much sense to check ONLY dir/.git that is a directory > and leave .git/modules/$name that dir/.git file points at unchecked. > Okay. > But there is much bigger problem with the above addition, I think. > There also can be a case where dir/ does not even have ".git" in it. > A submodule the user is not interested in will just have an empty > directory there, and immediately after the above three lines I > reproduced above, we have this > > if (!is_directory(git_dir)) { > strbuf_release(&buf); > return 0; > } > Good to know about this use case. > The added check will break the use case. If anything, that check, > if this code needs to verify that "git_dir" points at a valid Git > repository, should come _after_ that. Okay. > > Shouldn't "git-status --porcelain" run in the submodule notice that > it is not a valid repository and quickly die anyway? Why should we > even check before spawning that process in the first place? I don't see any reason to disagree with this. > > I might suggest to update prepare_submodule_repo_env() so that the > spawned process will *NOT* have to guess where the working tree and > repository by exporting GIT_DIR (set to "git_dir" we discover above) > and GIT_WORK_TREE (set to "." as cp.dir is set to the path to the > working tree of the submodule). That would stop the "git status" to > guess (and fooled by a corrupted dir/.git that is not a git > repository). > Here's where I am struggling with my lack of knowledge of git internals and the implementation particularly in the context of how environment variables are passed from the parent to the child process. Are you suggesting that we set up the child process environment array (the "out" argument) in prepare_submodule_repo_env() to include GIT_DIR_ENVIRONMENT and GIT_WORK_TREE_ENVIRONMENT in addition to CONFIG_DATA_ENVIRONMENT that is there now? Can I use the strcmp and argv_array_push() calls to do this? There are several callers for this routine prepare_submodule_repo_env(). Would the caller pass the git dir and work tree as arguments to this routine or would the caller set up the environment variables as needed? Is there any documentation or other example code where similar passing of env. vars to a child process is being done? Sorry for all these questions but I'm still a novice as far as the code is concerned. Thanks for your help. Uma
Re: [PATCH 14/22] sequencer: prepare for rebase -i's commit functionality
Junio C Hamano writes: > Two functions with the same name reading from the same format, even > when they expect to produce the same result in different internal > format, without even being aware of each other is a bad enough > "regression" in maintainability of the code. One of them not even > using sq_dequote() helper and reinventing is even worse. So, here is what I did as a lunch-break-hack. The "same result in different internal format" calls for a fairly light-weight mechanism to convey the necessary information that can be shared by callers with different needs, and I chose string-list for that. Totally untested, but parse_key_value_squoted() would be a good foundation to build on. The caller in "am" deliberately wants to be strict (e.g. it wants to error out when the user mucked with the file, so it insists on the three variables to appear in a known order and rejects input that violates its assumption), but the function does not mind if other callers want to be more lenient. -- >8 -- From: Junio C Hamano Date: Tue, 30 Aug 2016 12:36:42 -0700 Subject: [PATCH] am: refactor read_author_script() By splitting the part that reads from a file and the part that parses the variable definitions from the contents, make the latter can be more reusable in the future. Signed-off-by: Junio C Hamano --- builtin/am.c | 103 ++- 1 file changed, 45 insertions(+), 58 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index 739b34d..b36d1f0 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -28,6 +28,7 @@ #include "rerere.h" #include "prompt.h" #include "mailinfo.h" +#include "string-list.h" /** * Returns 1 if the file is empty or does not exist, 0 otherwise. @@ -258,38 +259,29 @@ static int read_state_file(struct strbuf *sb, const struct am_state *state, } /** - * Reads a KEY=VALUE shell variable assignment from `fp`, returning the VALUE - * as a newly-allocated string. VALUE must be a quoted string, and the KEY must - * match `key`. Returns NULL on failure. - * - * This is used by read_author_script() to read the GIT_AUTHOR_* variables from - * the author-script. + * Take a series of KEY='VALUE' lines where VALUE part is + * sq-quoted, and append at the end of the string list */ -static char *read_shell_var(FILE *fp, const char *key) +static int parse_key_value_squoted(char *buf, struct string_list *list) { - struct strbuf sb = STRBUF_INIT; - const char *str; - - if (strbuf_getline_lf(&sb, fp)) - goto fail; - - if (!skip_prefix(sb.buf, key, &str)) - goto fail; - - if (!skip_prefix(str, "=", &str)) - goto fail; - - strbuf_remove(&sb, 0, str - sb.buf); - - str = sq_dequote(sb.buf); - if (!str) - goto fail; - - return strbuf_detach(&sb, NULL); - -fail: - strbuf_release(&sb); - return NULL; + while (*buf) { + struct string_list_item *item; + char *np; + char *cp = strchr(buf, '='); + if (!cp) + return -1; + np = strchrnul(cp, '\n'); + *cp++ = '\0'; + item = string_list_append(list, buf); + + buf = np + (*np == '\n'); + *np = '\0'; + cp = sq_dequote(cp); + if (!cp) + return -1; + item->util = xstrdup(cp); + } + return 0; } /** @@ -311,44 +303,39 @@ static char *read_shell_var(FILE *fp, const char *key) static int read_author_script(struct am_state *state) { const char *filename = am_path(state, "author-script"); - FILE *fp; + struct strbuf buf = STRBUF_INIT; + struct string_list kv = STRING_LIST_INIT_DUP; + int retval = -1; /* assume failure */ + int fd; assert(!state->author_name); assert(!state->author_email); assert(!state->author_date); - fp = fopen(filename, "r"); - if (!fp) { + fd = open(filename, O_RDONLY); + if (fd < 0) { if (errno == ENOENT) return 0; die_errno(_("could not open '%s' for reading"), filename); } + strbuf_read(&buf, fd, 0); + close(fd); + if (parse_key_value_squoted(buf.buf, &kv)) + goto finish; - state->author_name = read_shell_var(fp, "GIT_AUTHOR_NAME"); - if (!state->author_name) { - fclose(fp); - return -1; - } - - state->author_email = read_shell_var(fp, "GIT_AUTHOR_EMAIL"); - if (!state->author_email) { - fclose(fp); - return -1; - } - - state->author_date = read_shell_var(fp, "GIT_AUTHOR_DATE"); - if (!state->author_date) { - fclose(fp); - return -1; - } - - if (fgetc(fp) != EOF) { - fclose(fp); - ret
Re: [PATCH v6 12/13] convert: add filter..process option
Lars Schneider writes: > On 30 Aug 2016, at 20:59, Junio C Hamano wrote: >> >>> "abort" could be ambiguous because it could be read as "abort only >>> this file". "abort-all" would work, though. Would you prefer to see >>> "error" replaced by "abort" and "error-all" by "abort-all"? >> >> No. >> >> I was primarily reacting to "-all" part, so anything that ends with >> "-all" is equally ugly from my point of view and not an improvement. >> >> As I said, "error-all" as long as other reviewers are happy with is >> OK by me, too. > > Now, I see your point. How about "error-and-stop" or "error-stop" > instead of "error-all"? Not really. I was primarily reacting to having "error" and "error-all", that share the same prefix. Changing "-all" suffix to "-stop" does not make all that difference to me. But in any case, as long as other reviewers are happy with it, it is OK by me, too. > Good argument. However, my intention here was to mimic the v1 filter > mechanism. I am not making any argument and in no way against the chosen behaviour. That "intention here" that was revealed after two iterations is something we would want to see as the reasoning behind the choice of the final behaviour recorded somewhere, and now I drew it out of you, I achieved what I set out to do initially ;-) > I am not sure I understand your last sentence. Just to be clear: > Would you prefer it, if Git would just close the pipe to the filter process > on Git exit and leave the filter running? As a potential filter writer, I would probably feel it the easiest to handle if there is no signal involved. My job would be to read from the command pipe, react to it, and when the command pipe gives me EOF, I am expected to exit myself. That would be simple enough. >> I meant it as primarily an example people can learn from when they >> want to write their own. > > I think `t/t0021/rot13-filter.pl` (part of this patch) serves this purpose > already. I would expect them to peek at contrib/, but do you seriously expect people to comb through t/ directory?
Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options
Johannes Sixt writes: > Am 30.08.2016 um 19:52 schrieb Johannes Schindelin: >> Right, but that is exactly what I wanted to avoid, because it is rather >> inelegant to strdup() strings just so that we do not have to record what >> to free() and what not to free(). > > Please, excuse, but when I have to choose what is more "elegant": > > 1. strdup() sometimes so that I can later free() always > 2. use sequencer_entrust() > > I would choose 1. at all times. Let's not bring elegance or aesthetics in. It is like asking if garbage collected systems are elegant. I do not think it is a valid question. GC is a good pragmatic compromise after (1) declaring that keeping track of allocation is too cumbersome and programmers' time is better spent elsewhere, and (2) making sure hiccups caused by GC can be minimized to keep the latency down to acceptable levels. There are good compromises and bad ones. Does the proposed entrust() thing qualify as a good pragmatic compromise like GC does? > Particularly in this case: parsing options does not sound like a > major drain of resources, neither CPU- nor memory-wise. If entrust() does not have any CPU- or memory-cost, then comparing it with having to strdup() sometimes might be a useful comparison. But the entrust() thing has the allocation cost in order to track of what needs to be done at runtime, and just like strdup() needs to be done on things that are being borrowed, entrust() needs to be avoided on things that are being borrowed (and the caller needs to be sure not to entrust() the same piece of memory twice), so it does not reduce the cost on programmers' and readers' mental burden--the programmer always has to know which piece of memory is borrowed and which are owned by the subsystem. So...
Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options
W dniu 30.08.2016 o 19:52, Johannes Schindelin pisze: > Hi Kuba, > > On Tue, 30 Aug 2016, Jakub Narębski wrote: > >> W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze: >> >>> The sequencer reads options from disk and stores them in its struct >>> for use during sequencer's operations. >>> >>> With this patch, the memory is released afterwards, plugging a >>> memory leak. >>> >>> Signed-off-by: Johannes Schindelin >>> --- >>> sequencer.c | 13 ++--- >>> 1 file changed, 10 insertions(+), 3 deletions(-) >>> >>> diff --git a/sequencer.c b/sequencer.c >>> index b5be0f9..8d79091 100644 >>> --- a/sequencer.c >>> +++ b/sequencer.c >>> @@ -131,6 +131,8 @@ static void remove_sequencer_state(const struct >>> replay_opts *opts) >>> free(opts->owned[i]); >>> free(opts->owned); >>> >>> + free(opts->xopts); >>> + >> >> This looks like independent change, not related to using the >> sequencer_entrust() to store options read from disk in replay_opts >> struct to be able to free memory afterwards. >> >> I guess you wanted to avoid one line changes... > > Actually, it is not an independent change, but it free()s memory that has > been allocated while reading the options, as the commit message says ;-) > >>> @@ -811,13 +813,18 @@ static int populate_opts_cb(const char *key, const >>> char *value, void *data) >> >> Sidenote: this patch would be easier to read if lines were reordered >> as below, but I don't think any slider heuristics could help achieve >> that automatically. Also, the patch might be invalid... >> >>> opts->allow_ff = git_config_bool_or_int(key, value, >>> &error_flag); >>> else if (!strcmp(key, "options.mainline")) >>> opts->mainline = git_config_int(key, value); >>> - else if (!strcmp(key, "options.strategy")) >>> + else if (!strcmp(key, "options.strategy")) { >>> git_config_string(&opts->strategy, key, value); >>> + sequencer_entrust(opts, (char *) opts->strategy); >> >> I wonder if the ability to free strings dup-ed by git_config_string() >> be something that is part of replay_opts, or rather remove_sequencer_state(), >> that is a list of >> >> free(opts->strategy); >> free(opts->gpg_sign); > > That is not necessarily possible because the way sequencer works, the > options may have not actually be read from the file, but may be populated > by the caller (in which case we do not necessarily want to require > strdup()ing the strings just so that the sequencer can clean stuff up > afterwards). I guess from cursory browsing through the Git code that _currently_ they are only read from the config file, where git_config_string() strdup's them, isn't it? And we want to prepare for the future, where the caller would prepare replay_opts, and the caller would be responsible for freeing data if necessary? Would there be any sane situation where some of data should be owned by caller (and freed by caller), and some of data should be owned by sequencer library API (and freed in remove_sequencer_state())? If not, perhaps *_entrust() mechanism is overthinking it, and we simply need 'is_strdup' boolean flag or something like that... >> The *_entrust() mechanism is more generic, but do we use this general-ness? >> Well, it could be xstrdup or git_config_string doing entrust'ing... > > Right, but that is exactly what I wanted to avoid, because it is rather > inelegant to strdup() strings just so that we do not have to record what > to free() and what not to free(). Maybe inelegant, but it might be easier than inventing and implementing *_entrust() mechanism, like Hannes wrote. > > BTW I have no objection at all to generalize this sequencer_entrust() > mechanism further (read: to other, similar use cases), should it withstand > the test of time. Yeah, that's my take on it too. -- Jakub Narębski
Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests
On Tue, Aug 30, 2016 at 10:51 PM, Jeff King wrote: > On Tue, Aug 30, 2016 at 10:48:19PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> > -failed: $(patsubst trash,,$(patsubst directory.%,%.sh,$(wildcard trash\ >> > directory.t[0-9]*))) >> > +failed: >> > + @failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \ >> > + grep -l '^failed [1-9]' $$(ls -t *.counts | \ >> > + sed >> > 'G;h;/^\(t[^.]*\)-[0-9]*\..*\n\1-[0-9]*\./d;P;d') | \ >> > + sed -n 's/-[0-9]*\.counts$$/.sh/p') && \ >> > + test -z "$$failed" || $(MAKE) $$failed >> > >> > prove: pre-clean $(TEST_LINT) >> > @echo "*** prove ***"; $(PROVE) --exec '$(SHELL_PATH_SQ)' >> > $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS) >> >> I don't at all mind this solution to the problem, if it works for that's >> cool. >> >> But FWIW something you may have missed is that you can just use >> prove(1) for this, which is why I initially patched git.git to support >> TAP, so I didn't have to implement stuff like this. > > Heh. I think each iteration of this patch will be destined to have > somebody[1] point Johannes at prove. ;) > > (But I really do recommend prove if you can use it). > > -Peff > > [1] http://public-inbox.org/git/20160630063725.gc15...@sigill.intra.peff.net/ Sorry about that, I see it's been mentioned already. My only excuse is that I don't know how to operate my E-Mail client :)
Apply Today
Dear Sir/Madam, We give out urgent loan for business and personal purpose with 3% intrest rate applicable to all amount. Kindly get back to us via email: loa...@foxmail.com for further details on how to apply.
Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests
On Tue, Aug 30, 2016 at 10:48:19PM +0200, Ævar Arnfjörð Bjarmason wrote: > > -failed: $(patsubst trash,,$(patsubst directory.%,%.sh,$(wildcard trash\ > > directory.t[0-9]*))) > > +failed: > > + @failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \ > > + grep -l '^failed [1-9]' $$(ls -t *.counts | \ > > + sed > > 'G;h;/^\(t[^.]*\)-[0-9]*\..*\n\1-[0-9]*\./d;P;d') | \ > > + sed -n 's/-[0-9]*\.counts$$/.sh/p') && \ > > + test -z "$$failed" || $(MAKE) $$failed > > > > prove: pre-clean $(TEST_LINT) > > @echo "*** prove ***"; $(PROVE) --exec '$(SHELL_PATH_SQ)' > > $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS) > > I don't at all mind this solution to the problem, if it works for that's cool. > > But FWIW something you may have missed is that you can just use > prove(1) for this, which is why I initially patched git.git to support > TAP, so I didn't have to implement stuff like this. Heh. I think each iteration of this patch will be destined to have somebody[1] point Johannes at prove. ;) (But I really do recommend prove if you can use it). -Peff [1] http://public-inbox.org/git/20160630063725.gc15...@sigill.intra.peff.net/
Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests
On Mon, Aug 29, 2016 at 3:46 PM, Johannes Schindelin wrote: > While developing patch series, it is a good practice to run the test > suite from time to time, just to make sure that obvious bugs are caught > early. With complex patch series, it is common to run `make -j15 -k > test`, i.e. run the tests in parallel and *not* stop at the first > failing test but continue. This has the advantage of identifying > possibly multiple problems in one big test run. > > It is particularly important to reduce the turn-around time thusly on > Windows, where the test suite spends 45 minutes on the computer on which > this patch was developed. > > It is the most convenient way to determine which tests failed after > running the entire test suite, in parallel, to look for left-over "trash > directory.t*" subdirectories in the t/ subdirectory. However, as was > pointed out by Jeff King, those directories might live outside t/ when > overridden using the --root= option, to which the Makefile > has no access. The next best method is to grep explicitly for failed > tests in the test-results/ directory, which the Makefile *can* access. > > This patch automates the process of determinig which tests failed > previously and re-running them. > > Note that we need to be careful to inspect only the *newest* entries in > test-results/: this directory contains files of the form > t--.counts and is only removed wholesale when running the > *entire* test suite, not when running individual tests. We ensure that > with a little sed magic on `ls -t`'s output that simply skips lines > when the file name was seen earlier. > > Signed-off-by: Johannes Schindelin > --- > > The patch is unfortunately no longer as trivial as before, but it > now also works with --root=..., i.e. when the user overrode the > location of the trash directories. > > Published-As: https://github.com/dscho/git/releases/tag/failing-tests-v2 > Fetch-It-Via: git fetch https://github.com/dscho/git failing-tests-v2 > Interdiff vs v1: > > diff --git a/t/Makefile b/t/Makefile > index c402a9ec..8aa6a72 100644 > --- a/t/Makefile > +++ b/t/Makefile > @@ -35,7 +35,12 @@ all: $(DEFAULT_TEST_TARGET) > test: pre-clean $(TEST_LINT) > $(MAKE) aggregate-results-and-cleanup > > -failed: $(patsubst trash,,$(patsubst directory.%,%.sh,$(wildcard trash\ > directory.t[0-9]*))) > +failed: > + @failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \ > + grep -l '^failed [1-9]' $$(ls -t *.counts | \ > + sed 'G;h;/^\(t[^.]*\)-[0-9]*\..*\n\1-[0-9]*\./d;P;d') > | \ > + sed -n 's/-[0-9]*\.counts$$/.sh/p') && \ > + test -z "$$failed" || $(MAKE) $$failed > > prove: pre-clean $(TEST_LINT) > @echo "*** prove ***"; $(PROVE) --exec '$(SHELL_PATH_SQ)' > $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS) I don't at all mind this solution to the problem, if it works for that's cool. But FWIW something you may have missed is that you can just use prove(1) for this, which is why I initially patched git.git to support TAP, so I didn't have to implement stuff like this. I.e.: $ prove --state=save t[0-9]*.sh $ prove --state=failed,save t[0-9]*.sh Does exactly what you're trying to do here with existing tools we support. I.e. your new target could just be implemented in terms of calling prove. Check out its man page, it may have other stuff you want to use, e.g. you can run the slowest tests first etc.
Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options
Am 30.08.2016 um 19:52 schrieb Johannes Schindelin: Right, but that is exactly what I wanted to avoid, because it is rather inelegant to strdup() strings just so that we do not have to record what to free() and what not to free(). Please, excuse, but when I have to choose what is more "elegant": 1. strdup() sometimes so that I can later free() always 2. use sequencer_entrust() I would choose 1. at all times. Particularly in this case: parsing options does not sound like a major drain of resources, neither CPU- nor memory-wise. -- Hannes
Re: [PATCH v6 12/13] convert: add filter..process option
Hello, W dniu 30.08.2016 o 20:59, Junio C Hamano pisze: > Lars Schneider writes: [...] >> The filter can exit right after the "error-all". If the filter does >> not exit then Git will kill the filter. I'll add this to the docs. > > OK. Is it explicit kill, or implicit kill: close pipe and end process? >> "abort" could be ambiguous because it could be read as "abort only >> this file". "abort-all" would work, though. Would you prefer to see >> "error" replaced by "abort" and "error-all" by "abort-all"? > > No. > > I was primarily reacting to "-all" part, so anything that ends with > "-all" is equally ugly from my point of view and not an improvement. > > As I said, "error-all" as long as other reviewers are happy with is > OK by me, too. I'm rather partial to "abort" instead of "error-all", or "quit"/"exit" (but I prefer "abort" or "bail-out"), as it better reflects what this response is about - ending filter process. >> A filter that dies during communication or does not adhere to the protocol >> is a faulty filter. Feeding the faulty filter after restart with the same >> blob would likely cause the same error. > > Why does Git restart it and continue with the rest of the blobs > then? Is there a reason to believe it may produce correct results > for other blobs if you run it again? I guess the idea is that single filter can be run on different types of blobs, and it could fail on some types (some files) and not others. Like LFS-type filter failing on files with size larger than 2 GiB, or iconv-like filter to convert UTF-16 to UTF-8 failing on invalid byte sequences. >> B) If we communicate "shutdown" to the filter then we need to give the >>filter some time to perform the exit before the filter is killed on >>Git exit. I wasn't able to come up with a good answer how long Git >>should wait for the exit. > > Would that be an issue? A filter is buggy if told to shutdown, > ignores the request and hangs around forever. I do not think we > even need to kill it. It is not Git's problem. I think the idea was for Git to request shutdown, and filter to tell when it finished (or just exiting, and Git getting SIGCHLD, I guess). It is hard to tell how much to wait, for example for filter process to send "sync" command, or finish unmounting, or something... > I personally would be reluctant to become a filter process writer if > the system it will be talking to can kill it without giving it a > chance to do a graceful exit, but perhaps that is just me. I don't > know if closing the pipe going there where you are having Git to > kill the process on the other end is any more involved than what you > have without extra patches. Isn't it the same problem with v1 filters being killed on Git process exit? Unless v2 filter wants to do something _after_ writing output to Git, it should be safe... unless Git process is killed, but it is not different from filter being stray-killed. +Please note that you cannot use an existing `filter..clean` +or `filter..smudge` command with `filter..process` +because the former two use a different inter process communication +protocol than the latter one. >>> >>> Would it be a useful sample program we can ship in contrib/ if you >>> created a "filter adapter" that reads these two configuration >>> variables and act as a filter..process? >> >> You mean a v2 filter that would use v1 filters under the hood? >> If we would drop v1, then this would be useful. Otherwise I don't >> see any need for such a thing. > > I meant it as primarily an example people can learn from when they > want to write their own. Errr... if it were any v1 filter, it would be useless (as bad or worse performance than ordinary v1 clean or smudge filter). It might make sense if v1 filter and v2 wrapper were written in the same [dynamic] language, and wrapper could just load / require filter as a function / procedure, [compile it], and use it. For example smudge/clean filter in Perl (e.g. rot13), and v2 wrapper in Perl too. Best, -- Jakub Narębski
Re: [PATCH v6 12/13] convert: add filter..process option
On 30 Aug 2016, at 20:59, Junio C Hamano wrote: > >> "abort" could be ambiguous because it could be read as "abort only >> this file". "abort-all" would work, though. Would you prefer to see >> "error" replaced by "abort" and "error-all" by "abort-all"? > > No. > > I was primarily reacting to "-all" part, so anything that ends with > "-all" is equally ugly from my point of view and not an improvement. > > As I said, "error-all" as long as other reviewers are happy with is > OK by me, too. Now, I see your point. How about "error-and-stop" or "error-stop" instead of "error-all"? >> A filter that dies during communication or does not adhere to the protocol >> is a faulty filter. Feeding the faulty filter after restart with the same >> blob would likely cause the same error. > > Why does Git restart it and continue with the rest of the blobs > then? Is there a reason to believe it may produce correct results > for other blobs if you run it again? Good argument. However, my intention here was to mimic the v1 filter mechanism. If a filter fails then Git will run the same v1 filter on the next file that needs to be filtered. If `filter..required=false` then a failure is even a legitimate result. >> B) If we communicate "shutdown" to the filter then we need to give the >> filter some time to perform the exit before the filter is killed on >> Git exit. I wasn't able to come up with a good answer how long Git >> should wait for the exit. > > Would that be an issue? A filter is buggy if told to shutdown, > ignores the request and hangs around forever. I do not think we > even need to kill it. It is not Git's problem. > I personally would be reluctant to become a filter process writer if > the system it will be talking to can kill it without giving it a > chance to do a graceful exit, but perhaps that is just me. I don't > know if closing the pipe going there where you are having Git to > kill the process on the other end is any more involved than what you > have without extra patches. I am not sure I understand your last sentence. Just to be clear: Would you prefer it, if Git would just close the pipe to the filter process on Git exit and leave the filter running? I think I wouldn't even need to close the pipe explicitly. The pipe would be closed automatically when git exits or dies, right? The filter could detect EOF and exit on its own. That would be OK with me. +Please note that you cannot use an existing `filter..clean` +or `filter..smudge` command with `filter..process` +because the former two use a different inter process communication +protocol than the latter one. >>> >>> Would it be a useful sample program we can ship in contrib/ if you >>> created a "filter adapter" that reads these two configuration >>> variables and act as a filter..process? >> >> You mean a v2 filter that would use v1 filters under the hood? >> If we would drop v1, then this would be useful. Otherwise I don't >> see any need for such a thing. > > I meant it as primarily an example people can learn from when they > want to write their own. I think `t/t0021/rot13-filter.pl` (part of this patch) serves this purpose already. Thanks, Lars
FW: GIT Support Partners
Hi- I'm wondering if anyone could suggest a GIT support partner(s)? The community is great, but I'm looking for a more personalized GIT support experience. Thanks! -Scott Sobstad Scott Sobstad Director-Application Support,TSG JDA Software 20700 Swenson Dr, Waukesha, WI 53186 / United States 1.262.317.2185 / office 1.480.308.3000 / worldwide scott.sobs...@jda.com visit us / jda.com
[PATCH] git-send-email: Add ability to cc: any "trailers" from commit message
Many commits have various forms of trailers similar to "Acked-by: Name " and "Reported-by: Name " Add the ability to cc these trailers when using git send-email. This can be suppressed with --suppress-cc=trailers. Signed-off-by: Joe Perches --- Documentation/git-send-email.txt | 10 ++ git-send-email.perl | 16 +++- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt index 642d0ef..999c842 100644 --- a/Documentation/git-send-email.txt +++ b/Documentation/git-send-email.txt @@ -278,9 +278,10 @@ Automating the value of `sendemail.identity`. --[no-]signed-off-by-cc:: - If this is set, add emails found in Signed-off-by: or Cc: lines to the - cc list. Default is the value of `sendemail.signedoffbycc` configuration - value; if that is unspecified, default to --signed-off-by-cc. + If this is set, add emails found in Signed-off-by: or Cc: or any other + trailer -by: lines to the cc list. Default is the value of + `sendemail.signedoffbycc` configuration value; if that is unspecified, + default to --signed-off-by-cc. --[no-]cc-cover:: If this is set, emails found in Cc: headers in the first patch of @@ -307,8 +308,9 @@ Automating patch body (commit message) except for self (use 'self' for that). - 'sob' will avoid including anyone mentioned in Signed-off-by lines except for self (use 'self' for that). +- 'trailers' will avoid including anyone mentioned in any "-by:" lines. - 'cccmd' will avoid running the --cc-cmd. -- 'body' is equivalent to 'sob' + 'bodycc' +- 'body' is equivalent to 'sob' + 'bodycc' + 'trailers' - 'all' will suppress all auto cc values. -- + diff --git a/git-send-email.perl b/git-send-email.perl index da81be4..255465a 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -84,7 +84,7 @@ git send-email --dump-aliases --identity* Use the sendemail. options. --to-cmd * Email To: via ` \$patch_path` --cc-cmd * Email Cc: via ` \$patch_path` ---suppress-cc * author, self, sob, cc, cccmd, body, bodycc, all. +--suppress-cc * author, self, sob, cc, cccmd, body, bodycc, trailers, all. --[no-]cc-cover* Email Cc: addresses in the cover letter. --[no-]to-cover* Email To: addresses in the cover letter. --[no-]signed-off-by-cc* Send to Signed-off-by: addresses. Default on. @@ -431,13 +431,13 @@ my(%suppress_cc); if (@suppress_cc) { foreach my $entry (@suppress_cc) { die "Unknown --suppress-cc field: '$entry'\n" - unless $entry =~ /^(?:all|cccmd|cc|author|self|sob|body|bodycc)$/; + unless $entry =~ /^(?:all|cccmd|cc|author|self|sob|body|bodycc|trailers)$/; $suppress_cc{$entry} = 1; } } if ($suppress_cc{'all'}) { - foreach my $entry (qw (cccmd cc author self sob body bodycc)) { + foreach my $entry (qw (cccmd cc author self sob body bodycc trailers)) { $suppress_cc{$entry} = 1; } delete $suppress_cc{'all'}; @@ -448,7 +448,7 @@ $suppress_cc{'self'} = $suppress_from if defined $suppress_from; $suppress_cc{'sob'} = !$signed_off_by_cc if defined $signed_off_by_cc; if ($suppress_cc{'body'}) { - foreach my $entry (qw (sob bodycc)) { + foreach my $entry (qw (sob bodycc trailers)) { $suppress_cc{$entry} = 1; } delete $suppress_cc{'body'}; @@ -1545,7 +1545,7 @@ foreach my $t (@files) { # Now parse the message body while(<$fh>) { $message .= $_; - if (/^(Signed-off-by|Cc): (.*)$/i) { + if (/^(Signed-off-by|Cc|[^\s]+[_-]by): (.*)$/i) { chomp; my ($what, $c) = ($1, $2); chomp $c; @@ -1555,6 +1555,12 @@ foreach my $t (@files) { } else { next if $suppress_cc{'sob'} and $what =~ /Signed-off-by/i; next if $suppress_cc{'bodycc'} and $what =~ /Cc/i; + next if $suppress_cc{'trailers'} and $what !~ /Signed-off-by/i && $what =~ /by$/i; + } + if ($c !~ /.+@.+/) { + printf("(body) Ignoring %s from line '%s'\n", + $what, $_) unless $quiet; + next; } push @cc, $c; printf("(body) Adding cc: %s from line '%s'\n", -- 2.10.0.rc2.1.gb2aa91d
Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C
Hey Junio, On Wed, Aug 31, 2016 at 1:03 AM, Junio C Hamano wrote: > Pranit Bauva writes: > >> This is a very tricky one. I have purposely not included this after a >> lot of testing. I have hand tested with the original git and with this >> branch. The reason why anyone wouldn't be able to catch this is >> because its not covered in the test suite. I am including a patch with >> this as an attachment (because I am behind a proxy right now but don't >> worry I will include this as a commit in the next series). The >> original behaviour of git does not clean the bisect state when this >> situation occurs. > > "We sometimes clean and we sometimes don't and this follows the > original" may be a valid justification but it is not a very useful > explanation. The real issue is if not cleaning is intended (and if > so why; otherwise, if it is clear that it is simply forgotten, we > can just fix it in the series as a follow-up step). I think we do want to recover from this "bad merge base" state and for that not cleaning up is essential. The original behaviour of git seems natural to me. > If not cleaning in some cases (but not others) is the right thing, > at least there needs an in-code comment to warn others against > "fixing" the lack of cleanups (e.g. "don't clean state here, because > the caller still wants to see what state we were for this and that > reason"). I will mention it in the comments. + if (bisect_next_check(terms, terms->term_good.buf)) + return -1; >>> >>> Mental note. The "autostart" in the original is gone. Perhaps it >>> is done by next_check in this code, but we haven't seen that yet. >> >> This will be added back again in a coming patch[1]. > > In other words, this series is broken at this step and the breakage > stay with the codebase until a later step? Broken though it passes the test suite. > I do not know if reordering the patches in the series is enough to > fix that, or if it is even worth to avoid such a temporary breakage. > It depends on how serious the circularity is, but I would understand > if it is too hard and not worth the effort (I think in a very early > review round some people advised against the bottom-up rewrite > because they anticipated this exact reason). At least the omission > (and later resurrection) needs to be mentioned in the log so that > people understand what is going on when they later need to bisect. bisect_autostart() call from bisect_next() was introduced in the earliest version of git-bisect in the commit 8cc6a0831 but it isn't really a very big deal and I think it would be OK to skip it for a very little while as any which ways the series in the end would fix it. It is important to mention this in the commit message and I will do it. Regards, Pranit Bauva
Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C
Pranit Bauva writes: > This is a very tricky one. I have purposely not included this after a > lot of testing. I have hand tested with the original git and with this > branch. The reason why anyone wouldn't be able to catch this is > because its not covered in the test suite. I am including a patch with > this as an attachment (because I am behind a proxy right now but don't > worry I will include this as a commit in the next series). The > original behaviour of git does not clean the bisect state when this > situation occurs. "We sometimes clean and we sometimes don't and this follows the original" may be a valid justification but it is not a very useful explanation. The real issue is if not cleaning is intended (and if so why; otherwise, if it is clear that it is simply forgotten, we can just fix it in the series as a follow-up step). If not cleaning in some cases (but not others) is the right thing, at least there needs an in-code comment to warn others against "fixing" the lack of cleanups (e.g. "don't clean state here, because the caller still wants to see what state we were for this and that reason"). >>> + if (bisect_next_check(terms, terms->term_good.buf)) >>> + return -1; >> >> Mental note. The "autostart" in the original is gone. Perhaps it >> is done by next_check in this code, but we haven't seen that yet. > > This will be added back again in a coming patch[1]. In other words, this series is broken at this step and the breakage stay with the codebase until a later step? I do not know if reordering the patches in the series is enough to fix that, or if it is even worth to avoid such a temporary breakage. It depends on how serious the circularity is, but I would understand if it is too hard and not worth the effort (I think in a very early review round some people advised against the bottom-up rewrite because they anticipated this exact reason). At least the omission (and later resurrection) needs to be mentioned in the log so that people understand what is going on when they later need to bisect. Thanks.
Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests
Jeff King writes: > Hmm, interesting. Your approach seems reasonable, but I have to wonder > if writing the pid in the first place is sane. > > I started to write up my reasoning in this email, but realized it was > rapidly becoming the content of a commit message. So here is that > commit. Sounds sensible; if this makes Dscho's "which ones failed in the previous run" simpler, that is even better ;-)
Re: git am and duplicate signatures
Joe Perches writes: > On Tue, 2016-08-30 at 11:17 -0700, Junio C Hamano wrote: >> Joe Perches writes: >> >> > >> > On Tue, 2016-08-30 at 10:41 -0700, Joe Perches wrote: >> > > >> > > Maybe something like traces or chains. >> > Or "taggers" or "tagged-bys" >> I am afraid that you are way too late; the ship has already sailed a >> few years ago, if not earlier, I think. > > What's the ship's name? Is it footers or trailers? I think we casually call them footers (as they are counter-part to "headers"), or trailers (probably more official as that is half of the name of the subsystem that is supposed to deal with them).
Re: [PATCH v6 12/13] convert: add filter..process option
Lars Schneider writes: >> This part of the document is well-written to help filter-writers. > > Thanks! Don't thank me; thank yourself and reviewers of the previous rounds. > The filter can exit right after the "error-all". If the filter does > not exit then Git will kill the filter. I'll add this to the docs. OK. > "abort" could be ambiguous because it could be read as "abort only > this file". "abort-all" would work, though. Would you prefer to see > "error" replaced by "abort" and "error-all" by "abort-all"? No. I was primarily reacting to "-all" part, so anything that ends with "-all" is equally ugly from my point of view and not an improvement. As I said, "error-all" as long as other reviewers are happy with is OK by me, too. > A filter that dies during communication or does not adhere to the protocol > is a faulty filter. Feeding the faulty filter after restart with the same > blob would likely cause the same error. Why does Git restart it and continue with the rest of the blobs then? Is there a reason to believe it may produce correct results for other blobs if you run it again? > B) If we communicate "shutdown" to the filter then we need to give the >filter some time to perform the exit before the filter is killed on >Git exit. I wasn't able to come up with a good answer how long Git >should wait for the exit. Would that be an issue? A filter is buggy if told to shutdown, ignores the request and hangs around forever. I do not think we even need to kill it. It is not Git's problem. I personally would be reluctant to become a filter process writer if the system it will be talking to can kill it without giving it a chance to do a graceful exit, but perhaps that is just me. I don't know if closing the pipe going there where you are having Git to kill the process on the other end is any more involved than what you have without extra patches. >>> +Please note that you cannot use an existing `filter..clean` >>> +or `filter..smudge` command with `filter..process` >>> +because the former two use a different inter process communication >>> +protocol than the latter one. >> >> Would it be a useful sample program we can ship in contrib/ if you >> created a "filter adapter" that reads these two configuration >> variables and act as a filter..process? > > You mean a v2 filter that would use v1 filters under the hood? > If we would drop v1, then this would be useful. Otherwise I don't > see any need for such a thing. I meant it as primarily an example people can learn from when they want to write their own.
Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C
Hey Junio, On Tue, Aug 30, 2016 at 11:55 PM, Pranit Bauva wrote: > >>> @@ -729,7 +735,7 @@ static struct commit **get_bad_and_good_commits(int >>> *rev_nr) >>> return rev; >>> } >>> >>> -static void handle_bad_merge_base(void) >>> +static int handle_bad_merge_base(void) >>> { >>> if (is_expected_rev(current_bad_oid)) { >>> char *bad_hex = oid_to_hex(current_bad_oid); >>> @@ -750,17 +756,18 @@ static void handle_bad_merge_base(void) >>> "between %s and [%s].\n"), >>> bad_hex, term_bad, term_good, bad_hex, >>> good_hex); >>> } >>> - exit(3); >>> + return 3; >>> } >>> >>> fprintf(stderr, _("Some %s revs are not ancestor of the %s rev.\n" >>> "git bisect cannot work properly in this case.\n" >>> "Maybe you mistook %s and %s revs?\n"), >>> term_good, term_bad, term_good, term_bad); >>> - exit(1); >>> + bisect_clean_state(); >>> + return 1; >> >> What is the logic behind this function sometimes clean the state, >> and some other times do not, when it makes an error-return? We see >> above that "return 3" codepath leaves the state behind. >> >> Either you forgot a necessary clean_state in "return 3" codepath, >> or you forgot to document why the distinction exists in the in-code >> comment for the function. I cannot tell which, but I am leaning >> towards guessing that it is the former. > > This is a very tricky one. I have purposely not included this after a > lot of testing. I have hand tested with the original git and with this > branch. The reason why anyone wouldn't be able to catch this is > because its not covered in the test suite. I am including a patch with > this as an attachment (because I am behind a proxy right now but don't > worry I will include this as a commit in the next series). The > original behaviour of git does not clean the bisect state when this > situation occurs. On another note which you might have missed that > bisect_clean_state() is purposely put before return 1 which is covered > by the test suite. You can try removing it and see that there is a > broken tes. tI was thinking of including the tests after the whole > conversion but now I think including this before will make the > conversion more easier for review. The patch which I included as an attachment before will fail for different reasons if you apply it on master branch. To test it on master branch use the current attachment. Again sorry I couldn't include this in the email itself. Regards, Pranit Bauva out3 Description: Binary data
Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options
Johannes Schindelin writes: > @@ -811,13 +813,18 @@ static int populate_opts_cb(const char *key, const char > *value, void *data) > opts->allow_ff = git_config_bool_or_int(key, value, > &error_flag); > else if (!strcmp(key, "options.mainline")) > opts->mainline = git_config_int(key, value); > - else if (!strcmp(key, "options.strategy")) > + else if (!strcmp(key, "options.strategy")) { > git_config_string(&opts->strategy, key, value); > - else if (!strcmp(key, "options.gpg-sign")) > + sequencer_entrust(opts, (char *) opts->strategy); > + } > + else if (!strcmp(key, "options.gpg-sign")) { > git_config_string(&opts->gpg_sign, key, value); > + sequencer_entrust(opts, (char *) opts->gpg_sign); > + } > else if (!strcmp(key, "options.strategy-option")) { > ALLOC_GROW(opts->xopts, opts->xopts_nr + 1, opts->xopts_alloc); > - opts->xopts[opts->xopts_nr++] = xstrdup(value); > + opts->xopts[opts->xopts_nr++] = > + sequencer_entrust(opts, xstrdup(value)); > } else > return error(_("Invalid key: %s"), key); Hmm. I would have expected a call to sequencer_opts_clear(&opts) once the machinery is done with the options structure, and among these places where an old value in opts->field is overwritten by a new one would get free(opt->field); opt->field = ... new value ...; Perhaps there was a good reason to do it this way (one valid reason may be that there is _no_ good place to declare that opts is now done and it is safe to call sequencer_opts_clear() on it), but this looks backwards from the way things are usually done.
Re: [PATCH 05/22] sequencer: allow the sequencer to take custody of malloc()ed data
Jakub Narębski writes: > In my personal opinion 'set_me_free_after_use' is not the best name, > but I unfortunately do not have a better proposal. Maybe 'entrust_ptr', > or 'entrusted_data' / 'entrusted_ptr' / 'entrusted'? Is this to accumulate to-be-freed pointers? I think we often call a local variable that points at a piece of memory to be freed "to_free", and that is an appropriate name for what this function is trying to do. It is a bit surprising that the careless memory management in this codepath leaks only the dumb pieces of memory (as opposed to pointers to structures like string list that needs _clear() functions, in which case we cannot get away with list of to-be-freed). I guess we were somewhat lucky ;-)
Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C
Hey Junio, Sorry for a late replay. On Fri, Aug 26, 2016 at 2:00 AM, Junio C Hamano wrote: > Pranit Bauva writes: > >> A lot of parts of bisect.c uses exit() and these signals are then >> trapped in the `bisect_start` function. Since the shell script ceases >> its existence it would be necessary to convert those exit() calls to >> return statements so that errors can be reported efficiently in C code. > > Is efficiency really an issue? I think the real reason is that it > would make it impossible for the callers to handle errors, if you do > not convert and let the error codepaths exit(). I think I put the word "efficiently" wrongly over here. Will omit it. >> @@ -729,7 +735,7 @@ static struct commit **get_bad_and_good_commits(int >> *rev_nr) >> return rev; >> } >> >> -static void handle_bad_merge_base(void) >> +static int handle_bad_merge_base(void) >> { >> if (is_expected_rev(current_bad_oid)) { >> char *bad_hex = oid_to_hex(current_bad_oid); >> @@ -750,17 +756,18 @@ static void handle_bad_merge_base(void) >> "between %s and [%s].\n"), >> bad_hex, term_bad, term_good, bad_hex, >> good_hex); >> } >> - exit(3); >> + return 3; >> } >> >> fprintf(stderr, _("Some %s revs are not ancestor of the %s rev.\n" >> "git bisect cannot work properly in this case.\n" >> "Maybe you mistook %s and %s revs?\n"), >> term_good, term_bad, term_good, term_bad); >> - exit(1); >> + bisect_clean_state(); >> + return 1; > > What is the logic behind this function sometimes clean the state, > and some other times do not, when it makes an error-return? We see > above that "return 3" codepath leaves the state behind. > > Either you forgot a necessary clean_state in "return 3" codepath, > or you forgot to document why the distinction exists in the in-code > comment for the function. I cannot tell which, but I am leaning > towards guessing that it is the former. This is a very tricky one. I have purposely not included this after a lot of testing. I have hand tested with the original git and with this branch. The reason why anyone wouldn't be able to catch this is because its not covered in the test suite. I am including a patch with this as an attachment (because I am behind a proxy right now but don't worry I will include this as a commit in the next series). The original behaviour of git does not clean the bisect state when this situation occurs. On another note which you might have missed that bisect_clean_state() is purposely put before return 1 which is covered by the test suite. You can try removing it and see that there is a broken tes. tI was thinking of including the tests after the whole conversion but now I think including this before will make the conversion more easier for review. >> -static void check_good_are_ancestors_of_bad(const char *prefix, int >> no_checkout) >> +static int check_good_are_ancestors_of_bad(const char *prefix, int >> no_checkout) >> { >> char *filename = git_pathdup("BISECT_ANCESTORS_OK"); >> struct stat st; >> - int fd; >> + int fd, res = 0; >> >> if (!current_bad_oid) >> die(_("a %s revision is needed"), term_bad); > > Can you let it die yere? Not really. I should change it to return error(). >> @@ -873,8 +890,11 @@ static void check_good_are_ancestors_of_bad(const char >> *prefix, int no_checkout) >> filename); >> else >> close(fd); >> + >> + return 0; >> done: >> free(filename); >> + return 0; >> } > > Who owns "filename"? The first "return 0" leaves it unfreed, and > when "goto done" is done, it is freed. > > The above two may indicate that "perhaps 'retval + goto finish' > pattern?" is a really relevant suggestion for the earlier steps in > this series. Yes. >> if (!all) { >> fprintf(stderr, _("No testable commit found.\n" >> "Maybe you started with bad path parameters?\n")); >> - exit(4); >> + return 4; >> } >> >> bisect_rev = revs.commits->item->object.oid.hash; >> >> if (!hashcmp(bisect_rev, current_bad_oid->hash)) { >> - exit_if_skipped_commits(tried, current_bad_oid); >> + res = exit_if_skipped_commits(tried, current_bad_oid); >> + if (res) >> + return res; >> + >> printf("%s is the first %s commit\n", sha1_to_hex(bisect_rev), >> term_bad); >> show_diff_tree(prefix, revs.commits->item); >> /* This means the bisection process succeeded. */ >> - exit(10); >> + return 10; >> } >> >> nr = all - reaches - 1; >> @@ -1005,7 +1033,11 @@ int bisect_next_all(const char *prefix, int >> no_checkout) >> "Bisecting: %d revisions left to t
Re: git am and duplicate signatures
On Tue, 2016-08-30 at 11:17 -0700, Junio C Hamano wrote: > Joe Perches writes: > > > > > On Tue, 2016-08-30 at 10:41 -0700, Joe Perches wrote: > > > > > > Maybe something like traces or chains. > > Or "taggers" or "tagged-bys" > I am afraid that you are way too late; the ship has already sailed a > few years ago, if not earlier, I think. What's the ship's name? Is it footers or trailers?
Re: Notation for current branch?
ryenus writes: > For now the best use case I can think of is with git-reflog, e.g., > the meaning of `git reflog HEAD` and `git reflog feature-branch` > are quite different, even if I'm currently on the feature-branch, > especially when I want to track the rebase histories (if any). "git reflog" gives you the reflog of HEAD in numbered format (HEAD@{1}, HEAD@{2}, etc.), and "git reflog HEAD@{now}" is an interesting way to tell it "I want that same HEAD reflog but not in numbered but in timed format). "git reflog @{0}" and "git reflog @{now}" give you the reflog of the current branch in these formats.
Re: [PATCH 14/22] sequencer: prepare for rebase -i's commit functionality
Hi Junio, On Tue, 30 Aug 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > >> > +static char **read_author_script(void) > >> > +{ > >> > +struct strbuf script = STRBUF_INIT; > >> > +int i, count = 0; > >> > +char *p, *p2, **env; > >> > +size_t env_size; > >> > + > >> > +if (strbuf_read_file(&script, rebase_path_author_script(), 256) > >> > <= 0) > >> > +return NULL; > >> > + > >> > +for (p = script.buf; *p; p++) > >> > +if (skip_prefix(p, "'''", (const char **)&p2)) > >> > +strbuf_splice(&script, p - script.buf, p2 - p, > >> > "'", 1); > >> > +else if (*p == '\'') > >> > +strbuf_splice(&script, p-- - script.buf, 1, "", > >> > 0); > >> > +else if (*p == '\n') { > >> > +*p = '\0'; > >> > +count++; > >> > +} > >> > >> Hmph. What is this loop doing? Is it decoding a sq-quoted buffer > >> or something? Don't we have a helper function to do that? > > > > Well, it is not just decoding an sq-quoted buffer, but several lines with > > definitions we sq-quoted ourselves, individually. > > > > The quote.[ch] code currently has no code to dequote lines individually. > > There is a function with exactly the same name in builtin/am.c and I > assume that it is reading from a file with the same format, which > uses a helper to read one variable line at a time. Hopefully that > can be refactored so that more parsing is shared between the two > users. > > Two functions with the same name reading from the same format, even > when they expect to produce the same result in different internal > format, without even being aware of each other is a bad enough > "regression" in maintainability of the code. One of them not even > using sq_dequote() helper and reinventing is even worse. First of all, builtin/am's read_author_script() really, really, really only wants to read stuff into the am_state struct. That alone is already so incompatible that I do not think it can be repaired. Further, builtin/am's read_author_script() reads *only* the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL and GIT_AUTHOR_DATE lines (opening the file three times for the task). And then it *requires* the corresponding values to be sq-quoted. I do not have a use case for this myself, but rebase -i explicitly eval's the author script, so it is conceivable that some enterprisey user makes use of this feature to set other environment variables. The thing is that rebase -i cannot necessarily expect all of those files in its state directory to be under tight control -- in stark contrast to git-am. I could imagine that this could be fixed by abstracting the functionality more, and use your favored sq_dequote() (which may actually fail in case of an enterprisey usage as outlined above), and adapting builtin/am's read_author_script() to make use of that refactored approach. This is a huge task, make no mistake, in particular because we need to ensure compatibility with non-standard usage. So I do not think I can tackle that any time soon. Certainly not as part of an iteration of this here patch series. Ciao, Dscho
Re: git am and duplicate signatures
Joe Perches writes: > On Tue, 2016-08-30 at 10:41 -0700, Joe Perches wrote: >> Maybe something like traces or chains. > > Or "taggers" or "tagged-bys" I am afraid that you are way too late; the ship has already sailed a few years ago, if not earlier, I think.
Re: Notation for current branch?
On 30 August 2016 at 03:49, Junio C Hamano wrote: > Jacob Keller writes: > >>> What's wrong with simply using 'HEAD' for scripting? >> >> When you want to display the current branch to the user, e.g. when >> scripting a shell prompt or similar use > > Wait. Even if a hypothetical version of Git understood @@ as "the > current branch", how would you use that notation, instead of HEAD, > to decide what to "display the current branch to the user"? > > You obviously will not be doing > > echo @@ > > You would be doing something around "git symbolic-ref" at the very > least, and wouldn't you be feeding HEAD to that symbolic-ref anyway > while doing so? > For now the best use case I can think of is with git-reflog, e.g., the meaning of `git reflog HEAD` and `git reflog feature-branch` are quite different, even if I'm currently on the feature-branch, especially when I want to track the rebase histories (if any). If there's a notation for the current branch then I don't have to find the name of the current branch, then copy/paste it. However, in this case, maybe git-reflog can have a `-b` option to show the reflog of the current branch (if on any)? - ryenus
Re: [PATCH 2/6] pull: make code more similar to the shell script again
Hi Junio, On Mon, 29 Aug 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > > +static int require_clean_work_tree(const char *action, const char *hint, > > + int gently) > > { > > struct lock_file *lock_file = xcalloc(1, sizeof(*lock_file)); > > - int do_die = 0; > > + int err = 0; > > > > hold_locked_index(lock_file, 0); > > refresh_cache(REFRESH_QUIET); > > @@ -376,20 +377,26 @@ static void die_on_unclean_work_tree(void) > > rollback_lock_file(lock_file); > > > > if (has_unstaged_changes()) { > > - error(_("Cannot pull with rebase: You have unstaged changes.")); > > - do_die = 1; > > + error(_("Cannot %s: You have unstaged changes."), action); > > ... > > if (!autostash) > > - die_on_unclean_work_tree(); > > + require_clean_work_tree("pull with rebase", > > + "Please commit or stash them.", 0); > > > > if (get_rebase_fork_point(rebase_fork_point, repo, *refspecs)) > > hashclr(rebase_fork_point); > > Splicing an English/C phrase 'pull with rebase' into a > _("localizable %s string") makes the life of i18n team hard. Hrm. > Can we do this differently? Sure, but not at this stage. Because... > If you are eventually going to expose this function as public API, I > think the right approach would be to enumerate the possible error > conditions this function can diagnose and return them to the caller, > i.e. > > #define WT_STATUS_DIRTY_WORKTREE 01 > #define WT_STATUS_DIRTY_INDEX02 > > static int require_clean_work_tree(void) > { > int status = 0; > ... > if (has_unstaged_changes()) > status |= WT_STATUS_DIRTY_WORKTREE; > if (has_uncommitted_changes()) > status |= WT_STATUS_DIRTY_INDEX; > return status; > } > > Then die_on_unclean_work_tree() can be made as a thin-wrapper that > calls it and shows the pull-specific error message. This sounds like a good plan, if involved. At this stage, I am really unwilling to introduce such extensive changes, for fear of introducing regressions. I will keep it in mind and make those changes, once Git for Windows v2.10.0 is out. Ciao, Dscho
Re: git submodules implementation question
Uma Srinivasan writes: > I think the following fix is still needed to is_submodule_modified(): > > strbuf_addf(&buf, "%s/.git", path); > git_dir = read_gitfile(buf.buf); > if (!git_dir) { > git_dir = buf.buf; > ==> if (!is_git_directory(git_dir)) { > ==> die("Corrupted .git dir in submodule %s", path); > ==> } > } If it is so important that git_dir is a valid Git repository after git_dir = read_gitfile(buf.buf); if (!git_dir) git_dir = buf.buf; is done to determine what "git_dir" to use, it seems to me that it does not make much sense to check ONLY dir/.git that is a directory and leave .git/modules/$name that dir/.git file points at unchecked. But there is much bigger problem with the above addition, I think. There also can be a case where dir/ does not even have ".git" in it. A submodule the user is not interested in will just have an empty directory there, and immediately after the above three lines I reproduced above, we have this if (!is_directory(git_dir)) { strbuf_release(&buf); return 0; } The added check will break the use case. If anything, that check, if this code needs to verify that "git_dir" points at a valid Git repository, should come _after_ that. Shouldn't "git-status --porcelain" run in the submodule notice that it is not a valid repository and quickly die anyway? Why should we even check before spawning that process in the first place? I might suggest to update prepare_submodule_repo_env() so that the spawned process will *NOT* have to guess where the working tree and repository by exporting GIT_DIR (set to "git_dir" we discover above) and GIT_WORK_TREE (set to "." as cp.dir is set to the path to the working tree of the submodule). That would stop the "git status" to guess (and fooled by a corrupted dir/.git that is not a git repository).
Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options
Hi Kuba, On Tue, 30 Aug 2016, Jakub Narębski wrote: > W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze: > > > The sequencer reads options from disk and stores them in its struct > > for use during sequencer's operations. > > > > With this patch, the memory is released afterwards, plugging a > > memory leak. > > > > Signed-off-by: Johannes Schindelin > > --- > > sequencer.c | 13 ++--- > > 1 file changed, 10 insertions(+), 3 deletions(-) > > > > diff --git a/sequencer.c b/sequencer.c > > index b5be0f9..8d79091 100644 > > --- a/sequencer.c > > +++ b/sequencer.c > > @@ -131,6 +131,8 @@ static void remove_sequencer_state(const struct > > replay_opts *opts) > > free(opts->owned[i]); > > free(opts->owned); > > > > + free(opts->xopts); > > + > > This looks like independent change, not related to using the > sequencer_entrust() to store options read from disk in replay_opts > struct to be able to free memory afterwards. > > I guess you wanted to avoid one line changes... Actually, it is not an independent change, but it free()s memory that has been allocated while reading the options, as the commit message says ;-) > > @@ -811,13 +813,18 @@ static int populate_opts_cb(const char *key, const > > char *value, void *data) > > Sidenote: this patch would be easier to read if lines were reordered > as below, but I don't think any slider heuristics could help achieve > that automatically. Also, the patch might be invalid... > > > opts->allow_ff = git_config_bool_or_int(key, value, > > &error_flag); > > else if (!strcmp(key, "options.mainline")) > > opts->mainline = git_config_int(key, value); > > - else if (!strcmp(key, "options.strategy")) > > + else if (!strcmp(key, "options.strategy")) { > > git_config_string(&opts->strategy, key, value); > > + sequencer_entrust(opts, (char *) opts->strategy); > > I wonder if the ability to free strings dup-ed by git_config_string() > be something that is part of replay_opts, or rather remove_sequencer_state(), > that is a list of > > free(opts->strategy); > free(opts->gpg_sign); That is not necessarily possible because the way sequencer works, the options may have not actually be read from the file, but may be populated by the caller (in which case we do not necessarily want to require strdup()ing the strings just so that the sequencer can clean stuff up afterwards). > Though... free(NULL) is nop as per standard, but can we rely on it? We can, and we do. > The *_entrust() mechanism is more generic, but do we use this general-ness? > Well, it could be xstrdup or git_config_string doing entrust'ing... Right, but that is exactly what I wanted to avoid, because it is rather inelegant to strdup() strings just so that we do not have to record what to free() and what not to free(). BTW I have no objection at all to generalize this sequencer_entrust() mechanism further (read: to other, similar use cases), should it withstand the test of time. Ciao, Johannes
Re: git am and duplicate signatures
On Tue, 2016-08-30 at 10:41 -0700, Joe Perches wrote: > Maybe something like traces or chains. Or "taggers" or "tagged-bys"
Re: git submodules implementation question
Thanks for the patch. Unfortunately, it doesn't help in my case as it invokes the is_submodule_modified() routine which you didn't modify. Here's my call trace #0 is_submodule_modified (path=path@entry=0x17c2f08 "groc", ignore_untracked=0) at submodule.c:939 #1 0x004aa4dc in match_stat_with_submodule ( diffopt=diffopt@entry=0x7fffde18, ce=ce@entry=0x17c2eb0, st=st@entry=0x7fffd840, ce_option=ce_option@entry=0, dirty_submodule=dirty_submodule@entry=0x7fffd83c) at diff-lib.c:81 #2 0x004ab4f5 in run_diff_files (revs=revs@entry=0x7fffd920, option=option@entry=0) at diff-lib.c:217 #3 0x0054c0d4 in wt_status_collect_changes_worktree (s=s@entry=0x7de280 ) at wt-status.c:559 #4 0x0054ecf6 in wt_status_collect (s=s@entry=0x7de280 ) at wt-status.c:678 #5 0x00422171 in cmd_status (argc=, argv=, prefix=0x0) at builtin/commit.c:1390 #6 0x00405abe in run_builtin (argv=, argc=, p=) at git.c:352 #7 handle_builtin (argc=1, argv=0x7fffe570) at git.c:551 #8 0x00405dd8 in run_argv (argv=0x7fffe320, argcp=0x7fffe32c) at git.c:606 #9 cmd_main (argc=1, argc@entry=2, argv=0x7fffe570, argv@entry=0x7fffe568) at git.c:678 #10 0x00405060 in main (argc=2, argv=0x7fffe568) at common-main.c:40 I think the following fix is still needed to is_submodule_modified(): strbuf_addf(&buf, "%s/.git", path); git_dir = read_gitfile(buf.buf); if (!git_dir) { git_dir = buf.buf; ==> if (!is_git_directory(git_dir)) { ==> die("Corrupted .git dir in submodule %s", path); ==> } } Thanks, Uma On Mon, Aug 29, 2016 at 11:23 PM, Jacob Keller wrote: > On Mon, Aug 29, 2016 at 11:09 PM, Jacob Keller wrote: >> On Mon, Aug 29, 2016 at 5:12 PM, Uma Srinivasan >> wrote: >>> This is great! Thanks Jake. If you happen to have the patch ID it >>> would be helpful. >>> >>> Uma >>> >> >> http://public-inbox.org/git/1472236108.28343.5.ca...@intel.com/ > > > Actually correct patch is > http://public-inbox.org/git/20160825233243.30700-6-jacob.e.kel...@intel.com/ > > Thanks, > Jake
Re: git am and duplicate signatures
On Tue, 2016-08-30 at 10:34 -0700, Junio C Hamano wrote: > Joe Perches writes: > > On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote: > > > > > > Support for more generic footers was supposed to come when the > > > "interpret-trailers" topic started, but the author of the topic > > > seems to have lost interest before the mechanism has become ready to > > > be integrated in the workflow commands like "am", "commit", "rebase" > > > etc., which is unfortunate. > > I think adding at least an option to git send-email > > allowing auto-cc's for all > > "-by: [name] " > > lines in the commit log would be useful. > > > > Today, only "Signed-off-by" and "CC" lines are > > added to cc's. > > > > I've always called these lines "-by:" lines > > "signatures", but perhaps there's a better name. > I think we casually call them footers (as they are counter-part to > "headers"), or trailers. I think they are neither footers, which would relate more to the email headers, nor trailers. Maybe something like traces or chains. btw: I submitted this awhile ago http://www.spinics.net/lists/git/msg162269.html
Re: [PATCH 07/22] sequencer: future-proof read_populate_todo()
Hi Kuba, On Tue, 30 Aug 2016, Jakub Narębski wrote: > W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze: > > > Over the next commits, we will work on improving the sequencer to the > > point where it can process the edit script of an interactive rebase. To > > that end, we will need to teach the sequencer to read interactive > > rebase's todo file. In preparation, we consolidate all places where > > that todo file is needed to call a function that we will later extend. > > > > Signed-off-by: Johannes Schindelin > > --- > > sequencer.c | 18 +++--- > > 1 file changed, 11 insertions(+), 7 deletions(-) > > > > diff --git a/sequencer.c b/sequencer.c > > index 8d79091..982b6e9 100644 > > --- a/sequencer.c > > +++ b/sequencer.c > > @@ -32,6 +32,11 @@ static const char *get_dir(const struct replay_opts > > *opts) > > return git_path_seq_dir(); > > } > > > > +static const char *get_todo_path(const struct replay_opts *opts) > > +{ > > + return git_path_todo_file(); > > +} > > I guess that in the future commit the return value of get_todo_path() > would change depending on what sequencer is used for, cherry-pick or > interactive rebase, that is, contents of replay_opts... Right. > > static int is_rfc2822_line(const char *buf, int len) > > { > > int i; > > @@ -772,25 +777,24 @@ static int parse_insn_buffer(char *buf, struct > > commit_list **todo_list, > > static int read_populate_todo(struct commit_list **todo_list, > > struct replay_opts *opts) > > { > > + const char *todo_file = get_todo_path(opts); > > ...and that's why you have added this temporary variable here, to > not repeat get_todo_path(opts) calculations... ... and to repeat only 9 characters instead of 19... > > - fd = open(git_path_todo_file(), O_RDONLY); > > + fd = open(todo_file, O_RDONLY); > > if (fd < 0) > > - return error_errno(_("Could not open %s"), > > - git_path_todo_file()); > > + return error_errno(_("Could not open %s"), todo_file); > > ... So that's why it is s/git_path_todo_file()/todo_file/ replacement, > and not simply... > > > @@ -1064,7 +1068,7 @@ static int sequencer_continue(struct replay_opts > > *opts) > > { > > struct commit_list *todo_list = NULL; > > > > - if (!file_exists(git_path_todo_file())) > > + if (!file_exists(get_todo_path(opts))) > > ...the s/git_path_todo_file()/git_todo_path(opts)/, isn't it? Correct. > Looks good; though I have not checked if all calling sites were converted. Thanks for the review! Johannes
Re: git am and duplicate signatures
Joe Perches writes: > On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote: >> Support for more generic footers was supposed to come when the >> "interpret-trailers" topic started, but the author of the topic >> seems to have lost interest before the mechanism has become ready to >> be integrated in the workflow commands like "am", "commit", "rebase" >> etc., which is unfortunate. > > I think adding at least an option to git send-email > allowing auto-cc's for all > "-by: [name] " > lines in the commit log would be useful. > > Today, only "Signed-off-by" and "CC" lines are > added to cc's. > > I've always called these lines "-by:" lines > "signatures", but perhaps there's a better name. I think we casually call them footers (as they are counter-part to "headers"), or trailers.
Re: git am and duplicate signatures
Joe Perches writes: > (adding lkml) > > On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote: >> Joe Perches writes: >> > git-am -s will avoid duplicating the last signature >> > in a patch. >> > >> > But given a developer creates a patch, send it around for >> > acks/other signoffs, collects signatures and then does >> > a git am -s on a different branch, this sort of sign-off >> > chain is possible: >> > >> >Signed-off-by: Original Developer >> >Acked-by: Random Developer >> >Signed-off-by: Original Developer >> Both correct and allowing the earlier one duplicated as long as >> there is somebody/something else in between is deliberate. > > linux-kernel has a script (scripts/checkpatch.pl) that > looks for duplicate signatures (-by: [name] ) > > Should the last Signed-off-by: in the commit log be > excluded from this check? That is left for the kernel folks to decide, but excluding only "the last" does not make much sense to me. If you look for only "two consecutive same signatures" and barf, that would be in line with what we have been shooting for to support the above "original then random then back to original" example you gave us above.
Re: [PATCH 14/22] sequencer: prepare for rebase -i's commit functionality
Johannes Schindelin writes: >> > +static char **read_author_script(void) >> > +{ >> > + struct strbuf script = STRBUF_INIT; >> > + int i, count = 0; >> > + char *p, *p2, **env; >> > + size_t env_size; >> > + >> > + if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0) >> > + return NULL; >> > + >> > + for (p = script.buf; *p; p++) >> > + if (skip_prefix(p, "'''", (const char **)&p2)) >> > + strbuf_splice(&script, p - script.buf, p2 - p, "'", 1); >> > + else if (*p == '\'') >> > + strbuf_splice(&script, p-- - script.buf, 1, "", 0); >> > + else if (*p == '\n') { >> > + *p = '\0'; >> > + count++; >> > + } >> >> Hmph. What is this loop doing? Is it decoding a sq-quoted buffer >> or something? Don't we have a helper function to do that? > > Well, it is not just decoding an sq-quoted buffer, but several lines with > definitions we sq-quoted ourselves, individually. > > The quote.[ch] code currently has no code to dequote lines individually. There is a function with exactly the same name in builtin/am.c and I assume that it is reading from a file with the same format, which uses a helper to read one variable line at a time. Hopefully that can be refactored so that more parsing is shared between the two users. Two functions with the same name reading from the same format, even when they expect to produce the same result in different internal format, without even being aware of each other is a bad enough "regression" in maintainability of the code. One of them not even using sq_dequote() helper and reinventing is even worse.
Re: git am and duplicate signatures
On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote: > Support for more generic footers was supposed to come when the > "interpret-trailers" topic started, but the author of the topic > seems to have lost interest before the mechanism has become ready to > be integrated in the workflow commands like "am", "commit", "rebase" > etc., which is unfortunate. I think adding at least an option to git send-email allowing auto-cc's for all "-by: [name] " lines in the commit log would be useful. Today, only "Signed-off-by" and "CC" lines are added to cc's. I've always called these lines "-by:" lines "signatures", but perhaps there's a better name. Any preference? from git send-email --help --suppress-cc= Specify an additional category of recipients to suppress the auto-cc of: · author will avoid including the patch author · self will avoid including the sender · cc will avoid including anyone mentioned in Cc lines in the patch header except for self (use self for that). · bodycc will avoid including anyone mentioned in Cc lines in the patch body (commit message) except for self (use self for that). · sob will avoid including anyone mentioned in Signed-off-by lines except for self (use self for that). · cccmd will avoid running the --cc-cmd. · body is equivalent to sob + bodycc · all will suppress all auto cc values. > > > > sequencer.c:append_signoff() has a flag for APPEND_SIGNOFF_DEDUP > Yes, I think this is one of the warts we talked about getting rid of > but haven't got around to it. It is there because "format-patch -s" > was incorrectly written to dedup Signed-off-by: from anywhere in its > early implementation and to keep the same behaviour. We should drop > that flag from append_signoff() function.
Re: [PATCH v2 08/14] sequencer: lib'ify read_and_refresh_cache()
Johannes Schindelin writes: >> With the current set of callers, a caller that notices an error from >> this function will immediately exit without doing any further >> damage. >> >> So in that sense, this is a "safe" conversion. >> >> But is it a sensible conversion? When the caller wants to do >> anything else (e.g. clean-up and try something else, perhaps read >> the index again), the caller can't, as the index is still locked, >> because even though the code knows that the lock will not be >> released until the process exit, it chose to return error without >> releasing the lock. > > It depends what the caller wants to do. The case about which I care most > is when some helpful advice should be printed (see e.g. 3be18b4 (t5520: > verify that `pull --rebase` shows the helpful advice when failing, > 2016-07-26)). Those callers do not need to care, as the atexit() handler > will clean up the lock file. > > However, I am sympathetic to your angle, even if I do not expect any such > caller to arise anytime soon. We just fixed a similar "why are we allowing the 'if the_index hasn't been read, read unconditionally from $GIT_INDEX_FILE" that is reached by a codepath that is specifically designed to read from a temporary index file while reviewing a separate topic, and that is where my reaction "this is not very helpful for other callers" comes from.
Re: [PATCH v2 10/14] sequencer: lib'ify read_populate_opts()
Johannes Schindelin writes: > I still think that it is a serious mistake for library functions to die(). > But I have no time to take care of git_parse_source() right now. I think we already agreed on both points during the previous round of review ;-)
Re: git am and duplicate signatures
(adding lkml) On Tue, 2016-08-30 at 09:54 -0700, Junio C Hamano wrote: > Joe Perches writes: > > git-am -s will avoid duplicating the last signature > > in a patch. > > > > But given a developer creates a patch, send it around for > > acks/other signoffs, collects signatures and then does > > a git am -s on a different branch, this sort of sign-off > > chain is possible: > > > > Signed-off-by: Original Developer > > Acked-by: Random Developer > > Signed-off-by: Original Developer > Both correct and allowing the earlier one duplicated as long as > there is somebody/something else in between is deliberate. linux-kernel has a script (scripts/checkpatch.pl) that looks for duplicate signatures (-by: [name] ) Should the last Signed-off-by: in the commit log be excluded from this check? > > Should there be an option to avoid duplicate signatures > > in a sequence where an author can git-am the same patch? > I dunno. The way "Signed-off-by" is handled is designed > specifically to support the meaning of that footer, namely to record > where it originated and whose hands it passed, used in the kernel > and Git land. Other projects certainly may have need for footers > that denote different things that want different semantics (e.g. Who > authored it and who cheered on it), but that is outside the scope of > the "Signed-off-by" supported by "am -s" and "commit -s". > > Support for more generic footers was supposed to come when the > "interpret-trailers" topic started, but the author of the topic > seems to have lost interest before the mechanism has become ready to > be integrated in the workflow commands like "am", "commit", "rebase" > etc., which is unfortunate. > > > > > sequencer.c:append_signoff() has a flag for APPEND_SIGNOFF_DEDUP > Yes, I think this is one of the warts we talked about getting rid of > but haven't got around to it. It is there because "format-patch -s" > was incorrectly written to dedup Signed-off-by: from anywhere in its > early implementation and to keep the same behaviour. We should drop > that flag from append_signoff() function.
Re: [PATCH v6 10/13] convert: generate large test files only once
Lars Schneider writes: > True, but applying rot13 (via tr ...) on 20+ MB takes quite a bit of > time. That's why I came up with the 1M SP in between. Ah, OK, that makes sense (it may not necessarily make it a good decision, but it certainly explains it). > However, I realized that testing a large amount of data is not really > necessary for the final series. A single packet is 64k. A 500k pseudo random > test file should be sufficient. This will make the test way simpler. ;-) Thanks.
Re: git am and duplicate signatures
Joe Perches writes: > git-am -s will avoid duplicating the last signature > in a patch. > > But given a developer creates a patch, send it around for > acks/other signoffs, collects signatures and then does > a git am -s on a different branch, this sort of sign-off > chain is possible: > > Signed-off-by: Original Developer > Acked-by: Random Developer > Signed-off-by: Original Developer Both correct and allowing the earlier one duplicated as long as there is somebody/something else in between is deliberate. > Should there be an option to avoid duplicate signatures > in a sequence where an author can git-am the same patch? I dunno. The way "Signed-off-by" is handled is designed specifically to support the meaning of that footer, namely to record where it originated and whose hands it passed, used in the kernel and Git land. Other projects certainly may have need for footers that denote different things that want different semantics (e.g. Who authored it and who cheered on it), but that is outside the scope of the "Signed-off-by" supported by "am -s" and "commit -s". Support for more generic footers was supposed to come when the "interpret-trailers" topic started, but the author of the topic seems to have lost interest before the mechanism has become ready to be integrated in the workflow commands like "am", "commit", "rebase" etc., which is unfortunate. > sequencer.c:append_signoff() has a flag for APPEND_SIGNOFF_DEDUP Yes, I think this is one of the warts we talked about getting rid of but haven't got around to it. It is there because "format-patch -s" was incorrectly written to dedup Signed-off-by: from anywhere in its early implementation and to keep the same behaviour. We should drop that flag from append_signoff() function.
Re: [PATCH v6 10/13] convert: generate large test files only once
On Tue, Aug 30, 2016 at 01:41:59PM +0200, Lars Schneider wrote: > >> + git checkout -- test test.t test.i && > >> + > >> + mkdir generated-test-data && > >> + for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE) > >> + do > >> + RANDOM_STRING="$(test-genrandom end $i | tr -dc "A-Za-z0-9" )" > >> + ROT_RANDOM_STRING="$(echo $RANDOM_STRING | ./rot13.sh )" > > > > In earlier iteration of loop with lower $i, what guarantees that > > some bytes survive "tr -dc"? > > Nothing really, good catch! The seed "end" produces as first character always > a > "S" which would survive "tr -dc". However, that is clunky. I will always set > "1" > as first character in $RANDOM_STRING to mitigate the problem. It seems odd that you would generate a larger set of random bytes and then throw most of them away (about 1 in 5, on average). So you don't actually know how long your inputs are, and you're wasting time generating bytes which are discarded. The goal looks like it is just to clean up the string to only-ASCII characters. Perhaps converting to to base64 or hex would be conceptually simpler? Like: test-genrandom ... | perl -pe 's/./hex(ord($&))/sge' > > I do not quite get the point of this complexity. You are using > > exactly the same seed "end" every time, so in the first round you > > have 1M of SP, letter '1', letter 'S' (from the genrandom), then > > in the second round you have 1M of SP, letter '1', letter 'S' and > > letter 'p' (the last two from the genrandom), and go on. Is it > > significant for the purpose of your test that the cruft inserted > > between the repetition of 1M of SP gets longer by one byte but they > > all share the same prefix (e.g. "1S", "1Sp", "1SpZ", "1SpZT", > > ... are what you insert between a large run of spaces)? > > The pktline packets have a constant size. If the cruft between 1M of SP > has a constant size as well then the generated packets for the test data > would repeat themselves. That's why I increased the length after every 1M > of SP. > > However, I realized that this test complexity is not necessary. I'll > simplify it in the next round. I was also confused by this, and wondered if other patterns (perhaps using a single longer genrandom) might be applicable. Simplification (or explanation in comments about what properties the content _needs_ to have) would be welcome. :) -Peff
Re: [PATCH v6 12/13] convert: add filter..process option
> On 30 Aug 2016, at 00:21, Junio C Hamano wrote: > > larsxschnei...@gmail.com writes: > >> +In case the filter cannot or does not want to process the content, >> +it is expected to respond with an "error" status. Depending on the >> +`filter..required` flag Git will interpret that as error >> +but it will not stop or restart the filter process. >> + >> +packet: git< status=error\n >> +packet: git< >> + >> + >> +In case the filter cannot or does not want to process the content >> +as well as any future content for the lifetime of the Git process, >> +it is expected to respond with an "error-all" status. Depending on >> +the `filter..required` flag Git will interpret that as error >> +but it will not stop or restart the filter process. >> + >> +packet: git< status=error-all\n >> +packet: git< >> + > > This part of the document is well-written to help filter-writers. Thanks! > One thing that was unclear from the above to me, when read as a > potential filter-writer, is when I am supposed to exit(2). After I > tell Git with error-all (I would have called it "abort", but that's > OK) that I desire no further communication, am I free to go? Or do > I wait until Git somehow disconnects (perhaps by closing the packet > stream I have been reading)? The filter can exit right after the "error-all". If the filter does not exit then Git will kill the filter. I'll add this to the docs. "abort" could be ambiguous because it could be read as "abort only this file". "abort-all" would work, though. Would you prefer to see "error" replaced by "abort" and "error-all" by "abort-all"? >> +If the filter dies during the communication or does not adhere to >> +the protocol then Git will stop the filter process and restart it >> +with the next file that needs to be processed. > > Hmph, is there a reason not to retry a half-converted-and-failed > blob with the fresh process? Note that this is not "you must do it > that way", and it is not even "I think doing so may be a better > idea". I merely want to know the reason behind this decision. A filter that dies during communication or does not adhere to the protocol is a faulty filter. Feeding the faulty filter after restart with the same blob would likely cause the same error. There are legitimate reasons for retries. E.g. if the filter communicates with the network. In these cases I expect the filter to handle the retry logic. Git just writes to and reads from pipes. I don't expect frequent problems in that area. Plus the existing filter mechanism has no retry either. Later on we could easily add a "retry" capability if we deem it necessary, though. >> +After the filter has processed a blob it is expected to wait for >> +the next "key=value" list containing a command. When the Git process >> +terminates, it will send a kill signal to the filter in that stage. > > The "kill" may not be very nice. As Git side _knows_ that the > filter is waiting for the next command, having an explicit > "shutdown" command would give the filter a chance to implement a > clean exit--it may have some housekeeping tasks it wants to perform > once it is done. The "explicit shutdown" could just be "the pipe > gets closed", so from the implementation point of view there may not > be anything you need to further add to this patch (after all, when > we exit, the pipes to them would be closed), but the shutdown > protocol and the expectation on the behaviour of filter processes > would need to be documented. I implemented a shutdown command in v4 [1][2] but dropped it in v5 after a discussion with Peff [3]. [1] http://public-inbox.org/git/20160803164225.46355-8-larsxschnei...@gmail.com/ [2] http://public-inbox.org/git/20160803164225.46355-13-larsxschnei...@gmail.com/ [3] http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/ My main reasons to drop it: A) There is no central place in Git that could execute code *after* all filter operations are complete and *before* Git exits. Therefore, I had to add a "clean_on_exit_handler()" to "run-command" [1]. This change made this series even larger and therefore harder to review. B) If we communicate "shutdown" to the filter then we need to give the filter some time to perform the exit before the filter is killed on Git exit. I wasn't able to come up with a good answer how long Git should wait for the exit. Do you think I should resurrect the "shutdown" patch? >> +If a `filter..clean` or `filter..smudge` command >> +is configured then these commands always take precedence over >> +a configured `filter..process` command. > > It may make more sense to give precedence to the .process (which is > a late-comer) if defined, ignoring .clean and .smudge, than the > other way around. I agree. >> +Please note that you cannot use an existing `filter..cle
Re: [PATCH 4/6] require_clean_work_tree: ensure that the index was read
Johannes Schindelin writes: >> > Scatch that. That would not work in a freshly created repository >> > before doing any "git add". An empty index is a normal state,... > > Alas, that does not work, either. If no .git/index exists, read_index() > will not set the "initialized" flag. Exactly
git am and duplicate signatures
git-am -s will avoid duplicating the last signature in a patch. But given a developer creates a patch, send it around for acks/other signoffs, collects signatures and then does a git am -s on a different branch, this sort of sign-off chain is possible: Signed-off-by: Original Developer Acked-by: Random Developer Signed-off-by: Original Developer Should there be an option to avoid duplicate signatures in a sequence where an author can git-am the same patch? sequencer.c:append_signoff() has a flag for APPEND_SIGNOFF_DEDUP sequencer.c:void append_signoff(struct strbuf *msgbuf, int ignore_footer, unsigned flag) but builtin/commit.c: append_signoff(&sb, ignore_non_trailer(&sb), 0); doesn't have an optional use mechanism available.
Re: [PATCH 07/22] sequencer: future-proof read_populate_todo()
W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze: > Over the next commits, we will work on improving the sequencer to the > point where it can process the edit script of an interactive rebase. To > that end, we will need to teach the sequencer to read interactive > rebase's todo file. In preparation, we consolidate all places where > that todo file is needed to call a function that we will later extend. > > Signed-off-by: Johannes Schindelin > --- > sequencer.c | 18 +++--- > 1 file changed, 11 insertions(+), 7 deletions(-) > > diff --git a/sequencer.c b/sequencer.c > index 8d79091..982b6e9 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -32,6 +32,11 @@ static const char *get_dir(const struct replay_opts *opts) > return git_path_seq_dir(); > } > > +static const char *get_todo_path(const struct replay_opts *opts) > +{ > + return git_path_todo_file(); > +} I guess that in the future commit the return value of get_todo_path() would change depending on what sequencer is used for, cherry-pick or interactive rebase, that is, contents of replay_opts... > + > static int is_rfc2822_line(const char *buf, int len) > { > int i; > @@ -772,25 +777,24 @@ static int parse_insn_buffer(char *buf, struct > commit_list **todo_list, > static int read_populate_todo(struct commit_list **todo_list, > struct replay_opts *opts) > { > + const char *todo_file = get_todo_path(opts); ...and that's why you have added this temporary variable here, to not repeat get_todo_path(opts) calculations... > struct strbuf buf = STRBUF_INIT; > int fd, res; > > - fd = open(git_path_todo_file(), O_RDONLY); > + fd = open(todo_file, O_RDONLY); > if (fd < 0) > - return error_errno(_("Could not open %s"), > -git_path_todo_file()); > + return error_errno(_("Could not open %s"), todo_file); ... So that's why it is s/git_path_todo_file()/todo_file/ replacement, and not simply... > if (strbuf_read(&buf, fd, 0) < 0) { > close(fd); > strbuf_release(&buf); > - return error(_("Could not read %s."), git_path_todo_file()); > + return error(_("Could not read %s."), todo_file); > } > close(fd); > > res = parse_insn_buffer(buf.buf, todo_list, opts); > strbuf_release(&buf); > if (res) > - return error(_("Unusable instruction sheet: %s"), > - git_path_todo_file()); > + return error(_("Unusable instruction sheet: %s"), todo_file); > return 0; > } > > @@ -1064,7 +1068,7 @@ static int sequencer_continue(struct replay_opts *opts) > { > struct commit_list *todo_list = NULL; > > - if (!file_exists(git_path_todo_file())) > + if (!file_exists(get_todo_path(opts))) ...the s/git_path_todo_file()/git_todo_path(opts)/, isn't it? > return continue_single_pick(); > if (read_populate_opts(opts) || > read_populate_todo(&todo_list, opts)) > Looks good; though I have not checked if all calling sites were converted. Good work, -- Jakub Narębski
Re: [PATCH v6 13/13] read-cache: make sure file handles are not inherited by child processes
> > diff --git a/sha1_file.c b/sha1_file.c > index d5e1121..759991e 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -1485,7 +1485,7 @@ int check_sha1_signature(const unsigned char *sha1, > void *map, > > int git_open_noatime(const char *name) Hm, should the function then be renamed into git_open_noatime_cloexec() > { > - static int sha1_file_open_flag = O_NOATIME; > + static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC; > > for (;;) { > int fd; > > >
Re: [PATCH 06/22] sequencer: release memory that was allocated when reading options
W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze: > The sequencer reads options from disk and stores them in its struct > for use during sequencer's operations. > > With this patch, the memory is released afterwards, plugging a > memory leak. > > Signed-off-by: Johannes Schindelin > --- > sequencer.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/sequencer.c b/sequencer.c > index b5be0f9..8d79091 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -131,6 +131,8 @@ static void remove_sequencer_state(const struct > replay_opts *opts) > free(opts->owned[i]); > free(opts->owned); > > + free(opts->xopts); > + This looks like independent change, not related to using the sequencer_entrust() to store options read from disk in replay_opts struct to be able to free memory afterwards. I guess you wanted to avoid one line changes... > strbuf_addf(&dir, "%s", get_dir(opts)); > remove_dir_recursively(&dir, 0); > strbuf_release(&dir); > @@ -811,13 +813,18 @@ static int populate_opts_cb(const char *key, const char > *value, void *data) Sidenote: this patch would be easier to read if lines were reordered as below, but I don't think any slider heuristics could help achieve that automatically. Also, the patch might be invalid... > opts->allow_ff = git_config_bool_or_int(key, value, > &error_flag); > else if (!strcmp(key, "options.mainline")) > opts->mainline = git_config_int(key, value); > - else if (!strcmp(key, "options.strategy")) > + else if (!strcmp(key, "options.strategy")) { > git_config_string(&opts->strategy, key, value); > + sequencer_entrust(opts, (char *) opts->strategy); I wonder if the ability to free strings dup-ed by git_config_string() be something that is part of replay_opts, or rather remove_sequencer_state(), that is a list of free(opts->strategy); free(opts->gpg_sign); And of course for (i = 0; i < opts->xopts_nr; i++) free(opts->xopts[i]); free(opts->xopts); Though... free(NULL) is nop as per standard, but can we rely on it? If it is a problem, we can create xfree(ptr) being if(ptr)free(ptr); The *_entrust() mechanism is more generic, but do we use this general-ness? Well, it could be xstrdup or git_config_string doing entrust'ing... > + } > - else if (!strcmp(key, "options.gpg-sign")) > + else if (!strcmp(key, "options.gpg-sign")) { > git_config_string(&opts->gpg_sign, key, value); > + sequencer_entrust(opts, (char *) opts->gpg_sign); > + } > else if (!strcmp(key, "options.strategy-option")) { > ALLOC_GROW(opts->xopts, opts->xopts_nr + 1, opts->xopts_alloc); > - opts->xopts[opts->xopts_nr++] = xstrdup(value); > + opts->xopts[opts->xopts_nr++] = > + sequencer_entrust(opts, xstrdup(value)); Nice. > } else > return error(_("Invalid key: %s"), key); > >
Re: [PATCH 4/6] require_clean_work_tree: ensure that the index was read
Hi Junio, On Tue, 30 Aug 2016, Johannes Schindelin wrote: > On Mon, 29 Aug 2016, Junio C Hamano wrote: > > > Junio C Hamano writes: > > > > > I am not sure if it should be left as the responsibility of the > > > caller (i.e. check the_index.initialized to bark at a caller that > > > forgets to read from an index) ... > > > > Scatch that. That would not work in a freshly created repository > > before doing any "git add". An empty index is a normal state, so it > > would not just be annoying to warn "You called me without reading > > the index" but is simply wrong. > > Fine. I changed it to assert that the_index.initialized was set. Alas, that does not work, either. If no .git/index exists, read_index() will not set the "initialized" flag. So it turns out that I can either get distracted in a major way, or drop the patch. I opt for the latter. Ciao, Dscho
Re: how to showing a merge graph?
Duy Nguyen venit, vidit, dixit 30.08.2016 15:10: > I want to see a "git log --oneline --graph" with all non-merge commits > removed, but history is rewritten so that the merge commits represent > the entire topics and are shown to have all the parents of the base > commits. e.g. if the full graph is > > * 8118403 Merge commit 'bbb2437' > |\ > | * bbb2437 3p > | * dfde6b9 2p > * | 9c0aeb2 2 > |/ > * 8db1c06 1 > > I just want to see > > * 8118403 Merge commit 'bbb2437' > |\ > | | > |/ > * 8db1c06 1 > > I had a quick look of rev-list-options.txt but couldn't find anything > that does that (--simplify-merges looks different), and revision.c is > not that familiar to me to figure this out by myself. Help? I don't think anything (we have) would give you two lines connecting the same two commits directly, i.e. a double edge in the graph as you indicate above. Michael
[PATCH v4 3/3] diff-highlight: add support for --graph output
Signed-off-by: Brian Henderson --- contrib/diff-highlight/diff-highlight| 19 +-- contrib/diff-highlight/t/t9400-diff-highlight.sh | 2 +- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight index ffefc31..9280c88 100755 --- a/contrib/diff-highlight/diff-highlight +++ b/contrib/diff-highlight/diff-highlight @@ -21,6 +21,10 @@ my $RESET = "\x1b[m"; my $COLOR = qr/\x1b\[[0-9;]*m/; my $BORING = qr/$COLOR|\s/; +# The patch portion of git log -p --graph should only ever have preceding | and +# not / or \ as merge history only shows up on the commit line. +my $GRAPH = qr/$COLOR?\|$COLOR?\s+/; + my @removed; my @added; my $in_hunk; @@ -32,12 +36,12 @@ $SIG{PIPE} = 'DEFAULT'; while (<>) { if (!$in_hunk) { print; - $in_hunk = /^$COLOR*\@/; + $in_hunk = /^$GRAPH*$COLOR*\@/; } - elsif (/^$COLOR*-/) { + elsif (/^$GRAPH*$COLOR*-/) { push @removed, $_; } - elsif (/^$COLOR*\+/) { + elsif (/^$GRAPH*$COLOR*\+/) { push @added, $_; } else { @@ -46,7 +50,7 @@ while (<>) { @added = (); print; - $in_hunk = /^$COLOR*[\@ ]/; + $in_hunk = /^$GRAPH*$COLOR*[\@ ]/; } # Most of the time there is enough output to keep things streaming, @@ -163,6 +167,9 @@ sub highlight_pair { } } +# we split either by $COLOR or by character. This has the side effect of +# leaving in graph cruft. It works because the graph cruft does not contain "-" +# or "+" sub split_line { local $_ = shift; return utf8::decode($_) ? @@ -211,8 +218,8 @@ sub is_pair_interesting { my $suffix_a = join('', @$a[($sa+1)..$#$a]); my $suffix_b = join('', @$b[($sb+1)..$#$b]); - return $prefix_a !~ /^$COLOR*-$BORING*$/ || - $prefix_b !~ /^$COLOR*\+$BORING*$/ || + return $prefix_a !~ /^$GRAPH*$COLOR*-$BORING*$/ || + $prefix_b !~ /^$GRAPH*$COLOR*\+$BORING*$/ || $suffix_a !~ /^$BORING*$/ || $suffix_b !~ /^$BORING*$/; } diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh b/contrib/diff-highlight/t/t9400-diff-highlight.sh index 54e11fe..e42232d 100755 --- a/contrib/diff-highlight/t/t9400-diff-highlight.sh +++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh @@ -209,7 +209,7 @@ test_expect_failure 'diff-highlight highlights mismatched hunk size' ' # TODO add multi-byte test -test_expect_failure 'diff-highlight works with the --graph option' ' +test_expect_success 'diff-highlight works with the --graph option' ' dh_test_setup_history && # topo-order so that the order of the commits is the same as with --graph -- 2.9.3
[PATCH v4 0/3] diff-highlight: add support for --graph option
On Mon, Aug 29, 2016 at 02:37:46PM -0700, Junio C Hamano wrote: > Brian Henderson writes: > > > How does this look? > > > > Drawing the graph helped me a lot in figuring out what I was > > actually testing. thanks! > > Yeah, I also am pleased to see the picture of what is being tested > in the test script. > > With your sign-off, they would have been almost perfect ;-). doh. fixed. I left the subject as v4, probably mostly because I have this weird aversion to increasing version numbers :) but I justified it by thinking that the actual patch set isn't changing, I just added the sign-off (and updated the commit messages per Jeff.) Hope that's ok. thanks everyone for all your help! Brian Henderson (3): diff-highlight: add some tests diff-highlight: add failing test for handling --graph output diff-highlight: add support for --graph output contrib/diff-highlight/Makefile | 5 + contrib/diff-highlight/diff-highlight| 19 +- contrib/diff-highlight/t/Makefile| 22 +++ contrib/diff-highlight/t/t9400-diff-highlight.sh | 223 +++ 4 files changed, 263 insertions(+), 6 deletions(-) create mode 100644 contrib/diff-highlight/Makefile create mode 100644 contrib/diff-highlight/t/Makefile create mode 100755 contrib/diff-highlight/t/t9400-diff-highlight.sh -- 2.9.3
[PATCH v4 2/3] diff-highlight: add failing test for handling --graph output
Signed-off-by: Brian Henderson --- contrib/diff-highlight/t/t9400-diff-highlight.sh | 60 1 file changed, 60 insertions(+) diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh b/contrib/diff-highlight/t/t9400-diff-highlight.sh index 7c303f7..54e11fe 100755 --- a/contrib/diff-highlight/t/t9400-diff-highlight.sh +++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh @@ -49,6 +49,55 @@ test_strip_patch_header () { sed -n '/^@@/,$p' $* } +# dh_test_setup_history generates a contrived graph such that we have at least +# 1 nesting (E) and 2 nestings (F). +# +#A branch +# / +# D---E---F master +# +# git log --all --graph +# * commit +# |A +# | * commit +# | |F +# | * commit +# |/ +# |E +# * commit +# D +# +dh_test_setup_history () { + echo "file1" >file1 && + echo "file2" >file2 && + echo "file3" >file3 && + + cat file1 >file && + git add file && + git commit -m "D" && + + git checkout -b branch && + cat file2 >file && + git commit -am "A" && + + git checkout master && + cat file2 >file && + git commit -am "E" && + + cat file3 >file && + git commit -am "F" +} + +left_trim () { + "$PERL_PATH" -pe 's/^\s+//' +} + +trim_graph () { + # graphs start with * or | + # followed by a space or / or \ + "$PERL_PATH" -pe 's@^((\*|\|)( |/|\\))+@@' +} + test_expect_success 'diff-highlight highlights the beginning of a line' ' cat >a <<-\EOF && aaa @@ -160,4 +209,15 @@ test_expect_failure 'diff-highlight highlights mismatched hunk size' ' # TODO add multi-byte test +test_expect_failure 'diff-highlight works with the --graph option' ' + dh_test_setup_history && + + # topo-order so that the order of the commits is the same as with --graph + # trim graph elements so we can do a diff + # trim leading space because our trim_graph is not perfect + git log --branches -p --topo-order | "$DIFF_HIGHLIGHT" | left_trim >graph.exp && + git log --branches -p --graph | "$DIFF_HIGHLIGHT" | trim_graph | left_trim >graph.act && + test_cmp graph.exp graph.act +' + test_done -- 2.9.3
[PATCH v4 1/3] diff-highlight: add some tests
Signed-off-by: Brian Henderson --- contrib/diff-highlight/Makefile | 5 + contrib/diff-highlight/t/Makefile| 22 +++ contrib/diff-highlight/t/t9400-diff-highlight.sh | 163 +++ 3 files changed, 190 insertions(+) create mode 100644 contrib/diff-highlight/Makefile create mode 100644 contrib/diff-highlight/t/Makefile create mode 100755 contrib/diff-highlight/t/t9400-diff-highlight.sh diff --git a/contrib/diff-highlight/Makefile b/contrib/diff-highlight/Makefile new file mode 100644 index 000..9018724 --- /dev/null +++ b/contrib/diff-highlight/Makefile @@ -0,0 +1,5 @@ +# nothing to build +all: + +test: + $(MAKE) -C t diff --git a/contrib/diff-highlight/t/Makefile b/contrib/diff-highlight/t/Makefile new file mode 100644 index 000..5ff5275 --- /dev/null +++ b/contrib/diff-highlight/t/Makefile @@ -0,0 +1,22 @@ +-include ../../../config.mak.autogen +-include ../../../config.mak + +# copied from ../../t/Makefile +SHELL_PATH ?= $(SHELL) +SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH)) +T = $(wildcard t[0-9][0-9][0-9][0-9]-*.sh) + +all: test +test: $(T) + +.PHONY: help clean all test $(T) + +help: + @echo 'Run "$(MAKE) test" to launch test scripts' + @echo 'Run "$(MAKE) clean" to remove trash folders' + +$(T): + @echo "*** $@ ***"; '$(SHELL_PATH_SQ)' $@ $(GIT_TEST_OPTS) + +clean: + $(RM) -r 'trash directory'.* diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh b/contrib/diff-highlight/t/t9400-diff-highlight.sh new file mode 100755 index 000..7c303f7 --- /dev/null +++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh @@ -0,0 +1,163 @@ +#!/bin/sh + +test_description='Test diff-highlight' + +CURR_DIR=$(pwd) +TEST_OUTPUT_DIRECTORY=$(pwd) +TEST_DIRECTORY="$CURR_DIR"/../../../t +DIFF_HIGHLIGHT="$CURR_DIR"/../diff-highlight + +CW="$(printf "\033[7m")" # white +CR="$(printf "\033[27m")" # reset + +. "$TEST_DIRECTORY"/test-lib.sh + +if ! test_have_prereq PERL +then + skip_all='skipping diff-highlight tests; perl not available' + test_done +fi + +# dh_test is a test helper function which takes 3 file names as parameters. The +# first 2 files are used to generate diff and commit output, which is then +# piped through diff-highlight. The 3rd file should contain the expected output +# of diff-highlight (minus the diff/commit header, ie. everything after and +# including the first @@ line). +dh_test () { + a="$1" b="$2" && + + cat >patch.exp && + + { + cat "$a" >file && + git add file && + git commit -m "Add a file" && + + cat "$b" >file && + git diff file >diff.raw && + git commit -am "Update a file" && + git show >commit.raw + } >/dev/null && + + "$DIFF_HIGHLIGHT" diff.act && + "$DIFF_HIGHLIGHT" commit.act && + test_cmp patch.exp diff.act && + test_cmp patch.exp commit.act +} + +test_strip_patch_header () { + sed -n '/^@@/,$p' $* +} + +test_expect_success 'diff-highlight highlights the beginning of a line' ' + cat >a <<-\EOF && + aaa + bbb + ccc + EOF + + cat >b <<-\EOF && + aaa + 0bb + ccc + EOF + + dh_test a b <<-EOF + @@ -1,3 +1,3 @@ +aaa + -${CW}b${CR}bb + +${CW}0${CR}bb +ccc + EOF +' + +test_expect_success 'diff-highlight highlights the end of a line' ' + cat >a <<-\EOF && + aaa + bbb + ccc + EOF + + cat >b <<-\EOF && + aaa + bb0 + ccc + EOF + + dh_test a b <<-EOF + @@ -1,3 +1,3 @@ +aaa + -bb${CW}b${CR} + +bb${CW}0${CR} +ccc + EOF +' + +test_expect_success 'diff-highlight highlights the middle of a line' ' + cat >a <<-\EOF && + aaa + bbb + ccc + EOF + + cat >b <<-\EOF && + aaa + b0b + ccc + EOF + + dh_test a b <<-EOF + @@ -1,3 +1,3 @@ +aaa + -b${CW}b${CR}b + +b${CW}0${CR}b +ccc + EOF +' + +test_expect_success 'diff-highlight does not highlight whole line' ' + cat >a <<-\EOF && + aaa + bbb + ccc + EOF + + cat >b <<-\EOF && + aaa + 000 + ccc + EOF + + dh_test a b <<-EOF + @@ -1,3 +1,3 @@ +aaa + -bbb + +000 +ccc + EOF +' + +test_expect_failure 'diff-highlight highlights mismatched hunk size' ' + cat >a <<-\EOF && + aaa + bbb + EOF + +
how to showing a merge graph?
I want to see a "git log --oneline --graph" with all non-merge commits removed, but history is rewritten so that the merge commits represent the entire topics and are shown to have all the parents of the base commits. e.g. if the full graph is * 8118403 Merge commit 'bbb2437' |\ | * bbb2437 3p | * dfde6b9 2p * | 9c0aeb2 2 |/ * 8db1c06 1 I just want to see * 8118403 Merge commit 'bbb2437' |\ | | |/ * 8db1c06 1 I had a quick look of rev-list-options.txt but couldn't find anything that does that (--simplify-merges looks different), and revision.c is not that familiar to me to figure this out by myself. Help? -- Duy
Re: [PATCH v6 13/13] read-cache: make sure file handles are not inherited by child processes
> On 29 Aug 2016, at 21:45, Junio C Hamano wrote: > > Lars Schneider writes: > >> I see. Thanks for the explanation. > > I expect the updated log message to explain the issue correctly > then. Sure! >>> The parent is >>> very likely to have pack windows open into .pack files and they need >>> to be closed on the child side after fork(2) starts the child >>> process but before execve(2) runs the helper, if we want to avoid >>> file descriptor leaks. >> >> I think I understand what you are saying. However, during my tests >> .pack file fd's were never a problem. > > I do not expect during the lifetime of your long-running helper > anybody would try to unlink an existing packfile, so it is unlikely > that "cannot unlink an open file on Windows" issue to come up. And > the cross-platform problem I pointed out is a fd leak; leaks would > not show up until you run out of the resource, just like you > wouldn't notice small memory leak here and there UNLESS you actively > look for them. I would be surprised if your "tests" found anything. > >> How would I find the open .pack file fd's? Should I go through >> /proc/PID/fd? Why is this no problem for other longer running >> commands such as the git-credential-cache--daemon or git-daemon? > > Nobody said this is "no problem for others". While discussing the > patch that started this thread, we noticed that any open file > descriptor the main process has when the long-running clean/smudge > helper is spawned _is_ a problem. Other helpers may share the same > problem, and if they do, that is all the more reason that fixing the > file descriptor leak is a good thing. > > The primary thing I was wondering was if we should open the window > into packfiles with CLOEXEC, just like we recently started opening > the tempfiles and lockfiles with the flag. The reason why I asked > if the site that spawns (i.e. run_command API) has an easy access to > the list of file descriptors that we opened ONLY for ourselves is > because that would make it possible to consider an alternative > approach close them before execve(2) in run_commands API. That > would save us from having to sprinkle existing calls to open(2) with > CLOEXEC. But if that is not the case, opening them with CLOEXEC is > a much better solution than going outside the program to "find" them > in non-portable ways. The pack files are opened before the filter process is forked. Therefore, I think CLOEXEC makes sense for them. Should this change be part of this series? If yes, would it look like below? Should I adjust the function name? Thanks, Lars diff --git a/sha1_file.c b/sha1_file.c index d5e1121..759991e 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1485,7 +1485,7 @@ int check_sha1_signature(const unsigned char *sha1, void *map, int git_open_noatime(const char *name) { - static int sha1_file_open_flag = O_NOATIME; + static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC; for (;;) { int fd;
Re: [PATCH v6 10/13] convert: generate large test files only once
> On 29 Aug 2016, at 19:52, Junio C Hamano wrote: > > Lars Schneider writes: > >>> On 25 Aug 2016, at 21:17, Stefan Beller wrote: >>> On Thu, Aug 25, 2016 at 4:07 AM, wrote: From: Lars Schneider Generate more interesting large test files >>> >>> How are the large test files more interesting? >>> (interesting in the notion of covering more potential bugs? >>> easier to debug? better to maintain, or just a pleasant read?) >> >> The old large test file was 1MB of zeros and 1 byte with a one, repeated >> 2048 times. >> >> Since the filter uses 64k packets we would test a large number of equally >> looking packets. >> >> That's why I thought the pseudo random content is more interesting. > > I guess my real question is why it is not just a single invocation > of test-genrandom that gives you the whole test file; if you are > using 20MB, the simplest would be to grab 20MB out of test-genrandom. > With that hopefully you won't see large number of equally looking > packets, no? True, but applying rot13 (via tr ...) on 20+ MB takes quite a bit of time. That's why I came up with the 1M SP in between. However, I realized that testing a large amount of data is not really necessary for the final series. A single packet is 64k. A 500k pseudo random test file should be sufficient. This will make the test way simpler. Thanks, Lars
Bug Report: "git submodule deinit" fails right after a clone
Hello, I found a curious bug in git version 2.9.0.windows.1 (run on Windows 7 via git bash). If I clone a repository containing submodules and run a "git submodule deinit" on any of the submodules of this repository without executing another git command, this command fails. For instance: (let's say the repo MyProject contains 2 submodules: Submodule1 and Submodule2) $ git clone ssh:///MyProject [ ... "git clone" output ... ] $ cd MyProject $ git submodule deinit Submodule1 fatal: Please stage your changes to .gitmodules or stash them to proceed Submodule work tree 'Submodule1' contains local modifications; use '-f' to discard them $ git submodule deinit Submodule2 fatal: Please stage your changes to .gitmodules or stash them to proceed Submodule work tree 'Submodule2' contains local modifications; use '-f' to discard them First the error message is strange. Then what is even stranger is that the error disappears if a "git status" is run before the submodule deinit... $ git clone ssh:///MyProject [ ... "git clone" output ... ] $ cd MyProject $ git submodule deinit Submodule1 fatal: Please stage your changes to .gitmodules or stash them to proceed Submodule work tree 'Submodule1' contains local modifications; use '-f' to discard them $ git status [ ... "git status" output ...] $ git submodule deinit Submodule1 Cleared directory 'Submodule1' I have been able to reproduce this error at least 10 times and with different repositories so I do not think it comes from the state of my repositories. Moreover, the use of "--recursive" argument to the "git clone" command did not change anything. Are you able to reproduce this problem? Thank you in advance. Best regards, Thomas
Re: [PATCH v6 10/13] convert: generate large test files only once
> On 29 Aug 2016, at 19:46, Junio C Hamano wrote: > > larsxschnei...@gmail.com writes: > >> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh >> index 7b45136..34c8eb9 100755 >> --- a/t/t0021-conversion.sh >> +++ b/t/t0021-conversion.sh >> @@ -4,6 +4,15 @@ test_description='blob conversion via gitattributes' >> >> . ./test-lib.sh >> >> +if test_have_prereq EXPENSIVE >> +then >> +T0021_LARGE_FILE_SIZE=2048 >> +T0021_LARGISH_FILE_SIZE=100 >> +else >> +T0021_LARGE_FILE_SIZE=30 >> +T0021_LARGISH_FILE_SIZE=2 >> +fi > > Minor: do we need T0021_ prefix? What are you trying to avoid > collisions with? Not necessary. I'll remove the prefix. >> +git checkout -- test test.t test.i && >> + >> +mkdir generated-test-data && >> +for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE) >> +do >> +RANDOM_STRING="$(test-genrandom end $i | tr -dc "A-Za-z0-9" )" >> +ROT_RANDOM_STRING="$(echo $RANDOM_STRING | ./rot13.sh )" > > In earlier iteration of loop with lower $i, what guarantees that > some bytes survive "tr -dc"? Nothing really, good catch! The seed "end" produces as first character always a "S" which would survive "tr -dc". However, that is clunky. I will always set "1" as first character in $RANDOM_STRING to mitigate the problem. > >> +# Generate 1MB of empty data and 100 bytes of random characters > > 100 bytes? It seems to me that you are giving 1MB and then $i-byte > or less (which sometimes can be zero) of random string. Outdated comment. Will fix! > >> +# printf "$(test-genrandom start $i)" >> +printf "%1048576d" 1 >>generated-test-data/large.file && >> +printf "$RANDOM_STRING" >>generated-test-data/large.file && >> +printf "%1048576d" 1 >>generated-test-data/large.file.rot13 && >> +printf "$ROT_RANDOM_STRING" >> >>generated-test-data/large.file.rot13 && >> + >> +if test $i = $T0021_LARGISH_FILE_SIZE >> +then >> +cat generated-test-data/large.file >> >generated-test-data/largish.file && >> +cat generated-test-data/large.file.rot13 >> >generated-test-data/largish.file.rot13 >> +fi >> +done > > This "now we are done with the loop, so copy them to the second > pair" needs to be in the loop? Shouldn't it come after 'done'? No, it does not need to be in the loop. I think I could do this after the loop instead: head -c $((1048576*$T0021_LARGISH_FILE_SIZE)) generated-test-data/large.file >generated-test-data/largish.file > I do not quite get the point of this complexity. You are using > exactly the same seed "end" every time, so in the first round you > have 1M of SP, letter '1', letter 'S' (from the genrandom), then > in the second round you have 1M of SP, letter '1', letter 'S' and > letter 'p' (the last two from the genrandom), and go on. Is it > significant for the purpose of your test that the cruft inserted > between the repetition of 1M of SP gets longer by one byte but they > all share the same prefix (e.g. "1S", "1Sp", "1SpZ", "1SpZT", > ... are what you insert between a large run of spaces)? The pktline packets have a constant size. If the cruft between 1M of SP has a constant size as well then the generated packets for the test data would repeat themselves. That's why I increased the length after every 1M of SP. However, I realized that this test complexity is not necessary. I'll simplify it in the next round. Thanks, Lars
Re: [PATCH 05/22] sequencer: allow the sequencer to take custody of malloc()ed data
Hello Johannes, W dniu 30.08.2016 o 09:29, Johannes Schindelin pisze: > On Mon, 29 Aug 2016, Jakub Narębski wrote: >> W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze: >>> +void *sequencer_entrust(struct replay_opts *opts, void >>> *set_me_free_after_use) >>> +{ >>> + ALLOC_GROW(opts->owned, opts->owned_nr + 1, opts->owned_alloc); >>> + opts->owned[opts->owned_nr++] = set_me_free_after_use; >>> + >>> + return set_me_free_after_use; >> >> I was wondering what this 'set_me_free_after_use' parameter is about; >> wouldn't it be more readable if this parameter was called 'owned_data' >> or 'owned_ptr'? > > If I read "owned_ptr" as a function's parameter, I would assume that the > associated memory is owned by the caller. So I would be puzzled reading > that name. Right. Well, it is difficult to come up with a good name for this parameter that would make sense both in a declaration as an information for a caller, and in the function itself as information about what it holds. In my personal opinion 'set_me_free_after_use' is not the best name, but I unfortunately do not have a better proposal. Maybe 'entrust_ptr', or 'entrusted_data' / 'entrusted_ptr' / 'entrusted'? There are two hard things in computer science: cache invalidation, *naming things*, and off-by-one errors ;-) P.S. It would be nice to have generic mechanism for taking custody of data to help libification, either at this or at lower level (on the level of xstrdup, etc.), but that can safely wait. It even should wait, so that we can see that this approach is a good one, before trying to generalize it. That should be not a blocker for this series, IMVHO. Best, -- Jakub Narębski
Re: [PATCH 4/6] require_clean_work_tree: ensure that the index was read
Hi Junio, On Mon, 29 Aug 2016, Junio C Hamano wrote: > Junio C Hamano writes: > > > I am not sure if it should be left as the responsibility of the > > caller (i.e. check the_index.initialized to bark at a caller that > > forgets to read from an index) ... > > Scatch that. That would not work in a freshly created repository > before doing any "git add". An empty index is a normal state, so it > would not just be annoying to warn "You called me without reading > the index" but is simply wrong. Fine. I changed it to assert that the_index.initialized was set. Ciao, Dscho
Re: [PATCH 4/6] require_clean_work_tree: ensure that the index was read
Hi Junio, On Mon, 29 Aug 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > > The function would otherwise pretend to work fine, but totally ignore > > the working directory. > > s/^T/Unless the caller has already read the index, t/; Changed. (I also removed the "otherwise" which would sound funny otherwise.) > I am not sure if it should be left as the responsibility of the > caller (i.e. check the_index.initialized to bark at a caller that > forgets to read from an index) or it is OK to unconditionally read > from $GIT_INDEX_FILE in a truly libified function. A caller that > wants to read from a custom (temporary) index file can choose to > make sure it does read_inddex_from() from its custom file before > calling this function, but if it forgets to do so, the penalty is > severe than ordinary callers who would have read from the normal > index file anyway. > > Even though the word-lego issue would make this particular API not > yet suitable, "who is responsible for reading the index" is an > orthogonal issue that we can discuss even on this version, so I > thought I'd bring it up. It is orthogonal alright. So I won't act on it, but just add my thoughts: We might want to consider thinking about introducing a more consistent internal API. This is even more orthogonal to the patch under discussion than just teaching require_clean_work_tree() about different index files, of course. It might very well pay off in the future were we to design a more unified "look & feel" to the API e.g. by putting more restrictions on the order of parameters, required "int" return values for functions that may fail, a unified error handling that does not necessarily print to stderr. Having said that, I do not have time to take lead on that, either. :-) Ciao, Dscho
git blame [was: Reducing CPU load on git server]
W dniu 29.08.2016 o 23:31, Jeff King pisze: > Blame-tree is a GitHub-specific command (it feeds the main repository > view page), and is a known CPU hog. There's more clever caching for that > coming down the pipe, but it's not shipped yet. I wonder if having support for 'git blame ' in Git core would be something interesting to Git users. I once tried to implement it, but it went nowhere. Would it be hard to implement? I guess that GitHub offers this view as a part of the home page view for a repository is something of a historical artifact; namely that web interfaces for other version control systems used this. But I wonder how useful it is. Neither cgit nor gitweb offer such view, both GitLab and Bitbucket start with summary age with no blame-tree. -- Jakub Narębski
Re: [PATCH v2 12/14] sequencer: lib'ify save_head()
Hi Junio, On Mon, 29 Aug 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > > strbuf_addf(&buf, "%s\n", head); > > if (write_in_full(fd, buf.buf, buf.len) < 0) > > - die_errno(_("Could not write to %s"), git_path_head_file()); > > + return error_errno(_("Could not write to %s"), > > + git_path_head_file()); > > Same comment around a left-over lockfile applies to this. An extra > rollback being minimally intrusive also applies here, I think. Sure. I added rollbacks in case of failure. Ciao, Dscho
Re: [PATCH v2 10/14] sequencer: lib'ify read_populate_opts()
Hi Junio, On Mon, 29 Aug 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > > Instead of dying there, let the caller high up in the callchain notice > > the error and handle it (by dying, still). > > > > The only caller of read_populate_opts(), sequencer_continue() can > > already return errors, so its caller must be already prepared to > > handle error returns, and with this step, we make it notice an error > > return from this function. > > > > So this is a safe conversion to make read_populate_opts() callable > > from new callers that want it not to die, without changing the > > external behaviour of anything existing. > > > > Signed-off-by: Johannes Schindelin > > --- > > sequencer.c | 14 -- > > 1 file changed, 8 insertions(+), 6 deletions(-) > > > > diff --git a/sequencer.c b/sequencer.c > > index e11b24f..be6020a 100644 > > --- a/sequencer.c > > +++ b/sequencer.c > > @@ -808,12 +808,14 @@ static int populate_opts_cb(const char *key, const > > char *value, void *data) > > return 0; > > } > > > > -static void read_populate_opts(struct replay_opts **opts_ptr) > > +static int read_populate_opts(struct replay_opts **opts) > > { > > if (!file_exists(git_path_opts_file())) > > - return; > > - if (git_config_from_file(populate_opts_cb, git_path_opts_file(), > > *opts_ptr) < 0) > > - die(_("Malformed options sheet: %s"), git_path_opts_file()); > > + return 0; > > + if (git_config_from_file(populate_opts_cb, git_path_opts_file(), *opts) > > < 0) > > + return error(_("Malformed options sheet: %s"), > > + git_path_opts_file()); > > + return 0; > > As discussed, perhaps have a comment immediately before calling > config-from-file that says that the call could die when it is fed a > syntactically broken file, but we ignore it for now because we will > be writing the file we have written, or something? Sure. I added a code comment. I still think that it is a serious mistake for library functions to die(). But I have no time to take care of git_parse_source() right now. Ciao, Dscho
Re: [PATCH v2 08/14] sequencer: lib'ify read_and_refresh_cache()
Hi Junio, On Mon, 29 Aug 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > > Instead of dying there, let the caller high up in the callchain > > notice the error and handle it (by dying, still). > > > > There are two call sites of read_and_refresh_cache(), one of which is > > pick_commits(), whose callers were already prepared to do the right > > thing given an "error" return from it by an earlier patch, so the > > conversion is safe. > > > > The other one, sequencer_pick_revisions() was also prepared to relay > > an error return back to its caller in all remaining cases in an > > earlier patch. > > > > Signed-off-by: Johannes Schindelin > > --- > > sequencer.c | 15 ++- > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/sequencer.c b/sequencer.c > > index c006cae..e30aa82 100644 > > --- a/sequencer.c > > +++ b/sequencer.c > > @@ -640,18 +640,21 @@ static int prepare_revs(struct replay_opts *opts) > > return 0; > > } > > > > -static void read_and_refresh_cache(struct replay_opts *opts) > > +static int read_and_refresh_cache(struct replay_opts *opts) > > { > > static struct lock_file index_lock; > > int index_fd = hold_locked_index(&index_lock, 0); > > if (read_index_preload(&the_index, NULL) < 0) > > - die(_("git %s: failed to read the index"), action_name(opts)); > > + return error(_("git %s: failed to read the index"), > > + action_name(opts)); > > refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, NULL, NULL, > > NULL); > > if (the_index.cache_changed && index_fd >= 0) { > > if (write_locked_index(&the_index, &index_lock, COMMIT_LOCK)) > > - die(_("git %s: failed to refresh the index"), > > action_name(opts)); > > + return error(_("git %s: failed to refresh the index"), > > + action_name(opts)); > > } > > rollback_lock_file(&index_lock); > > + return 0; > > } > > With the current set of callers, a caller that notices an error from > this function will immediately exit without doing any further > damage. > > So in that sense, this is a "safe" conversion. > > But is it a sensible conversion? When the caller wants to do > anything else (e.g. clean-up and try something else, perhaps read > the index again), the caller can't, as the index is still locked, > because even though the code knows that the lock will not be > released until the process exit, it chose to return error without > releasing the lock. It depends what the caller wants to do. The case about which I care most is when some helpful advice should be printed (see e.g. 3be18b4 (t5520: verify that `pull --rebase` shows the helpful advice when failing, 2016-07-26)). Those callers do not need to care, as the atexit() handler will clean up the lock file. However, I am sympathetic to your angle, even if I do not expect any such caller to arise anytime soon. > For a file-scope static helper, that probably is sufficient. But if > this can be reached from a public entry point in the API, the caller > of that entry point will find this not-so-useful, I would think. > > I suspect doing the "right thing" to future-proof it may not be too > much more work. > > static int read_and_refresh_cache(struct replay_opts *opts) > { > + int retval = 0; /* assume success */ > ... > if (read_idnex_preload(...) < 0) { > retval = error(...); > goto finish; > } > refresh_index(...); > if (...changed...) { > if (write_locked_index(...)) > retval = error(...); > } > + finish: > rollback_lock_file(&index_lock); > return retval; > } > > or something like that on top? I settled for rolling back in the if() clauses. Ciao, Dscho
Re: [PATCH v13 00/14] libify apply and use lib in am, part 3
On Mon, Aug 29, 2016 at 9:04 PM, Junio C Hamano wrote: > Christian Couder writes: > >> Highlevel view of the patches in the series >> ~~~ >> >> This is "part 3" of the full patch series. I am resending only the >> last 14 patches of the full series as "part 3", because I don't want >> to resend the first 27 patches of v10 nearly as is. > > Just to make sure, you are sending the first 11 of these 14 exactly > as-is, right? I didn't spot anything different other than 12 and 13 > that replaced the "alternate index" step from the previous round. Yeah, the first 11 of the 14 patches have no change compared to v12. I didn't want to create a "part 4" as that could be confusing, and sending the first 11 patches gives them another chance to be reviewed again.
[PATCH RFC] subtree: support subtree -P /path/to/file split
Hello, I'm missing `git subtree split` functionality for the case when prefix is a single file. Looking at git-subtree, pretty few changes are needed to handle this. The attached patch works for me in the common case (no rejoins). As such, I'm looking for comments :) Signed-off-by: Ivan Shapovalov --- contrib/subtree/git-subtree.sh | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh index dec085a..464dcf7 100755 --- a/contrib/subtree/git-subtree.sh +++ b/contrib/subtree/git-subtree.sh @@ -485,8 +485,12 @@ subtree_for_commit () { while read mode type tree name do assert test "$name" = "$dir" - assert test "$type" = "tree" -o "$type" = "commit" + assert test "$type" = "tree" -o "$type" = "commit" -o "$type" = "blob" test "$type" = "commit" && continue # ignore submodules + test "$type" = "blob" && { # handle if $dir is a file + printf "%s %s %s\t%s\0" "$mode" "$type" "$tree" "${file##*/}" | git mktree + break + } echo $tree break done -- 2.9.3
Re: [PATCH v2] t/Makefile: add a rule to re-run previously-failed tests
On Mon, Aug 29, 2016 at 03:46:05PM +0200, Johannes Schindelin wrote: > Note that we need to be careful to inspect only the *newest* entries in > test-results/: this directory contains files of the form > t--.counts and is only removed wholesale when running the > *entire* test suite, not when running individual tests. We ensure that > with a little sed magic on `ls -t`'s output that simply skips lines > when the file name was seen earlier. Hmm, interesting. Your approach seems reasonable, but I have to wonder if writing the pid in the first place is sane. I started to write up my reasoning in this email, but realized it was rapidly becoming the content of a commit message. So here is that commit. -- >8 -- Subject: [PATCH] test-lib: drop PID from test-results/*.count Each test run generates a "count" file in t/test-results that stores the number of successful, failed, etc tests. If you run "t1234-foo.sh", that file is named as "t/test-results/t1234-foo-$$.count" The addition of the PID there is serving no purpose, and makes analysis of the count files harder. The presence of the PID dates back to 2d84e9f (Modify test-lib.sh to output stats to t/test-results/*, 2008-06-08), but no reasoning is given there. Looking at the current code, we can see that other files we write to test-results (like *.exit and *.out) do _not_ have the PID included. So the presence of the PID does not meaningfully allow one to store the results from multiple runs anyway. Moreover, anybody wishing to read the *.count files to aggregate results has to deal with the presence of multiple files for a given test (and figure out which one is the most recent based on their timestamps!). The only consumer of these files is the aggregate.sh script, which arguably gets this wrong. If a test is run multiple times, its counts will appear multiple times in the total (I say arguably only because the desired semantics aren't documented anywhere, but I have trouble seeing how this behavior could be useful). So let's just drop the PID, which fixes aggregate.sh, and will make new features based around the count files easier to write. Note that since the count-file may already exist (when re-running a test), we also switch the "cat" from appending to truncating. The use of append here was pointless in the first place, as we expected to always write to a unique file. Signed-off-by: Jeff King --- The presence of the append, combined with the way aggregate.sh is written makes me wonder if the intent was to store multiple run results for each test in a single file (and aggregate would just report the last one). Which _still_ makes the use of the PID wrong. But again, I don't see much use for it. t/test-lib.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/test-lib.sh b/t/test-lib.sh index d731d66..eada492 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -687,9 +687,9 @@ test_done () { test_results_dir="$TEST_OUTPUT_DIRECTORY/test-results" mkdir -p "$test_results_dir" base=${0##*/} - test_results_path="$test_results_dir/${base%.sh}-$$.counts" + test_results_path="$test_results_dir/${base%.sh}.counts" - cat >>"$test_results_path" <<-EOF + cat >"$test_results_path" <<-EOF total $test_count success $test_success fixed $test_fixed -- 2.10.0.rc2.123.ga991f9e
Re: [PATCH v4 0/3] diff-highlight: add support for --graph option
On Mon, Aug 29, 2016 at 10:33:44AM -0700, Brian Henderson wrote: > How does this look? > > Drawing the graph helped me a lot in figuring out what I was actually > testing. thanks! > > Brian Henderson (3): > diff-highlight: add some tests. > diff-highlight: add failing test for handling --graph output. > diff-highlight: add support for --graph output. These all look good to me. As Junio mentioned, you need to add a signoff (see the section "Sign your work" in Documentation/SubmittingPatches). Also, a minor nit, but we don't usually put a "." at the end of our commit message subjects (not worth a re-roll on its own, but if you are sending them again anyway...). -Peff
Re: [PATCH v2 03/14] sequencer: lib'ify write_message()
Hi Junio, On Mon, 29 Aug 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > > Instead of dying there, let the caller high up in the callchain > > notice the error and handle it (by dying, still). > > > > The only caller of write_message(), do_pick_commit() already checks > > the return value and passes it on to its callers, so its caller must > > be already prepared to handle error returns, and with this step, we > > make it notice an error return from this function. > > ... > > @@ -223,7 +226,7 @@ static int fast_forward_to(const unsigned char *to, > > const unsigned char *from, > > > > read_cache(); > > if (checkout_fast_forward(from, to, 1)) > > - exit(128); /* the callee should have complained already */ > > + return -1; /* the callee should have complained already */ > > This hunk does not seem to have anything to do with write_message() > conversion, other than that its only caller, do_pick_commit(), also > calls write_message(). Darn. Sorry that this slipped through. I split it out into its own commit, and fixed checkout_fast_forward_to() in yet another commit. Ciao, Dscho
Re: [PATCH 05/22] sequencer: allow the sequencer to take custody of malloc()ed data
Hi Hannes, On Tue, 30 Aug 2016, Johannes Sixt wrote: > Am 29.08.2016 um 23:59 schrieb Jakub Narębski: > > W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze: > > > -#define REPLAY_OPTS_INIT { -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NULL, > > > NULL, NULL, 0, 0, NULL } > > > +#define REPLAY_OPTS_INIT { -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NULL, > > > NULL, NULL, 0, 0, NULL, NULL, 0, 0 } > > > > Nb. it is a pity that we cannot use named initializers for structs, > > so called designated inits. It would make this macro more readable. > > It is actually pointless to add the 0's and NULL's here. This should be > sufficient: > > #define REPLAY_OPTS_INIT { -1, -1 } > > because initialization with 0 (or NULL) is the default for any omitted > members. D'oh. You're right. The same applies to TODO_LIST_INIT, of course. Fixed, Johannes
Re: [PATCH 05/22] sequencer: allow the sequencer to take custody of malloc()ed data
Hi Kuba, On Mon, 29 Aug 2016, Jakub Narębski wrote: > W dniu 29.08.2016 o 10:04, Johannes Schindelin pisze: > > > The sequencer is our attempt to lib-ify cherry-pick. Yet it behaves > > like a one-shot command when it reads its configuration: memory is > > allocated and released only when the command exits. > > > > This is kind of okay for git-cherry-pick, which *is* a one-shot > > command. All the work to make the sequencer its work horse was done to > > allow using the functionality as a library function, though, including > > proper clean-up after use. > > > > This patch introduces an API to pass the responsibility of releasing > > certain memory to the sequencer. > > So how this API would be / is meant to be used? I added an example to the commit message. > Would sequencer as a library function be called multiple times, > or only once? The point of a library function is that it should not care. > I'm trying to find out how this is solved in other places of Git > code, and I have stumbled upon free_util in string_list... I wanted this to be flexible enough to take care of any type of data, not just strings. And while the string_list has a void *util field, it would be rather silly to add strings to a string list for the sole purpose of free()ing their util fields in the end. (That was the conclusion I came to after a search of my own.) > > +void *sequencer_entrust(struct replay_opts *opts, void > > *set_me_free_after_use) > > +{ > > + ALLOC_GROW(opts->owned, opts->owned_nr + 1, opts->owned_alloc); > > + opts->owned[opts->owned_nr++] = set_me_free_after_use; > > + > > + return set_me_free_after_use; > > I was wondering what this 'set_me_free_after_use' parameter is about; > wouldn't it be more readable if this parameter was called 'owned_data' > or 'owned_ptr'? If I read "owned_ptr" as a function's parameter, I would assume that the associated memory is owned by the caller. So I would be puzzled reading that name. > > static void remove_sequencer_state(const struct replay_opts *opts) > > { > > struct strbuf dir = STRBUF_INIT; > > + int i; > > + > > + for (i = 0; i < opts->owned_nr; i++) > > + free(opts->owned[i]); > > I guess you can remove owned data in any order, regardless if you > store struct or its members first... Indeed, this is not like a C++ destructor. It's free(). > > diff --git a/sequencer.h b/sequencer.h > > index c955594..20b708a 100644 > > --- a/sequencer.h > > +++ b/sequencer.h > > @@ -43,8 +43,14 @@ struct replay_opts { > > > > /* Only used by REPLAY_NONE */ > > struct rev_info *revs; > > + > > + /* malloc()ed data entrusted to the sequencer */ > > + void **owned; > > + int owned_nr, owned_alloc; > > I'm not sure about naming conventions for those types of data, but > wouldn't 'owned_data' be a better name? I could be wrong here... The convention seemed to be "void *X; int X_nr, X_alloc;", so I stuck with it. Thanks for your review! Johannes