Re: [PATCH v3 2/5] log: honor log.merges= option
On Mon, Apr 13, 2015 at 8:29 AM, Koosha Khajehmoogahi koo...@posteo.de wrote: From: Junio C Hamano gits...@pobox.com Let's not call this *my* change. The mechanism I did write, but I think the most important part of the change is to decide which commands in the log family should honor the configuration variable and which ones should not. As I said in the beginning, I am not convinced that letting git log and git log alone is the right choice. Make it Helped-by or something instead, to make it clear that I helped the implementation but not the design. The config. variable is honored only by log. Other log family commands including show, shortlog and rev-list are affected only from the equivalent command line option (i.e. --merges=). Since these commands are somehow different representations of log command, we like to have the same behavior from them as we do from log itself. However, to not break the default output of them, we do not consider the the config. That is a weak argument, isn't it? If we really wanted not to break the default output, then git log should not honor the configuration variable, either. And I do not agree with we like to have the same behaviour in the first place. My preference, as I already hinted in previous review rounds, is: - log and whatchanged are the same things and should behave the same way with respect to this configuration variable; - rev-list is plumbing and should never pay attention to a UI level configuration variable like this one; - show is given exact commit object(s) and is asked to show them; not showing what was explicitly asked to be shown only because the user has do not show merges configuration does not make much sense, so the configuration variable should be ignored; - The revision graph traversal done by format-patch and cherry-pick is primarily to determine the set of commits to process and of a different nature from the traversal done by log, even though they share the same underlying code. It is not appropriate to pay attention to this configuration variable. - I am neutral as to shortlog. As long as it pays attention to the new command line override, I think it is OK if it paid attention to the configuration variable, primarily because people most of the time use shortlog --no-merges with the current version of Git _anyway_. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATH modifications for git-hook processes
On Thu, Apr 16, 2015 at 02:17:32AM -0400, Jeff King wrote: So it uses a special git-specific version of exec, and doesn't touch the PATH. Later, we did 231af83 (Teach the git command to handle some commands internally, 2006-02-26), which says at the end: There's one other change: the search order for external programs is modified slightly, so that the first entry remains GIT_EXEC_DIR, but the second entry is the same directory as the git wrapper itself was executed out of - if we can figure it out from argv[0], of course. There was some question of whether this would be a problem[1], but we realized it is OK due to 77cb17e, mentioned above. I forgot to include my footnote, which was a link to the thread: http://thread.gmane.org/gmane.comp.version-control.git/16798 -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATH modifications for git-hook processes
On Wed, Apr 15, 2015 at 11:00:20AM -0400, Matthew Rothenberg wrote: So, I guess what I'm looking for is: - A way to prevent the **path to the running git itself** portion of setup_path from firing, (OR) Yeah, this behavior leaves me somewhat confused. It is obviously dangerous (since we have no clue what else is in the directory along with git, as opposed to the exec-path, which contains only git programs), and that is biting you here. But is it actually doing any good? Surely we don't need to find the git executable from that directory; it should already be in the exec-path itself. Ditto for any dashed commands (built-ins or external commands) that git ships. The only case I could come up with (that works now, but would not work if we stopped prepending the argv0 directory to the PATH) is: mkdir foo cp $(which git) foo cat foo/git-bar \EOF #!/bin/sh echo this external command is not in your PATH! EOF chmod +x foo/git-bar foo/git bar but I am not sure that is altogether sane. I tried to find some reasoning for this behavior in the history, and this is what I found. We originally started treating git programs as magical in 77cb17e (Exec git programs without using PATH., 2006-01-10), but it specifically warns against this: Modifying PATH is not desirable as it result in behavior differing from the user's intentions, as we may end up prepending /usr/bin to PATH. So it uses a special git-specific version of exec, and doesn't touch the PATH. Later, we did 231af83 (Teach the git command to handle some commands internally, 2006-02-26), which says at the end: There's one other change: the search order for external programs is modified slightly, so that the first entry remains GIT_EXEC_DIR, but the second entry is the same directory as the git wrapper itself was executed out of - if we can figure it out from argv[0], of course. There was some question of whether this would be a problem[1], but we realized it is OK due to 77cb17e, mentioned above. The use case described by Linus is more or less the not altogether sane scenario I described above: IOW, you can do things like alias git=/opt/my-git/git and all the git commands will automatically work fine, even if you didn't know at compile time where you would install them, and you didn't set GIT_EXEC_DIR at run-time. It will still first look in GIT_EXEC_DIR, but if that fails, it will take the git commands from /opt/my-git/ instead of from /usr/bin or whatever. I think this is pretty much a non-issue for stock git commands (we do not even put them into /opt/my-git these days, so it cannot be helping there). It's really about finding third-party commands you have added. Later we did 511707d (use only the $PATH for exec'ing git commands, 2007-10-28) which went back to munging the PATH, and dropping the protection we relied on when doing 231af83 (and causing the breakage you're seeing today). So I dunno. I find the current behavior quite questionable. I guess I can see it coming in handy, but as you note it can also cause problems. And put the git-* programs somewhere in your PATH or git's exec-dir if you want to use them does not seem like that bad a thing to require people to do. - A way to specify (via env var?) paths that must remain in high precedence even during a git-exec, e.g.: NEWPATH = [git --exec-path] : [GIT_EXEC_PATH] : [$PROPOSED_HIGH_PRECENDENCE_PATH] : ['git itself' path] : [$PATH] (OR) - A way to refine git-exec default behavior to avoid this edge case (my C is a little rusty but I'm happy to try to help out if we can think of a good logic), (OR) - Something else clever I am not aware of. If we can get away with just dropping this element from the PATH, I'd much rather do that than try to implement a complicated path-precedence scheme. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/5] t3904-stash-patch: factor PERL prereq at the top of the file
Signed-off-by: Matthieu Moy matthieu@imag.fr --- t/t3904-stash-patch.sh | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/t/t3904-stash-patch.sh b/t/t3904-stash-patch.sh index 9a59683..0f8f47f 100755 --- a/t/t3904-stash-patch.sh +++ b/t/t3904-stash-patch.sh @@ -3,7 +3,13 @@ test_description='stash -p' . ./lib-patch-mode.sh -test_expect_success PERL 'setup' ' +if ! test_have_prereq PERL +then + skip_all='skipping stash -p tests, perl not available' + test_done +fi + +test_expect_success 'setup' ' mkdir dir echo parent dir/foo echo dummy bar @@ -20,7 +26,7 @@ test_expect_success PERL 'setup' ' # note: order of files with unstaged changes: HEAD bar dir/foo -test_expect_success PERL 'saying n does nothing' ' +test_expect_success 'saying n does nothing' ' set_state HEAD HEADfile_work HEADfile_index set_state dir/foo work index (echo n; echo n; echo n) | test_must_fail git stash save -p @@ -29,7 +35,7 @@ test_expect_success PERL 'saying n does nothing' ' verify_state dir/foo work index ' -test_expect_success PERL 'git stash -p' ' +test_expect_success 'git stash -p' ' (echo y; echo n; echo y) | git stash save -p verify_state HEAD committed HEADfile_index verify_saved_state bar @@ -41,7 +47,7 @@ test_expect_success PERL 'git stash -p' ' verify_state dir/foo work head ' -test_expect_success PERL 'git stash -p --no-keep-index' ' +test_expect_success 'git stash -p --no-keep-index' ' set_state HEAD HEADfile_work HEADfile_index set_state bar bar_work bar_index set_state dir/foo work index @@ -56,7 +62,7 @@ test_expect_success PERL 'git stash -p --no-keep-index' ' verify_state dir/foo work index ' -test_expect_success PERL 'git stash --no-keep-index -p' ' +test_expect_success 'git stash --no-keep-index -p' ' set_state HEAD HEADfile_work HEADfile_index set_state bar bar_work bar_index set_state dir/foo work index @@ -71,7 +77,7 @@ test_expect_success PERL 'git stash --no-keep-index -p' ' verify_state dir/foo work index ' -test_expect_success PERL 'none of this moved HEAD' ' +test_expect_success 'none of this moved HEAD' ' verify_saved_head ' -- 2.4.0.rc1.42.g9642cc6 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/5] add -p: demonstrate failure when running 'edit' after a split
The test passes if one replaces the 'e' command with a 'y' command in the 'add -p' session. Reported-by: Tanky Woo wtq1...@gmail.com Signed-off-by: Matthieu Moy matthieu@imag.fr --- t/t3701-add-interactive.sh | 22 ++ 1 file changed, 22 insertions(+) diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh index b63b9d4..deae948 100755 --- a/t/t3701-add-interactive.sh +++ b/t/t3701-add-interactive.sh @@ -332,6 +332,28 @@ test_expect_success 'split hunk add -p (edit)' ' ! grep ^+15 actual ' +test_expect_failure 'split hunk add -p (no, yes, edit)' ' + cat test -\EOF + 5 + 10 + 20 + 21 + 30 + 31 + 40 + 50 + 60 + EOF + git reset + # test sequence is s(plit), n(o), y(es), e(dit) + # q n q q is there to make sure we exit at the end. + printf %s\n s n y e q n q q | + EDITOR=: git add -p 2error + test_must_be_empty error + git diff actual + ! grep ^+31 actual +' + test_expect_success 'patch mode ignores unmerged entries' ' git reset --hard test_commit conflict -- 2.4.0.rc1.42.g9642cc6 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/5] t3904-stash-patch: fix test description
The old description is rather clearly a wrong cut-and-paste from t2016-checkout-patch.sh. Signed-off-by: Matthieu Moy matthieu@imag.fr --- t/t3904-stash-patch.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t3904-stash-patch.sh b/t/t3904-stash-patch.sh index 70655c1..9a59683 100755 --- a/t/t3904-stash-patch.sh +++ b/t/t3904-stash-patch.sh @@ -1,6 +1,6 @@ #!/bin/sh -test_description='git checkout --patch' +test_description='stash -p' . ./lib-patch-mode.sh test_expect_success PERL 'setup' ' -- 2.4.0.rc1.42.g9642cc6 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/5] t3701-add-interactive: simplify code
Signed-off-by: Matthieu Moy matthieu@imag.fr --- t/t3701-add-interactive.sh | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh index 24ddd8a..b63b9d4 100755 --- a/t/t3701-add-interactive.sh +++ b/t/t3701-add-interactive.sh @@ -326,10 +326,7 @@ test_expect_success 'split hunk add -p (edit)' ' # 2. Correct version applies the (not)edited version, and asks #about the next hunk, against which we say q and program #exits. - for a in s e q n q q - do - echo $a - done | + printf %s\n s e q n q q | EDITOR=: git add -p git diff actual ! grep ^+15 actual -- 2.4.0.rc1.42.g9642cc6 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 5/5] stash -p: demonstrate failure of split with mixed y/n
Signed-off-by: Matthieu Moy matthieu@imag.fr --- t/t3904-stash-patch.sh | 23 +++ 1 file changed, 23 insertions(+) diff --git a/t/t3904-stash-patch.sh b/t/t3904-stash-patch.sh index 0f8f47f..38e7300 100755 --- a/t/t3904-stash-patch.sh +++ b/t/t3904-stash-patch.sh @@ -81,4 +81,27 @@ test_expect_success 'none of this moved HEAD' ' verify_saved_head ' +test_expect_failure 'stash -p with split hunk' ' + git reset --hard + cat test -\EOF + aaa + bbb + ccc + EOF + git add test + git commit -m initial + cat test -\EOF + aaa + added line 1 + bbb + added line 2 + ccc + EOF + printf %s\n s n y q | + test_might_fail git stash -p 2error + ! test_must_be_empty error + grep added line 1 test + ! grep added line 2 test +' + test_done -- 2.4.0.rc1.42.g9642cc6 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[bug] first line truncated with `git log --oneline --decorate --graph`
I have a bug using the following command: git log --oneline --decorate --graph In short, the first line of the log is often truncated. If the terminal is too small to display all the text (~100 columns, and too much text to fit on the high of screen, in this example less than 4 lines), I actually see: If my history is: * 4656b73 (HEAD, long_branch_name_and_long_commit_name) Merge commit 'f7f6e4736ad040a1238644a33b681a66c79fac0e' into HEAD |\ | * f7f6e47 (master) Praesent et diam eget libero egestas mattis sit amet vitae augue. | * 8dccb9d Maecenas congue ligula ac quam viverra nec consectetur ante hendrerit. |/ * f7724a2 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus. If my screen is not high enough, I see: 1238644a33b681a66c79fac0e' into HEAD |\ | * f7f6e47 (master) Praesent et diam eget libero egestas mattis sit amet vitae augue. | * 8dccb9d Maecenas congue ligula ac quam viverra nec consectetur ante hendrerit. # next message truncated, because the screen is too small instead of: * 4656b73 (HEAD, long_branch_name_and_long_commit_name) Merge commit 'f7f6e4736ad040a1238644a33b681a66c79fac0e' into HEAD |\ | * f7f6e47 (master) Praesent et diam eget libero egestas mattis sit amet vitae augue. | * 8dccb9d Maecenas congue ligula ac quam viverra nec consectetur ante hendrerit. # next message truncated, because the screen is too small As you can see, the first line is truncated. If I use: git log --pretty=oneline --all --decorate --graph instead of: git log --oneline --all --decorate --graph the problem is the same (first line truncated). I have see that bug using terminology or xterm, with bash (without .bashrc) or zsh (lots of things in my .zshrc), and my desktop environment is i3/archlinux/git version 2.3.5 It is not a recent regression (if it is, I've learn git one year ago). # How to reproduce Open a small terminal windows (4*100) mkdir tmp cd tmp git init git commit --allow-empty -m 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus.' git checkout -b long_branch_name_and_long_commit_name git commit --allow-empty -m 'Maecenas congue ligula ac quam viverra nec consectetur ante hendrerit.' git commit --allow-empty -m 'Praesent et diam eget libero egestas mattis sit amet vitae augue.' git checkout master git merge --no-ff long_branch_name_and_long_commit_name -m 'merge with a long commit message' git checkout long_branch_name_and_long_commit_name git merge master git log --oneline --decorate --graph I hope it is clear. The English is not my mother tongue. -- Robin Moussu // signature.asc Description: OpenPGP digital signature
Re: Git + SFC Status Update
- Written policy: https://git-scm.com/trademark My browser shows an Untrusted Connection page when click the link. Does git-scm have no TLS certificate? -- View this message in context: http://git.661346.n2.nabble.com/Git-SFC-Status-Update-tp7628669p7628850.html Sent from the git mailing list archive at Nabble.com. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/9] config: use getc_unlocked when reading from file
We read config files character-by-character from a stdio handle using fgetc(). This incurs significant locking overhead, even though we know that only one thread can possibly access the handle. We can speed this up by taking the lock ourselves, and then using getc_unlocked to read each character. On a silly pathological case: perl -le ' print [core]; print key$_ = value$_ for (1..100) ' input git config -f input core.key1 this dropped the time to run git-config from: real0m0.263s user0m0.260s sys 0m0.000s to: real0m0.159s user0m0.152s sys 0m0.004s for a savings of 39%. Most config files are not this big, but the savings should be proportional to the size of the file (i.e., we always save 39%, just of a much smaller number). Signed-off-by: Jeff King p...@peff.net --- config.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/config.c b/config.c index 66c0a51..8b297fc 100644 --- a/config.c +++ b/config.c @@ -49,7 +49,7 @@ static struct config_set the_config_set; static int config_file_fgetc(struct config_source *conf) { - return fgetc(conf-u.file); + return getc_unlocked(conf-u.file); } static int config_file_ungetc(int c, struct config_source *conf) @@ -1088,7 +1088,9 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data) f = fopen(filename, r); if (f) { + flockfile(f); ret = do_config_from_file(fn, filename, filename, f, data); + funlockfile(f); fclose(f); } return ret; -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/9] strbuf_getwholeline: avoid calling strbuf_grow
As with the recent speedup to strbuf_addch, we can avoid calling strbuf_grow() in a tight loop of single-character adds by instead checking strbuf_avail. Note that we would instead call strbuf_addch directly here, but it does more work than necessary: it will NUL-terminate the result for each character read. Instead, in this loop we read the characters one by one and then add the terminator manually at the end. Running git rev-parse refs/heads/does-not-exist on a repo with an extremely large (1.6GB) packed-refs file went from (best-of-5): real0m10.948s user0m10.548s sys 0m0.412s to: real0m8.601s user0m8.084s sys 0m0.524s for a wall-clock speedup of 21%. Helped-by: Eric Sunshine sunsh...@sunshineco.com Signed-off-by: Jeff King p...@peff.net --- Our don't write a NUL for each character optimization is only possible because we're intimate with the strbuf details here. I thought about making a strbuf_addch_unsafe interface to let other callers do this, too. But the only other caller that would use it is the config reader, and I measured only a 3% speedup there. Which I don't think is worth the extra API complexity. Whereas here it does make a big difference. Switching to strbuf_addch knocks us back up into the 9.5s range. I think the difference is that our lines are much longer than the tokens we're parsing in the config file. So the percentage of wasted NUL writes is much higher here. strbuf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/strbuf.c b/strbuf.c index af2bad4..921619e 100644 --- a/strbuf.c +++ b/strbuf.c @@ -445,7 +445,8 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) strbuf_reset(sb); flockfile(fp); while ((ch = getc_unlocked(fp)) != EOF) { - strbuf_grow(sb, 1); + if (!strbuf_avail(sb)) + strbuf_grow(sb, 1); sb-buf[sb-len++] = ch; if (ch == term) break; -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git + SFC Status Update
On Thu, Apr 16, 2015 at 01:37:54AM -0700, Yi EungJun wrote: - Written policy: https://git-scm.com/trademark My browser shows an Untrusted Connection page when click the link. Does git-scm have no TLS certificate? Hmm, you're right. It has a *.herokuapp.com cert. That's fine for hitting https://git-scm.herokuapp.com, but obviously not when you use the custom domain. I think I manually accepted the cert long ago and forgot about it. Looks like somebody opened https://github.com/git/git-scm.com/issues/489 but nobody has dealt with it. I have no idea what the recommended setup is for dealing with TLS and custom domains on Heroku. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/9] address packed-refs speed regressions
On Sat, Apr 04, 2015 at 09:06:11PM -0400, Jeff King wrote: As I've mentioned before, I have some repositories with rather large numbers of refs. The worst one has ~13 million refs, for a 1.6GB packed-refs file. So I was saddened by this: $ time git.v2.0.0 rev-parse refs/heads/foo /dev/null 21 real0m6.840s user0m6.404s sys 0m0.440s $ time git.v2.4.0-rc1 rev-parse refs/heads/foo /dev/null 21 real0m19.432s user0m18.996s sys 0m0.456s Here's a re-roll incorporating feedback from the list. Thanks everybody for your comments. Last time the final number was ~8.5s, which was disappointingly slower than v2.0.0. In this iteration, my final numbers are: real0m5.703s user0m5.276s sys 0m0.432s which is quite pleasing. The big changes that resulted in this additional speedup are: 1. Use getdelim() when it is available. This is much faster than even a getc_unlocked() loop. 2. The slowdown from d0f810f was from adding in refname_is_safe calls. But what I didn't notice before is that we run them in _addition_ to check_refname_format, rather than instead of it. So in the common case of a sanely-formatted refname, we can skip the call, rather than writing a lot of code to micro-optimize it. It was also mentioned in a nearby thread that the config code could benefit from some of the same micro-optimizations. It can't make use of getdelim(), as it really does want to do character-by-character parsing. But it can still use getc_unlocked() and the strbuf_avail() trick, which speeds up config reading by 47%. Those patches are included here. [1/9]: strbuf_getwholeline: use getc macro [2/9]: git-compat-util: add fallbacks for unlocked stdio [3/9]: strbuf_getwholeline: use getc_unlocked [4/9]: config: use getc_unlocked when reading from file [5/9]: strbuf_addch: avoid calling strbuf_grow [6/9]: strbuf_getwholeline: avoid calling strbuf_grow [7/9]: strbuf_getwholeline: use getdelim if it is available [8/9]: read_packed_refs: avoid double-checking sane refs [9/9]: t1430: add another refs-escape test -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/9] strbuf_getwholeline: use getdelim if it is available
We spend a lot of time in strbuf_getwholeline in a tight loop reading characters from a stdio handle into a buffer. The libc getdelim() function can do this for us with less overhead. It's in POSIX.1-2008, and was a GNU extension before that. Therefore we can't rely on it, but can fall back to the existing getc loop when it is not available. The HAVE_GETDELIM knob is turned on automatically for Linux, where we have glibc. We don't need to set any new feature-test macros, because we already define _GNU_SOURCE. Other systems that implement getdelim may need to other macros (probably _POSIX_C_SOURCE = 200809L), but we can address that along with setting the Makefile knob after testing the feature on those systems. Running git rev-parse refs/heads/does-not-exist on a repo with an extremely large (1.6GB) packed-refs file went from (best-of-5): real0m8.601s user0m8.084s sys 0m0.524s to: real0m6.768s user0m6.340s sys 0m0.432s for a wall-clock speedup of 21%. Based on a patch from Rasmus Villemoes r...@rasmusvillemoes.dk. Signed-off-by: Jeff King p...@peff.net --- If somebody has a FreeBSD or OS X system to test on, I'd love to see what is needed to compile with HAVE_GETDELIM there. And to confirm that the performance is much better. Sharing my 1.6GB packed-refs file would be hard, but you should be able to generate something large and ridiculous. I'll leave that as an exercise to the reader. Makefile | 6 ++ config.mak.uname | 1 + strbuf.c | 42 ++ 3 files changed, 49 insertions(+) diff --git a/Makefile b/Makefile index 5f3987f..36655d5 100644 --- a/Makefile +++ b/Makefile @@ -359,6 +359,8 @@ all:: # compiler is detected to support it. # # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl function. +# +# Define HAVE_GETDELIM if your system has the getdelim() function. GIT-VERSION-FILE: FORCE @$(SHELL_PATH) ./GIT-VERSION-GEN @@ -1437,6 +1439,10 @@ ifdef HAVE_BSD_SYSCTL BASIC_CFLAGS += -DHAVE_BSD_SYSCTL endif +ifdef HAVE_GETDELIM + BASIC_CFLAGS += -DHAVE_GETDELIM +endif + ifeq ($(TCLTK_PATH),) NO_TCLTK = NoThanks endif diff --git a/config.mak.uname b/config.mak.uname index f4e77cb..d26665f 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -36,6 +36,7 @@ ifeq ($(uname_S),Linux) HAVE_DEV_TTY = YesPlease HAVE_CLOCK_GETTIME = YesPlease HAVE_CLOCK_MONOTONIC = YesPlease + HAVE_GETDELIM = YesPlease endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease diff --git a/strbuf.c b/strbuf.c index 921619e..0d4f4e5 100644 --- a/strbuf.c +++ b/strbuf.c @@ -435,6 +435,47 @@ int strbuf_getcwd(struct strbuf *sb) return -1; } +#ifdef HAVE_GETDELIM +int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) +{ + ssize_t r; + + if (feof(fp)) + return EOF; + + strbuf_reset(sb); + + /* Translate slopbuf to NULL, as we cannot call realloc on it */ + if (!sb-alloc) + sb-buf = NULL; + r = getdelim(sb-buf, sb-alloc, term, fp); + + if (r 0) { + sb-len = r; + return 0; + } + assert(r == -1); + + /* +* Normally we would have called xrealloc, which will try to free +* memory and recover. But we have no way to tell getdelim() to do so. +* Worse, we cannot try to recover ENOMEM ourselves, because we have +* no idea how many bytes were read by getdelim. +* +* Dying here is reasonable. It mirrors what xrealloc would do on +* catastrophic memory failure. We skip the opportunity to free pack +* memory and retry, but that's unlikely to help for a malloc small +* enough to hold a single line of input, anyway. +*/ + if (errno == ENOMEM) + die(Out of memory, getdelim failed); + + /* Restore slopbuf that we moved out of the way before */ + if (!sb-buf) + strbuf_init(sb, 0); + return EOF; +} +#else int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) { int ch; @@ -458,6 +499,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) sb-buf[sb-len] = '\0'; return 0; } +#endif int strbuf_getline(struct strbuf *sb, FILE *fp, int term) { -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/9] address packed-refs speed regressions
On Thu, Apr 16, 2015 at 04:47:34AM -0400, Jeff King wrote: Here's a re-roll incorporating feedback from the list. Thanks everybody for your comments. Last time the final number was ~8.5s, which was disappointingly slower than v2.0.0. In this iteration, my final numbers are: real0m5.703s user0m5.276s sys 0m0.432s which is quite pleasing. I forgot to mention what I _didn't_ include. We discussed using mmap instead of stdio. Applying René's mmap patch drops my best-of-five here to 5.114s. Which is nice, but it is a bit more invasive (and does not help other callers of strbuf_getline). If I further apply my really nasty patch to avoid the strbuf entirely (i.e., we parse straight out of the mmap), I can get it down to 4.835s. I don't know if the complexity is worth it or not. Ultimately, I think the best route to making packed-refs faster is to drop the whole ref_cache structure and just iterate directly over the mmap data. It would use less RAM, and it opens the possibility of binary-searching to look at only a subset of the entries. That's a _lot_ more invasive, though. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/9] git-compat-util: add fallbacks for unlocked stdio
POSIX.1-2001 specifies some functions for optimizing the locking out of tight getc() loops. Not all systems are POSIX, though, and even not all POSIX systems are required to implement these functions. We can check for the feature-test macro to see if they are available, and if not, provide a noop implementation. There's no Makefile knob here, because we should just detect this automatically. If there are very bizarre systems, we may need to add one, but it's not clear yet in which direction: 1. If a system defines _POSIX_THREAD_SAFE_FUNCTIONS but these functions are missing or broken, we would want a knob to manually turn them off. 2. If a system has these functions but does not define _POSIX_THREAD_SAFE_FUNCTIONS, we would want a knob to manually turn them on. We can add such a knob when we find a real-world system that matches this. Signed-off-by: Jeff King p...@peff.net --- git-compat-util.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index bc8fc8c..685a0a4 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -883,4 +883,10 @@ struct tm *git_gmtime_r(const time_t *, struct tm *); # define SHELL_PATH /bin/sh #endif +#ifndef _POSIX_THREAD_SAFE_FUNCTIONS +#define flockfile(fh) +#define funlockfile(fh) +#define getc_unlocked(fh) getc(fh) +#endif + #endif -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/9] strbuf_getwholeline: use getc_unlocked
strbuf_getwholeline calls getc in a tight loop. On modern libc implementations, the stdio code locks the handle for every operation, which means we are paying a significant overhead. We can get around this by locking the handle for the whole loop and using the unlocked variant. Running git rev-parse refs/heads/does-not-exist on a repo with an extremely large (1.6GB) packed-refs file went from: real0m18.900s user0m18.472s sys 0m0.448s to: real0m10.953s user0m10.384s sys 0m0.580s for a wall-clock speedup of 42%. All times are best-of-3, and done on a glibc 2.19 system. Note that we call into strbuf_grow while holding the lock. It's possible for that function to call other stdio functions (e.g., printing to stderr when dying due to malloc error); however, the POSIX.1-2001 definition of flockfile makes it clear that the locks are per-handle, so we are fine unless somebody else tries to read from our same handle. This doesn't ever happen in the current code, and is unlikely to be added in the future (we would have to do something exotic like add a die_routine that tried to read from stdin). Signed-off-by: Jeff King p...@peff.net --- strbuf.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/strbuf.c b/strbuf.c index 14f337d..af2bad4 100644 --- a/strbuf.c +++ b/strbuf.c @@ -443,12 +443,14 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) return EOF; strbuf_reset(sb); - while ((ch = getc(fp)) != EOF) { + flockfile(fp); + while ((ch = getc_unlocked(fp)) != EOF) { strbuf_grow(sb, 1); sb-buf[sb-len++] = ch; if (ch == term) break; } + funlockfile(fp); if (ch == EOF sb-len == 0) return EOF; -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/9] strbuf_getwholeline: use getc macro
strbuf_getwholeline calls fgetc in a tight loop. Using the getc form, which can be implemented as a macro, should be faster (and we do not care about it evaluating our argument twice, as we just have a plain variable). On my glibc system, running git rev-parse refs/heads/does-not-exist on a file with an extremely large (1.6GB) packed-refs file went from (best of 3 runs): real0m19.383s user0m18.876s sys 0m0.528s to: real0m18.900s user0m18.472s sys 0m0.448s for a wall-clock speedup of 2.5%. Signed-off-by: Jeff King p...@peff.net --- strbuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/strbuf.c b/strbuf.c index 88cafd4..14f337d 100644 --- a/strbuf.c +++ b/strbuf.c @@ -443,7 +443,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) return EOF; strbuf_reset(sb); - while ((ch = fgetc(fp)) != EOF) { + while ((ch = getc(fp)) != EOF) { strbuf_grow(sb, 1); sb-buf[sb-len++] = ch; if (ch == term) -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
Hi all, over the years I've had the same phenomenon with various versions of msysgit (now at 1.9.5.msysgit.0, on Windows 7 64bit), so I'm now sufficiently confident of it being a long-standing, longer-term issue and thus I'm reporting it now. Since I'm doing development in a sufficiently rebase-heavy manner, I seem to aggregate a lot of objects. Thus, when fetching content I'm sufficiently frequently greeted with a git gc run. This, however, does not work fully reliably: Auto packing the repository for optimum performance. You may also run git gc manually. See git help gc for more information. Counting objects: 206527, done. Delta compression using up to 4 threads. Compressing objects: 100% (27430/27430), done. Writing objects: 100% (206527/206527), done. Total 206527 (delta 178632), reused 206527 (delta 178632) Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.pack' failed. Should I try again? (y/n) n Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.idx' failed. Should I try again? (y/n) n Checking connectivity: 206527, done. A workable workaround for this recurring issue (such a fetch will fail repeatedly, thereby hampering my ability to update properly) is to manually do a git gc --auto prior to the fetch (which will then succeed). -- ¿umop apisdn upside down? (by daniweb.com user Bench) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/9] strbuf_addch: avoid calling strbuf_grow
We mark strbuf_addch as inline, because we expect it may be called from a tight loop. However, the first thing it does is call the non-inline strbuf_grow(), which can handle arbitrary-sized growth. Since we know that we only need a single character, we can use the inline strbuf_avail() to quickly check whether we need to grow at all. Our check is redundant when we do call strbuf_grow(), but that's OK. The common case is that we avoid calling it at all, and we have made that case faster. On a silly pathological case: perl -le ' print [core]; print key$_ = value$_ for (1..100) ' input git config -f input core.key1 this dropped the time to run git-config from: real0m0.159s user0m0.152s sys 0m0.004s to: real0m0.140s user0m0.136s sys 0m0.004s for a savings of 12%. Signed-off-by: Jeff King p...@peff.net --- I doubt anybody will really notice this in practice with config files, and for the most part we do not have tight loops of strbuf_addch elsewhere. But it is such an easy optimization, I'd rather do it now while we're thinking about it. strbuf.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/strbuf.h b/strbuf.h index 1883494..01c5c63 100644 --- a/strbuf.h +++ b/strbuf.h @@ -205,7 +205,8 @@ extern int strbuf_cmp(const struct strbuf *, const struct strbuf *); */ static inline void strbuf_addch(struct strbuf *sb, int c) { - strbuf_grow(sb, 1); + if (!strbuf_avail(sb)) + strbuf_grow(sb, 1); sb-buf[sb-len++] = c; sb-buf[sb-len] = '\0'; } -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 9/9] t1430: add another refs-escape test
In t1430, we check whether deleting the branch ../../foo will delete .git/foo. However, this is not that interesting a test; the precious file .git/foo does not look like a ref, so even if we did not notice the escape from the refs/ hierarchy, we would fail for that reason (i.e., if you turned refname_is_safe into a noop, the test still passes). Let's add an additional test for the same thing, but with a file that actually looks like a ref. That will make sure we are exercising the refname_is_safe code. While we're at it, let's also make the code work a little harder by adding some extra paths and some empty path components. Signed-off-by: Jeff King p...@peff.net --- This was originally included to exercise refname_is_safe(), because in the v1 series I refactored it (here I just avoid calling it entirely). So it's not as important in v2. But AFAICT, we do not exercise refname_is_safe() at all in the test suite without this patch, so it's probably a good thing to do regardless. t/t1430-bad-ref-name.sh | 8 1 file changed, 8 insertions(+) diff --git a/t/t1430-bad-ref-name.sh b/t/t1430-bad-ref-name.sh index 468e856..16d0b8b 100755 --- a/t/t1430-bad-ref-name.sh +++ b/t/t1430-bad-ref-name.sh @@ -68,6 +68,14 @@ test_expect_success 'branch -D cannot delete non-ref in .git dir' ' test_cmp expect .git/my-private-file ' +test_expect_success 'branch -D cannot delete ref in .git dir' ' + git rev-parse HEAD .git/my-private-file + git rev-parse HEAD expect + git branch foo/legit + test_must_fail git branch -D foo./././../../../my-private-file + test_cmp expect .git/my-private-file +' + test_expect_success 'branch -D cannot delete absolute path' ' git branch -f extra test_must_fail git branch -D $(pwd)/.git/refs/heads/extra -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/9] read_packed_refs: avoid double-checking sane refs
Prior to d0f810f (refs.c: allow listing and deleting badly named refs, 2014-09-03), read_packed_refs would barf on any malformed refnames by virtue of calling create_ref_entry with the check parameter set to 1. That commit loosened our reading so that we call check_refname_format ourselves and just set a REF_BAD_NAME flag. We then call create_ref_entry with the check parameter set to 0. That function learned to do an extra safety check even when the check parameter is 0, so that we don't load any dangerous refnames (like ../../../etc/passwd). This is implemented by calling refname_is_safe() in create_ref_entry(). However, we can observe that refname_is_safe() can only be true if check_refname_format() also failed. So in the common case of a sanely named ref, we perform _both_ checks, even though we know that the latter will never trigger. This has a noticeable performance impact when the packed-refs file is large. Let's drop the refname_is_safe check from create_ref_entry(), and make it the responsibility of the caller. Of the three callers that pass a check parameter of 0, two will have just called check_refname_format(), and can check the refname-safety only when it fails. The third case, pack_if_possible_fn, is copying from an existing ref entry, which must have previously passed our safety check. With this patch, running git rev-parse refs/heads/does-not-exist on a repo with a large (1.6GB) packed-refs file went from: real0m6.768s user0m6.340s sys 0m0.432s to: real0m5.703s user0m5.276s sys 0m0.432s for a wall-clock speedup of 15%. Signed-off-by: Jeff King p...@peff.net --- refs.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/refs.c b/refs.c index 47e4e53..f36ea75 100644 --- a/refs.c +++ b/refs.c @@ -344,8 +344,6 @@ static struct ref_entry *create_ref_entry(const char *refname, if (check_name check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) die(Reference has invalid format: '%s', refname); - if (!check_name !refname_is_safe(refname)) - die(Reference has invalid name: '%s', refname); len = strlen(refname) + 1; ref = xmalloc(sizeof(struct ref_entry) + len); hashcpy(ref-u.value.sha1, sha1); @@ -1178,6 +1176,8 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir) int flag = REF_ISPACKED; if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) { + if (!refname_is_safe(refname)) + die(packed refname is dangerous: %s, refname); hashclr(sha1); flag |= REF_BAD_NAME | REF_ISBROKEN; } @@ -1323,6 +1323,8 @@ static void read_loose_refs(const char *dirname, struct ref_dir *dir) } if (check_refname_format(refname.buf, REFNAME_ALLOW_ONELEVEL)) { + if (!refname_is_safe(refname.buf)) + die(loose refname is dangerous: %s, refname.buf); hashclr(sha1); flag |= REF_BAD_NAME | REF_ISBROKEN; } -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: support git+mosh for unreliable connections
Trevor Saunders venit, vidit, dixit 15.04.2015 20:59: On Wed, Apr 15, 2015 at 07:46:15PM +0200, Johannes Schindelin wrote: Hi Trevor, On 2015-04-15 17:33, Trevor Saunders wrote: On Wed, Apr 15, 2015 at 04:41:42PM +0200, Johannes Schindelin wrote: On 2015-04-15 16:18, Pirate Praveen wrote: On Wednesday 15 April 2015 07:22 PM, Michael J Gruber wrote: What would that require git to do, beyond taking whatever you tell it (using GIT_SSH or _GIT_SSH_COMMAND) to use as a drop in replacement for ssh? May be support git+mosh as a protocol, since it is not a drop in replacement. It is redesigned remote shell. The ideas it uses for session resumption needs to be reimplemented. This will need support from git, because it needs server side to be modified. Use SSP to return the the current progress for a particular session (it uses AES session ids). It will need support from Git alright, but not as much as from mosh, see my other reply: Mosh was not designed for non-interactive use. That support would have to be added before we can go any further. is that really relevent? mosh doesn't support things like X forwarding or port forwarding, but it certainly does support ssh host command and then doing IO. Ah, so mosh's README lied to me! I wouldn't say it lied, its just not really clear what is interactive I'd say git's use of ssh is kind of interactive compared to things like port forwarding. If `mosh user@host command` works, then a simple `GIT_SSH=mosh` should work out of the box, too. Have you tried it? it does work, I just tried mosh $host cat and then typing stuff and having it printed back at me. However it clears the terminal before hand and prints a message on exit. I tried GIT_SSH=mosh git clone and it failed, but I haven't really dug into why. SO I suspect this can be made to work with some work on the mosh side, but I'm not sure exactly how ssh and mosh are behaving differently here. Trev First thing you see on mosh.mit.edu: Mosh is a replacement for SSH. I guess that needs a footnote... Michael -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
Hi, On Thu, Apr 16, 2015 at 01:31:02PM +0200, Johannes Schindelin wrote: Hi, On 2015-04-16 13:10, Thomas Braun wrote: I've never had this issue. The error message from unlinking the file means that someone is still accessing the file and thus it can not be deleted (due to the implicit file locking on windows). Best guess is that an antivirus is still accessing it. There is a tool called `WhoUses.exe` in msysGit (I do not remember if I included it into Git for Windows 1.x for end users) which could be used to figure out which process accesses a given file still: https://github.com/msysgit/msysgit/blob/master/mingw/bin/WhoUses.exe (maybe that would help you identify the cause of the problem). Oh my. Botched mail conversation... I tried to f'up on this messy start ASAP, so I even managed to omit this final *pre-existing* part: Please note that this system is hampered by a crappy virus scanner dependency (F-Secure), which could be the culprit for this issue (e.g. by keeping files busy for longer than expected), however I really don't think that it takes part in this issue. The reason that I suspect that it's not virus scanner related is: - standalone git gc --auto works immediately (hmm but this might also point at the opposite - namely virus scanner still accessing files of a prior operation only in case there *was* a prior operation) - file descriptor scope handling issue in git source code is very easily imaginable - only a very rebase-heavy workflow of a sufficiently large repo is likely to have this issue turn up in a frequently enough manner, thus it's quite likely that it's not observed (or reported) all too often Thanks, Andreas Mohr -- GNU/Linux. It's not the software that's free, it's you. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] compat/mingw: stubs for getpgid() and tcgetpgrp()
Hi Junio, On 2015-04-15 20:48, Junio C Hamano wrote: Johannes Sixt j...@kdbg.org writes: Windows does not have process groups. It is, therefore, the simplest to pretend that each process is in its own process group. While here, move the getppid() stub from its old location (between two sync related functions) next to the two new functions. Signed-off-by: Johannes Sixt j...@kdbg.org --- Thanks for a quick update. The patch should do for now, but I suspect that it may give us a better abstraction to make the is_foreground_fd(int fd) or even is_foreground(void) the public API that would be implemented as int we_are_in_the_foreground(void) { return getpgid(0) == tcgetpgrp(fileno(stderr)); } in POSIX and Windows can implement entirely differently. I really like it. We already require a couple of workarounds to be able to use `mintty` (which is a replacement terminal emulator that is more flexible than the default Windows terminal window, but it comes at the price that most Win32 programs think they are not running interactively inside a `mintty` session). I would really like to avoid having to finagle some really ugly code into `getpgid()` to make that test work. In general, I find it rewarding not only from a portability point of view, but *especially* from a readability one if the code contains functions that are named semantically, i.e. they describe *why* they are called, not *how* they answer the question. In this case, I would prefer the `is_foreground_fd(int fd)` one, but I am sure I can make the other signature work for us, too. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
Hi, On 2015-04-16 13:10, Thomas Braun wrote: Am 16.04.2015 um 12:03 schrieb Andreas Mohr: over the years I've had the same phenomenon with various versions of msysgit (now at 1.9.5.msysgit.0, on Windows 7 64bit), so I'm now sufficiently confident of it being a long-standing, longer-term issue and thus I'm reporting it now. (CC'ing msysgit) Good idea. Since I'm doing development in a sufficiently rebase-heavy manner, I seem to aggregate a lot of objects. Thus, when fetching content I'm sufficiently frequently greeted with a git gc run. This, however, does not work fully reliably: Auto packing the repository for optimum performance. You may also run git gc manually. See git help gc for more information. Counting objects: 206527, done. Delta compression using up to 4 threads. Compressing objects: 100% (27430/27430), done. Writing objects: 100% (206527/206527), done. Total 206527 (delta 178632), reused 206527 (delta 178632) Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.pack' failed. Should I try again? (y/n) n Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.idx' failed. Should I try again? (y/n) n Checking connectivity: 206527, done. A workable workaround for this recurring issue (such a fetch will fail repeatedly, thereby hampering my ability to update properly) is to manually do a git gc --auto prior to the fetch (which will then succeed). I've never had this issue. The error message from unlinking the file means that someone is still accessing the file and thus it can not be deleted (due to the implicit file locking on windows). Best guess is that an antivirus is still accessing it. There is a tool called `WhoUses.exe` in msysGit (I do not remember if I included it into Git for Windows 1.x for end users) which could be used to figure out which process accesses a given file still: https://github.com/msysgit/msysgit/blob/master/mingw/bin/WhoUses.exe (maybe that would help you identify the cause of the problem). Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sha1_file.c: make parse_sha1_header_extended() static
commit 9e1f5bc0 (sha1_file.c: support reading from a loose object of unknown type, 15-04-2015) added a new externally visible function which does not require more than file scope. This causes sparse to issue a warning message about this symbol. In order to suppress the warning, add the static qualifier to the function definition. [An alternative solution, if this symbol should have external scope, is to add an external declaration for the function to the cache.h header file (next to the one for parse_sha1_header()).] Signed-off-by: Ramsay Jones ram...@ramsay1.demon.co.uk --- Hi Karthik, If you need to re-roll your patches in the 'kn/cat-file-literally' branch, could you please squash this, or something like it, into the relevant patch. Thanks! ATB, Ramsay Jones sha1_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 267399d..2b81534 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1642,7 +1642,7 @@ static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long s * too permissive for what we want to check. So do an anal * object header parse by hand. */ -int parse_sha1_header_extended(const char *hdr, struct object_info *oi, +static int parse_sha1_header_extended(const char *hdr, struct object_info *oi, unsigned int flags) { struct strbuf typename = STRBUF_INIT; -- 2.3.0 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
On Thu, Apr 16, 2015 at 01:48:46PM +0200, Andreas Mohr wrote: OK, at this point in time it's my turn to actually verify that indeed it's NOT the virus scanner: - generate rebase-heavy activity - update - hit issue - unload virus (~ scanner?? I'm unsure on exact terminology to be used ;-) - update - profit!? Despite trying hard (generating a lot of activity, with different repo projects even) I cannot reproduce it in a timely manner, thus I'll have to wait until repo state has degraded in a sufficient manner for such a larger repack with that issue to occur again (probably a matter of weeks). Once it happens, I will: - ensure keeping a copy of the entire (problematic-state) repo, and verify reproducibility of its (copied/preserved) breakage - unload virus and do other tests - report back Andreas Mohr -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [msysGit] [PATCH] compat/mingw: stubs for getpgid() and tcgetpgrp()
Hi kusma, On 2015-04-15 21:43, Erik Faye-Lund wrote: On Wed, Apr 15, 2015 at 8:29 PM, Johannes Sixt j...@kdbg.org wrote: Windows does not have process groups. It is, therefore, the simplest to pretend that each process is in its own process group. Windows does have some concept of process groups, but probably not quite what you want: https://msdn.microsoft.com/en-us/library/windows/desktop/ms682083%28v=vs.85%29.aspx Yes, and we actually need that in Git for Windows anyway because shooting down a process does not kill its child processes: https://github.com/git-for-windows/msys2-runtime/commit/15f209511985092588b171703e5046eba937b47b#diff-8753cda163376cee6c80aab11eb8701fR402 However, using this code for `getppid()` would be serious overkill (not to mention an unbearable performance hit because you have to enumerate *all* processes to get that information). Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
Hi, sorry, I had sent the prior mail prematurely (hit wrong key) and have been busy working on the resubmission... On Thu, Apr 16, 2015 at 01:10:36PM +0200, Thomas Braun wrote: Am 16.04.2015 um 12:03 schrieb Andreas Mohr: Hi all, over the years I've had the same phenomenon with various versions of msysgit (now at 1.9.5.msysgit.0, on Windows 7 64bit), so I'm now sufficiently confident of it being a long-standing, longer-term issue and thus I'm reporting it now. (I've had experience with this issue as early as 1.7.x, I believe). (CC'ing msysgit) Hi Andreas, Since I'm doing development in a sufficiently rebase-heavy manner, I seem to aggregate a lot of objects. Thus, when fetching content I'm sufficiently frequently greeted with a git gc run. This, however, does not work fully reliably: Auto packing the repository for optimum performance. You may also run git gc manually. See git help gc for more information. Counting objects: 206527, done. Delta compression using up to 4 threads. Compressing objects: 100% (27430/27430), done. Writing objects: 100% (206527/206527), done. Total 206527 (delta 178632), reused 206527 (delta 178632) Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.pack' failed. Should I try again? (y/n) n Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.idx' failed. Should I try again? (y/n) n Checking connectivity: 206527, done. A workable workaround for this recurring issue (such a fetch will fail repeatedly, thereby hampering my ability to update properly) is to manually do a git gc --auto prior to the fetch (which will then succeed). I've never had this issue. The error message from unlinking the file means that someone is still accessing the file and thus it can not be deleted (due to the implicit file locking on windows). Can you reproduce the error reliably? It seems to be reproducible pretty reliably, at least once git thinks it needs to repack (initiated by a fetch operation, I think), *and* then the unlink issue successfully turned up (which may happen perhaps every 20 fetches of a *very* rebase-heavy workflow). Interim mail content: I strongly suspect that git's repacking implementation (probably unrelated to msysgit-specific deviations, IOW, git *core* handling) simply is buggy in that it may keep certain file descriptors open at least a certain time (depending on scope of implementation/objects!?) beyond having finished its operation (rename?). As a related note, in an unrelated application of mine I also encountered issues on Windows where renaming of in-use files and further use of these files/names then failed (error code was EACCES I believe). IOW, this seems to be an issue specific to Windows' special (and sometimes quirky) filesystem handling which probably does not turn up on many other platforms, thus a historic existing implementation weakness in git's repack handling could not be nailed down in a sufficiently easy manner. I think I may have the order wrong, however: Handling seems to be: - repack needed - counting objects - compressing - writing - unlink (delete) of all prior non-repacked objects (which fails) I have to admit that at this point in time I'm actually unsure which higher-level operation it actually is that gets carried out where eventually a repack *implicitly* gets triggered (I've got a shell script here which implements clean branch updating, where I eventually hit the problem during its daily use). Since a standalone git gc --auto *immediately* appears to work (after many repeated attempts of failing full-update), this is a strong hint that (in the failure case) it's the *PRIOR* (non-repack) operation which has kept these objects open beyond its actual operation scope. Suspected implementation sample code: if (operation_needed) { operation_workingset set; set.DoStuff(); if (repack_needed) { repack_handler repack; repack.DoStuff(); } } [NOTE the very prominent scope issues in this example, which might be the exact reason for hitting such unlink failures - simply due to having kept file descriptors open within the working set] I have not had a look at git source though to actually determine whether there do exist such severe operation scope issues that I'm strongly contemplating. Andreas Mohr -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
On Thu, Apr 16, 2015 at 01:42:35PM +0200, Andreas Mohr wrote: Hi, On Thu, Apr 16, 2015 at 01:31:02PM +0200, Johannes Schindelin wrote: Hi, On 2015-04-16 13:10, Thomas Braun wrote: I've never had this issue. The error message from unlinking the file means that someone is still accessing the file and thus it can not be deleted (due to the implicit file locking on windows). Best guess is that an antivirus is still accessing it. There is a tool called `WhoUses.exe` in msysGit (I do not remember if I included it into Git for Windows 1.x for end users) which could be used to figure out which process accesses a given file still: https://github.com/msysgit/msysgit/blob/master/mingw/bin/WhoUses.exe (maybe that would help you identify the cause of the problem). Oh my. Botched mail conversation... I tried to f'up on this messy start ASAP, so I even managed to omit this final *pre-existing* part: Please note that this system is hampered by a crappy virus scanner dependency (F-Secure), which could be the culprit for this issue (e.g. by keeping files busy for longer than expected), however I really don't think that it takes part in this issue. The reason that I suspect that it's not virus scanner related is: - standalone git gc --auto works immediately (hmm but this might also point at the opposite - namely virus scanner still accessing files of a prior operation only in case there *was* a prior operation) - file descriptor scope handling issue in git source code is very easily imaginable - only a very rebase-heavy workflow of a sufficiently large repo is likely to have this issue turn up in a frequently enough manner, thus it's quite likely that it's not observed (or reported) all too often OK, at this point in time it's my turn to actually verify that indeed it's NOT the virus scanner: - generate rebase-heavy activity - update - hit issue - unload virus (~ scanner?? I'm unsure on exact terminology to be used ;-) - update - profit!? (and possibly have a try at WhoUses.exe there, too - thanks for the hint!) Andreas Mohr -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
Hi Andreas, On 2015-04-16 14:35, Andreas Mohr wrote: On Thu, Apr 16, 2015 at 01:48:46PM +0200, Andreas Mohr wrote: OK, at this point in time it's my turn to actually verify that indeed it's NOT the virus scanner: - generate rebase-heavy activity - update - hit issue - unload virus (~ scanner?? I'm unsure on exact terminology to be used ;-) - update - profit!? Despite trying hard (generating a lot of activity, with different repo projects even) I cannot reproduce it in a timely manner, thus I'll have to wait until repo state has degraded in a sufficient manner for such a larger repack with that issue to occur again (probably a matter of weeks). Once it happens, I will: - ensure keeping a copy of the entire (problematic-state) repo, and verify reproducibility of its (copied/preserved) breakage - unload virus and do other tests - report back I guess the best way to trigger it is by ensuring that a lot of loose objects are accumulated, e.g. by running ```sh i=$(date +%s) j=0 while test $j -lt do echo test $(($i+$j)) git hash-object -w --stdin j=$(($j+1)) done ``` Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
Am 16.04.2015 um 12:03 schrieb Andreas Mohr: Hi all, over the years I've had the same phenomenon with various versions of msysgit (now at 1.9.5.msysgit.0, on Windows 7 64bit), so I'm now sufficiently confident of it being a long-standing, longer-term issue and thus I'm reporting it now. (CC'ing msysgit) Hi Andreas, Since I'm doing development in a sufficiently rebase-heavy manner, I seem to aggregate a lot of objects. Thus, when fetching content I'm sufficiently frequently greeted with a git gc run. This, however, does not work fully reliably: Auto packing the repository for optimum performance. You may also run git gc manually. See git help gc for more information. Counting objects: 206527, done. Delta compression using up to 4 threads. Compressing objects: 100% (27430/27430), done. Writing objects: 100% (206527/206527), done. Total 206527 (delta 178632), reused 206527 (delta 178632) Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.pack' failed. Should I try again? (y/n) n Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.idx' failed. Should I try again? (y/n) n Checking connectivity: 206527, done. A workable workaround for this recurring issue (such a fetch will fail repeatedly, thereby hampering my ability to update properly) is to manually do a git gc --auto prior to the fetch (which will then succeed). I've never had this issue. The error message from unlinking the file means that someone is still accessing the file and thus it can not be deleted (due to the implicit file locking on windows). Can you reproduce the error reliably? Thomas -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] t0027: Add repoMIX and LF_nul
Hi Torsten, On 2015-04-15 21:58, Torsten Bögershausen wrote: Yes, I try to be up-to-date with Git for Windows, but missed to follow this very patch. Do you plan to send it to git.git ? I did plan to send it to this mailing list (together with the other 23 branches we accumulated in all those years, at a mild pace, of course), but *after* releasing Git for Windows 2.x (which is now imminent). If you could cherry-pick my fix into your series, that would be splendid! However, if you feel that you came up with a better patch, please do not hesitate to drop my patch (I'd like to know about that, too, of course). Thanks! Johannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] test-lib-functions.sh: fix the second argument to some helper functions
The second argument to test_path_is_file and test_path_is_dir must be $2 and not $*, which instead would repeat the file name in the error message. Signed-off-by: Elia Pinto gitter.spi...@gmail.com --- t/test-lib-functions.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 0698ce7..8f8858a 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -478,7 +478,7 @@ test_external_without_stderr () { test_path_is_file () { if ! test -f $1 then - echo File $1 doesn't exist. $* + echo File $1 doesn't exist. $2 false fi } @@ -486,7 +486,7 @@ test_path_is_file () { test_path_is_dir () { if ! test -d $1 then - echo Directory $1 doesn't exist. $* + echo Directory $1 doesn't exist. $2 false fi } -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] log -L: improve error message on malformed argument
The old message did not mention the :regex:file form. To avoid overly long lines, split the message into two lines (in case item-string is long, it will be the only part truncated in a narrow terminal). Signed-off-by: Matthieu Moy matthieu@imag.fr --- line-log.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/line-log.c b/line-log.c index a490efe..e725248 100644 --- a/line-log.c +++ b/line-log.c @@ -575,7 +575,8 @@ parse_lines(struct commit *commit, const char *prefix, struct string_list *args) name_part = skip_range_arg(item-string); if (!name_part || *name_part != ':' || !name_part[1]) - die(-L argument '%s' not of the form start,end:file, + die(invalid -L argument '%s'.\n + It should be of the form start,end:file or :regex:file., item-string); range_part = xstrndup(item-string, name_part - item-string); name_part++; -- 2.4.0.rc1.42.g9642cc6 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] compat/mingw: stubs for getpgid() and tcgetpgrp()
Johannes Schindelin johannes.schinde...@gmx.de writes: On 2015-04-15 20:48, Junio C Hamano wrote: The patch should do for now, but I suspect that it may give us a better abstraction to make the is_foreground_fd(int fd) or even is_foreground(void) the public API that would be implemented as int we_are_in_the_foreground(void) { return getpgid(0) == tcgetpgrp(fileno(stderr)); } in POSIX and Windows can implement entirely differently. ... In general, I find it rewarding not only from a portability point of view, but *especially* from a readability one if the code contains functions that are named semantically, i.e. they describe *why* they are called, not *how* they answer the question. Yeah, that was the rationale behind the suggestion (i.e. how may be different depending on the platform, and the main code flow cares more about why and not how). I'll queue Luke's patch with J6t's compat/ for now, but if you find more suitable implementation for the higher-level abstraction, please do send a follow-up patch to update it. Two Johannes'es may need to talk between themselves to agree what the right level of abstraction is, though. I trust two of you a lot more than my gut feeling when it comes to POSIX vs Windows API differences ;-) Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dir: allow a BOM at the beginning of exclude files
On Thu, 2015-04-16 at 17:03 +0200, Johannes Schindelin wrote: Hi Carlos, On 2015-04-16 16:05, Carlos Martín Nieto wrote: Some text editors like Notepad or LibreOffice write an UTF-8 BOM in order to indicate that the file is Unicode text rather than whatever the current locale would indicate. If someone uses such an editor to edit a gitignore file, we are left with those three bytes at the beginning of the file. If we do not skip them, we will attempt to match a filename with the BOM as prefix, which won't match the files the user is expecting. --- If you're wondering how I came up with LibreOffice, I was doing a workshop recently and one of the participants was not content with the choice of vim or nano, so he opened LibreOffice to edit the gitignore file with confusing consequences. This codepath doesn't go as far as the config code in validating that we do not have a partial BOM which would mean there's some invalid content, but we don't really have invalid content any other way, as we're just dealing with a list of paths in the file. Yeah, users are entertaining! I agree that this is a good patch. *Maybe* we would need the same handling in more places, in which case it might make sense to refactor the test into its own function. Yes, this was my train of thought. If we (discover that) need it in a third place, then we can unify the test and skip. For two places I reckoned it was fine if we duplicated a bit. In any case, though, the Git project requires a [Developer's Certificate of Origin](https://github.com/git/git/blob/v2.3.5/Documentation/SubmittingPatches#L234-L277); Would you mind adding that? Yeah, sorry. I keep forgetting to do that. I'll reply to my original e-mail with it. Cheers, cmn -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dir: allow a BOM at the beginning of exclude files
On Thu, 2015-04-16 at 16:05 +0200, Carlos Martín Nieto wrote: Some text editors like Notepad or LibreOffice write an UTF-8 BOM in order to indicate that the file is Unicode text rather than whatever the current locale would indicate. If someone uses such an editor to edit a gitignore file, we are left with those three bytes at the beginning of the file. If we do not skip them, we will attempt to match a filename with the BOM as prefix, which won't match the files the user is expecting. Signed-off-by: Carlos Martín Nieto c...@elego.de which I keep forgetting. --- If you're wondering how I came up with LibreOffice, I was doing a workshop recently and one of the participants was not content with the choice of vim or nano, so he opened LibreOffice to edit the gitignore file with confusing consequences. This codepath doesn't go as far as the config code in validating that we do not have a partial BOM which would mean there's some invalid content, but we don't really have invalid content any other way, as we're just dealing with a list of paths in the file. dir.c | 8 +++- t/t7061-wtstatus-ignore.sh | 2 ++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/dir.c b/dir.c index 0943a81..6368247 100644 --- a/dir.c +++ b/dir.c @@ -581,6 +581,7 @@ int add_excludes_from_file_to_list(const char *fname, struct stat st; int fd, i, lineno = 1; size_t size = 0; + static const unsigned char *utf8_bom = (unsigned char *) \xef\xbb\xbf; char *buf, *entry; fd = open(fname, O_RDONLY); @@ -617,7 +618,12 @@ int add_excludes_from_file_to_list(const char *fname, } el-filebuf = buf; - entry = buf; + + if (size = 3 !memcmp(buf, utf8_bom, 3)) + entry = buf + 3; + else + entry = buf; + for (i = 0; i size; i++) { if (buf[i] == '\n') { if (entry != buf + i entry[0] != '#') { diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh index 460789b..0a06fbf 100755 --- a/t/t7061-wtstatus-ignore.sh +++ b/t/t7061-wtstatus-ignore.sh @@ -13,6 +13,8 @@ EOF test_expect_success 'status untracked directory with --ignored' ' echo ignored .gitignore + sed -e s/^/\xef\xbb\xbf/ .gitignore .gitignore.new + mv .gitignore.new .gitignore mkdir untracked : untracked/ignored : untracked/uncommitted -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] test-lib-functions.sh: fix the second argument to some helper functions
Elia Pinto gitter.spi...@gmail.com writes: --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -478,7 +478,7 @@ test_external_without_stderr () { test_path_is_file () { if ! test -f $1 then - echo File $1 doesn't exist. $* + echo File $1 doesn't exist. $2 false fi } @@ -486,7 +486,7 @@ test_path_is_file () { test_path_is_dir () { if ! test -d $1 then - echo Directory $1 doesn't exist. $* + echo Directory $1 doesn't exist. $2 false fi } Sounds straightforwardly correct to me. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dir: allow a BOM at the beginning of exclude files
Hi Carlos, On 2015-04-16 16:05, Carlos Martín Nieto wrote: Some text editors like Notepad or LibreOffice write an UTF-8 BOM in order to indicate that the file is Unicode text rather than whatever the current locale would indicate. If someone uses such an editor to edit a gitignore file, we are left with those three bytes at the beginning of the file. If we do not skip them, we will attempt to match a filename with the BOM as prefix, which won't match the files the user is expecting. --- If you're wondering how I came up with LibreOffice, I was doing a workshop recently and one of the participants was not content with the choice of vim or nano, so he opened LibreOffice to edit the gitignore file with confusing consequences. This codepath doesn't go as far as the config code in validating that we do not have a partial BOM which would mean there's some invalid content, but we don't really have invalid content any other way, as we're just dealing with a list of paths in the file. Yeah, users are entertaining! I agree that this is a good patch. *Maybe* we would need the same handling in more places, in which case it might make sense to refactor the test into its own function. In any case, though, the Git project requires a [Developer's Certificate of Origin](https://github.com/git/git/blob/v2.3.5/Documentation/SubmittingPatches#L234-L277); Would you mind adding that? Thanks, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
On Thu, Apr 16, 2015 at 01:35:05PM +0200, Andreas Mohr wrote: I strongly suspect that git's repacking implementation (probably unrelated to msysgit-specific deviations, IOW, git *core* handling) simply is buggy in that it may keep certain file descriptors open at least a certain time (depending on scope of implementation/objects!?) beyond having finished its operation (rename?). Hrm. I do not see anything in builtin/fetch.c that closes the packfile descriptors before running gc --auto. So basically the sequence: 1. Fetch performs actual fetch. It needs to open packfiles to do commit negotiation with other side (the hard work is done by an index-pack subprocess, but it is likely we have to access _some_ objects). 2. The packfiles remain open and mmap'd (at least on Linux) in the sha1_file.c:packed_git list. 3. We spawn gc --auto and wait for it to finish. While we are waiting, the descriptors are still open, but gc --auto will not be able to delete any packs. But this seems too simple to be the problem, as it would mean that just about any gc --auto that triggers a full repack would be a problem (so anytime you have about 50 packs). But maybe the gc autodetach behavior means it works racily. I was able to set up the situation deterministically by running the script below: -- 8 -- #!/bin/sh # XXX tweak this setting as appropriate PATH_TO_GIT_BUILD=$HOME/compile/git PATH=$PATH_TO_GIT_BUILD/bin-wrappers:$PATH rm -rf parent child # make a parent/child where the child will have to access # a packfile to fulfill another fetch git init parent git -C parent commit --allow-empty -m base git clone parent child git -C parent commit --allow-empty -m extra # we want to make our base pack really big, because otherwise # git will open/mmap/close it. So we must exceed core.packedgitlimit cd child $PATH_TO_GIT_BUILD/test-genrandom foo 500 file git add file git commit -m large file git repack -ad git config core.packedGitLimit 1M # now make some spare packs to bust the gc.autopacklimit for i in 1 2 3 4 5; do git commit --allow-empty -m $i git repack -d done git config gc.autoPackLimit 3 git config gc.autoDetach false GIT_TRACE=1 git fetch ``` I also instrumented my (v1.9.5) git build like this: diff --git a/builtin/fetch.c b/builtin/fetch.c index 025bc3e..fc99e5e 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1174,6 +1174,12 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) list.strdup_strings = 1; string_list_clear(list, 0); + { + struct packed_git *p; + for (p = packed_git; p; p = p-next) + trace_printf(pack %s has descriptor %d\n, +p-pack_name, p-pack_fd); + } run_command_v_opt(argv_gc_auto, RUN_GIT_CMD); return result; diff --git a/builtin/repack.c b/builtin/repack.c index bb2314c..e8b29cf 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -105,6 +105,7 @@ static void remove_redundant_pack(const char *dir_name, const char *base_name) for (i = 0; i ARRAY_SIZE(exts); i++) { strbuf_setlen(buf, plen); strbuf_addstr(buf, exts[i]); + trace_printf(unlinking %s\n, buf.buf); unlink(buf.buf); } strbuf_release(buf); to confirm what was happening (because of course on Linux it is perfectly fine to delete the open file). If this does trigger the bug for you, though, it should be obvious even without the trace calls. :) -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
Hi Peff, On 2015-04-16 17:28, Jeff King wrote: On Thu, Apr 16, 2015 at 01:35:05PM +0200, Andreas Mohr wrote: I strongly suspect that git's repacking implementation (probably unrelated to msysgit-specific deviations, IOW, git *core* handling) simply is buggy in that it may keep certain file descriptors open at least a certain time (depending on scope of implementation/objects!?) beyond having finished its operation (rename?). Hrm. [... detailed analysis, including a Minimal, Complete Verifiable Example ...] Thank you so much! I will definitely test this (at the moment, I have to recreate my build environment in a different VM than I used so far, that takes quite some time...) Thanks! Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
Please remove git-owner from the CC: list in future replies, thank you. :-) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [bug] first line truncated with `git log --oneline --decorate --graph`
Robin Moussu robin.mou...@gmail.com writes: I have a bug using the following command: git log --oneline --decorate --graph In short, the first line of the log is often truncated. ... # How to reproduce Open a small terminal windows (4*100) mkdir tmp cd tmp git init git commit --allow-empty -m 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam lectus.' git checkout -b long_branch_name_and_long_commit_name git commit --allow-empty -m 'Maecenas congue ligula ac quam viverra nec consectetur ante hendrerit.' git commit --allow-empty -m 'Praesent et diam eget libero egestas mattis sit amet vitae augue.' git checkout master git merge --no-ff long_branch_name_and_long_commit_name -m 'merge with a long commit message' git checkout long_branch_name_and_long_commit_name git merge master git log --oneline --decorate --graph I hope it is clear. The English is not my mother tongue. It is clear and it does not reproduce for me. I see * 5eff3a3 (HEAD - long_branch_name_and_long_commit_name, master) merge with a long commit message |\ | * 61e21f3 Praesent et diam eget libero egestas mattis sit amet vitae augue. : which looks perfectly sensible (my terminal is a screen running on a Ubuntu machine, displaying to a SecureShell terminal on a Chromebook). Can you try running that problematic git log with its standard output redirected to a file (i.e. git log ... output) and then run your pager in that wide-but-short terminal (i.e. less output), to see if the same problem is observed? And then run cat output in a taller terminal with the same width to see if it is the output from git log that is causing you the problem? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dir: allow a BOM at the beginning of exclude files
Carlos Martín Nieto c...@elego.de writes: diff --git a/dir.c b/dir.c index 0943a81..6368247 100644 --- a/dir.c +++ b/dir.c @@ -581,6 +581,7 @@ int add_excludes_from_file_to_list(const char *fname, struct stat st; int fd, i, lineno = 1; size_t size = 0; + static const unsigned char *utf8_bom = (unsigned char *) \xef\xbb\xbf; char *buf, *entry; fd = open(fname, O_RDONLY); @@ -617,7 +618,12 @@ int add_excludes_from_file_to_list(const char *fname, } el-filebuf = buf; - entry = buf; + + if (size = 3 !memcmp(buf, utf8_bom, 3)) OK. + entry = buf + 3; + else + entry = buf; + for (i = 0; i size; i++) { if (buf[i] == '\n') { if (entry != buf + i entry[0] != '#') { diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh index 460789b..0a06fbf 100755 --- a/t/t7061-wtstatus-ignore.sh +++ b/t/t7061-wtstatus-ignore.sh @@ -13,6 +13,8 @@ EOF test_expect_success 'status untracked directory with --ignored' ' echo ignored .gitignore + sed -e s/^/\xef\xbb\xbf/ .gitignore .gitignore.new + mv .gitignore.new .gitignore Is this write literal in \xHEX on the replacement side of sed substitution potable? In any case, replacing the above three with something like: printf bomignored\n .gitignore may be more sensible, no? Also do we need a similar change to the attribute side, or are we already covered and we forgot to do the same for the ignore files? mkdir untracked : untracked/ignored : untracked/uncommitted -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [bug] first line truncated with `git log --oneline --decorate --graph`
Hi, On 2015-04-16 17:28, Junio C Hamano wrote: Robin Moussu robin.mou...@gmail.com writes: I have a bug using the following command: git log --oneline --decorate --graph In short, the first line of the log is often truncated. I imagine that the pager (`less`) cuts off the lines because we start it with the `-S` option. Robin, if the pager does not quit right away and you see a truncated line, could you try to press the keys `-`, `Shift+S`, `Enter` and see whether they are untruncated? Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dir: allow a BOM at the beginning of exclude files
On Thu, Apr 16, 2015 at 08:39:55AM -0700, Junio C Hamano wrote: test_expect_success 'status untracked directory with --ignored' ' echo ignored .gitignore + sed -e s/^/\xef\xbb\xbf/ .gitignore .gitignore.new + mv .gitignore.new .gitignore Is this write literal in \xHEX on the replacement side of sed substitution potable? In any case, replacing the above three with something like: printf bomignored\n .gitignore may be more sensible, no? I'm not sure about sed, but I agree it is suspect. And note that printf with hex codes is not portable, either You have to use octal: printf '\357\273\277ignored\n' .gitignore Also, as a nit, I'd much rather see this in its own test rather than crammed into another test_expect_success. It's much easier to diagnose failures if the test description mentions the goal, and it is not tied up with testing other parts that might fail. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [bug] first line truncated with `git log --oneline --decorate --graph`
Hey folks, please remove git-owner from the CC: list, that goes to me and not the list :-) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git-owner, was Re: [bug] first line truncated with `git log --oneline --decorate --graph`
Hi David, On 2015-04-16 17:56, David Miller wrote: Hey folks, please remove git-owner from the CC: list, that goes to me and not the list :-) I feared as much, but I cannot recall putting you on any Cc: myself. It appeared to me as if the list added git-owner@vger as sender, at least under *some* circumstances... Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dir: allow a BOM at the beginning of exclude files
Hi, On 2015-04-16 17:39, Junio C Hamano wrote: Also do we need a similar change to the attribute side, or are we already covered and we forgot to do the same for the ignore files? I fear so: https://github.com/git/git/blob/v2.3.5/attr.c#L359-L376 As for the config, we are safe: https://github.com/git/git/blob/v2.3.5/config.c#L419-L439 Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dir: allow a BOM at the beginning of exclude files
On 2015-04-16 16.05, Carlos Martín Nieto wrote: [] May be it is easier to move this into an own function, like remove_utf8_bom() ? dir.c | 8 +++- t/t7061-wtstatus-ignore.sh | 2 ++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/dir.c b/dir.c index 0943a81..6368247 100644 --- a/dir.c +++ b/dir.c @@ -581,6 +581,7 @@ int add_excludes_from_file_to_list(const char *fname, struct stat st; int fd, i, lineno = 1; size_t size = 0; + static const unsigned char *utf8_bom = (unsigned char *) \xef\xbb\xbf; Do we really need to cast here (and if, is the cast dropping the const ?) Another suggestion, see below: either: static const size_t bom_len = 3; or static const size_t bom_len = strlen(utf8_bom); char *buf, *entry; fd = open(fname, O_RDONLY); @@ -617,7 +618,12 @@ int add_excludes_from_file_to_list(const char *fname, } el-filebuf = buf; - entry = buf; + And now we can avoid magic numbers: if (size = bom_len !memcmp(buf, utf8_bom, bom_len)) entry = buf + bom_len; else entry = buf; [] -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-owner, was Re: [bug] first line truncated with `git log --oneline --decorate --graph`
[retaining davem in cc in case he is curious, and hopefully does not see this as more spam :) ] On Thu, Apr 16, 2015 at 06:11:47PM +0200, Johannes Schindelin wrote: On 2015-04-16 17:56, David Miller wrote: Hey folks, please remove git-owner from the CC: list, that goes to me and not the list :-) I feared as much, but I cannot recall putting you on any Cc: myself. It appeared to me as if the list added git-owner@vger as sender, at least under *some* circumstances... Weird. In a nearby thread with the same problem, the first email that mentions git-owner in a cc header is yours[1]. It's in reply to a message that does not mention git-owner at all[2], except in the Sender field. Your agent header looks like: User-Agent: Roundcube Webmail/1.1.0 Maybe their reply to all function is a little over-zealous? Messages from you from a few days ago do not show this problem, but several starting on the 15th do. I don't see anything changing in the messages from the list. Maybe your webmail provider changed something? Or maybe it is a difference between replying to the version that came through the list rather than one that came straight to you. -Peff [1,2] Rather than link to an archive, I'll attach full messages as I received them to make sure the headers are intact. one.mbox Description: application/mbox two.mbox Description: application/mbox
Re: git-owner, was Re: [bug] first line truncated with `git log --oneline --decorate --graph`
From: Jeff King p...@peff.net Date: Thu, 16 Apr 2015 12:26:21 -0400 Weird. In a nearby thread with the same problem, the first email that mentions git-owner in a cc header is yours[1]. It's in reply to a message that does not mention git-owner at all[2], except in the Sender field. Your agent header looks like: User-Agent: Roundcube Webmail/1.1.0 Maybe their reply to all function is a little over-zealous? This is always caused by broken reply list handling in email clients. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-owner, was Re: [bug] first line truncated with `git log --oneline --decorate --graph`
On 2015-04-16 18:31, David Miller wrote: From: Jeff King p...@peff.net Date: Thu, 16 Apr 2015 12:26:21 -0400 Weird. In a nearby thread with the same problem, the first email that mentions git-owner in a cc header is yours[1]. It's in reply to a message that does not mention git-owner at all[2], except in the Sender field. Your agent header looks like: User-Agent: Roundcube Webmail/1.1.0 Maybe their reply to all function is a little over-zealous? This is always caused by broken reply list handling in email clients. That must be it. Dave, my apologies! Will investigate *right now*. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dir: allow a BOM at the beginning of exclude files
Jeff King p...@peff.net writes: On Thu, Apr 16, 2015 at 08:39:55AM -0700, Junio C Hamano wrote: test_expect_success 'status untracked directory with --ignored' ' echo ignored .gitignore + sed -e s/^/\xef\xbb\xbf/ .gitignore .gitignore.new + mv .gitignore.new .gitignore Is this write literal in \xHEX on the replacement side of sed substitution potable? In any case, replacing the above three with something like: printf bomignored\n .gitignore may be more sensible, no? I'm not sure about sed, but I agree it is suspect. And note that printf with hex codes is not portable, either You have to use octal: printf '\357\273\277ignored\n' .gitignore Also, as a nit, I'd much rather see this in its own test rather than crammed into another test_expect_success. It's much easier to diagnose failures if the test description mentions the goal, and it is not tied up with testing other parts that might fail. Yeah, I totally agree. Carlos, something like this squashed in, perhaps? t/t7061-wtstatus-ignore.sh | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh index 0a06fbf..cdc0747 100755 --- a/t/t7061-wtstatus-ignore.sh +++ b/t/t7061-wtstatus-ignore.sh @@ -13,8 +13,6 @@ EOF test_expect_success 'status untracked directory with --ignored' ' echo ignored .gitignore - sed -e s/^/\xef\xbb\xbf/ .gitignore .gitignore.new - mv .gitignore.new .gitignore mkdir untracked : untracked/ignored : untracked/uncommitted @@ -22,6 +20,15 @@ test_expect_success 'status untracked directory with --ignored' ' test_cmp expected actual ' +test_expect_success 'same with gitignore starting with BOM' ' + printf \357\273\277ignored\n .gitignore + mkdir -p untracked + : untracked/ignored + : untracked/uncommitted + git status --porcelain --ignored actual + test_cmp expected actual +' + cat expected \EOF ?? .gitignore ?? actual -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-owner, was Re: [bug] first line truncated with `git log --oneline --decorate --graph`
Hi Junio and Johanes, It was effectively a less problem, and not a git problem. My $LESS is '-r'. If I make export LESS='-R' , it is fixed. Thanks a lot. Cheers. Robin. signature.asc Description: OpenPGP digital signature
Re: [PATCH v8 2/4] cat-file: teach cat-file a '--literally' option
karthik nayak karthik@gmail.com writes: On 04/16/2015 02:22 AM, Junio C Hamano wrote: This is a tangent, but while we are in the vicinity, we may want to rethink the help message we attach to the '-e' option. Technically the current message is _not_ wrong per-se, but it misses the point. The primary thing the option does is to check the (e)xistence of the named object, and the fact that it does so silently is merely a detail of the operation. The current help text omits the more important part of what the option is. Would you rather check '-e' and go on to check '-p' or do you merely just want a different message. I meant just a different message. The point of -e is to see if the thing exists. It is good to mention _how_ the result is reported back to the user (i.e. via the exit code, not via an output to the standard output exists vs missing, for example), but that is secondary. Telling how it reports is meaningless without telling what it reports in the first place. ... when a user is giving the '-e' option he just expects a silent output if the object exists, hence we rather have the option '-e' behave as a mutually exclusive option... Yes, and that is in line with the switch to OPT_CMDMODE. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] dir: allow a BOM at the beginning of exclude files
Some text editors like Notepad or LibreOffice write an UTF-8 BOM in order to indicate that the file is Unicode text rather than whatever the current locale would indicate. If someone uses such an editor to edit a gitignore file, we are left with those three bytes at the beginning of the file. If we do not skip them, we will attempt to match a filename with the BOM as prefix, which won't match the files the user is expecting. --- If you're wondering how I came up with LibreOffice, I was doing a workshop recently and one of the participants was not content with the choice of vim or nano, so he opened LibreOffice to edit the gitignore file with confusing consequences. This codepath doesn't go as far as the config code in validating that we do not have a partial BOM which would mean there's some invalid content, but we don't really have invalid content any other way, as we're just dealing with a list of paths in the file. dir.c | 8 +++- t/t7061-wtstatus-ignore.sh | 2 ++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/dir.c b/dir.c index 0943a81..6368247 100644 --- a/dir.c +++ b/dir.c @@ -581,6 +581,7 @@ int add_excludes_from_file_to_list(const char *fname, struct stat st; int fd, i, lineno = 1; size_t size = 0; + static const unsigned char *utf8_bom = (unsigned char *) \xef\xbb\xbf; char *buf, *entry; fd = open(fname, O_RDONLY); @@ -617,7 +618,12 @@ int add_excludes_from_file_to_list(const char *fname, } el-filebuf = buf; - entry = buf; + + if (size = 3 !memcmp(buf, utf8_bom, 3)) + entry = buf + 3; + else + entry = buf; + for (i = 0; i size; i++) { if (buf[i] == '\n') { if (entry != buf + i entry[0] != '#') { diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh index 460789b..0a06fbf 100755 --- a/t/t7061-wtstatus-ignore.sh +++ b/t/t7061-wtstatus-ignore.sh @@ -13,6 +13,8 @@ EOF test_expect_success 'status untracked directory with --ignored' ' echo ignored .gitignore + sed -e s/^/\xef\xbb\xbf/ .gitignore .gitignore.new + mv .gitignore.new .gitignore mkdir untracked : untracked/ignored : untracked/uncommitted -- 2.0.0.5.gbce14aa -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] UTF8 BOM follow-up
This is on top of the .gitignore can start with UTF8 BOM patch from Carlos. Junio C Hamano (3): utf8-bom: introduce skip_utf8_bom() helper config: use utf8_bom[] from utf.[ch] in git_parse_source() attr: skip UTF8 BOM at the beginning of the input file attr.c | 9 +++-- config.c | 6 +++--- dir.c| 8 +++- utf8.c | 11 +++ utf8.h | 3 +++ 5 files changed, 27 insertions(+), 10 deletions(-) -- 2.4.0-rc2-171-g98ddf7f -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] utf8-bom: introduce skip_utf8_bom() helper
With the recent change to ignore the UTF8 BOM at the beginning of .gitignore files, we now have two codepaths that do such a skipping (the other one is for reading the configuration files). Introduce utf8_bom[] constant string and skip_utf8_bom() helper and teach .gitignore code how to use it. Signed-off-by: Junio C Hamano gits...@pobox.com --- dir.c | 8 +++- utf8.c | 11 +++ utf8.h | 3 +++ 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/dir.c b/dir.c index 10c1f90..9e04a23 100644 --- a/dir.c +++ b/dir.c @@ -12,6 +12,7 @@ #include refs.h #include wildmatch.h #include pathspec.h +#include utf8.h struct path_simplify { int len; @@ -538,7 +539,6 @@ int add_excludes_from_file_to_list(const char *fname, struct stat st; int fd, i, lineno = 1; size_t size = 0; - static const unsigned char *utf8_bom = (unsigned char *) \xef\xbb\xbf; char *buf, *entry; fd = open(fname, O_RDONLY); @@ -576,10 +576,8 @@ int add_excludes_from_file_to_list(const char *fname, el-filebuf = buf; - if (size = 3 !memcmp(buf, utf8_bom, 3)) - entry = buf + 3; - else - entry = buf; + entry = buf; + skip_utf8_bom(entry, size); for (i = 0; i size; i++) { if (buf[i] == '\n') { diff --git a/utf8.c b/utf8.c index 520fbb4..28e6d76 100644 --- a/utf8.c +++ b/utf8.c @@ -633,3 +633,14 @@ int is_hfs_dotgit(const char *path) return 1; } + +const char utf8_bom[] = \357\273\277; + +int skip_utf8_bom(char **text, size_t len) +{ + if (len strlen(utf8_bom) || + memcmp(*text, utf8_bom, strlen(utf8_bom))) + return 0; + *text += strlen(utf8_bom); + return 1; +} diff --git a/utf8.h b/utf8.h index e4d9183..e7b2aa4 100644 --- a/utf8.h +++ b/utf8.h @@ -13,6 +13,9 @@ int same_encoding(const char *, const char *); __attribute__((format (printf, 2, 3))) int utf8_fprintf(FILE *, const char *, ...); +extern const char utf8_bom[]; +extern int skip_utf8_bom(char **, size_t); + void strbuf_add_wrapped_text(struct strbuf *buf, const char *text, int indent, int indent2, int width); void strbuf_add_wrapped_bytes(struct strbuf *buf, const char *data, int len, -- 2.4.0-rc2-171-g98ddf7f -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] config: use utf8_bom[] from utf.[ch] in git_parse_source()
Because the function reads one character at the time, unfortunately we cannot use the easier skip_utf8_bom() helper, but at least we do not have to duplicate the constant string this way. Signed-off-by: Junio C Hamano gits...@pobox.com --- config.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/config.c b/config.c index 752e2e2..9618aa4 100644 --- a/config.c +++ b/config.c @@ -12,6 +12,7 @@ #include quote.h #include hashmap.h #include string-list.h +#include utf8.h struct config_source { struct config_source *prev; @@ -412,8 +413,7 @@ static int git_parse_source(config_fn_t fn, void *data) struct strbuf *var = cf-var; /* U+FEFF Byte Order Mark in UTF8 */ - static const unsigned char *utf8_bom = (unsigned char *) \xef\xbb\xbf; - const unsigned char *bomptr = utf8_bom; + const char *bomptr = utf8_bom; for (;;) { int c = get_next_char(); @@ -421,7 +421,7 @@ static int git_parse_source(config_fn_t fn, void *data) /* We are at the file beginning; skip UTF8-encoded BOM * if present. Sane editors won't put this in on their * own, but e.g. Windows Notepad will do it happily. */ - if ((unsigned char) c == *bomptr) { + if (c == (*bomptr 0377)) { bomptr++; continue; } else { -- 2.4.0-rc2-171-g98ddf7f -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git-archive ignores submodules
I've been using git-archive as my main way of deploying to production servers, but today I've come across a git repo with submodules and found out that git archive has no option to include submodules on the output archive. This simply makes git-archive unusable on this scenario. -- Pedro Rodrigues +244 917 774 823 +351 969 042 335 Mail: prodrigues1...@gmail.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] attr: skip UTF8 BOM at the beginning of the input file
Signed-off-by: Junio C Hamano gits...@pobox.com --- attr.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/attr.c b/attr.c index cd54697..7c530f4 100644 --- a/attr.c +++ b/attr.c @@ -12,6 +12,7 @@ #include exec_cmd.h #include attr.h #include dir.h +#include utf8.h const char git_attr__true[] = (builtin)true; const char git_attr__false[] = \0(builtin)false; @@ -369,8 +370,12 @@ static struct attr_stack *read_attr_from_file(const char *path, int macro_ok) return NULL; } res = xcalloc(1, sizeof(*res)); - while (fgets(buf, sizeof(buf), fp)) - handle_attr_line(res, buf, path, ++lineno, macro_ok); + while (fgets(buf, sizeof(buf), fp)) { + char *bufp = buf; + if (!lineno) + skip_utf8_bom(bufp, strlen(bufp)); + handle_attr_line(res, bufp, path, ++lineno, macro_ok); + } fclose(fp); return res; } -- 2.4.0-rc2-171-g98ddf7f -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-archive ignores submodules
On Thu, Apr 16, 2015 at 06:35:38PM +0100, Pedro Rodrigues wrote: I've been using git-archive as my main way of deploying to production servers, but today I've come across a git repo with submodules and found out that git archive has no option to include submodules on the output archive. As far as I know this is an known limitation that's just waiting for someone to solve. Thanks for bringing attention to it. This simply makes git-archive unusable on this scenario. Not completely. There's a simple workaround. Combine git submodule foreach with git archive and make an archive for each submodule. Not as simple as if git archive --recurse-submodule would have been implementet, but hopefully it can make things work for you at the moment. -- Fredrik Gustafsson phone: +46 733-608274 e-mail: iv...@iveqy.com website: http://www.iveqy.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] log -L: improve error message on malformed argument
Matthieu Moy matthieu@imag.fr writes: The old message did not mention the :regex:file form. To avoid overly long lines, split the message into two lines (in case item-string is long, it will be the only part truncated in a narrow terminal). Signed-off-by: Matthieu Moy matthieu@imag.fr --- line-log.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/line-log.c b/line-log.c index a490efe..e725248 100644 --- a/line-log.c +++ b/line-log.c @@ -575,7 +575,8 @@ parse_lines(struct commit *commit, const char *prefix, struct string_list *args) name_part = skip_range_arg(item-string); if (!name_part || *name_part != ':' || !name_part[1]) - die(-L argument '%s' not of the form start,end:file, + die(invalid -L argument '%s'.\n + It should be of the form start,end:file or :regex:file., item-string); range_part = xstrndup(item-string, name_part - item-string); name_part++; Hmm. While I understand and even agree with the reasoning behind chomping the line after a potentially long user-supplied argument, it somewhat bothers me that the output of subsequent lines would begin without the prefix. Do we have any other multi-line die/error/warning message? I am tempted to suggest doing something along the lines of the attached, if we were to start using multi-line die/error/warning like this one. Obviously we would need something similar on the vwritef() side as well. usage.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/usage.c b/usage.c index ed14645..09710fa 100644 --- a/usage.c +++ b/usage.c @@ -8,9 +8,26 @@ void vreportf(const char *prefix, const char *err, va_list params) { - char msg[4096]; - vsnprintf(msg, sizeof(msg), err, params); - fprintf(stderr, %s%s\n, prefix, msg); + char msg[4096], *bol; + int len; + + len = vsnprintf(msg, sizeof(msg), err, params); + if (sizeof(msg) len) + len = sizeof(msg); + bol = msg; + while (1) { + int linelen; + char *eol = memchr(bol, '\n', len); + if (!eol) + linelen = len; + else + linelen = eol - bol; + fprintf(stderr, %s%.*s\n, prefix, linelen, bol); + if (!eol) + break; + bol = eol + 1; + len -= linelen + 1; + } } void vwritef(int fd, const char *prefix, const char *err, va_list params) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-archive ignores submodules
Am 16.04.2015 um 20:09 schrieb Pedro Rodrigues: Good to know about git submodule foreach. Simpler yet, I'm using: zip -r ../project.zip . -x *.git* Which essentially does the same thing I would need from git-archive --recurse-submodule, zip everything excluding .git folders. Still would be better to use git itself. Yes. 2015-04-16 18:56 GMT+01:00 Fredrik Gustafsson iv...@iveqy.com: On Thu, Apr 16, 2015 at 06:35:38PM +0100, Pedro Rodrigues wrote: I've been using git-archive as my main way of deploying to production servers, but today I've come across a git repo with submodules and found out that git archive has no option to include submodules on the output archive. As far as I know this is an known limitation that's just waiting for someone to solve. Thanks for bringing attention to it. I just rebased a patch Lars Hjemli posted in 2009 (which I kept in my GitHub repo to resurrect it when I find the time) to current master: https://github.com/jlehmann/git-submod-enhancements/commits/archive--submodules See gmane/$107030 for its original posting. Apart from renaming the '--submodule' option to '--recurse-submodules' and solving a merge conflict I didn't change anything. Not sure why it wasn't accepted back then, I'll have to read that thread more closely ... If people are interested I could try to polish it and resubmit it. It would be great if Pedro could test that it does what he expects. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] log -L: improve error message on malformed argument
On Thu, Apr 16, 2015 at 5:45 PM, Junio C Hamano gits...@pobox.com wrote: Matthieu Moy matthieu@imag.fr writes: The old message did not mention the :regex:file form. To avoid overly long lines, split the message into two lines (in case item-string is long, it will be the only part truncated in a narrow terminal). Signed-off-by: Matthieu Moy matthieu@imag.fr --- line-log.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/line-log.c b/line-log.c index a490efe..e725248 100644 --- a/line-log.c +++ b/line-log.c @@ -575,7 +575,8 @@ parse_lines(struct commit *commit, const char *prefix, struct string_list *args) name_part = skip_range_arg(item-string); if (!name_part || *name_part != ':' || !name_part[1]) - die(-L argument '%s' not of the form start,end:file, + die(invalid -L argument '%s'.\n + It should be of the form start,end:file or :regex:file., item-string); range_part = xstrndup(item-string, name_part - item-string); name_part++; You forgot to update tests to match their expectations? 4211.20 wants to see 'argument.*not of the form', it seems. An alternate wording might be: -L argument not 'start,end:file' or ':regex:file': %s This can still wrap onscreen but the important information (the two forms accepted by -L) is at the head of the error message, and the variable (user-supplied) bit is at the end after the colon, in typical error message form. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/4] UTF8 BOM follow-up
On Thu, Apr 16, 2015 at 11:39:04AM -0700, Junio C Hamano wrote: This is on top of the .gitignore can start with UTF8 BOM patch from Carlos. Second try; the first patch is new to clarify the logic in the codeflow after Carlos's patch, and the second one has been adjusted accordingly. Junio C Hamano (4): add_excludes_from_file: clarify the bom skipping logic utf8-bom: introduce skip_utf8_bom() helper config: use utf8_bom[] from utf.[ch] in git_parse_source() attr: skip UTF8 BOM at the beginning of the input file This one looks OK to me. The manual adjustment of size in the second one is a little funny, but I guess avoiding a pointer for that size parameter makes the final one (that uses strlen) a lot easier. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] test-lib-functions.sh: fix the second argument to some helper functions
Matthieu Moy matthieu@grenoble-inp.fr writes: Elia Pinto gitter.spi...@gmail.com writes: --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -478,7 +478,7 @@ test_external_without_stderr () { test_path_is_file () { if ! test -f $1 then -echo File $1 doesn't exist. $* +echo File $1 doesn't exist. $2 false fi } @@ -486,7 +486,7 @@ test_path_is_file () { test_path_is_dir () { if ! test -d $1 then -echo Directory $1 doesn't exist. $* +echo Directory $1 doesn't exist. $2 false fi } Sounds straightforwardly correct to me. Thanks. This however makes me wonder why you were nominated for reviewing this patch, though... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
[git-owner CC dutifully removed] On Thu, Apr 16, 2015 at 05:48:42PM +0200, Johannes Schindelin wrote: Hi Peff, On 2015-04-16 17:28, Jeff King wrote: On Thu, Apr 16, 2015 at 01:35:05PM +0200, Andreas Mohr wrote: I strongly suspect that git's repacking implementation (probably unrelated to msysgit-specific deviations, IOW, git *core* handling) simply is buggy in that it may keep certain file descriptors open at least a certain time (depending on scope of implementation/objects!?) beyond having finished its operation (rename?). Hrm. [... detailed analysis, including a Minimal, Complete Verifiable Example ...] Thank you so much! I will definitely test this (at the moment, I have to recreate my build environment in a different VM than I used so far, that takes quite some time...) Your hash-object script successfully and with ease managed to provoke the issue again, thanks a lot! (syntax issue though: missed a '|' pipe). And I then did some unload tests (force-unloaded, via End Process Tree) of the virus, and the unlink issue persisted (but to be truly certain, I would have to rename away the entire virus installation tree). Not to mention that it already looks anyway like we seem to be on the way of nailing a genuine git handling bug... Also, I have a very hard time remembering that the retry unlink? EVER finally ended up successful (despite virus file activity surely being a very temporary thing!). So much for some related observations that I can contribute currently - I had no time left to actually work on it today but I'll try to do some testing given the very detailed (and gratifyingly matching :) analysis of Jeff King (thanks a lot, too!). Andreas Mohr -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] log -L: improve error message on malformed argument
Matthieu Moy matthieu@imag.fr writes: The old message did not mention the :regex:file form. To avoid overly long lines, split the message into two lines (in case item-string is long, it will be the only part truncated in a narrow terminal). Signed-off-by: Matthieu Moy matthieu@imag.fr --- line-log.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/line-log.c b/line-log.c index a490efe..e725248 100644 --- a/line-log.c +++ b/line-log.c @@ -575,7 +575,8 @@ parse_lines(struct commit *commit, const char *prefix, struct string_list *args) name_part = skip_range_arg(item-string); if (!name_part || *name_part != ':' || !name_part[1]) - die(-L argument '%s' not of the form start,end:file, + die(invalid -L argument '%s'.\n + It should be of the form start,end:file or :regex:file., item-string); range_part = xstrndup(item-string, name_part - item-string); name_part++; You forgot to update tests to match their expectations? 4211.20 wants to see 'argument.*not of the form', it seems. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git-p4: Use -m when running p4 changes
Lex Spoon l...@lexspoon.org writes: From 9cc607667a20317c837afd90d50c078da659b72f Mon Sep 17 00:00:00 2001 From: Lex Spoon l...@lexspoon.org Date: Sat, 11 Apr 2015 10:01:15 -0400 Subject: [PATCH] git-p4: Use -m when running p4 changes All of the above is duplicate and shouldn't be added to the message; the recipient can pick them up from the e-mail headers. Please explain what this change intends to do (e.g. Is it a fix? If so, what is broken without this change? Is it an enhancement? If so, what cannot be done without this change, and how and why is the new thing the change enables a good thing?), and why it is a good idea to use -m to realize that objective. Signed-off-by: Lex Spoon l...@lexspoon.org --- Updated to include a test case git-p4.py | 51 ++- t/t9818-git-p4-block.sh | 64 + 2 files changed, 104 insertions(+), 11 deletions(-) create mode 100755 t/t9818-git-p4-block.sh diff --git a/git-p4.py b/git-p4.py index 549022e..2fc8d9c 100755 --- a/git-p4.py +++ b/git-p4.py @@ -740,17 +740,43 @@ def createOrUpdateBranchesFromOrigin(localRefPrefix = refs/remotes/p4/, silent def originP4BranchesExist(): return gitBranchExists(origin) or gitBranchExists(origin/p4) or gitBranchExists(origin/p4/master) It appears that the patch is severely linewrapped. diff --git a/t/t9818-git-p4-block.sh b/t/t9818-git-p4-block.sh new file mode 100755 index 000..73e545d --- /dev/null +++ b/t/t9818-git-p4-block.sh @@ -0,0 +1,64 @@ +#!/bin/sh + +test_description='git p4 fetching changes in multiple blocks' + +. ./lib-git-p4.sh + +test_expect_success 'start p4d' ' + start_p4d +' We do not do one-SP indent. Indent with tab instead. + +test_expect_success 'Create a repo with 100 changes' ' + ( + cd $cli + touch file.txt Do not use touch when the only thing you are interested in is that the file exists and you do not care about its timestamp. I.e. say file.txt instead. + p4 add file.txt + p4 submit -d Add file.txt + for i in 0 1 2 3 4 5 6 7 8 9 + do + touch outer$i.txt + p4 add outer$i.txt + p4 submit -d Adding outer$i.txt + for j in 0 1 2 3 4 5 6 7 8 9 + do + p4 edit file.txt + echo $i$j file.txt + p4 submit -d Commit $i$j + done + done + ) What happens when any of these commands in the -chain fails? ( cd $cli file.txt p4 ... for i in $(test_seq ...) do outer$i.txt p4 ... for j in $(test_seq ...) do p4 ... p4 ... || exit done done ) or something like that, perhaps? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: assert failed in submodule edge case
Am 13.04.2015 um 18:55 schrieb Dennis Kaarsemaker: Reported by djanos_ in #git: git add segfaults when you manage to confuse it with a submodule in the index that is no longer a submodule. Here's his script to reproduce the segfault: mkdir segfault cd segfault mkdir subrepo cd subrepo git init . echo a a git add a git commit -m a cd .. git init . git add . If you add a git rm -f --cached subrepo here the problem goes away. cd subrepo rm -rf .git git add . This last git add will now barf: git: pathspec.c:317: prefix_pathspec: Assertion `item-nowildcard_len = item-len item-prefix = item-len' failed. These all work as I think they are intended to: $ git -C segfault add subrepo/a fatal: Pathspec 'subrepo/a' is in submodule 'subrepo' $ git -C segfault/subrepo add a fatal: Pathspec 'a' is in submodule 'subrepo' And this does nothing, it also doesn't crash: $ git -C segfault add subrepo GDB backtrace below, this is with next as of e6f3401. Starting program: /home/dennis/code/git/git -C segfault/subrepo add . [Thread debugging using libthread_db enabled] Using host libthread_db library /lib/x86_64-linux-gnu/libthread_db.so.1. git: pathspec.c:317: prefix_pathspec: Assertion `item-nowildcard_len = item-len item-prefix = item-len' failed. Program received signal SIGABRT, Aborted. 0x7702ae37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0x7702ae37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x7702c528 in __GI_abort () at abort.c:89 #2 0x77023ce6 in __assert_fail_base (fmt=0x77173c08 %s%s%s:%u: %s%sAssertion `%s' failed.\n%n, assertion=assertion@entry=0x560660 item-nowildcard_len = item-len item-prefix = item-len, file=file@entry=0x560826 pathspec.c, line=line@entry=317, function=function@entry=0x560850 __PRETTY_FUNCTION__.22237 prefix_pathspec) at assert.c:92 #3 0x77023d92 in __GI___assert_fail (assertion=assertion@entry=0x560660 item-nowildcard_len = item-len item-prefix = item-len, file=file@entry=0x560826 pathspec.c, line=line@entry=317, function=function@entry=0x560850 __PRETTY_FUNCTION__.22237 prefix_pathspec) at assert.c:101 #4 0x004d6a37 in prefix_pathspec (elt=0x7fffdaf1 ., prefixlen=8, prefix=0x7dd09f subrepo/, flags=50, raw=0x7fffd648, p_short_magic=synthetic pointer, item=optimized out) at pathspec.c:316 #5 parse_pathspec (pathspec=pathspec@entry=0x7fffd240, magic_mask=magic_mask@entry=0, flags=flags@entry=50, prefix=prefix@entry=0x7dd09f subrepo/, argv=argv@entry=0x7fffd648) at pathspec.c:417 #6 0x0040698c in cmd_add (argc=optimized out, argv=0x7fffd648, prefix=0x7dd09f subrepo/) at builtin/add.c:370 #7 0x00406001 in run_builtin (argv=0x7fffd640, argc=2, p=0x79d740 commands) at git.c:350 #8 handle_builtin (argc=2, argv=0x7fffd640) at git.c:532 #9 0x00405151 in run_argv (argv=0x7fffd458, argcp=0x7fffd43c) at git.c:578 #10 main (argc=2, av=optimized out) at git.c:686 I dig a bit into pathspec.c and it looks like the PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE code in prefix_pathspec is only stripping out paths inside the submodule, not the submodule itself. I also guess that it shouldn't as otherwise you can't ever 'git add' a submodule. So I have no idea how to proceed, or even what the correct behaviour of 'git add' should be in this case. I do think that failing an assert() to tell a user he's doing something unexpected/silly/wrong isn't the right thing to do though :) The problem seems to be that subrepo is still registered as a submodule in the index. But we should see a proper error message instead of an assert in that case ... CCed Duy who knows much more about pathspec.c than me. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-archive ignores submodules
I sure can. Just send me an ID I can pull and test in here (not really into C, so this the least I can contribute). Although, my expectations are very simple, I just expect(ed) the exact same git-archive command to be run on submodule(s), and have an output on a single zip|tar|whatever file. Not completely off topic, but for consistency consider that: git-clone supports --recursive and --recurse-submodules, which do the same thing. git-pull and git-push only support --recurse-submodules. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/7] path.c: implement xdg_config_home()
On Tue, Apr 14, 2015 at 1:28 PM, Paul Tan pyoka...@gmail.com wrote: Below is the fixed patch. I also decided to return NULL if `filename` is NULL because such an input usually indicated an uncaught error. Unfortunately, this blurs the line between programmer error (passing NULL for filename) and a user/configuration error (XDG_CONFIG_HOME and HOME being undefined). If there is indeed no valid interpretation for filename==NULL, then it may be better to die() or assert() here to flag the programmer error as early as possible, rather than returning NULL. More below. 8 The XDG base dir spec[1] specifies that configuration files be stored in a subdirectory in $XDG_CONFIG_HOME. To construct such a configuration file path, home_config_paths() can be used. However, home_config_paths() combines distinct functionality: 1. Retrieve the home git config file path ~/.gitconfig 2. Construct the XDG config path of the file specified by `file`. This function was introduced in commit 21cf3227 (read (but not write) from $XDG_CONFIG_HOME/git/config file). While the intention of the function was to allow the home directory configuration file path and the xdg directory configuration file path to be retrieved with one function call, the hard-coding of the path ~/.gitconfig prevents it from being used for other configuration files. Furthermore, retrieving a file path relative to the user's home directory can be done with expand_user_path(). Hence, it can be seen that home_config_paths() introduces unnecessary complexity, especially if a user just wants to retrieve the xdg config file path. As such, implement a simpler function xdg_config_home() for constructing the XDG base dir spec configuration file path. This function, together with expand_user_path(), can replace all uses of home_config_paths(). [1] http://standards.freedesktop.org/basedir-spec/basedir-spec-0.7.html Signed-off-by: Paul Tan pyoka...@gmail.com --- diff --git a/cache.h b/cache.h index 3d3244b..2db10b8 100644 --- a/cache.h +++ b/cache.h @@ -836,6 +836,14 @@ char *strip_path_suffix(const char *path, const char *suffix); int daemon_avoid_alias(const char *path); extern int is_ntfs_dotgit(const char *name); +/** + * Returns the newly allocated string $XDG_CONFIG_HOME/git/{filename}. If + * $XDG_CONFIG_HOME is unset or empty, returns the newly allocated string + * $HOME/.config/git/{filename}. Returns NULL if filename is NULL or an error + * occurred. + */ This is better than the original which said $XDG_CONFIG_HOME/git/%s, but is still potentially confusing. When I read the earlier iteration, I was left with the impression that it was returning that literal string, with '$' and '%s' embedded, and that the caller would have to process it further to have '$' and '%s' expanded. Perhaps rephrasing it something like this will help? Return a newly allocated string with value xdg+/git/+filename where xdg is the interpolated value of $XDG_CONFIG_HOME if defined and non-empty, otherwise $HOME/.config. Return NULL upon error. Also, for consistency with other API documentation, say Return rather than Returns. More below. +extern char *xdg_config_home(const char *filename); + /* object replacement */ #define LOOKUP_REPLACE_OBJECT 1 extern void *read_sha1_file_extended(const unsigned char *sha1, enum object_type *type, unsigned long *size, unsigned flag); diff --git a/path.c b/path.c index e608993..8ee7191 100644 --- a/path.c +++ b/path.c @@ -856,3 +856,19 @@ int is_ntfs_dotgit(const char *name) len = -1; } } + +char *xdg_config_home(const char *filename) +{ + const char *config_home = getenv(XDG_CONFIG_HOME); + + if (!filename) + return NULL; See above regarding conflation of programmer error and user/configuration error. + if (!config_home || !config_home[0]) { On this project, *config_home is usually favored over config_home[0]. + const char *home = getenv(HOME); + + if (!home) + return NULL; + return mkpathdup(%s/.config/git/%s, home, filename); + } else + return mkpathdup(%s/git/%s, config_home, filename); This logic is more difficult to follow than it ought to be due to the use of 'config_home' so distant from the 'if' which checked it, and due to the nested 'if'. It could be expressed more straight-forwardly as: if (config_home *config_home) return mkpathdup(%s/git/%s, config_home, filename); home = getenv(HOME); if (home) return mkpathdup(%s/.config/git/%s, home, filename); return NULL; +} -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] compat/mingw: stubs for getpgid() and tcgetpgrp()
On Wed, Apr 15, 2015 at 08:29:30PM +0200, Johannes Sixt wrote: | Windows does not have process groups. It is, therefore, the simplest | to pretend that each process is in its own process group. | | [...] | | diff --git a/compat/mingw.h b/compat/mingw.h | index 7b523cf..a552026 100644 | @@ -118,6 +116,12 @@ static inline int sigaddset(sigset_t *set, int signum) | #define SIG_UNBLOCK 0 | static inline int sigprocmask(int how, const sigset_t *set, sigset_t *oldset) | { return 0; } | +static inline pid_t getppid(void) | +{ return 1; } | +static inline pid_t getpgid(pid_t pid) | +{ return pid == 0 ? getpid() : pid; } | +static inline pid_t tcgetpgrp(int fd) | +{ return getpid(); } This appears to be similar to the approach that tcsh uses too; return the current process ID for the process group ID. See https://github.com/tcsh-org/tcsh/blob/master/win32/ntport.h for tcsh's implementation of getpgrp() (a variation of getpgid()) and tcgetpgrp(). regards, Luke. pgpyN0nrBi1n3.pgp Description: PGP signature
Re: [PATCH] sha1_file.c: make parse_sha1_header_extended() static
On April 16, 2015 5:55:25 PM GMT+05:30, Ramsay Jones ram...@ramsay1.demon.co.uk wrote: commit 9e1f5bc0 (sha1_file.c: support reading from a loose object of unknown type, 15-04-2015) added a new externally visible function which does not require more than file scope. This causes sparse to issue a warning message about this symbol. In order to suppress the warning, add the static qualifier to the function definition. [An alternative solution, if this symbol should have external scope, is to add an external declaration for the function to the cache.h header file (next to the one for parse_sha1_header()).] Signed-off-by: Ramsay Jones ram...@ramsay1.demon.co.uk --- Hi Karthik, If you need to re-roll your patches in the 'kn/cat-file-literally' branch, could you please squash this, or something like it, into the relevant patch. Thanks for this Ramsay, will be re rolling, will squash it into my commit. Regards Karthik Thanks! ATB, Ramsay Jones sha1_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 267399d..2b81534 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1642,7 +1642,7 @@ static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long s * too permissive for what we want to check. So do an anal * object header parse by hand. */ -int parse_sha1_header_extended(const char *hdr, struct object_info *oi, +static int parse_sha1_header_extended(const char *hdr, struct object_info *oi, unsigned int flags) { struct strbuf typename = STRBUF_INIT; -- 2.3.0 -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 2/4] cat-file: teach cat-file a '--literally' option
On Thu, Apr 16, 2015 at 7:10 PM, Karthik Nayak karthik@gmail.com wrote: On April 16, 2015 7:05:04 PM GMT+05:30, Junio C Hamano gits...@pobox.com wrote: I meant just a different message. The point of -e is to see if the thing exists. It is good to mention _how_ the result is reported back to the user (i.e. via the exit code, not via an output to the standard output exists vs missing, for example), but that is secondary. Telling how it reports is meaningless without telling what it reports in the first place. I see what you mean. But I think it's beyond the scope of this patch series. It is perfectly fine to leave it as-is; that is why I said This is a tangent. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git-p4: Use -m when running p4 changes
On 15/04/15 04:47, Lex Spoon wrote: From 9cc607667a20317c837afd90d50c078da659b72f Mon Sep 17 00:00:00 2001 From: Lex Spoon l...@lexspoon.org Date: Sat, 11 Apr 2015 10:01:15 -0400 Subject: [PATCH] git-p4: Use -m when running p4 changes This patch didn't want to apply for me, I'm not quite sure why but possibly it's become scrambled? Either that or I'm doing it wrong! If you use git send-email it should Just Work. As an aside could you post reworked versions of patches with a subject line of [PATCH v2], [PATCH v3], etc, so reviewers can keep track of what's going on? Note to other reviewers: the existing git-p4 has a --max-changes option for 'sync', but this doesn't do the same thing at all. It doesn't limit the number of changes requested from the server, it just limits the number of changes pulled down, after the p4 server has supplied those changes. This confused me at first! Lex - I should have mentioned this before, but would you be able to add some documentation to Documentation/git-p4.txt to explain what your new option does? It would help to distinguish between your option and the existing --max-changes option. I've put a few remarks below in your shell script; there are a few minor issues that could do with being tidied up. Thanks! Luke snip diff --git a/t/t9818-git-p4-block.sh b/t/t9818-git-p4-block.sh new file mode 100755 index 000..73e545d --- /dev/null +++ b/t/t9818-git-p4-block.sh @@ -0,0 +1,64 @@ +#!/bin/sh + +test_description='git p4 fetching changes in multiple blocks' + +. ./lib-git-p4.sh + +test_expect_success 'start p4d' ' + start_p4d +' + +test_expect_success 'Create a repo with 100 changes' ' + ( + cd $cli This doesn't look like enough indentation. The tests normally get a hard tab indent at each level. + touch file.txt + p4 add file.txt + p4 submit -d Add file.txt + for i in 0 1 2 3 4 5 6 7 8 9 + do + touch outer$i.txt + p4 add outer$i.txt + p4 submit -d Adding outer$i.txt + for j in 0 1 2 3 4 5 6 7 8 9 + do + p4 edit file.txt + echo $i$j file.txt Please put the file argument immediately after the redirection, i.e. echo $i$j file.txt (Which you've done below in fact). + p4 submit -d Commit $i$j + done + done + ) +' + +test_expect_success 'Clone the repo' ' + git p4 clone --dest=$git --changes-block-size=10 --verbose //depot@all +' + +test_expect_success 'All files are present' ' + echo file.txt expected + test_write_lines outer0.txt outer1.txt outer2.txt outer3.txt outer4.txt expected + test_write_lines outer5.txt outer6.txt outer7.txt outer8.txt outer9.txt expected + ls $git current + test_cmp expected current +' + +test_expect_success 'file.txt is correct' ' + echo 99 expected + test_cmp expected $git/file.txt +' + +test_expect_success 'Correct number of commits' ' + (cd $git; git log --oneline) log Use rather than ; + test_line_count = 111 log +' + +test_expect_success 'Previous version of file.txt is correct' ' + (cd $git; git checkout HEAD^^) As above. + echo 97 expected + test_cmp expected $git/file.txt +' + +test_expect_success 'kill p4d' ' + kill_p4d +' + +test_done Looks good other than that (+Junio's comments). Thanks! Luke -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 2/4] cat-file: teach cat-file a '--literally' option
On April 16, 2015 7:05:04 PM GMT+05:30, Junio C Hamano gits...@pobox.com wrote: karthik nayak karthik@gmail.com writes: On 04/16/2015 02:22 AM, Junio C Hamano wrote: This is a tangent, but while we are in the vicinity, we may want to rethink the help message we attach to the '-e' option. Technically the current message is _not_ wrong per-se, but it misses the point. The primary thing the option does is to check the (e)xistence of the named object, and the fact that it does so silently is merely a detail of the operation. The current help text omits the more important part of what the option is. Would you rather check '-e' and go on to check '-p' or do you merely just want a different message. I meant just a different message. The point of -e is to see if the thing exists. It is good to mention _how_ the result is reported back to the user (i.e. via the exit code, not via an output to the standard output exists vs missing, for example), but that is secondary. Telling how it reports is meaningless without telling what it reports in the first place. I see what you mean. But I think it's beyond the scope of this patch series. If no one does, I'll work on it ... when a user is giving the '-e' option he just expects a silent output if the object exists, hence we rather have the option '-e' behave as a mutually exclusive option... Yes, and that is in line with the switch to OPT_CMDMODE. -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] Move get_max_fd_limit(void) to git_compat_util.h
Signed-off-by: Stefan Beller sbel...@google.com --- git-compat-util.h | 1 + sha1_file.c | 41 - wrapper.c | 41 + 3 files changed, 42 insertions(+), 41 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index bc8fc8c..2c55ca7 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -675,6 +675,7 @@ extern int xmkstemp_mode(char *template, int mode); extern int odb_mkstemp(char *template, size_t limit, const char *pattern); extern int odb_pack_keep(char *name, size_t namesz, const unsigned char *sha1); extern char *xgetcwd(void); +extern unsigned int get_max_fd_limit(void); #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), (alloc) * sizeof(*(x))) diff --git a/sha1_file.c b/sha1_file.c index 88f06ba..1f2519c 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -870,47 +870,6 @@ void free_pack_by_name(const char *pack_name) } } -static unsigned int get_max_fd_limit(void) -{ -#ifdef RLIMIT_NOFILE - { - struct rlimit lim; - - if (!getrlimit(RLIMIT_NOFILE, lim)) - return lim.rlim_cur; - } -#endif - -#ifdef _SC_OPEN_MAX - { - long open_max = sysconf(_SC_OPEN_MAX); - if (0 open_max) - return open_max; - /* -* Otherwise, we got -1 for one of the two -* reasons: -* -* (1) sysconf() did not understand _SC_OPEN_MAX -* and signaled an error with -1; or -* (2) sysconf() said there is no limit. -* -* We _could_ clear errno before calling sysconf() to -* tell these two cases apart and return a huge number -* in the latter case to let the caller cap it to a -* value that is not so selfish, but letting the -* fallback OPEN_MAX codepath take care of these cases -* is a lot simpler. -*/ - } -#endif - -#ifdef OPEN_MAX - return OPEN_MAX; -#else - return 1; /* see the caller ;-) */ -#endif -} - /* * Do not call this directly as this leaks p-pack_fd on error return; * call open_packed_git() instead. diff --git a/wrapper.c b/wrapper.c index d5a6cef..493bf6f 100644 --- a/wrapper.c +++ b/wrapper.c @@ -303,6 +303,47 @@ ssize_t pread_in_full(int fd, void *buf, size_t count, off_t offset) return total; } +unsigned int get_max_fd_limit(void) +{ +#ifdef RLIMIT_NOFILE + { + struct rlimit lim; + + if (!getrlimit(RLIMIT_NOFILE, lim)) + return lim.rlim_cur; + } +#endif + +#ifdef _SC_OPEN_MAX + { + long open_max = sysconf(_SC_OPEN_MAX); + if (0 open_max) + return open_max; + /* +* Otherwise, we got -1 for one of the two +* reasons: +* +* (1) sysconf() did not understand _SC_OPEN_MAX +* and signaled an error with -1; or +* (2) sysconf() said there is no limit. +* +* We _could_ clear errno before calling sysconf() to +* tell these two cases apart and return a huge number +* in the latter case to let the caller cap it to a +* value that is not so selfish, but letting the +* fallback OPEN_MAX codepath take care of these cases +* is a lot simpler. +*/ + } +#endif + +#ifdef OPEN_MAX + return OPEN_MAX; +#else + return 1; /* see the caller ;-) */ +#endif +} + int xdup(int fd) { int ret = dup(fd); -- 2.3.0.81.gc37f363 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] Another approach to large transactions
* We keep the speed on small transactions (no close and reopen of fds in small transactions) * No refactoring for refs included, only minimally invasive to the refs.c code * applies on top of origin/sb/remove-fd-from-ref-lock replacing the last commit there (I reworded the commit message of the last patch of that tip, being the first patch in this series) * another approach would be to move the fd counting into the lock file api, I think that's not worth it for now. Stefan Beller (3): refs.c: remove lock_fd from struct ref_lock Move unsigned int get_max_fd_limit(void) to git_compat_util.h refs.c: enable large transactions git-compat-util.h | 1 + refs.c| 28 ++-- sha1_file.c | 41 - t/t1400-update-ref.sh | 4 ++-- wrapper.c | 41 + 5 files changed, 62 insertions(+), 53 deletions(-) -- 2.3.0.81.gc37f363 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] refs.c: enable large transactions
This is another attempt on enabling large transactions (large in terms of open file descriptors). We keep track of how many lock files are opened by the ref_transaction_commit function. When more than a reasonable amount of files is open, we close the file descriptors to make sure the transaction can continue. Another idea I had during implementing this was to move this file closing into the lock file API, such that only a certain amount of lock files can be open at any given point in time and we'd be 'garbage collecting' open fds when necessary in any relevant call to the lock file API. This would have brought the advantage of having such functionality available in other users of the lock file API as well. The downside however is the over complication, you really need to always check for (lock-fd != -1) all the time, which may slow down other parts of the code, which did not ask for such a feature. Signed-off-by: Stefan Beller sbel...@google.com --- refs.c| 13 + t/t1400-update-ref.sh | 4 ++-- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/refs.c b/refs.c index 4f495bd..44e0b15 100644 --- a/refs.c +++ b/refs.c @@ -3041,6 +3041,8 @@ static int write_ref_sha1(struct ref_lock *lock, errno = EINVAL; return -1; } + if (lock-lk-fd == -1) + reopen_lock_file(lock-lk); if (write_in_full(lock-lk-fd, sha1_to_hex(sha1), 40) != 40 || write_in_full(lock-lk-fd, term, 1) != 1 || close_ref(lock) 0) { @@ -3719,6 +3721,12 @@ int ref_transaction_commit(struct ref_transaction *transaction, { int ret = 0, i; int n = transaction-nr; + /* +* We may want to open many files in a large transaction, so come up with +* a reasonable maximum, keep some spares for stdin/out and other open +* files. +*/ + int remaining_fds = get_max_fd_limit() - 8; struct ref_update **updates = transaction-updates; struct string_list refs_to_delete = STRING_LIST_INIT_NODUP; struct string_list_item *ref_to_delete; @@ -3762,6 +3770,11 @@ int ref_transaction_commit(struct ref_transaction *transaction, update-refname); goto cleanup; } + if (remaining_fds 0) { + remaining_fds--; + } else { + close_lock_file(update-lock-lk); + } } /* Perform updates first so live commits remain referenced */ diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index 7a69f1a..636d3a1 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -1071,7 +1071,7 @@ run_with_limited_open_files () { test_lazy_prereq ULIMIT_FILE_DESCRIPTORS 'run_with_limited_open_files true' -test_expect_failure ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches does not burst open file limit' ' +test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches does not burst open file limit' ' ( for i in $(test_seq 33) do @@ -1082,7 +1082,7 @@ test_expect_failure ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches ) ' -test_expect_failure ULIMIT_FILE_DESCRIPTORS 'large transaction deleting branches does not burst open file limit' ' +test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction deleting branches does not burst open file limit' ' ( for i in $(test_seq 33) do -- 2.3.0.81.gc37f363 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] refs.c: remove lock_fd from struct ref_lock
The 'lock_fd' is the same as 'lk-fd'. No need to store it twice so remove it. No functional changes intended. Signed-off-by: Stefan Beller sbel...@google.com Reviewed-by: Michael Haggerty mhag...@alum.mit.edu Signed-off-by: Junio C Hamano gits...@pobox.com --- refs.c | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/refs.c b/refs.c index 47e4e53..4f495bd 100644 --- a/refs.c +++ b/refs.c @@ -11,7 +11,6 @@ struct ref_lock { char *orig_ref_name; struct lock_file *lk; unsigned char old_sha1[20]; - int lock_fd; }; /* @@ -2284,7 +2283,6 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, int attempts_remaining = 3; lock = xcalloc(1, sizeof(struct ref_lock)); - lock-lock_fd = -1; if (mustexist) resolve_flags |= RESOLVE_REF_READING; @@ -2356,8 +2354,7 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, goto error_return; } - lock-lock_fd = hold_lock_file_for_update(lock-lk, ref_file, lflags); - if (lock-lock_fd 0) { + if (hold_lock_file_for_update(lock-lk, ref_file, lflags) 0) { last_errno = errno; if (errno == ENOENT --attempts_remaining 0) /* @@ -2868,7 +2865,6 @@ static int close_ref(struct ref_lock *lock) { if (close_lock_file(lock-lk)) return -1; - lock-lock_fd = -1; return 0; } @@ -2876,7 +2872,6 @@ static int commit_ref(struct ref_lock *lock) { if (commit_lock_file(lock-lk)) return -1; - lock-lock_fd = -1; return 0; } @@ -3046,8 +3041,8 @@ static int write_ref_sha1(struct ref_lock *lock, errno = EINVAL; return -1; } - if (write_in_full(lock-lock_fd, sha1_to_hex(sha1), 40) != 40 || - write_in_full(lock-lock_fd, term, 1) != 1 || + if (write_in_full(lock-lk-fd, sha1_to_hex(sha1), 40) != 40 || + write_in_full(lock-lk-fd, term, 1) != 1 || close_ref(lock) 0) { int save_errno = errno; error(Couldn't write %s, lock-lk-filename.buf); @@ -4084,9 +4079,9 @@ int reflog_expire(const char *refname, const unsigned char *sha1, status |= error(couldn't write %s: %s, log_file, strerror(errno)); } else if (update - (write_in_full(lock-lock_fd, + (write_in_full(lock-lk-fd, sha1_to_hex(cb.last_kept_sha1), 40) != 40 || -write_str_in_full(lock-lock_fd, \n) != 1 || +write_str_in_full(lock-lk-fd, \n) != 1 || close_ref(lock) 0)) { status |= error(couldn't write %s, lock-lk-filename.buf); -- 2.3.0.81.gc37f363 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rebase --root conflicts with --committer-date-is-author-date
Junio C Hamano gits...@pobox.com wrote: Chris, do you remember if there was a reason why it was a bad idea to teach the normal rebase codepath to handle --root? I think we would have needed to allow am to apply a creation patch and start a new history on an unborn branch in order to do so, but I am not sure if there was a valid reason why such a change to am would have been a bad idea. Hi. It's a long time ago, but I don't remember any reason and it feels sensible that am should be able to create an unborn branch in the same way interactive rebase can. I suspect I had done the necessary work for rebase -i but not for am, and incorrectly assumed that interactive rebase was in any case a superset of non-interactive. Best wishes, Chris. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-archive ignores submodules
Good to know about git submodule foreach. Simpler yet, I'm using: zip -r ../project.zip . -x *.git* Which essentially does the same thing I would need from git-archive --recurse-submodule, zip everything excluding .git folders. Still would be better to use git itself. 2015-04-16 18:56 GMT+01:00 Fredrik Gustafsson iv...@iveqy.com: On Thu, Apr 16, 2015 at 06:35:38PM +0100, Pedro Rodrigues wrote: I've been using git-archive as my main way of deploying to production servers, but today I've come across a git repo with submodules and found out that git archive has no option to include submodules on the output archive. As far as I know this is an known limitation that's just waiting for someone to solve. Thanks for bringing attention to it. This simply makes git-archive unusable on this scenario. Not completely. There's a simple workaround. Combine git submodule foreach with git archive and make an archive for each submodule. Not as simple as if git archive --recurse-submodule would have been implementet, but hopefully it can make things work for you at the moment. -- Fredrik Gustafsson phone: +46 733-608274 e-mail: iv...@iveqy.com website: http://www.iveqy.com -- Pedro Rodrigues +244 917 774 823 +351 969 042 335 Mail: prodrigues1...@gmail.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] utf8-bom: introduce skip_utf8_bom() helper
On Thu, Apr 16, 2015 at 10:52:52AM -0700, Junio C Hamano wrote: @@ -576,10 +576,8 @@ int add_excludes_from_file_to_list(const char *fname, el-filebuf = buf; - if (size = 3 !memcmp(buf, utf8_bom, 3)) - entry = buf + 3; - else - entry = buf; + entry = buf; + skip_utf8_bom(entry, size); for (i = 0; i size; i++) { if (buf[i] == '\n') { I'm surprised that in both yours and the original that we do not need to subtract 3 from size. It looks like we advance entry here, not buf, and then iterate over buf. But I think that makes the later logic weird: if (entry != buf + i entry[0] != '#') because if there is a BOM, we end up with entry buf + i, which I think this code isn't expecting. I'm not sure it does anything bad, but I think it might be simpler as just: /* save away the real copy for later, as we do now */ el-filebuf = buf; /* * now pretend as if the BOM was not there at all by advancing * the pointer and shrinking the size */ skip_utf8_bom(buf, size); /* * and now we do our usual magic with entry */ entry = buf; for (i = 0; i size; i++) ... -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] utf8-bom: introduce skip_utf8_bom() helper
Jeff King p...@peff.net writes: On Thu, Apr 16, 2015 at 10:52:52AM -0700, Junio C Hamano wrote: @@ -576,10 +576,8 @@ int add_excludes_from_file_to_list(const char *fname, el-filebuf = buf; -if (size = 3 !memcmp(buf, utf8_bom, 3)) -entry = buf + 3; -else -entry = buf; +entry = buf; +skip_utf8_bom(entry, size); for (i = 0; i size; i++) { if (buf[i] == '\n') { I'm surprised that in both yours and the original that we do not need to subtract 3 from size. Or we start scanning from the beginning of buf, i.e. for (i = 0; i size; i++) After you pointed it out, I wondered why we do not adjust the initial value of i (without futzing with size). But... It looks like we advance entry here, not buf, and then iterate over buf. But I think that makes the later logic weird: if (entry != buf + i entry[0] != '#') because if there is a BOM, we end up with entry buf + i, which I think this code isn't expecting. I'm not sure it does anything bad, but I think it might be simpler as just: /* save away the real copy for later, as we do now */ el-filebuf = buf; /* * now pretend as if the BOM was not there at all by advancing * the pointer and shrinking the size */ skip_utf8_bom(buf, size); /* * and now we do our usual magic with entry */ entry = buf; for (i = 0; i size; i++) ... ... this would work much better for this caller. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] dir: allow a BOM at the beginning of exclude files
On Thu, 2015-04-16 at 10:16 -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: On Thu, Apr 16, 2015 at 08:39:55AM -0700, Junio C Hamano wrote: test_expect_success 'status untracked directory with --ignored' ' echo ignored .gitignore +sed -e s/^/\xef\xbb\xbf/ .gitignore .gitignore.new +mv .gitignore.new .gitignore Is this write literal in \xHEX on the replacement side of sed substitution potable? In any case, replacing the above three with something like: printf bomignored\n .gitignore may be more sensible, no? I'm not sure about sed, but I agree it is suspect. And note that printf with hex codes is not portable, either You have to use octal: printf '\357\273\277ignored\n' .gitignore Also, as a nit, I'd much rather see this in its own test rather than crammed into another test_expect_success. It's much easier to diagnose failures if the test description mentions the goal, and it is not tied up with testing other parts that might fail. Yeah, I totally agree. Carlos, something like this squashed in, perhaps? t/t7061-wtstatus-ignore.sh | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh index 0a06fbf..cdc0747 100755 --- a/t/t7061-wtstatus-ignore.sh +++ b/t/t7061-wtstatus-ignore.sh @@ -13,8 +13,6 @@ EOF test_expect_success 'status untracked directory with --ignored' ' echo ignored .gitignore - sed -e s/^/\xef\xbb\xbf/ .gitignore .gitignore.new - mv .gitignore.new .gitignore mkdir untracked : untracked/ignored : untracked/uncommitted @@ -22,6 +20,15 @@ test_expect_success 'status untracked directory with --ignored' ' test_cmp expected actual ' +test_expect_success 'same with gitignore starting with BOM' ' + printf \357\273\277ignored\n .gitignore + mkdir -p untracked + : untracked/ignored + : untracked/uncommitted + git status --porcelain --ignored actual + test_cmp expected actual +' + cat expected \EOF ?? .gitignore ?? actual Yeah, that makes sense. I had something similar in my patch at one point before going with modifying the current one. cmn -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/4] add_excludes_from_file: clarify the bom skipping logic
Even though the previous step shifts where the entry begins, we still iterate over the original buf[], which may begin with the UTF-8 BOM we are supposed to be skipping. At the end of the first line, the code grabs the contents of it starting at entry, so there is nothing wrong per-se, but the logic looks really confused. Instead, move the buf pointer and shrink its size, to truly pretend that UTF-8 BOM did not exist in the input. Signed-off-by: Junio C Hamano gits...@pobox.com --- dir.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/dir.c b/dir.c index 10c1f90..b5bb389 100644 --- a/dir.c +++ b/dir.c @@ -576,10 +576,11 @@ int add_excludes_from_file_to_list(const char *fname, el-filebuf = buf; - if (size = 3 !memcmp(buf, utf8_bom, 3)) - entry = buf + 3; - else - entry = buf; + if (size = 3 !memcmp(buf, utf8_bom, 3)) { + buf += 3; + size -= 3; + } + entry = buf; for (i = 0; i size; i++) { if (buf[i] == '\n') { -- 2.4.0-rc2-171-g98ddf7f -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/4] config: use utf8_bom[] from utf.[ch] in git_parse_source()
Because the function reads one character at the time, unfortunately we cannot use the easier skip_utf8_bom() helper, but at least we do not have to duplicate the constant string this way. Signed-off-by: Junio C Hamano gits...@pobox.com --- config.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/config.c b/config.c index 752e2e2..9618aa4 100644 --- a/config.c +++ b/config.c @@ -12,6 +12,7 @@ #include quote.h #include hashmap.h #include string-list.h +#include utf8.h struct config_source { struct config_source *prev; @@ -412,8 +413,7 @@ static int git_parse_source(config_fn_t fn, void *data) struct strbuf *var = cf-var; /* U+FEFF Byte Order Mark in UTF8 */ - static const unsigned char *utf8_bom = (unsigned char *) \xef\xbb\xbf; - const unsigned char *bomptr = utf8_bom; + const char *bomptr = utf8_bom; for (;;) { int c = get_next_char(); @@ -421,7 +421,7 @@ static int git_parse_source(config_fn_t fn, void *data) /* We are at the file beginning; skip UTF8-encoded BOM * if present. Sane editors won't put this in on their * own, but e.g. Windows Notepad will do it happily. */ - if ((unsigned char) c == *bomptr) { + if (c == (*bomptr 0377)) { bomptr++; continue; } else { -- 2.4.0-rc2-171-g98ddf7f -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/4] UTF8 BOM follow-up
This is on top of the .gitignore can start with UTF8 BOM patch from Carlos. Second try; the first patch is new to clarify the logic in the codeflow after Carlos's patch, and the second one has been adjusted accordingly. Junio C Hamano (4): add_excludes_from_file: clarify the bom skipping logic utf8-bom: introduce skip_utf8_bom() helper config: use utf8_bom[] from utf.[ch] in git_parse_source() attr: skip UTF8 BOM at the beginning of the input file attr.c | 9 +++-- config.c | 6 +++--- dir.c| 10 +- utf8.c | 11 +++ utf8.h | 3 +++ 5 files changed, 29 insertions(+), 10 deletions(-) -- 2.4.0-rc2-171-g98ddf7f -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/4] attr: skip UTF8 BOM at the beginning of the input file
Signed-off-by: Junio C Hamano gits...@pobox.com --- attr.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/attr.c b/attr.c index cd54697..7c530f4 100644 --- a/attr.c +++ b/attr.c @@ -12,6 +12,7 @@ #include exec_cmd.h #include attr.h #include dir.h +#include utf8.h const char git_attr__true[] = (builtin)true; const char git_attr__false[] = \0(builtin)false; @@ -369,8 +370,12 @@ static struct attr_stack *read_attr_from_file(const char *path, int macro_ok) return NULL; } res = xcalloc(1, sizeof(*res)); - while (fgets(buf, sizeof(buf), fp)) - handle_attr_line(res, buf, path, ++lineno, macro_ok); + while (fgets(buf, sizeof(buf), fp)) { + char *bufp = buf; + if (!lineno) + skip_utf8_bom(bufp, strlen(bufp)); + handle_attr_line(res, bufp, path, ++lineno, macro_ok); + } fclose(fp); return res; } -- 2.4.0-rc2-171-g98ddf7f -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/4] utf8-bom: introduce skip_utf8_bom() helper
With the recent change to ignore the UTF8 BOM at the beginning of .gitignore files, we now have two codepaths that do such a skipping (the other one is for reading the configuration files). Introduce utf8_bom[] constant string and skip_utf8_bom() helper and teach .gitignore code how to use it. Signed-off-by: Junio C Hamano gits...@pobox.com --- dir.c | 9 - utf8.c | 11 +++ utf8.h | 3 +++ 3 files changed, 18 insertions(+), 5 deletions(-) diff --git a/dir.c b/dir.c index b5bb389..4c4bf91 100644 --- a/dir.c +++ b/dir.c @@ -12,6 +12,7 @@ #include refs.h #include wildmatch.h #include pathspec.h +#include utf8.h struct path_simplify { int len; @@ -538,7 +539,6 @@ int add_excludes_from_file_to_list(const char *fname, struct stat st; int fd, i, lineno = 1; size_t size = 0; - static const unsigned char *utf8_bom = (unsigned char *) \xef\xbb\xbf; char *buf, *entry; fd = open(fname, O_RDONLY); @@ -576,10 +576,9 @@ int add_excludes_from_file_to_list(const char *fname, el-filebuf = buf; - if (size = 3 !memcmp(buf, utf8_bom, 3)) { - buf += 3; - size -= 3; - } + if (skip_utf8_bom(buf, size)) + size -= buf - el-filebuf; + entry = buf; for (i = 0; i size; i++) { diff --git a/utf8.c b/utf8.c index 520fbb4..28e6d76 100644 --- a/utf8.c +++ b/utf8.c @@ -633,3 +633,14 @@ int is_hfs_dotgit(const char *path) return 1; } + +const char utf8_bom[] = \357\273\277; + +int skip_utf8_bom(char **text, size_t len) +{ + if (len strlen(utf8_bom) || + memcmp(*text, utf8_bom, strlen(utf8_bom))) + return 0; + *text += strlen(utf8_bom); + return 1; +} diff --git a/utf8.h b/utf8.h index e4d9183..e7b2aa4 100644 --- a/utf8.h +++ b/utf8.h @@ -13,6 +13,9 @@ int same_encoding(const char *, const char *); __attribute__((format (printf, 2, 3))) int utf8_fprintf(FILE *, const char *, ...); +extern const char utf8_bom[]; +extern int skip_utf8_bom(char **, size_t); + void strbuf_add_wrapped_text(struct strbuf *buf, const char *text, int indent, int indent2, int width); void strbuf_add_wrapped_bytes(struct strbuf *buf, const char *data, int len, -- 2.4.0-rc2-171-g98ddf7f -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html