[PATCH] show-ref: make --head always show the HEAD ref
The docs seem to say that doing git show-ref --head --tags would show both the HEAD ref and all the tag refs. However, doing both --head and either of --tags or --heads would filter out the HEAD ref. Signed-off-by: Doug Bell madcity...@gmail.com --- I think this patch could be done better if I refactor the show_ref() sub a bit... This commit passes the current test suite. Where would I put new tests for this? I can't find an existing show-ref test to append to. I would be happy to write show-ref tests if there aren't any already. builtin/show-ref.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/builtin/show-ref.c b/builtin/show-ref.c index 8d9b76a..5954e9b 100644 --- a/builtin/show-ref.c +++ b/builtin/show-ref.c @@ -31,6 +31,9 @@ static int show_ref(const char *refname, const unsigned char *sha1, int flag, vo const char *hex; unsigned char peeled[20]; + if (show_head !strncmp(refname, HEAD\0, 5)) + goto match; + if (tags_only || heads_only) { int match; -- 1.8.3.101.g727a46b.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/7] add tests for rebasing with patch-equivalence present
On Wed, May 29, 2013 at 10:41 PM, Felipe Contreras felipe.contre...@gmail.com wrote: On Thu, May 30, 2013 at 12:30 AM, Martin von Zweigbergk martinv...@gmail.com wrote: On Wed, May 29, 2013 at 12:09 AM, Johannes Sixt j.s...@viscovery.net wrote: Am 5/29/2013 8:39, schrieb Martin von Zweigbergk: +# f +# / +# a---b---c---g---h +# \ +# d---G---i ... +test_run_rebase () { + result=$1 + shift + test_expect_$result rebase $* --onto drops patches in onto + reset_rebase + git rebase $* --onto h f i + test_cmp_rev h HEAD~2 + test_linear_range 'd i' h.. Isn't this expectation wrong? The upstream of the rebased branch is f, and it does not contain G. Hence, G should be replayed. Since h is the reversal of g, the state at h is the same as at c, and applying G should succeed (it is the same change as g). Therefore, I think the correct expectation is: test_linear_range 'd G i' h.. Good question! It is really not obvious what the right answer is. Some arguments in favor of dropping 'G': I think the answer is obvious; G should not be dropped. Maybe it made sense to drop g in upstream, but d fixes an issue, and it makes sense to apply G on upstream. Well, maybe I was wrong in thinking that dropping 'G' in 'git rebase --onto f h i' is bad. It seems to complicate things a lot, so maybe we should just decide that it's fine to do that (to drop 'G' in that case). Since that's mostly how it has worked forever and no one seems to have reported a problem with it, I'm probably just being paranoid. Thoughts? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
project
Hello, I am Mr. Fang Yong from Wing Lung Bank.I have a secured business proposal worth US$16.5M .Get back to me for more details. Regards, Fang L. Yong -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html X-pstn-neptune: 0/0/0.00/0 X-pstn-levels: (S: 4.04063/99.9 CV:99.9000 FC:95.5390 LC:95.5390 R:95.9108 P:95.9108 M:97.0282 C:98.6951 ) X-pstn-dkim: 0 //M [no valid signature] X-pstn-settings: 3 (1.:1.) s cv gt4 gt3 gt2 gt1 r p m c X-pstn-addresses: from of...@investmentsibv.com [320/11]
Re: [PATCH v2 2/7] add tests for rebasing with patch-equivalence present
On Thu, May 30, 2013 at 1:14 AM, Martin von Zweigbergk martinv...@gmail.com wrote: On Wed, May 29, 2013 at 10:41 PM, Felipe Contreras felipe.contre...@gmail.com wrote: On Thu, May 30, 2013 at 12:30 AM, Martin von Zweigbergk martinv...@gmail.com wrote: On Wed, May 29, 2013 at 12:09 AM, Johannes Sixt j.s...@viscovery.net wrote: Am 5/29/2013 8:39, schrieb Martin von Zweigbergk: +# f +# / +# a---b---c---g---h +# \ +# d---G---i ... +test_run_rebase () { + result=$1 + shift + test_expect_$result rebase $* --onto drops patches in onto + reset_rebase + git rebase $* --onto h f i + test_cmp_rev h HEAD~2 + test_linear_range 'd i' h.. Isn't this expectation wrong? The upstream of the rebased branch is f, and it does not contain G. Hence, G should be replayed. Since h is the reversal of g, the state at h is the same as at c, and applying G should succeed (it is the same change as g). Therefore, I think the correct expectation is: test_linear_range 'd G i' h.. Good question! It is really not obvious what the right answer is. Some arguments in favor of dropping 'G': I think the answer is obvious; G should not be dropped. Maybe it made sense to drop g in upstream, but d fixes an issue, and it makes sense to apply G on upstream. Well, maybe I was wrong in thinking that dropping 'G' in 'git rebase --onto f h i' is bad. It seems to complicate things a lot, so maybe we should just decide that it's fine to do that (to drop 'G' in that case). Since that's mostly how it has worked forever and no one seems to have reported a problem with it, I'm probably just being paranoid. Thoughts? Huh? I said the opposite; G should *not* be dropped. -- Felipe Contreras -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/7] add tests for rebasing with patch-equivalence present
On Wed, May 29, 2013 at 11:40 PM, Felipe Contreras felipe.contre...@gmail.com wrote: On Thu, May 30, 2013 at 1:14 AM, Martin von Zweigbergk martinv...@gmail.com wrote: On Wed, May 29, 2013 at 10:41 PM, Felipe Contreras felipe.contre...@gmail.com wrote: On Thu, May 30, 2013 at 12:30 AM, Martin von Zweigbergk martinv...@gmail.com wrote: On Wed, May 29, 2013 at 12:09 AM, Johannes Sixt j.s...@viscovery.net wrote: Am 5/29/2013 8:39, schrieb Martin von Zweigbergk: +# f +# / +# a---b---c---g---h +# \ +# d---G---i ... +test_run_rebase () { + result=$1 + shift + test_expect_$result rebase $* --onto drops patches in onto + reset_rebase + git rebase $* --onto h f i + test_cmp_rev h HEAD~2 + test_linear_range 'd i' h.. Isn't this expectation wrong? The upstream of the rebased branch is f, and it does not contain G. Hence, G should be replayed. Since h is the reversal of g, the state at h is the same as at c, and applying G should succeed (it is the same change as g). Therefore, I think the correct expectation is: test_linear_range 'd G i' h.. Good question! It is really not obvious what the right answer is. Some arguments in favor of dropping 'G': I think the answer is obvious; G should not be dropped. Maybe it made sense to drop g in upstream, but d fixes an issue, and it makes sense to apply G on upstream. Well, maybe I was wrong in thinking that dropping 'G' in 'git rebase --onto f h i' is bad. It seems to complicate things a lot, so maybe we should just decide that it's fine to do that (to drop 'G' in that case). Since that's mostly how it has worked forever and no one seems to have reported a problem with it, I'm probably just being paranoid. Thoughts? Huh? I said the opposite; G should *not* be dropped. I suspect you missed that I said 'git rebase --onto f h i', not 'git rebase --onto h f i'. Sorry, I should have pointed that out. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v13 02/15] path.c: refactor relative_path(), not only strip prefix
2013/5/26 Jiang Xin worldhello@gmail.com: 2013/5/22 Michael Haggerty mhag...@alum.mit.edu: The old and new versions both seem to be (differently) inconsistent about when the output has a trailing slash. What is the rule? The reason for introducing patch 02/15 is that we don't want to reinvent the wheel. Patch 06/15 (git-clean: refactor git-clean into two phases) needs to save relative_path of each git-clean candidate file/directory in del_list, but the public method in path.c (i.e. relative_path) is not powerful, and static method in quote.c (i.e. path_relative) can note be used directly. One way is to enhanced relative_path in path.c, like this patch. Since we combine the two methods (relative_path in path.c and path_relative in quote.c), the new relative_path must be compatible with the original two methods. relative_path in path.c === relative_path is called only once in setup.c: if (getenv(GIT_WORK_TREE_ENVIRONMENT)) setenv(GIT_WORK_TREE_ENVIRONMENT, ., 1); set_git_dir(relative_path(git_dir, work_tree)); initialized = 1; and set_git_dir only set the environment GIT_DIR_ENVIRONMENT like this: int set_git_dir(const char *path) { if (setenv(GIT_DIR_ENVIRONMENT, path, 1)) return error(Could not set GIT_DIR to '%s', path); setup_git_env(); return 0; } So the only restriction for relative_path is that the return value can not be blank. If the abs and base arguments for relative_path are the same, the return value should be . (./ is also OK), then set the envionment GIT_DIR_ENVIRONMENT to . (or ./). path_relative in quote.c We can not simply move path_relative in quote.c to relative_path in path.c directly. It is because: * The arguments for relative_path are from user input. So must validate (remove duplicate slash) before use. But path_relative does not check duplicate slash in arguments. * path_relative will return blank string, if abs and base are the same. Also I noticed that quote_path_relative of quote.c (which calls path_relative) will transform the blank string from path_relative to ./ (not .) char *quote_path_relative(const char *in, int len, ... const char *rel = path_relative(in, len, sb, prefix, -1); ... if (!out-len) strbuf_addstr(out, ./); That's why the path_relative in path.c refactor the output of . into ./. Hi, Junio I have updated comment and commit log in this patch. You can substitute commit bd4d1 (path.c: refactor relative_path(), not only strip prefix) in jx/clean-interactive branch with this one. -- 8 -- Subject: [PATCH] path.c: refactor relative_path(), not only strip prefix Original design of relative_path() is simple, just strip the prefix (*base) from the absolute path (*abs). In most cases, we need a real relative path, such as: ../foo, ../../bar. That's why there is another reimplementation (path_relative()) in quote.c. Borrowed some codes from path_relative() in quote.c to refactor relative_path() in path.c, so that it could return real relative path, and user can reuse this function without reimplement his/her own. The function path_relative() in quote.c will be substituted, and I would use the new relative_path() function when implement the interactive git-clean later. Different results for relative_path() before and after this refactor: abs path base path relative (original) relative (refactor) = === === /a/b/c/ /a/b c/ c/ /a/b//c/ //a///b/ c/ c/ /a/b /a/b ../ /a/b/ /a/b ../ /a/a/b/ /a ../ / /a/b/ /../../ /a/c /a/b/ /a/c ../c /a/b (empty)/a/b /a/b /a/b (null) /a/b /a/b (empty) /a/b (empty) ./ (null)(empty)(null) ./ (null)/a/b (segfault) ./ You may notice that return value . has been changed to ./. It is because: * Function quote_path_relative() in quote.c will show the relative path as ./ if abs(in) and base(prefix) are the same. * Function relative_path() is called only once (in setup.c), and it will be OK for the return value as ./ instead of .. Signed-off-by: Jiang Xin worldhello@gmail.com Signed-off-by: Junio C Hamano gits...@pobox.com --- cache.h | 2 +- path.c| 112 +++--- setup.c | 5 ++- t/t0060-path-utils.sh | 27 ++-- test-path-utils.c | 4 +- 5 files changed, 107 insertions(+),
Re: [msysGit] Re: t0008-ignores failure
Am 30.05.2013 04:55, schrieb Jeff King: So while it would be nice to work on paths with colons everywhere, I doubt it is worth the effort to start checking it through the whole test suite. And on top of it, on Windows, you can't have a path component or file name that contains a colon... -- Hannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/6] git send-email suppress-cc=self fixes
This includes bugfixes related to handling of --suppress-cc=self flag. Tests are also included. Changes from v1: - tweak coding style in tests to address comments by Junio Michael S. Tsirkin (6): t/send-email.sh: add test for suppress-cc=self send-email: fix suppress-cc=self on cccmd t/send-email: test suppress-cc=self on cccmd send-email: make --suppress-cc=self sanitize input t/send-email: add test with quoted sender t/send-email: test suppress-cc=self with non-ascii git-send-email.perl | 20 +-- t/t9001-send-email.sh | 70 +++ 2 files changed, 82 insertions(+), 8 deletions(-) -- MST -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/6] t/send-email.sh: add test for suppress-cc=self
This adds a basic test for --suppress-cc=self option of git send-email. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- t/t9001-send-email.sh | 43 +++ 1 file changed, 43 insertions(+) diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index ebd5c5d..e1a7f3e 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -171,6 +171,49 @@ Result: OK EOF +test_suppress_self () { + test_commit $3 + test_when_finished git reset --hard HEAD^ + + write_script cccmd-sed -EOF + sed -n -e s/^cccmd--//p \$1 + EOF + + git commit --amend --author=$1 $2 -F - + clean_fake_sendmail + git format-patch --stdout -1 suppress-self-$3.patch + + git send-email --from=$1 $2 \ + --to=nob...@example.com \ + --cc-cmd=./cccmd-sed \ + --suppress-cc=self \ + --smtp-server=$(pwd)/fake.sendmail \ + suppress-self-$3.patch + + mv msgtxt1 msgtxt1-$3 + sed -e '/^$/q' msgtxt1-$3 msghdr1-$3 + expected-no-cc-$3 + + (grep '^Cc:' msghdr1-$3 actual-no-cc-$3; +test_cmp expected-no-cc-$3 actual-no-cc-$3) +} + +test_suppress_self_unquoted () { + test_suppress_self $1 $2 unquoted-$3 -EOF + test suppress-cc.self unquoted-$3 with name $1 email $2 + + unquoted-$3 + + Cc: $1 $2 + Signed-off-by: $1 $2 + EOF +} + +test_expect_success $PREREQ 'self name is suppressed' + test_suppress_self_unquoted 'A U Thor' 'aut...@redhat.com' \ + 'self_name_suppressed' + + test_expect_success $PREREQ 'Show all headers' ' git send-email \ --dry-run \ -- MST -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/6] send-email: fix suppress-cc=self on cccmd
When cccmd is used, old-style suppress-from filter is applied by the newer suppress-cc=self isn't. Fix this up. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- git-send-email.perl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/git-send-email.perl b/git-send-email.perl index bd13cc8..a138615 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -1462,7 +1462,7 @@ sub recipients_cmd { $address =~ s/^\s*//g; $address =~ s/\s*$//g; $address = sanitize_address($address); - next if ($address eq $sanitized_sender and $suppress_from); + next if ($address eq $sender and $suppress_cc{'self'}); push @addresses, $address; printf(($prefix) Adding %s: %s from: '%s'\n, $what, $address, $cmd) unless $quiet; -- MST -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/6] t/send-email: test suppress-cc=self on cccmd
Check that suppress-cc=self works when applied to output of cccmd. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- t/t9001-send-email.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index e1a7f3e..452b362 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -204,6 +204,8 @@ test_suppress_self_unquoted () { unquoted-$3 + cccmd--$1 $2 + Cc: $1 $2 Signed-off-by: $1 $2 EOF -- MST -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/6] send-email: make --suppress-cc=self sanitize input
--suppress-cc=self fails to filter sender address in many cases where it needs to be sanitized in some way, for example quoted: A U. Thor aut...@example.com To fix, make send-email sanitize both sender and the address it is compared against. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- git-send-email.perl | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index a138615..92df393 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -760,6 +760,8 @@ if (!defined $sender) { $sender = $repoauthor || $repocommitter || ''; } +$sender = sanitize_address($sender); + my $prompting = 0; if (!@initial_to !defined $to_cmd) { my $to = ask(Who should the emails be sent to (if any)? , @@ -1113,10 +1115,9 @@ sub send_message { if ($cc ne '') { $ccline = \nCc: $cc; } - my $sanitized_sender = sanitize_address($sender); make_message_id() unless defined($message_id); - my $header = From: $sanitized_sender + my $header = From: $sender To: $to${ccline} Subject: $subject Date: $date @@ -1133,7 +1134,7 @@ X-Mailer: git-send-email $gitversion } my @sendmail_parameters = ('-i', @recipients); - my $raw_from = $sanitized_sender; + my $raw_from = $sender; if (defined $envelope_sender $envelope_sender ne auto) { $raw_from = $envelope_sender; } @@ -1308,8 +1309,9 @@ foreach my $t (@files) { } elsif (/^From:\s+(.*)$/i) { ($author, $author_encoding) = unquote_rfc2047($1); + my $sauthor = sanitize_address($author); next if $suppress_cc{'author'}; - next if $suppress_cc{'self'} and $author eq $sender; + next if $suppress_cc{'self'} and $sauthor eq $sender; printf((mbox) Adding cc: %s from line '%s'\n, $1, $_) unless $quiet; push @cc, $1; @@ -1323,7 +1325,9 @@ foreach my $t (@files) { } elsif (/^Cc:\s+(.*)$/i) { foreach my $addr (parse_address_line($1)) { - if (unquote_rfc2047($addr) eq $sender) { + my $qaddr = unquote_rfc2047($addr); + my $saddr = sanitize_address($qaddr); + if ($saddr eq $sender) { next if ($suppress_cc{'self'}); } else { next if ($suppress_cc{'cc'}); @@ -1370,7 +1374,8 @@ foreach my $t (@files) { chomp; my ($what, $c) = ($1, $2); chomp $c; - if ($c eq $sender) { + my $sc = sanitize_address($c); + if ($sc eq $sender) { next if ($suppress_cc{'self'}); } else { next if $suppress_cc{'sob'} and $what =~ /Signed-off-by/i; @@ -1454,7 +1459,6 @@ foreach my $t (@files) { sub recipients_cmd { my ($prefix, $what, $cmd, $file) = @_; - my $sanitized_sender = sanitize_address($sender); my @addresses = (); open my $fh, -|, $cmd \Q$file\E or die ($prefix) Could not execute '$cmd'; -- MST -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 5/6] t/send-email: add test with quoted sender
add test where sender address needs to be quoted. Make sure --suppress-cc=self works well in this case. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- t/t9001-send-email.sh | 20 1 file changed, 20 insertions(+) diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index 452b362..42fb809 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -211,11 +211,31 @@ test_suppress_self_unquoted () { EOF } +test_suppress_self_quoted () { + test_suppress_self $1 $2 quoted-$3 -EOF + test suppress-cc.self quoted-$3 with name $1 email $2 + + quoted-$3 + + cccmd--$1 $2 + + Cc: $1 $2 + Cc: $1 $2 + Signed-off-by: $1 $2 + Signed-off-by: $1 $2 + EOF +} + test_expect_success $PREREQ 'self name is suppressed' test_suppress_self_unquoted 'A U Thor' 'aut...@redhat.com' \ 'self_name_suppressed' +test_expect_success $PREREQ 'quoted self name is suppressed' + test_suppress_self_quoted 'A U. Thor' 'aut...@redhat.com' \ + 'self_name_suppressed' + + test_expect_success $PREREQ 'Show all headers' ' git send-email \ --dry-run \ -- MST -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 6/6] t/send-email: test suppress-cc=self with non-ascii
test suppress-cc=self when sender is non-acsii Signed-off-by: Michael S. Tsirkin m...@redhat.com --- t/t9001-send-email.sh | 5 + 1 file changed, 5 insertions(+) diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index 42fb809..430e8de 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -236,6 +236,11 @@ test_expect_success $PREREQ 'quoted self name is suppressed' 'self_name_suppressed' +test_expect_success $PREREQ 'non-ascii self name is suppressed' + test_suppress_self_quoted 'Füñný Nâmé' 'odd_?=m...@example.com' \ + 'non_ascii_self_suppressed' + + test_expect_success $PREREQ 'Show all headers' ' git send-email \ --dry-run \ -- MST -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [git-users] Highlevel (but simple to implement) commands provided by default for git
Jonathan Nieder wrote: That's detectable and could be made to error out, so it's not too bad. Sure it's possible, but I'm arguing about whether it's worth the effort. There can be loops like a - b - c - d - e - a. Given that nobody has even bothered to get git to print an error message when a builtin command is overridden, do you think anyone will be interested in this? A bigger problem (in my opinion) with allowing arbitrary changes to the meaning of existing commands is that scripts, whether placed in .sh files or given as commands to run over IRC, stop working altogether. It's nice to have commands like git log and git am mean the same thing no matter what machine I am on. Yeah, I agree with this to a large extent. It's nice to have a minimal set of unambiguous commands for the purposes of communicating, and I'm quite happy with the present state of things. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git clone depth of 0 not possible.
Hi Junio, On Tue, May 28, 2013 at 10:04:46AM -0700, Junio C Hamano wrote: Matthijs Kooijman matth...@stdin.nl writes: Did you consider how to implement this? Looking at the code, it seems the deepen parameter in the wire protocol now means: - 0: Do not change anything about the shallowness (i.e., fetch everything from the shallow root to the tip). - 0: Create new shallow commits at depth commits below the tip (so depth == 1 means tip and one below). - INFINITE_DEPTH (0x7fff): Remove all shallowness and fetch complete history. Given this, I'm not sure how one can express fetch the tip and nothing below that, since depth == 0 already has a different meaning. Doing it correctly (in the shorter term) would involve: Given below suggestion, I take it you don't like what Jonathan proposed (changing the meaning of the deepen parameter in the protocol so that the server effectively decides how to interpret --depth)? - adding a capability on the sending side fixed-off-by-one-depth to the protocol, and teaching the sending side to advertise the capability; - teaching the sending side to see if the new behaviour to fix off-by-one is asked by the requestor, and stop at the correct number of commits, not oversending one more. Otherwise retain the old behaviour. We can implement these two in current git already, since they only add to the protocol, not break it in an incompatible manner, right? - teaching the requestor that got --depth=N from the end user to pay attention to the new capability in such a way that: - when talking to an old sender (i.e. without the off-by-one fix), send N-1 for N greater than 1. Punt on N==1; - when talking to a fixed sender, ask to enable the capability, and send N as is (including N==1). And these should wait for git2, since they change the meaning of the --depth parameter? Or is this change ok for current git as well? What do you mean by punt exactly? Show an error to the user, saying only depth = 2 is supported? In the longer term, I think we should introduce a better deepening mechanism. Cf. Even when there will be a better deepening mechanism, the above is still useful (passing --depth=1 serves to get just a single commit without history, which is a distinct usecase from deepening the history of an existing shallow repository). In other words, I think the improved deepening and fixed depth should be complementary features. Gr. Matthijs -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] wildmatch: properly fold case everywhere
Case folding is not done correctly when matching against the [:upper:] character class and uppercased character ranges (e.g. A-Z). Specifically, an uppercase letter fails to match against any of them when case folding is requested because plain characters in the pattern and the whole string and preemptively lowercased to handle the base case fast. That optimization is kept and ISLOWER() is used in the [:upper:] case when case folding is requested, while matching against a character range is retried with toupper() if the character was lowercase, as the bounds of the range itself cannot be modified (in a case-insensitive context, [A-_] is not equivalent to [a-_]). Signed-off-by: Anthony Ramine n.ox...@gmail.com --- t/t3070-wildmatch.sh | 55 ++-- wildmatch.c | 7 +++ 2 files changed, 56 insertions(+), 6 deletions(-) I added four tests for the [A-_] range case and a note about it in the commit message. diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh index 4c37057..38446a0 100755 --- a/t/t3070-wildmatch.sh +++ b/t/t3070-wildmatch.sh @@ -6,20 +6,20 @@ test_description='wildmatch tests' match() { if [ $1 = 1 ]; then - test_expect_success wildmatch:match '$3' '$4' + test_expect_success wildmatch: match '$3' '$4' test-wildmatch wildmatch '$3' '$4' else - test_expect_success wildmatch: no match '$3' '$4' + test_expect_success wildmatch: no match '$3' '$4' ! test-wildmatch wildmatch '$3' '$4' fi if [ $2 = 1 ]; then - test_expect_success fnmatch: match '$3' '$4' + test_expect_success fnmatch: match '$3' '$4' test-wildmatch fnmatch '$3' '$4' elif [ $2 = 0 ]; then - test_expect_success fnmatch: no match '$3' '$4' + test_expect_success fnmatch:no match '$3' '$4' ! test-wildmatch fnmatch '$3' '$4' #else @@ -29,13 +29,25 @@ match() { fi } +imatch() { +if [ $1 = 1 ]; then + test_expect_success iwildmatch:match '$2' '$3' + test-wildmatch iwildmatch '$2' '$3' + +else + test_expect_success iwildmatch: no match '$2' '$3' + ! test-wildmatch iwildmatch '$2' '$3' + +fi +} + pathmatch() { if [ $1 = 1 ]; then - test_expect_success pathmatch:match '$2' '$3' + test_expect_success pathmatch: match '$2' '$3' test-wildmatch pathmatch '$2' '$3' else - test_expect_success pathmatch: no match '$2' '$3' + test_expect_success pathmatch: no match '$2' '$3' ! test-wildmatch pathmatch '$2' '$3' fi @@ -235,4 +247,35 @@ pathmatch 1 abcXdefXghi '*X*i' pathmatch 1 ab/cXd/efXg/hi '*/*X*/*/*i' pathmatch 1 ab/cXd/efXg/hi '*Xg*i' +# Case-sensitivy features +match 0 x 'a' '[A-Z]' +match 1 x 'A' '[A-Z]' +match 0 x 'A' '[a-z]' +match 1 x 'a' '[a-z]' +match 0 x 'a' '[[:upper:]]' +match 1 x 'A' '[[:upper:]]' +match 0 x 'A' '[[:lower:]]' +match 1 x 'a' '[[:lower:]]' +match 0 x 'A' '[B-Za]' +match 1 x 'a' '[B-Za]' +match 0 x 'A' '[B-a]' +match 1 x 'a' '[B-a]' +match 0 x 'z' '[Z-y]' +match 1 x 'Z' '[Z-y]' + +imatch 1 'a' '[A-Z]' +imatch 1 'A' '[A-Z]' +imatch 1 'A' '[a-z]' +imatch 1 'a' '[a-z]' +imatch 1 'a' '[[:upper:]]' +imatch 1 'A' '[[:upper:]]' +imatch 1 'A' '[[:lower:]]' +imatch 1 'a' '[[:lower:]]' +imatch 1 'A' '[B-Za]' +imatch 1 'a' '[B-Za]' +imatch 1 'A' '[B-a]' +imatch 1 'a' '[B-a]' +imatch 1 'z' '[Z-y]' +imatch 1 'Z' '[Z-y]' + test_done diff --git a/wildmatch.c b/wildmatch.c index 7192bdc..f91ba99 100644 --- a/wildmatch.c +++ b/wildmatch.c @@ -196,6 +196,11 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags) } if (t_ch = p_ch t_ch = prev_ch) matched = 1; + else if ((flags WM_CASEFOLD) ISLOWER(t_ch)) { + uchar t_ch_upper = toupper(t_ch); + if (t_ch_upper = p_ch t_ch_upper = prev_ch) + matched = 1; + } p_ch = 0; /* This makes prev_ch get set to 0. */ } else if (p_ch == '[' p[1] == ':') { const uchar *s; @@ -245,6 +250,8 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags) } else if (CC_EQ(s,i, upper)) { if (ISUPPER(t_ch)) matched = 1; + else if ((flags WM_CASEFOLD) ISLOWER(t_ch)) +
Re: [PATCH] wildmatch: properly fold case everywhere
On Thu, May 30, 2013 at 3:45 PM, Anthony Ramine n.ox...@gmail.com wrote: Case folding is not done correctly when matching against the [:upper:] character class and uppercased character ranges (e.g. A-Z). Specifically, an uppercase letter fails to match against any of them when case folding is requested because plain characters in the pattern and the whole string and preemptively lowercased to handle the base case fast. That optimization is kept and ISLOWER() is used in the [:upper:] case when case folding is requested, while matching against a character range is retried with toupper() if the character was lowercase, as the bounds of the range itself cannot be modified (in a case-insensitive context, [A-_] is not equivalent to [a-_]). Signed-off-by: Anthony Ramine n.ox...@gmail.com Reviewed-by: Duy Nguyen pclo...@gmail.com If you have time, you may want to send a similar patch to rsync, which contains original wildmatch implementation. It does not look much different from this one, except that (flags WM_CASEFOLD) is replaced with force_lower_case. Thanks. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7] Add new git-related helper to contrib
Let's do one more review. Felipe Contreras wrote: diff --git a/contrib/related/git-related b/contrib/related/git-related new file mode 100755 index 000..1b9b1e7 --- /dev/null +++ b/contrib/related/git-related @@ -0,0 +1,120 @@ +#!/usr/bin/env ruby + +# This script finds people that might be interested in a patch +# usage: git related file + +$since = '5-years-ago' +$min_percent = 10 + +class Commit + + attr_reader :persons Unless you plan to introduce many more fields (I haven't looked at the later patches), you might as well implement an #each, like in Commits. + def initialize(id) +@id = id +@persons = [] + end + + def parse(data) +msg = nil msg = false, to indicate that it is a boolean. +data.each_line do |line| + if not msg +case line +when /^author ([^]+) (\S+) (.+)$/ + @persons '%s %s' % [$1, $2] Why capture the third group when $3 is unused? +when /^$/ + msg = true +end + else +if line =~ /^(Signed-off|Reviewed|Acked)-by: ([^]+) (\S+?)$/ + @persons '%s %s' % [$2, $3] Why capture the first group when $1 is unused? +end + end +end +@persons.uniq! + end + +end + +class Commits + + def initialize +@items = {} + end + + def size +@items.size + end + + def each(block) +@items.each(block) + end + + def import +return if @items.empty? +File.popen(%w[git cat-file --batch], 'r+') do |p| Don't you need rb+ to suppress the CRLF nonsense on Windows? + p.write(@items.keys.join(\n)) As you might have realized, the parentheses are optional everywhere (except when it is required for disambiguation). I'm merely pointing it out here, because this line looks especially ugly. + p.close_write + p.each do |line| +if line =~ /^(\h{40}) commit (\d+)/ + id, len = $1, $2 id, len = $1, Integer $2. And drop the .to_i on the next line. + data = p.read($2.to_i) + @items[id].parse(data) +end + end +end + end + + def get_blame(source, start, len, from) +return if len == 0 +len ||= 1 I asked you to use 'len =1 if not len' for clarity, but you didn't like it. +File.popen(['git', 'blame', '--incremental', '-C', '-C', + '-L', '%u,+%u' % [start, len], + '--since', $since, from + '^', + '--', source]) do |p| + p.each do |line| +if line =~ /^\h{40}/ + id = $ + @items[id] = Commit.new(id) +end + end +end + end + + def from_patch(file) +from = source = nil +File.open(file) do |f| + f.each do |line| File.readlines(file).each do |line|. +case line +when /^From (\h+) (.+)$/ + from = $1 Useless capture. +when /^---\s+(\S+)/ + source = $1 != '/dev/null' ? $1[2..-1] : nil +when /^@@ -(\d+)(?:,(\d+))?/ + get_blame(source, $1, $2, from) if source and from Useless capture. When is len ($2) going to be nil? +end + end +end + end + +end + +exit 1 if ARGV.size != 1 + +commits = Commits.new +commits.from_patch(ARGV[0]) +commits.import + +count_per_person = Hash.new(0) + +commits.each do |id, commit| commits.each do |_, commit|, since you're not using id. + commit.persons.each do |person| +count_per_person[person] += 1 + end +end + +count_per_person.each do |person, count| + percent = count.to_f * 100 / commits.size I prefer 'Float count' over count.to_f, but that's just a matter of taste. + next if percent $min_percent + puts person +end -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (May 2013, #09; Wed, 29)
Junio C Hamano gits...@pobox.com writes: * rr/die-on-missing-upstream (2013-05-22) 2 commits - sha1_name: fix error message for @{N}, @{date} - sha1_name: fix error message for @{u} When a reflog notation is used for implicit current branch, we did not say which branch and worse said branch ''. Will merge to 'next'. This interacts with the tests from * fc/at-head (2013-05-08) 13 commits to fail valgrind in t1508 like so: ==22927== Invalid read of size 1 ==22927==at 0x4C2C71B: __GI_strnlen (mc_replace_strmem.c:377) ==22927==by 0x567FF6B: vfprintf (in /lib64/libc-2.17.so) ==22927==by 0x56AC194: vsnprintf (in /lib64/libc-2.17.so) ==22927==by 0x4EAC39: vreportf (usage.c:12) ==22927==by 0x4EACA4: die_builtin (usage.c:36) ==22927==by 0x4EAEE0: die (usage.c:103) ==22927==by 0x4D8C51: get_sha1_1 (sha1_name.c:542) ==22927==by 0x4D8E5D: get_sha1_with_context_1 (sha1_name.c:1299) ==22927==by 0x4D992A: get_sha1_with_context (sha1_name.c:1417) ==22927==by 0x4D99E1: get_sha1 (sha1_name.c:1124) ==22927==by 0x45E1AC: cmd_rev_parse (rev-parse.c:761) ==22927==by 0x4051B3: handle_internal_command (git.c:284) ==22927== Address 0x5bebccb is 11 bytes inside a block of size 22 free'd ==22927==at 0x4C2ACDA: free (vg_replace_malloc.c:468) ==22927==by 0x4D8C37: get_sha1_1 (sha1_name.c:541) ==22927==by 0x4D8E5D: get_sha1_with_context_1 (sha1_name.c:1299) ==22927==by 0x4D992A: get_sha1_with_context (sha1_name.c:1417) ==22927==by 0x4D99E1: get_sha1 (sha1_name.c:1124) ==22927==by 0x45E1AC: cmd_rev_parse (rev-parse.c:761) ==22927==by 0x4051B3: handle_internal_command (git.c:284) ==22927==by 0x4053E7: main (git.c:492) I think it's enough to squash this little change; leaking some memory immediately before die() is not too bad, especially if we're going to pass real_ref+11 into die()... diff --git i/sha1_name.c w/sha1_name.c index 5ea16ff..a07558d 100644 --- i/sha1_name.c +++ w/sha1_name.c @@ -538,7 +538,6 @@ static int get_sha1_basic(const char *str, int len, unsigned char *sha1) back to %s., len, str, show_date(co_time, co_tz, DATE_RFC2822)); else { - free(real_ref); die(Log for '%.*s' only has %d entries., len, str, co_cnt); } -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] wildmatch: properly fold case everywhere
On Thu, May 30, 2013 at 5:29 AM, Anthony Ramine n.ox...@gmail.com wrote: Yes indeed. Will amend. Should I add your name in Reviewed-by as well? No. I merely spotted a minor typographical error. -- Anthony Ramine Le 30 mai 2013 à 11:07, Eric Sunshine a écrit : On Thu, May 30, 2013 at 4:45 AM, Anthony Ramine n.ox...@gmail.com wrote: Case folding is not done correctly when matching against the [:upper:] character class and uppercased character ranges (e.g. A-Z). Specifically, an uppercase letter fails to match against any of them when case folding is requested because plain characters in the pattern and the whole string and preemptively lowercased to handle the base case Did you mean s/and preemptively/are preemptively/ ? fast. That optimization is kept and ISLOWER() is used in the [:upper:] case when case folding is requested, while matching against a character range is retried with toupper() if the character was lowercase, as the bounds of the range itself cannot be modified (in a case-insensitive context, [A-_] is not equivalent to [a-_]). Signed-off-by: Anthony Ramine n.ox...@gmail.com -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5] wildmatch: properly fold case everywhere
Case folding is not done correctly when matching against the [:upper:] character class and uppercased character ranges (e.g. A-Z). Specifically, an uppercase letter fails to match against any of them when case folding is requested because plain characters in the pattern and the whole string are preemptively lowercased to handle the base case fast. That optimization is kept and ISLOWER() is used in the [:upper:] case when case folding is requested, while matching against a character range is retried with toupper() if the character was lowercase, as the bounds of the range itself cannot be modified (in a case-insensitive context, [A-_] is not equivalent to [a-_]). Signed-off-by: Anthony Ramine n.ox...@gmail.com Reviewed-by: Duy Nguyen pclo...@gmail.com --- t/t3070-wildmatch.sh | 55 ++-- wildmatch.c | 7 +++ 2 files changed, 56 insertions(+), 6 deletions(-) I added Duy as reviewer and fixed a typo in the commit message reported by Eric Sunshine. diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh index 4c37057..38446a0 100755 --- a/t/t3070-wildmatch.sh +++ b/t/t3070-wildmatch.sh @@ -6,20 +6,20 @@ test_description='wildmatch tests' match() { if [ $1 = 1 ]; then - test_expect_success wildmatch:match '$3' '$4' + test_expect_success wildmatch: match '$3' '$4' test-wildmatch wildmatch '$3' '$4' else - test_expect_success wildmatch: no match '$3' '$4' + test_expect_success wildmatch: no match '$3' '$4' ! test-wildmatch wildmatch '$3' '$4' fi if [ $2 = 1 ]; then - test_expect_success fnmatch: match '$3' '$4' + test_expect_success fnmatch: match '$3' '$4' test-wildmatch fnmatch '$3' '$4' elif [ $2 = 0 ]; then - test_expect_success fnmatch: no match '$3' '$4' + test_expect_success fnmatch:no match '$3' '$4' ! test-wildmatch fnmatch '$3' '$4' #else @@ -29,13 +29,25 @@ match() { fi } +imatch() { +if [ $1 = 1 ]; then + test_expect_success iwildmatch:match '$2' '$3' + test-wildmatch iwildmatch '$2' '$3' + +else + test_expect_success iwildmatch: no match '$2' '$3' + ! test-wildmatch iwildmatch '$2' '$3' + +fi +} + pathmatch() { if [ $1 = 1 ]; then - test_expect_success pathmatch:match '$2' '$3' + test_expect_success pathmatch: match '$2' '$3' test-wildmatch pathmatch '$2' '$3' else - test_expect_success pathmatch: no match '$2' '$3' + test_expect_success pathmatch: no match '$2' '$3' ! test-wildmatch pathmatch '$2' '$3' fi @@ -235,4 +247,35 @@ pathmatch 1 abcXdefXghi '*X*i' pathmatch 1 ab/cXd/efXg/hi '*/*X*/*/*i' pathmatch 1 ab/cXd/efXg/hi '*Xg*i' +# Case-sensitivy features +match 0 x 'a' '[A-Z]' +match 1 x 'A' '[A-Z]' +match 0 x 'A' '[a-z]' +match 1 x 'a' '[a-z]' +match 0 x 'a' '[[:upper:]]' +match 1 x 'A' '[[:upper:]]' +match 0 x 'A' '[[:lower:]]' +match 1 x 'a' '[[:lower:]]' +match 0 x 'A' '[B-Za]' +match 1 x 'a' '[B-Za]' +match 0 x 'A' '[B-a]' +match 1 x 'a' '[B-a]' +match 0 x 'z' '[Z-y]' +match 1 x 'Z' '[Z-y]' + +imatch 1 'a' '[A-Z]' +imatch 1 'A' '[A-Z]' +imatch 1 'A' '[a-z]' +imatch 1 'a' '[a-z]' +imatch 1 'a' '[[:upper:]]' +imatch 1 'A' '[[:upper:]]' +imatch 1 'A' '[[:lower:]]' +imatch 1 'a' '[[:lower:]]' +imatch 1 'A' '[B-Za]' +imatch 1 'a' '[B-Za]' +imatch 1 'A' '[B-a]' +imatch 1 'a' '[B-a]' +imatch 1 'z' '[Z-y]' +imatch 1 'Z' '[Z-y]' + test_done diff --git a/wildmatch.c b/wildmatch.c index 7192bdc..f91ba99 100644 --- a/wildmatch.c +++ b/wildmatch.c @@ -196,6 +196,11 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags) } if (t_ch = p_ch t_ch = prev_ch) matched = 1; + else if ((flags WM_CASEFOLD) ISLOWER(t_ch)) { + uchar t_ch_upper = toupper(t_ch); + if (t_ch_upper = p_ch t_ch_upper = prev_ch) + matched = 1; + } p_ch = 0; /* This makes prev_ch get set to 0. */ } else if (p_ch == '[' p[1] == ':') { const uchar *s; @@ -245,6 +250,8 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags) } else if (CC_EQ(s,i, upper)) { if (ISUPPER(t_ch)) matched = 1; + else if ((flags WM_CASEFOLD)
Poor performance of git describe in big repos
Hi, I'm a fairly heavy user of the magit Emacs extension for interacting with my git repos. However I've noticed there are some cases where lag is very high. By analysing strace output of emacs calling git I found two commands that where particularly problematic when interrogating the repo: 11:00 ajb@sloy/x86_64 [work.git] time /usr/bin/git --no-pager describe --long --tags ajb-build-test-5224-10-gfa296e6 real0m5.016s user0m4.364s sys 0m0.444s 11:34 ajb@sloy/x86_64 [work.git] time /usr/bin/git --no-pager describe --contains HEAD fatal: cannot describe 'fa296e61f549a1252a65a13b2f734d7afbc7e88e' real0m4.805s user0m4.388s sys 0m0.400s Running with first command with the --debug flag on gives: 11:34 ajb@sloy/x86_64 [work.git] time /usr/bin/git --no-pager describe --long --tags --debug searching to describe HEAD lightweight 10 ajb-build-test-5224 lightweight 41 ajb-build-test-5222 annotated146 vnms-2-1-36-32 annotated155 vnms-2-1-36-31 annotated174 vnms-2-1-36-30 annotated183 vnms-2-1-36-29 lightweight 188 vnms-2-1-36-28 annotated193 vnms-2-1-36-27 annotated206 vnms-2-1-36-26 annotated215 vectastar-4-2-83-5 traversed 223 commits more than 10 tags found; listed 10 most recent gave up search at 2b69df72d47be8440e3ce4cee91b9b7ceaf8b77c ajb-build-test-5224-10-gfa296e6 real0m4.817s user0m4.320s sys 0m0.464s Which has only traversed 223 before coming to a decision. This seems like a very low number of commits given the time it's spent doing this. One factor might be the size of my repo (.git is around 2.4G). Could this just be due to computational cost of searching through large packs to walk the commit chain? Is there any way to make this easier for git to do? -- Alex, homepage: http://www.bennee.com/~alex/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7] Add new git-related helper to contrib
On Thu, May 30, 2013 at 4:01 AM, Ramkumar Ramachandra artag...@gmail.com wrote: Let's do one more review. Felipe Contreras wrote: diff --git a/contrib/related/git-related b/contrib/related/git-related new file mode 100755 index 000..1b9b1e7 --- /dev/null +++ b/contrib/related/git-related @@ -0,0 +1,120 @@ +#!/usr/bin/env ruby + +# This script finds people that might be interested in a patch +# usage: git related file + +$since = '5-years-ago' +$min_percent = 10 + +class Commit + + attr_reader :persons Unless you plan to introduce many more fields (I haven't looked at the later patches), you might as well implement an #each, like in Commits. commit.each doesn't make sense; each what? We could do 'commit.each_person', but why is that so better than commit.persons.each? It's not. +data.each_line do |line| + if not msg +case line +when /^author ([^]+) (\S+) (.+)$/ + @persons '%s %s' % [$1, $2] Why capture the third group when $3 is unused? Completeness. +when /^$/ + msg = true +end + else +if line =~ /^(Signed-off|Reviewed|Acked)-by: ([^]+) (\S+?)$/ + @persons '%s %s' % [$2, $3] Why capture the first group when $1 is unused? You want to complicate the regex even more with: /^(?:Signed-off|Reviewed|Acked)-by: ([^]+) (\S+?)$/ For what purpose? +end + end +end +@persons.uniq! + end + +end + +class Commits + + def initialize +@items = {} + end + + def size +@items.size + end + + def each(block) +@items.each(block) + end + + def import +return if @items.empty? +File.popen(%w[git cat-file --batch], 'r+') do |p| Don't you need rb+ to suppress the CRLF nonsense on Windows? Who knows. + p.write(@items.keys.join(\n)) As you might have realized, the parentheses are optional everywhere (except when it is required for disambiguation). I'm merely pointing it out here, because this line looks especially ugly. I suspect most Git developers would prefer the traditional function call style. + p.close_write + p.each do |line| +if line =~ /^(\h{40}) commit (\d+)/ + id, len = $1, $2 id, len = $1, Integer $2. And drop the .to_i on the next line. This is way is better. + data = p.read($2.to_i) + @items[id].parse(data) +end + end +end + end + + def get_blame(source, start, len, from) +return if len == 0 +len ||= 1 I asked you to use 'len =1 if not len' for clarity, but you didn't like it. git grep ||= disagrees. +File.popen(['git', 'blame', '--incremental', '-C', '-C', + '-L', '%u,+%u' % [start, len], + '--since', $since, from + '^', + '--', source]) do |p| + p.each do |line| +if line =~ /^\h{40}/ + id = $ + @items[id] = Commit.new(id) +end + end +end + end + + def from_patch(file) +from = source = nil +File.open(file) do |f| + f.each do |line| File.readlines(file).each do |line|. That's less efficient. +when /^---\s+(\S+)/ + source = $1 != '/dev/null' ? $1[2..-1] : nil +when /^@@ -(\d+)(?:,(\d+))?/ + get_blame(source, $1, $2, from) if source and from Useless capture. When is len ($2) going to be nil? Junio already went over it, see the diff format. What's your objective? Block this patch from ever going in? Not a single one of these comments makes a difference at all, all of them can wait until after the patch is merged, many of them are a matter of preferences, and some of them have already been addressed as precisely that: disagreements in style. -- Felipe Contreras -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
Alex Bennée wrote: time /usr/bin/git --no-pager traversed 223 commits real0m4.817s user0m4.320s sys 0m0.464s I'm quite clueless about why it is taking this long: I think it's IO because there's nothing to compute? I really can't trace anything unless you can reproduce it on a public repository. On linux.git with my rotating hard disk: $ time git describe --debug --long --tags HEAD~1 searching to describe HEAD~1 annotated 5445 v2.6.33 annotated 5660 v2.6.33-rc8 annotated 5884 v2.6.33-rc7 annotated 6140 v2.6.33-rc6 annotated 6467 v2.6.33-rc5 annotated 6999 v2.6.33-rc4 annotated 7430 v2.6.33-rc3 annotated 7746 v2.6.33-rc2 annotated 8212 v2.6.33-rc1 annotated 13854 v2.6.32 traversed 18895 commits more than 10 tags found; listed 10 most recent gave up search at 648f4e3e50c4793d9dbf9a09afa193631f76fa26 v2.6.33-5445-ge7c84ee real0m0.509s user0m0.470s sys 0m0.037s 18k+ commits traversed in half a second here, so I really don't know what is going on. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/7] cache: mark cache_entry pointers const
Add const for pointers that are only dereferenced for reading by the inline functions copy_cache_entry and ce_mode_from_stat. This allows callers to pass in const pointers. Signed-off-by: René Scharfe rene.scha...@lsrfire.ath.cx --- cache.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index 94ca1ac..43a27e7 100644 --- a/cache.h +++ b/cache.h @@ -190,7 +190,8 @@ struct cache_entry { * another. But we never change the name, or the hash state! */ #define CE_STATE_MASK (CE_HASHED | CE_UNHASHED) -static inline void copy_cache_entry(struct cache_entry *dst, struct cache_entry *src) +static inline void copy_cache_entry(struct cache_entry *dst, + const struct cache_entry *src) { unsigned int state = dst-ce_flags CE_STATE_MASK; @@ -222,7 +223,8 @@ static inline unsigned int create_ce_mode(unsigned int mode) return S_IFGITLINK; return S_IFREG | ce_permissions(mode); } -static inline unsigned int ce_mode_from_stat(struct cache_entry *ce, unsigned int mode) +static inline unsigned int ce_mode_from_stat(const struct cache_entry *ce, +unsigned int mode) { extern int trust_executable_bit, has_symlinks; if (!has_symlinks S_ISREG(mode) -- 1.8.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/7] read-cache: mark cache_entry pointers const
ie_match_stat and ie_modified only derefence their struct cache_entry pointers for reading. Add const to the parameter declaration here and do the same for the static helper function used by them, as it's the same there as well. This allows callers to pass in const pointers. Signed-off-by: René Scharfe rene.scha...@lsrfire.ath.cx --- cache.h | 4 ++-- read-cache.c | 18 ++ 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/cache.h b/cache.h index 43a27e7..01e8760 100644 --- a/cache.h +++ b/cache.h @@ -482,8 +482,8 @@ extern void *read_blob_data_from_index(struct index_state *, const char *, unsig #define CE_MATCH_RACY_IS_DIRTY 02 /* do stat comparison even if CE_SKIP_WORKTREE is true */ #define CE_MATCH_IGNORE_SKIP_WORKTREE 04 -extern int ie_match_stat(const struct index_state *, struct cache_entry *, struct stat *, unsigned int); -extern int ie_modified(const struct index_state *, struct cache_entry *, struct stat *, unsigned int); +extern int ie_match_stat(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int); +extern int ie_modified(const struct index_state *, const struct cache_entry *, struct stat *, unsigned int); #define PATHSPEC_ONESTAR 1 /* the pathspec pattern sastisfies GFNM_ONESTAR */ diff --git a/read-cache.c b/read-cache.c index 04ed561..e6e0466 100644 --- a/read-cache.c +++ b/read-cache.c @@ -91,7 +91,7 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st) ce_mark_uptodate(ce); } -static int ce_compare_data(struct cache_entry *ce, struct stat *st) +static int ce_compare_data(const struct cache_entry *ce, struct stat *st) { int match = -1; int fd = open(ce-name, O_RDONLY); @@ -105,7 +105,7 @@ static int ce_compare_data(struct cache_entry *ce, struct stat *st) return match; } -static int ce_compare_link(struct cache_entry *ce, size_t expected_size) +static int ce_compare_link(const struct cache_entry *ce, size_t expected_size) { int match = -1; void *buffer; @@ -126,7 +126,7 @@ static int ce_compare_link(struct cache_entry *ce, size_t expected_size) return match; } -static int ce_compare_gitlink(struct cache_entry *ce) +static int ce_compare_gitlink(const struct cache_entry *ce) { unsigned char sha1[20]; @@ -143,7 +143,7 @@ static int ce_compare_gitlink(struct cache_entry *ce) return hashcmp(sha1, ce-sha1); } -static int ce_modified_check_fs(struct cache_entry *ce, struct stat *st) +static int ce_modified_check_fs(const struct cache_entry *ce, struct stat *st) { switch (st-st_mode S_IFMT) { case S_IFREG: @@ -163,7 +163,7 @@ static int ce_modified_check_fs(struct cache_entry *ce, struct stat *st) return 0; } -static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st) +static int ce_match_stat_basic(const struct cache_entry *ce, struct stat *st) { unsigned int changed = 0; @@ -239,7 +239,8 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st) return changed; } -static int is_racy_timestamp(const struct index_state *istate, struct cache_entry *ce) +static int is_racy_timestamp(const struct index_state *istate, +const struct cache_entry *ce) { return (!S_ISGITLINK(ce-ce_mode) istate-timestamp.sec @@ -255,7 +256,7 @@ static int is_racy_timestamp(const struct index_state *istate, struct cache_entr } int ie_match_stat(const struct index_state *istate, - struct cache_entry *ce, struct stat *st, + const struct cache_entry *ce, struct stat *st, unsigned int options) { unsigned int changed; @@ -311,7 +312,8 @@ int ie_match_stat(const struct index_state *istate, } int ie_modified(const struct index_state *istate, - struct cache_entry *ce, struct stat *st, unsigned int options) + const struct cache_entry *ce, + struct stat *st, unsigned int options) { int changed, changed_fs; -- 1.8.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/7] unpack-trees: free cache_entry array members for merges
The merge functions duplicate entries as needed and they don't free them. Release them in unpack_nondirectories, the same function where they were allocated, after we're done. Signed-off-by: René Scharfe rene.scha...@lsrfire.ath.cx --- unpack-trees.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index 2dbc05d..fc0dd5a 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -600,9 +600,14 @@ static int unpack_nondirectories(int n, unsigned long mask, src[i + o-merge] = create_ce_entry(info, names + i, stage); } - if (o-merge) - return call_unpack_fn((const struct cache_entry * const *)src, - o); + if (o-merge) { + int rc = call_unpack_fn((const struct cache_entry * const *)src, + o); + for (i = 1; i = n; i++) + if (src[i] src[i] != o-df_conflict_entry) + free(src[i]); + return rc; + } for (i = 0; i n; i++) if (src[i] src[i] != o-df_conflict_entry) -- 1.8.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/7] unpack-trees: factor out dup_entry
While we're add it, mark the struct cache_entry pointer of add_entry const because we only read from it and this allows callers to pass in const pointers. Signed-off-by: René Scharfe rene.scha...@lsrfire.ath.cx --- unpack-trees.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index ede4299..e8b4cc1 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -116,14 +116,20 @@ static void do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce, ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE); } -static void add_entry(struct unpack_trees_options *o, struct cache_entry *ce, - unsigned int set, unsigned int clear) +static struct cache_entry *dup_entry(const struct cache_entry *ce) { unsigned int size = ce_size(ce); struct cache_entry *new = xmalloc(size); memcpy(new, ce, size); - do_add_entry(o, new, set, clear); + return new; +} + +static void add_entry(struct unpack_trees_options *o, + const struct cache_entry *ce, + unsigned int set, unsigned int clear) +{ + do_add_entry(o, dup_entry(ce), set, clear); } /* -- 1.8.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/7] unpack-trees: create working copy of merge entry in merged_entry
Duplicate the merge entry right away and work with that instead of modifying the entry we got and duplicating it only at the end of the function. Then mark that pointer const to document that we don't modify the referenced cache_entry. This change is safe because all existing merge functions call merged_entry just before returning (or not at all), i.e. they don't care about changes to the referenced cache_entry after the call. unpack_nondirectories and unpack_index_entry, which call the merge functions through call_unpack_fn, aren't interested in such changes neither. The change complicates merged_entry a bit because we have to free the copy if we error out, but allows callers to pass a const pointer. Signed-off-by: René Scharfe rene.scha...@lsrfire.ath.cx --- unpack-trees.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index e8b4cc1..2fecef8 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1466,10 +1466,12 @@ static int verify_absent_sparse(struct cache_entry *ce, return verify_absent_1(ce, orphaned_error, o); } -static int merged_entry(struct cache_entry *merge, struct cache_entry *old, - struct unpack_trees_options *o) +static int merged_entry(const struct cache_entry *ce, + struct cache_entry *old, + struct unpack_trees_options *o) { int update = CE_UPDATE; + struct cache_entry *merge = dup_entry(ce); if (!old) { /* @@ -1487,8 +1489,11 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old, update |= CE_ADDED; merge-ce_flags |= CE_NEW_SKIP_WORKTREE; - if (verify_absent(merge, ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN, o)) + if (verify_absent(merge, + ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN, o)) { + free(merge); return -1; + } invalidate_ce_path(merge, o); } else if (!(old-ce_flags CE_CONFLICTED)) { /* @@ -1502,8 +1507,10 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old, copy_cache_entry(merge, old); update = 0; } else { - if (verify_uptodate(old, o)) + if (verify_uptodate(old, o)) { + free(merge); return -1; + } /* Migrate old flags over */ update |= old-ce_flags (CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE); invalidate_ce_path(old, o); @@ -1516,7 +1523,7 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old, invalidate_ce_path(old, o); } - add_entry(o, merge, update, CE_STAGEMASK); + do_add_entry(o, merge, update, CE_STAGEMASK); return 1; } -- 1.8.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/7] diff-lib, read-tree, unpack-trees: mark cache_entry pointers const
Add const to struct cache_entry pointers throughout the tree which are only used for reading. This allows callers to pass in const pointers. Signed-off-by: René Scharfe rene.scha...@lsrfire.ath.cx --- builtin/read-tree.c | 2 +- diff-lib.c | 23 +++--- unpack-trees.c | 91 + 3 files changed, 63 insertions(+), 53 deletions(-) diff --git a/builtin/read-tree.c b/builtin/read-tree.c index 042ac1b..b847486 100644 --- a/builtin/read-tree.c +++ b/builtin/read-tree.c @@ -66,7 +66,7 @@ static int exclude_per_directory_cb(const struct option *opt, const char *arg, return 0; } -static void debug_stage(const char *label, struct cache_entry *ce, +static void debug_stage(const char *label, const struct cache_entry *ce, struct unpack_trees_options *o) { printf(%s , label); diff --git a/diff-lib.c b/diff-lib.c index f35de0f..83d0cb8 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -64,8 +64,9 @@ static int check_removed(const struct cache_entry *ce, struct stat *st) * commits, untracked content and/or modified content). */ static int match_stat_with_submodule(struct diff_options *diffopt, - struct cache_entry *ce, struct stat *st, - unsigned ce_option, unsigned *dirty_submodule) +const struct cache_entry *ce, +struct stat *st, unsigned ce_option, +unsigned *dirty_submodule) { int changed = ce_match_stat(ce, st, ce_option); if (S_ISGITLINK(ce-ce_mode)) { @@ -237,7 +238,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option) /* A file entry went away or appeared */ static void diff_index_show_file(struct rev_info *revs, const char *prefix, -struct cache_entry *ce, +const struct cache_entry *ce, const unsigned char *sha1, int sha1_valid, unsigned int mode, unsigned dirty_submodule) @@ -246,7 +247,7 @@ static void diff_index_show_file(struct rev_info *revs, sha1, sha1_valid, ce-name, dirty_submodule); } -static int get_stat_data(struct cache_entry *ce, +static int get_stat_data(const struct cache_entry *ce, const unsigned char **sha1p, unsigned int *modep, int cached, int match_missing, @@ -283,7 +284,7 @@ static int get_stat_data(struct cache_entry *ce, } static void show_new_file(struct rev_info *revs, - struct cache_entry *new, + const struct cache_entry *new, int cached, int match_missing) { const unsigned char *sha1; @@ -302,8 +303,8 @@ static void show_new_file(struct rev_info *revs, } static int show_modified(struct rev_info *revs, -struct cache_entry *old, -struct cache_entry *new, +const struct cache_entry *old, +const struct cache_entry *new, int report_missing, int cached, int match_missing) { @@ -362,8 +363,8 @@ static int show_modified(struct rev_info *revs, * give you the position and number of entries in the index). */ static void do_oneway_diff(struct unpack_trees_options *o, - struct cache_entry *idx, - struct cache_entry *tree) + const struct cache_entry *idx, + const struct cache_entry *tree) { struct rev_info *revs = o-unpack_data; int match_missing, cached; @@ -425,8 +426,8 @@ static void do_oneway_diff(struct unpack_trees_options *o, */ static int oneway_diff(struct cache_entry **src, struct unpack_trees_options *o) { - struct cache_entry *idx = src[0]; - struct cache_entry *tree = src[1]; + const struct cache_entry *idx = src[0]; + const struct cache_entry *tree = src[1]; struct rev_info *revs = o-unpack_data; /* diff --git a/unpack-trees.c b/unpack-trees.c index 2fecef8..c5a40df 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -241,8 +241,11 @@ static int check_updates(struct unpack_trees_options *o) return errs != 0; } -static int verify_uptodate_sparse(struct cache_entry *ce, struct unpack_trees_options *o); -static int verify_absent_sparse(struct cache_entry *ce, enum unpack_trees_error_types, struct unpack_trees_options *o); +static int verify_uptodate_sparse(const struct cache_entry *ce, + struct unpack_trees_options *o); +static int verify_absent_sparse(const struct cache_entry *ce, + enum unpack_trees_error_types, +
Re: [PATCH] Makefile: promote wildmatch to be the default fnmatch implementation
On Thu, May 30, 2013 at 9:25 AM, Junio C Hamano gits...@pobox.com wrote: Nguyễn Thái Ngọc Duy pclo...@gmail.com writes: This makes git use wildmatch by default for all fnmatch() calls. Users who want to use system fnmatch (or compat fnmatch) need to set NO_WILDMATCH flag. wildmatch is a drop-in fnmatch replacement with more features. Using wildmatch gives us a consistent behavior across platforms. While I agree this is a good move in the longer term in that we get the often-asked-for foo/**/*.c match and also we have one less platform differences to worry about, I somehow have a recollection that we discussed that there are incompatibilities in dark corners we would want to warn users about and lay a transition plan across some major version bump. I've skimmed through all wildmatch related mails in my gmail archive. There are differences between fnmatch versions, e.g. [1], but I don't think anyone would run into those cases on purpose. There were performance concerns [2] and they should have been addressed with nd/retire-fnmatch series. Originally I was worried that this new code might not be mature enough, but I've been running wildmatch-only git for quite some time, can't really complain. Not really a transition plan, but maybe we could provide a runtime switch to return to system fnmatch when wildmatch becomes default, for a few cycles. This way if wildmatch turns out broken, people can switch back while we work on a fix. Hmph, could you (no need to hurry, though) check the previous discussion and point at what we decided if we did reach any conclusion to refresh our collective memory? We all seemed to agree that the replacement would be a way to go. But not hard decision was reached. [1] http://thread.gmane.org/gmane.comp.version-control.git/207385/focus=207540 [2] http://thread.gmane.org/gmane.comp.version-control.git/211823/focus=211836 -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
On Thu, May 30, 2013 at 11:38:32AM +0100, Alex Bennée wrote: One factor might be the size of my repo (.git is around 2.4G). Could this just be due to computational cost of searching through large packs to walk the commit chain? Is there any way to make this easier for git to do? What does git count-objects -v say for your repository? You may find that performance improves if you repack with git gc --aggressive. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] read-cache: plug small memory leak
We free each entry, but not the array of entries themselves. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- read-cache.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/read-cache.c b/read-cache.c index 04ed561..9d9b886 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1510,6 +1510,8 @@ int discard_index(struct index_state *istate) for (i = 0; i istate-cache_nr; i++) free(istate-cache[i]); + free(istate-cache); + istate-cache = NULL; resolve_undo_clear_index(istate); istate-cache_nr = 0; istate-cache_changed = 0; -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] unpack-trees: free created cache entries
We created them, and nobody else is going to destroy them. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- unpack-trees.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index eff2944..9f19d01 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -590,8 +590,16 @@ static int unpack_nondirectories(int n, unsigned long mask, src[i + o-merge] = create_ce_entry(info, names + i, stage); } - if (o-merge) - return call_unpack_fn(src, o); + if (o-merge) { + int ret = call_unpack_fn(src, o); + for (i = 0; i n; i++) { + struct cache_entry *ce = src[i + o-merge]; + if (!ce || ce == o-df_conflict_entry) + continue; + free(ce); + } + return ret; + } for (i = 0; i n; i++) if (src[i] src[i] != o-df_conflict_entry) -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] unpack-trees: plug a memory leak
Before overwriting the destination index, first let's discard it's contents. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- unpack-trees.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/unpack-trees.c b/unpack-trees.c index ede4299..eff2944 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1146,8 +1146,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options o-src_index = NULL; ret = check_updates(o) ? (-2) : 0; - if (o-dst_index) + if (o-dst_index) { + discard_index(o-dst_index); *o-dst_index = o-result; + } done: clear_exclude_list(el); -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] unpack-trees: free cache_entry array members for merges
On Thu, May 30, 2013 at 6:34 AM, René Scharfe rene.scha...@lsrfire.ath.cx wrote: The merge functions duplicate entries as needed and they don't free them. Release them in unpack_nondirectories, the same function where they were allocated, after we're done. Ah, you beat me to this change, but.. @@ -600,9 +600,14 @@ static int unpack_nondirectories(int n, unsigned long mask, src[i + o-merge] = create_ce_entry(info, names + i, stage); } - if (o-merge) - return call_unpack_fn((const struct cache_entry * const *)src, - o); + if (o-merge) { + int rc = call_unpack_fn((const struct cache_entry * const *)src, + o); + for (i = 1; i = n; i++) + if (src[i] src[i] != o-df_conflict_entry) + free(src[i]); Doesn't it make more sense to follow the code above and do src[i + o-merge]? -- Felipe Contreras -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7] Add new git-related helper to contrib
Felipe Contreras wrote: What's your objective? Block this patch from ever going in? Not a single one of these comments makes a difference at all, all of them can wait until after the patch is merged, many of them are a matter of preferences, and some of them have already been addressed as precisely that: disagreements in style. You posted a patch, and I reviewed it. End of story. I never explicitly or implicitly indicated that I want to block the patch, so stop pulling stuff out of your arse. If you don't want a review, write DO NOT REVIEW (or better yet, don't hit my inbox). I'm not interested in wasting my time either. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7] Add new git-related helper to contrib
On Thu, May 30, 2013 at 7:08 AM, Ramkumar Ramachandra artag...@gmail.com wrote: Felipe Contreras wrote: What's your objective? Block this patch from ever going in? Not a single one of these comments makes a difference at all, all of them can wait until after the patch is merged, many of them are a matter of preferences, and some of them have already been addressed as precisely that: disagreements in style. You posted a patch, and I reviewed it. End of story. I never explicitly or implicitly indicated that I want to block the patch, so stop pulling stuff out of your arse. That's exactly what you are doing. Do you see any other reason for not merging this if not your comments? -- Felipe Contreras -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] commit: reload cache properly
Felipe Contreras felipe.contre...@gmail.com writes: We are supposedly adding files, to to which cache if 'the_index' is discarded? [...] if (!current_head) { discard_cache(); + if (read_cache() 0) + die(_(cannot read the index)); return; } It is not obvious to me that this is a correct change. discard_cache() without subsequent reloading could also legitimately be used to empty the index. So if you are fixing a bug, please justify the change and provide a testcase to guard against it in the future. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
The repo is a fairly hairy one as it includes two historically un-related but content related repos which I'm the process of cherry-picking stuff across. 11:58 ajb@sloy/x86_64 [work.git] git count-objects -v count: 493 size: 4572 in-pack: 399307 packs: 1 size-pack: 1930755 prune-packable: 0 garbage: 0 size-garbage: 0 This was after a repack which did have slight negative effect on performance. The pack file is: 13:27 ajb@sloy/x86_64 [work.git] ls -lh ./.git/objects/pack/* -r--r--r-- 1 ajb cvs 11M May 30 11:56 ./.git/objects/pack/pack-a9ba133a6f25ffa74c3c407e09ab030f8745b201.idx -r--r--r-- 1 ajb cvs 1.9G May 30 11:56 ./.git/objects/pack/pack-a9ba133a6f25ffa74c3c407e09ab030f8745b201.pack I ran perf on it and the top items in the report where: 41.58% git libcrypto.so.1.0.0 [.] 0x6ae73 33.96% git libz.so.1.2.3.4 [.] 0xe0ec 10.39% git libz.so.1.2.3.4 [.] adler32 2.03% git [kernel.kallsyms] [k] clear_page_c So I'm guessing it's spending a lot of non-cache efficient time un-packing and processing the deltas? -- Alex. On 30 May 2013 12:48, John Keeping j...@keeping.me.uk wrote: On Thu, May 30, 2013 at 11:38:32AM +0100, Alex Bennée wrote: One factor might be the size of my repo (.git is around 2.4G). Could this just be due to computational cost of searching through large packs to walk the commit chain? Is there any way to make this easier for git to do? What does git count-objects -v say for your repository? You may find that performance improves if you repack with git gc --aggressive. -- Alex, homepage: http://www.bennee.com/~alex/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] commit: reload cache properly
On Thu, May 30, 2013 at 7:17 AM, Thomas Rast tr...@inf.ethz.ch wrote: Felipe Contreras felipe.contre...@gmail.com writes: We are supposedly adding files, to to which cache if 'the_index' is discarded? [...] if (!current_head) { discard_cache(); + if (read_cache() 0) + die(_(cannot read the index)); return; } It is not obvious to me that this is a correct change. discard_cache() without subsequent reloading could also legitimately be used to empty the index. So if you are fixing a bug, please justify the change and provide a testcase to guard against it in the future. So istate-initialized is false, yet somebody can still add entries to the cache? What happens when somebody else tries to initialize this cache? All the entries there will be lost, even though nobody discarded it afterwards. -- Felipe Contreras -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/7] add tests for rebasing with patch-equivalence present
Am 30.05.2013 07:30, schrieb Martin von Zweigbergk: On Wed, May 29, 2013 at 12:09 AM, Johannes Sixt j.s...@viscovery.net wrote: Am 5/29/2013 8:39, schrieb Martin von Zweigbergk: +# f +# / +# a---b---c---g---h +# \ +# d---G---i ... +test_run_rebase () { + result=$1 + shift + test_expect_$result rebase $* --onto drops patches in onto + reset_rebase + git rebase $* --onto h f i + test_cmp_rev h HEAD~2 + test_linear_range 'd i' h.. Isn't this expectation wrong? The upstream of the rebased branch is f, and it does not contain G. Hence, G should be replayed. Since h is the reversal of g, the state at h is the same as at c, and applying G should succeed (it is the same change as g). Therefore, I think the correct expectation is: test_linear_range 'd G i' h.. Good question! It is really not obvious what the right answer is. Some arguments in favor of dropping 'G': 1. Let's say origin/master points to 'b' when you start the 'd G i' branch. You then send the 'G' patch to Junio who applies it as 'g' (cherry-picking direction is reversed compared to figure, but same effect). You then git pull --rebase when master on origin points to 'h'. Because of the cleverness in 'git pull --rebase', it issues 'git rebase --onto h b i'. The reason for this git pull cleverness is to be prepared for rewritten history: b'--c'--g'--h' / a---b \ d---G---i to avoid that b is rebased. In this case it's clearly useful to have the patch dropped. 2. In the test a little before the above one, we instead do 'git rebase --onto f h i' and make sure that the 'G' is _not_ lost. In that case it doesn't matter what's in $branch..$upstream. Do we agree that $branch..$upstream should never matter (instead, $upstream is only used to find merge base with $branch)? No, we do not agree. $branch..$upstream should be the set of patches that should be omitted. $branch..$onto should not matter. $onto is only used to specify the destination of the rebased commits. Do we also agree that 'git rebase a b' should be identical to 'git rebase --onto a a b'? Absolutely! Because 'git rebase h i' should clearly drop 'G', then so should 'git rebase --onto h h i'. Yes! Then, if we agreed that $branch..$upstream doesn't matter, 'git rebase --onto h f i' should behave the same, no? Correct in the mathematically logical sense. ;) But we do not agree that $branch..$upstream doesn't matter. The set of commits to rebase that I was thinking of using was $upstream..$branch, unless equivalent with patch in $branch..$onto. But I'm not very confident about my conclusions above :-) At least the man page says that ..$upstream counts and $onto tells just the new base. The way how git pull calls rebase should be revisited, I think. -- Hannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] commit: reload cache properly
Felipe Contreras felipe.contre...@gmail.com writes: On Thu, May 30, 2013 at 7:17 AM, Thomas Rast tr...@inf.ethz.ch wrote: Felipe Contreras felipe.contre...@gmail.com writes: We are supposedly adding files, to to which cache if 'the_index' is discarded? [...] if (!current_head) { discard_cache(); + if (read_cache() 0) + die(_(cannot read the index)); return; } It is not obvious to me that this is a correct change. discard_cache() without subsequent reloading could also legitimately be used to empty the index. So if you are fixing a bug, please justify the change and provide a testcase to guard against it in the future. So istate-initialized is false, yet somebody can still add entries to the cache? What happens when somebody else tries to initialize this cache? All the entries there will be lost, even though nobody discarded it afterwards. And yet it works, and your patch breaks it. diff --git i/t/t7501-commit.sh w/t/t7501-commit.sh index 195e747..1608254 100755 --- i/t/t7501-commit.sh +++ w/t/t7501-commit.sh @@ -524,4 +524,16 @@ test_expect_success 'commit a file whose name is a dash' ' test_i18ngrep changed, 5 insertions output ' +test_expect_success '--only works on to-be-born branch' ' + git checkout --orphan orphan + echo foo newfile + git add newfile + git commit --only newfile -m--only on unborn branch + cat expected EOF +100644 blob 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 newfile +EOF + git ls-tree -r HEAD actual + test_cmp expected actual +' + test_done -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] commit: reload cache properly
On Thu, May 30, 2013 at 7:58 AM, Thomas Rast tr...@inf.ethz.ch wrote: Felipe Contreras felipe.contre...@gmail.com writes: On Thu, May 30, 2013 at 7:17 AM, Thomas Rast tr...@inf.ethz.ch wrote: Felipe Contreras felipe.contre...@gmail.com writes: We are supposedly adding files, to to which cache if 'the_index' is discarded? [...] if (!current_head) { discard_cache(); + if (read_cache() 0) + die(_(cannot read the index)); return; } It is not obvious to me that this is a correct change. discard_cache() without subsequent reloading could also legitimately be used to empty the index. So if you are fixing a bug, please justify the change and provide a testcase to guard against it in the future. So istate-initialized is false, yet somebody can still add entries to the cache? What happens when somebody else tries to initialize this cache? All the entries there will be lost, even though nobody discarded it afterwards. And yet it works, and your patch breaks it. It might work, but the API doesn't make any sense. -- Felipe Contreras -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] commit: reload cache properly
On Thu, May 30, 2013 at 7:17 PM, Thomas Rast tr...@inf.ethz.ch wrote: Felipe Contreras felipe.contre...@gmail.com writes: We are supposedly adding files, to to which cache if 'the_index' is discarded? [...] if (!current_head) { discard_cache(); + if (read_cache() 0) + die(_(cannot read the index)); return; } It is not obvious to me that this is a correct change. discard_cache() without subsequent reloading could also legitimately be used to empty the index. So if you are fixing a bug, please justify the change and provide a testcase to guard against it in the future. That discard_cache line I think came from fa9dcf8 (Fix performance regression for partial commits - 2008-01-13). The code flow back then was if (initial_commit) discard_cache(); add_remove_files(); /* do something more */ A quick look from current code seems to indicate this pattern is still valid for creating partial commits, where temporary index will be thrown away afterwards. But I may be wrong. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
It looks like it's a file caching effect combined with my repo being more pathalogical in size and contents. Note run 1 (cold) vs run 2 on the linux file tree: 13:52 ajb@sloy/x86_64 [linux.git] time git describe --debug --long --tags HEAD~1 searching to describe HEAD~1 annotated 57 v2.6.34-rc2 annotated 1688 v2.6.34-rc1 annotated 7932 v2.6.33 annotated 8157 v2.6.33-rc8 annotated 8381 v2.6.33-rc7 annotated 8637 v2.6.33-rc6 annotated 8964 v2.6.33-rc5 annotated 9493 v2.6.33-rc4 annotated 9912 v2.6.33-rc3 annotated 10202 v2.6.33-rc2 traversed 10547 commits more than 10 tags found; listed 10 most recent gave up search at 55639353a0035052d9ea6cfe4dde0ac7fcbb2c9f v2.6.34-rc2-57-gef5da59 real0m7.332s user0m0.308s sys 0m0.244s 14:03 ajb@sloy/x86_64 [linux.git] time git describe --debug --long --tags HEAD~1 searching to describe HEAD~1 annotated 57 v2.6.34-rc2 annotated 1688 v2.6.34-rc1 annotated 7932 v2.6.33 annotated 8157 v2.6.33-rc8 annotated 8381 v2.6.33-rc7 annotated 8637 v2.6.33-rc6 annotated 8964 v2.6.33-rc5 annotated 9493 v2.6.33-rc4 annotated 9912 v2.6.33-rc3 annotated 10202 v2.6.33-rc2 traversed 10547 commits more than 10 tags found; listed 10 most recent gave up search at 55639353a0035052d9ea6cfe4dde0ac7fcbb2c9f v2.6.34-rc2-57-gef5da59 real0m0.298s user0m0.244s sys 0m0.036s Although the perf profile looks subtly different. First through the linux tree: 22.35% git libz.so.1.2.3.4[.] inflate 18.56% git libz.so.1.2.3.4[.] inflate_fast 17.48% git libz.so.1.2.3.4[.] inflate_table 7.84% git git[.] hashcmp 3.93% git git[.] get_sha1_hex 3.46% git libz.so.1.2.3.4[.] adler32 And through my special repo: 41.58% git libcrypto.so.1.0.0 [.] sha1_block_data_order_ssse3 33.62% git libz.so.1.2.3.4 [.] inflate_fast 10.39% git libz.so.1.2.3.4 [.] adler32 2.03% git [kernel.kallsyms] [k] clear_page_c I'm not sure why libcrypto features so highly in the results -- Alex. On 30 May 2013 12:33, Ramkumar Ramachandra artag...@gmail.com wrote: Alex Bennée wrote: time /usr/bin/git --no-pager traversed 223 commits real0m4.817s user0m4.320s sys 0m0.464s I'm quite clueless about why it is taking this long: I think it's IO because there's nothing to compute? I really can't trace anything unless you can reproduce it on a public repository. On linux.git with my rotating hard disk: $ time git describe --debug --long --tags HEAD~1 searching to describe HEAD~1 annotated 5445 v2.6.33 annotated 5660 v2.6.33-rc8 annotated 5884 v2.6.33-rc7 annotated 6140 v2.6.33-rc6 annotated 6467 v2.6.33-rc5 annotated 6999 v2.6.33-rc4 annotated 7430 v2.6.33-rc3 annotated 7746 v2.6.33-rc2 annotated 8212 v2.6.33-rc1 annotated 13854 v2.6.32 traversed 18895 commits more than 10 tags found; listed 10 most recent gave up search at 648f4e3e50c4793d9dbf9a09afa193631f76fa26 v2.6.33-5445-ge7c84ee real0m0.509s user0m0.470s sys 0m0.037s 18k+ commits traversed in half a second here, so I really don't know what is going on. -- Alex, homepage: http://www.bennee.com/~alex/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
You may find that performance improves if you repack with git gc --aggressive. It seems that increases the time to get to where it wants to: 14:12 ajb@sloy/x86_64 [work.git] time /usr/bin/git --no-pager describe --long --tags --debug searching to describe HEAD lightweight 10 ajb-build-test-5224 lightweight 41 ajb-build-test-5222 annotated146 vnms-2-1-36-32 annotated155 vnms-2-1-36-31 annotated174 vnms-2-1-36-30 annotated183 vnms-2-1-36-29 lightweight 188 vnms-2-1-36-28 annotated193 vnms-2-1-36-27 annotated206 vnms-2-1-36-26 annotated215 vectastar-4-2-83-5 traversed 223 commits more than 10 tags found; listed 10 most recent gave up search at 2b69df72d47be8440e3ce4cee91b9b7ceaf8b77c ajb-build-test-5224-10-gfa296e6 real0m14.658s user0m12.845s sys 0m1.776s On 30 May 2013 12:48, John Keeping j...@keeping.me.uk wrote: On Thu, May 30, 2013 at 11:38:32AM +0100, Alex Bennée wrote: One factor might be the size of my repo (.git is around 2.4G). Could this just be due to computational cost of searching through large packs to walk the commit chain? Is there any way to make this easier for git to do? What does git count-objects -v say for your repository? You may find that performance improves if you repack with git gc --aggressive. -- Alex, homepage: http://www.bennee.com/~alex/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
On Thu, May 30, 2013 at 7:29 PM, Alex Bennée kernel-hac...@bennee.com wrote: I ran perf on it and the top items in the report where: 41.58% git libcrypto.so.1.0.0 [.] 0x6ae73 33.96% git libz.so.1.2.3.4 [.] 0xe0ec 10.39% git libz.so.1.2.3.4 [.] adler32 2.03% git [kernel.kallsyms] [k] clear_page_c So I'm guessing it's spending a lot of non-cache efficient time un-packing and processing the deltas? If I'm not mistaken, commits are never deltified. They are usually small and packed close together for better I/O patterns. If you really just read hundreds of commits, it can't take that long. Maybe some code paths accidentally open a tree, a blob or something.. Can you try setting core.logpackaccess to a path on and rerun describe? Jugding from the code (I never actually tried it), it'll create a file at the given path with the accessed pack offsets. You can check what offset corresponds to what object with verify-pack -v. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/3] read-cache: plug a few leaks
We don't free 'istate-cache' properly. Apparently 'initialized' doesn't really mean initialized, but loaded, or rather 'not-empty', and the cache can be used even if it's not 'initialized', so we can't rely on this variable to keep track of the 'istate-cache'. So assume it always has data, and free it before overwriting it. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- read-cache.c | 4 1 file changed, 4 insertions(+) diff --git a/read-cache.c b/read-cache.c index 04ed561..e5dc96f 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1449,6 +1449,7 @@ int read_index_from(struct index_state *istate, const char *path) istate-version = ntohl(hdr-hdr_version); istate-cache_nr = ntohl(hdr-hdr_entries); istate-cache_alloc = alloc_nr(istate-cache_nr); + free(istate-cache); istate-cache = xcalloc(istate-cache_alloc, sizeof(struct cache_entry *)); istate-initialized = 1; @@ -1510,6 +1511,9 @@ int discard_index(struct index_state *istate) for (i = 0; i istate-cache_nr; i++) free(istate-cache[i]); + free(istate-cache); + istate-cache = NULL; + istate-cache_alloc = 0; resolve_undo_clear_index(istate); istate-cache_nr = 0; istate-cache_changed = 0; -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/3] unpack-trees: free created cache entries
We created them, and nobody else is going to destroy them. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- unpack-trees.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index eff2944..9f19d01 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -590,8 +590,16 @@ static int unpack_nondirectories(int n, unsigned long mask, src[i + o-merge] = create_ce_entry(info, names + i, stage); } - if (o-merge) - return call_unpack_fn(src, o); + if (o-merge) { + int ret = call_unpack_fn(src, o); + for (i = 0; i n; i++) { + struct cache_entry *ce = src[i + o-merge]; + if (!ce || ce == o-df_conflict_entry) + continue; + free(ce); + } + return ret; + } for (i = 0; i n; i++) if (src[i] src[i] != o-df_conflict_entry) -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/3] unpack-trees: plug a memory leak
Before overwriting the destination index, first let's discard it's contents. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- unpack-trees.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/unpack-trees.c b/unpack-trees.c index ede4299..eff2944 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1146,8 +1146,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options o-src_index = NULL; ret = check_updates(o) ? (-2) : 0; - if (o-dst_index) + if (o-dst_index) { + discard_index(o-dst_index); *o-dst_index = o-result; + } done: clear_exclude_list(el); -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/3] cherry-pick: fix memory leaks
Hi, A small change since v1; one patch is dropped and another is updated to make up for it. Felipe Contreras (3): read-cache: plug a few leaks unpack-trees: plug a memory leak unpack-trees: free created cache entries read-cache.c | 4 unpack-trees.c | 16 +--- 2 files changed, 17 insertions(+), 3 deletions(-) -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/3] unpack-trees: plug a memory leak
On 05/30/2013 03:34 PM, Felipe Contreras wrote: Before overwriting the destination index, first let's discard it's s/it's/its/ contents. [SNIP] Regards, Stefano -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
On Thu, May 30, 2013 at 8:34 PM, Alex Bennée kernel-hac...@bennee.com wrote: From the following run: 14:31 ajb@sloy/x86_64 [work.git] time /usr/bin/git --no-pager describe --long --tags ajb-build-test-5224-11-g9660048 real0m14.720s user0m12.985s sys 0m1.700s 14:31 ajb@sloy/x86_64 [work.git] wc -l /tmp/log-pack.txt 1610 /tmp/log-pack.txt The pack has been tuned with a gc --aggressive. Assuming the numbers are offsets into the pack it looks fairly random access until the last 100 or so. [snipped] Thanks. Can you share verify-pack -v output of pack-a9ba133a6f25ffa74c3c407e09ab030f8745b201.pack? I think you need to put it somewhere on Internet temporarily as it's likely to exceed git@vger limits. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] core: use env variable instead of config var to turn on logging pack access
5f44324 (core: log offset pack data accesses happened - 2011-07-06) provides a way to observe pack access patterns via a config switch. Setting an environment variable looks more obvious than a config var, especially when you just need to _observe_, and more inline with other tracing knobs we have. Document it as it may be useful for remote troubleshooting. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com --- Documentation/git.txt | 7 +++ cache.h | 3 --- config.c | 3 --- sha1_file.c | 10 ++ 4 files changed, 17 insertions(+), 6 deletions(-) diff --git a/Documentation/git.txt b/Documentation/git.txt index 9e302b0..3e74440 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -832,6 +832,13 @@ for further details. as a file path and will try to write the trace messages into it. +'GIT_TRACE_PACK_ACCESS':: + If this variable is set to a path, a file will be created at + the given path logging all accesses to any packs. For each + access, the pack file name and an offset in the pack is + recorded. This may be helpful for troubleshooting some + pack-related performance problems. + GIT_LITERAL_PATHSPECS:: Setting this variable to `1` will cause Git to treat all pathspecs literally, rather than as glob patterns. For example, diff --git a/cache.h b/cache.h index 94ca1ac..9bfd76b 100644 --- a/cache.h +++ b/cache.h @@ -772,9 +772,6 @@ extern int parse_sha1_header(const char *hdr, unsigned long *sizep); /* global flag to enable extra checks when accessing packed objects */ extern int do_check_packed_object_crc; -/* for development: log offset of pack access */ -extern const char *log_pack_access; - extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type); extern int move_temp_to_file(const char *tmpfile, const char *filename); diff --git a/config.c b/config.c index aefd80b..ce074d7 100644 --- a/config.c +++ b/config.c @@ -675,9 +675,6 @@ static int git_default_core_config(const char *var, const char *value) return 0; } - if (!strcmp(var, core.logpackaccess)) - return git_config_string(log_pack_access, var, value); - if (!strcmp(var, core.autocrlf)) { if (value !strcasecmp(value, input)) { if (core_eol == EOL_CRLF) diff --git a/sha1_file.c b/sha1_file.c index 67e815b..7b47bdc 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -36,6 +36,8 @@ static inline uintmax_t sz_fmt(size_t s) { return s; } const unsigned char null_sha1[20]; +static const char *log_pack_access = ; + /* * This is meant to hold a *small* number of objects that you would * want read_sha1_file() to be able to return, but yet you do not want @@ -1956,6 +1958,14 @@ static void write_pack_access_log(struct packed_git *p, off_t obj_offset) { static FILE *log_file; + if (!*log_pack_access) { + log_pack_access = getenv(GIT_TRACE_PACK_ACCESS); + if (!*log_pack_access) + log_pack_access = NULL; + if (!log_pack_access) + return; + } + if (!log_file) { log_file = fopen(log_pack_access, w); if (!log_file) { -- 1.8.2.83.gc99314b -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] git.txt: document GIT_TRACE_PACKET
This can help with debugging object negotiation or other protocol issues. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com --- Documentation/git.txt | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/git.txt b/Documentation/git.txt index 3e74440..12ef7a2 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -839,6 +839,11 @@ for further details. recorded. This may be helpful for troubleshooting some pack-related performance problems. +'GIT_TRACE_PACKET':: + If this variable is set, it shows a trace of all packets + coming in or out of a given program. This can help with + debugging object negotiation or other protocol issues. + GIT_LITERAL_PATHSPECS:: Setting this variable to `1` will cause Git to treat all pathspecs literally, rather than as glob patterns. For example, -- 1.8.2.83.gc99314b -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] cherry-pick: don't barf when there's nothing to do
If the user set --ff, it's expected that if theres's nothing to do we fast-forward our current HEAD, which is a no-op. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- sequencer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sequencer.c b/sequencer.c index d8f9d30..b9d4b48 100644 --- a/sequencer.c +++ b/sequencer.c @@ -749,7 +749,7 @@ static void prepare_revs(struct replay_opts *opts) if (prepare_revision_walk(opts-revs)) die(_(revision walk setup failed)); - if (!opts-revs-commits) + if (!opts-revs-commits !opts-allow_ff) die(_(empty commit set passed)); } -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] Trivial patches
Hi, Here's a bunch of trivial patches, mostly syle, but the first one might be important. Felipe Contreras (4): read-cache: fix wrong 'the_index' usage read-cache: trivial style cleanups unpack-trees: trivial cleanup sha1_file: trivial style cleanup read-cache.c | 6 +++--- sha1_file.c| 2 +- unpack-trees.c | 3 +-- 3 files changed, 5 insertions(+), 6 deletions(-) -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] read-cache: fix wrong 'the_index' usage
We are dealing with the 'istate' index, not 'the_index'. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- read-cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/read-cache.c b/read-cache.c index 04ed561..5253ec5 100644 --- a/read-cache.c +++ b/read-cache.c @@ -626,7 +626,7 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st, if (*ptr == '/') { struct cache_entry *foundce; ++ptr; - foundce = index_name_exists(the_index, ce-name, ptr - ce-name, ignore_case); + foundce = index_name_exists(istate, ce-name, ptr - ce-name, ignore_case); if (foundce) { memcpy((void *)startPtr, foundce-name + (startPtr - ce-name), ptr - startPtr); startPtr = ptr; -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] read-cache: trivial style cleanups
Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- read-cache.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/read-cache.c b/read-cache.c index 5253ec5..7040e79 100644 --- a/read-cache.c +++ b/read-cache.c @@ -979,7 +979,7 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti if (istate-cache_nr == istate-cache_alloc) { istate-cache_alloc = alloc_nr(istate-cache_alloc); istate-cache = xrealloc(istate-cache, - istate-cache_alloc * sizeof(struct cache_entry *)); + istate-cache_alloc * sizeof(*istate-cache)); } /* Add it in.. */ @@ -1449,7 +1449,7 @@ int read_index_from(struct index_state *istate, const char *path) istate-version = ntohl(hdr-hdr_version); istate-cache_nr = ntohl(hdr-hdr_entries); istate-cache_alloc = alloc_nr(istate-cache_nr); - istate-cache = xcalloc(istate-cache_alloc, sizeof(struct cache_entry *)); + istate-cache = xcalloc(istate-cache_alloc, sizeof(*istate-cache)); istate-initialized = 1; if (istate-version == 4) -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] unpack-trees: trivial cleanup
dfc has not been initialized at this point. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- unpack-trees.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index ede4299..36f4ff7 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1040,8 +1040,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options if (!o-skip_sparse_checkout) mark_new_skip_worktree(o-el, o-src_index, 0, CE_NEW_SKIP_WORKTREE); - if (!dfc) - dfc = xcalloc(1, cache_entry_size(0)); + dfc = xcalloc(1, cache_entry_size(0)); o-df_conflict_entry = dfc; if (len) { -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] sha1_file: trivial style cleanup
Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- sha1_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 67e815b..b114cc9 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2138,7 +2138,7 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, if (!data) die(failed to apply delta); - free (delta_data); + free(delta_data); } *final_type = type; -- 1.8.3.rc3.312.g47657de -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
On 30 May 2013 14:45, Duy Nguyen pclo...@gmail.com wrote: On Thu, May 30, 2013 at 8:34 PM, Alex Bennée kernel-hac...@bennee.com wrote: snip Thanks. Can you share verify-pack -v output of pack-a9ba133a6f25ffa74c3c407e09ab030f8745b201.pack? I think you need to put it somewhere on Internet temporarily as it's likely to exceed git@vger limits. -- Duy http://www.bennee.com/~alex/stuff/git-pack-access.tar.bz2 -- Alex, homepage: http://www.bennee.com/~alex/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
Alex Bennée wrote: And through my special repo: 41.58% git libcrypto.so.1.0.0 [.] sha1_block_data_order_ssse3 33.62% git libz.so.1.2.3.4 [.] inflate_fast 10.39% git libz.so.1.2.3.4 [.] adler32 2.03% git [kernel.kallsyms] [k] clear_page_c I'm not sure why libcrypto features so highly in the results While Duy churns on the delta chain, let me try to make a (rather crude) observation: What does it mean for libcrypto to be so high in your perf report? sha1_block_data_order is ultimately by object.c:parse_object. While it indicates that deltas are taking a long time to apply (or are somehow not optimally organized for IO), I think it indicates either: 1. Your history is very deep and there are an unusually high number of deltas for each blob. What are the total number of commits? 2. You have have huge (binary) files checked into your repository. Do you? If so, why isn't the code in streaming.c kicking in? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] test: fix post rewrite hook report
Felipe Contreras felipe.contre...@gmail.com writes: First expected, then actual. Ack. That is the prevalent (almost universal, but not quite) style. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- t/t5407-post-rewrite-hook.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/t5407-post-rewrite-hook.sh b/t/t5407-post-rewrite-hook.sh index baa670c..ea2e0d4 100755 --- a/t/t5407-post-rewrite-hook.sh +++ b/t/t5407-post-rewrite-hook.sh @@ -31,8 +31,8 @@ clear_hook_input () { } verify_hook_input () { - test_cmp $TRASH_DIRECTORY/post-rewrite.args expected.args - test_cmp $TRASH_DIRECTORY/post-rewrite.data expected.data + test_cmp expected.args $TRASH_DIRECTORY/post-rewrite.args + test_cmp expected.data $TRASH_DIRECTORY/post-rewrite.data } test_expect_success 'git commit --amend' ' -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] unpack-trees: free cache_entry array members for merges
Am 30.05.2013 14:04, schrieb Felipe Contreras: On Thu, May 30, 2013 at 6:34 AM, René Scharfe rene.scha...@lsrfire.ath.cx wrote: The merge functions duplicate entries as needed and they don't free them. Release them in unpack_nondirectories, the same function where they were allocated, after we're done. Ah, you beat me to this change, but.. @@ -600,9 +600,14 @@ static int unpack_nondirectories(int n, unsigned long mask, src[i + o-merge] = create_ce_entry(info, names + i, stage); } - if (o-merge) - return call_unpack_fn((const struct cache_entry * const *)src, - o); + if (o-merge) { + int rc = call_unpack_fn((const struct cache_entry * const *)src, + o); + for (i = 1; i = n; i++) + if (src[i] src[i] != o-df_conflict_entry) + free(src[i]); Doesn't it make more sense to follow the code above and do src[i + o-merge]? Not sure I understand. Is the goal to avoid confusion for code readers by using the same indexing method for allocation and release? Or are you worried about o-merge having a different value than 1 in that loop? We'd have to add 1 (== o-merge) to each index variable usage with a zero-based loop. A one-based loop avoids that, and while it's not pretty it's also not too complicated, I think. René -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/3] unpack-trees: free created cache entries
Am 30.05.2013 15:34, schrieb Felipe Contreras: We created them, and nobody else is going to destroy them. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- unpack-trees.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index eff2944..9f19d01 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -590,8 +590,16 @@ static int unpack_nondirectories(int n, unsigned long mask, src[i + o-merge] = create_ce_entry(info, names + i, stage); } - if (o-merge) - return call_unpack_fn(src, o); + if (o-merge) { + int ret = call_unpack_fn(src, o); + for (i = 0; i n; i++) { + struct cache_entry *ce = src[i + o-merge]; + if (!ce || ce == o-df_conflict_entry) + continue; + free(ce); + } + return ret; + } Ah, now I understand what you meant in that other email. That works as well, of course. It's slightly nicer on the eye, admittedly. René -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [git-users] Highlevel (but simple to implement) commands provided by default for git
Felipe Contreras wrote: On Thu, May 30, 2013 at 12:23 AM, Jonathan Nieder jrnie...@gmail.com wrote: Felipe Contreras wrote: On Wed, May 29, 2013 at 6:43 PM, Jonathan Nieder jrnie...@gmail.com wrote: A bigger problem (in my opinion) with allowing arbitrary changes to the meaning of existing commands is that scripts, whether placed in .sh files or given as commands to run over IRC, stop working altogether. It's nice to have commands like git log and git am mean the same thing no matter what machine I am on. Except that's not true: It's not true that my opinion is that a bigger problem than the non-problem Ram mentioned with allowing arbitrary changes to the meaning of existing commands is that scripts stop working reliably? It's not true what you said: commands like git log and git am mean the same thing no matter what machine I am on. It's not true that it's nice when they do? This combative style of communication is toxic. It kills the chance of a calm, pleasant discussion, even with patient people who don't even fundamentally disagree. Please stop it. Stop assuming bad faith[1]. Perhaps you mean I'm trying, but I'm human --- please bear with me while I work on improving. Know that my intentions are good. Unfortunately, good intentions are not enough. Communicating in a way that clearly conveys what you mean instead of something else that derails the conversation is a real skill and, as I said, it's an important one that is basic to being able to have a productive conversation. If you are working on it and are not there yet, my best advice would be to lurk more and speak less, to learn from the example of others, and to start by working on how to receive criticism and to clear up misunderstandings, which can at least mitigate the damage when things go wrong. You have accused others of assuming bad faith in the past, which only escalates things. It's not a good way to move forward. It's possible that the best way to move forward involves some way (e.g., mail client configuration that holds messages in drafts for a little while before sending them out) of being able to cool down when discussions get hot. Sincerely, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/7] add tests for rebasing with patch-equivalence present
On Thu, May 30, 2013 at 5:54 AM, Johannes Sixt j...@kdbg.org wrote: Am 30.05.2013 07:30, schrieb Martin von Zweigbergk: On Wed, May 29, 2013 at 12:09 AM, Johannes Sixt j.s...@viscovery.net wrote: Am 5/29/2013 8:39, schrieb Martin von Zweigbergk: +# f +# / +# a---b---c---g---h +# \ +# d---G---i ... +test_run_rebase () { + result=$1 + shift + test_expect_$result rebase $* --onto drops patches in onto + reset_rebase + git rebase $* --onto h f i + test_cmp_rev h HEAD~2 + test_linear_range 'd i' h.. Isn't this expectation wrong? The upstream of the rebased branch is f, and it does not contain G. Hence, G should be replayed. Since h is the reversal of g, the state at h is the same as at c, and applying G should succeed (it is the same change as g). Therefore, I think the correct expectation is: test_linear_range 'd G i' h.. Good question! It is really not obvious what the right answer is. Some arguments in favor of dropping 'G': 1. Let's say origin/master points to 'b' when you start the 'd G i' branch. You then send the 'G' patch to Junio who applies it as 'g' (cherry-picking direction is reversed compared to figure, but same effect). You then git pull --rebase when master on origin points to 'h'. Because of the cleverness in 'git pull --rebase', it issues 'git rebase --onto h b i'. The reason for this git pull cleverness is to be prepared for rewritten history: b'--c'--g'--h' / a---b \ d---G---i to avoid that b is rebased. Right. It doesn't currently drop 'G' and maybe it shouldn't, so let's leave it as is for now, I would say. 2. In the test a little before the above one, we instead do 'git rebase --onto f h i' and make sure that the 'G' is _not_ lost. In that case it doesn't matter what's in $branch..$upstream. Do we agree that $branch..$upstream should never matter (instead, $upstream is only used to find merge base with $branch)? No, we do not agree. $branch..$upstream should be the set of patches that should be omitted. $branch..$onto should not matter. $onto is only used to specify the destination of the rebased commits. Ok. As I said to Felipe, I'm not sure why I was so convinced that it's bad to lose the patch in 'git rebase --onto f h i'. It can result in lost work, but it seems rare enough that no one has reported it, AFAIK. I'll change those tests in a re-roll, and perhaps I'll drop a few of them. Let me know if you (anyone) disagree. Martin PS. Thanks for a meticulous review! -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
On 30 May 2013 15:32, Ramkumar Ramachandra artag...@gmail.com wrote: Alex Bennée wrote: And through my special repo: 41.58% git libcrypto.so.1.0.0 [.] sha1_block_data_order_ssse3 33.62% git libz.so.1.2.3.4 [.] inflate_fast 10.39% git libz.so.1.2.3.4 [.] adler32 2.03% git [kernel.kallsyms] [k] clear_page_c I'm not sure why libcrypto features so highly in the results While Duy churns on the delta chain, let me try to make a (rather crude) observation: What does it mean for libcrypto to be so high in your perf report? sha1_block_data_order is ultimately by object.c:parse_object. While it indicates that deltas are taking a long time to apply (or are somehow not optimally organized for IO), I think it indicates either: 1. Your history is very deep and there are an unusually high number of deltas for each blob. What are the total number of commits? Well the history does en-compose about 10 years of product development and has a high number of files in the repo (including about 3 copies of the kernel - sans upstream history). 15:50 ajb@sloy/x86_64 [work.git] time git log --pretty=oneline | wc -l 24648 real0m0.434s user0m0.388s sys 0m0.112s Although it doesn't take too long to walk the whole mainline history (obviously ignoring all the other branches). 15:52 ajb@sloy/x86_64 [work.git] git count-objects -v -H count: 581 size: 5.09 MiB in-pack: 399307 packs: 1 size-pack: 1.49 GiB prune-packable: 0 garbage: 0 size-garbage: 0 bytes It is a pick repo. The gc --aggressive nearly took out my machine keeping around 4gb resident for most of the half hour and using nearly 8gb of VM. Of course most of the history is not needed for day to day stuff. Maybe if I split the pack files up it wouldn't be quite such a strain to work through them? 2. You have have huge (binary) files checked into your repository. Do you? If so, why isn't the code in streaming.c kicking in? We do have some binary blobs in the repository (mainly DSP and FPGA images) although not a huge number: 15:58 ajb@sloy/x86_64 [work.git] time git log --pretty=oneline -- xxx xxx/xx/*.out ./xxx/xxx/*.out ./xxx/xxx/*.out | wc -l 234 real0m0.590s user0m0.552s sys 0m0.040s How can I tell if streaming is kicking in or now? -- Alex, homepage: http://www.bennee.com/~alex/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/3] read-cache: plug a few leaks
Am 30.05.2013 15:34, schrieb Felipe Contreras: We don't free 'istate-cache' properly. Apparently 'initialized' doesn't really mean initialized, but loaded, or rather 'not-empty', and the cache can be used even if it's not 'initialized', so we can't rely on this variable to keep track of the 'istate-cache'. So assume it always has data, and free it before overwriting it. Signed-off-by: Felipe Contreras felipe.contre...@gmail.com --- read-cache.c | 4 1 file changed, 4 insertions(+) diff --git a/read-cache.c b/read-cache.c index 04ed561..e5dc96f 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1449,6 +1449,7 @@ int read_index_from(struct index_state *istate, const char *path) istate-version = ntohl(hdr-hdr_version); istate-cache_nr = ntohl(hdr-hdr_entries); istate-cache_alloc = alloc_nr(istate-cache_nr); + free(istate-cache); With that change, callers of read_index_from need to set -cache to NULL for uninitialized (on-stack) index_state variables. They only had to set -initialized to 0 before in that situation. It this chunk safe for all existing callers? Shouldn't the same free in discard_index (added below) be enough? istate-cache = xcalloc(istate-cache_alloc, sizeof(struct cache_entry *)); istate-initialized = 1; @@ -1510,6 +1511,9 @@ int discard_index(struct index_state *istate) for (i = 0; i istate-cache_nr; i++) free(istate-cache[i]); + free(istate-cache); + istate-cache = NULL; + istate-cache_alloc = 0; resolve_undo_clear_index(istate); istate-cache_nr = 0; istate-cache_changed = 0; I was preparing a similar change, looks good. There is a comment at the end of discard_index() that becomes wrong due to that patch, though -- better remove it as well. It was already outdated as it mentioned active_cache, while the function can be used with any index_state. diff --git a/read-cache.c b/read-cache.c index e5dc96f..0f868af 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1522,8 +1522,6 @@ int discard_index(struct index_state *istate) free_name_hash(istate); cache_tree_free((istate-cache_tree)); istate-initialized = 0; - - /* no need to throw away allocated active_cache */ return 0; } -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: t0008-ignores failure (was: [msysGit] Git for Windows 1.8.3)
Hi Karsten, On Thu, 30 May 2013, Karsten Blees wrote: Am 25.05.2013 21:16, schrieb Pat Thoyts: On that note -- with this merge as it now stands I get the following test failures: t0008-ignores.sh 155, 158, 162, 164 These tests fail because they use absolute paths, e.g. C:/.../global-excludes, which is then translated to CNUL/.../global-excludes. Can be fixed like so: --- 8 --- --- a/t/t0008-ignores.sh +++ b/t/t0008-ignores.sh @@ -5,7 +5,7 @@ test_description=check-ignore . ./test-lib.sh init_vars () { - global_excludes=$(pwd)/global-excludes + global_excludes=global-excludes } enable_global_excludes () { --- Since I do not have time for the lengthy, undirected discussion upstream seems to want to start, let's make your change, but only conditional on MINGW? Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
Alex Bennée wrote: 15:50 ajb@sloy/x86_64 [work.git] time git log --pretty=oneline | wc -l 24648 real0m0.434s user0m0.388s sys 0m0.112s Although it doesn't take too long to walk the whole mainline history (obviously ignoring all the other branches). Damn, non-starter. linux.git has 361k+ commits in mainline history. Nit: use git rev-list --count HEAD next time. 15:52 ajb@sloy/x86_64 [work.git] git count-objects -v -H count: 581 size: 5.09 MiB in-pack: 399307 packs: 1 size-pack: 1.49 GiB prune-packable: 0 garbage: 0 size-garbage: 0 bytes linux.git has 2.9m+ in-pack. The pack-size is much lower at about 800+ MiB, but I don't think 1.49 GiB is a problem in itself. Looking forward to your big-files report to see why it's so big. It is a pick repo. The gc --aggressive nearly took out my machine keeping around 4gb resident for most of the half hour and using nearly 8gb of VM. Of course most of the history is not needed for day to day stuff. Maybe if I split the pack files up it wouldn't be quite such a strain to work through them? Really out of my depth here, sorry. Let's see what Duy (or the others) have to say. 2. You have have huge (binary) files checked into your repository. Do you? If so, why isn't the code in streaming.c kicking in? We do have some binary blobs in the repository (mainly DSP and FPGA images) although not a huge number: 15:58 ajb@sloy/x86_64 [work.git] time git log --pretty=oneline -- xxx xxx/xx/*.out ./xxx/xxx/*.out ./xxx/xxx/*.out | wc -l 234 real0m0.590s user0m0.552s sys 0m0.040s log is streaming, and is not a good measure: it doesn't even walk the entire commit graph. How big are these files? How can I tell if streaming is kicking in or now? I use callgrind (and kcachegrind to visualize). Can you post callgrind output? It will be helpful in figuring out where exactly git is spending time. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] unpack-trees: free cache_entry array members for merges
On Thu, May 30, 2013 at 9:40 AM, René Scharfe rene.scha...@lsrfire.ath.cx wrote: Am 30.05.2013 14:04, schrieb Felipe Contreras: On Thu, May 30, 2013 at 6:34 AM, René Scharfe rene.scha...@lsrfire.ath.cx wrote: The merge functions duplicate entries as needed and they don't free them. Release them in unpack_nondirectories, the same function where they were allocated, after we're done. Ah, you beat me to this change, but.. @@ -600,9 +600,14 @@ static int unpack_nondirectories(int n, unsigned long mask, src[i + o-merge] = create_ce_entry(info, names + i, stage); } - if (o-merge) - return call_unpack_fn((const struct cache_entry * const *)src, - o); + if (o-merge) { + int rc = call_unpack_fn((const struct cache_entry * const *)src, + o); + for (i = 1; i = n; i++) + if (src[i] src[i] != o-df_conflict_entry) + free(src[i]); Doesn't it make more sense to follow the code above and do src[i + o-merge]? Not sure I understand. Is the goal to avoid confusion for code readers by using the same indexing method for allocation and release? Or are you worried about o-merge having a different value than 1 in that loop? Both. In particular I'm eyeballing the code you can even see in this patch: src[i + o-merge] = create_ce_entry(info, names + i, stage); If you think it's better to use src[i], then I think the code above should do the same. -- Felipe Contreras -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [git-users] Highlevel (but simple to implement) commands provided by default for git
On Thu, May 30, 2013 at 9:54 AM, Jonathan Nieder jrnie...@gmail.com wrote: Felipe Contreras wrote: On Thu, May 30, 2013 at 12:23 AM, Jonathan Nieder jrnie...@gmail.com wrote: Felipe Contreras wrote: On Wed, May 29, 2013 at 6:43 PM, Jonathan Nieder jrnie...@gmail.com wrote: A bigger problem (in my opinion) with allowing arbitrary changes to the meaning of existing commands is that scripts, whether placed in .sh files or given as commands to run over IRC, stop working altogether. It's nice to have commands like git log and git am mean the same thing no matter what machine I am on. Except that's not true: It's not true that my opinion is that a bigger problem than the non-problem Ram mentioned with allowing arbitrary changes to the meaning of existing commands is that scripts stop working reliably? It's not true what you said: commands like git log and git am mean the same thing no matter what machine I am on. It's not true that it's nice when they do? Yeah, it's nice that the sun is purple. Never-mind the fact that it's not true. The consistency you experience across machines has absolutely nothing to do with Git, since Git can be configured in a way you don't consider nice. So this argument is invalid. Any proposed change to make Git more configurable is not affected by this argument, because Git can *already* be configured in a way that would break your experience, yet it doesn't happen. In other words; it's the policy or your machine users you have to thank for, not Git's code, and changing Git's code is not going to change that policy. Either way this is a straw man, again, nobody is pushing to allow builtins to be overridable. The topic is default *aliases*. -- Felipe Contreras -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
Alex Bennée kernel-hac...@bennee.com writes: 41.58% git libcrypto.so.1.0.0 [.] sha1_block_data_order_ssse3 33.62% git libz.so.1.2.3.4 [.] inflate_fast 10.39% git libz.so.1.2.3.4 [.] adler32 2.03% git [kernel.kallsyms] [k] clear_page_c Do you have any large blobs in the repo that are referenced directly by a tag? Because this just so happens to exactly reproduce your symptoms: # in a random git.git $ time git describe --debug [...] real0m0.390s user0m0.037s sys 0m0.011s $ git tag big1 $(dd if=/dev/urandom bs=1M count=512 | git hash-object -w --stdin) 512+0 records in 512+0 records out 536870912 bytes (537 MB) copied, 45.5088 s, 11.8 MB/s $ time git describe --debug [...] real0m1.875s user0m1.738s sys 0m0.129s $ git tag big2 $(dd if=/dev/urandom bs=1M count=512 | git hash-object -w --stdin) 512+0 records in 512+0 records out 536870912 bytes (537 MB) copied, 44.972 s, 11.9 MB/s $ time git describe --debugsuche zur Beschreibung von HEAD [...] real0m3.620s user0m3.357s sys 0m0.248s (I actually ran the git-describe invocations more than once to ensure that they are again cache-hot.) git-describe should probably be fixed to avoid loading blobs, though I'm not sure off hand if we have any infrastructure to infer the type of a loose object without inflating it. (This could probably be added by inflating only the first block.) We do have this for packed objects, so at least for packed repos there's a speedup to be had. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: t0008-ignores failure (was: [msysGit] Git for Windows 1.8.3)
On 30 May 2013 16:15, Johannes Schindelin johannes.schinde...@gmx.de wrote: Hi Karsten, On Thu, 30 May 2013, Karsten Blees wrote: Am 25.05.2013 21:16, schrieb Pat Thoyts: On that note -- with this merge as it now stands I get the following test failures: t0008-ignores.sh 155, 158, 162, 164 These tests fail because they use absolute paths, e.g. C:/.../global-excludes, which is then translated to CNUL/.../global-excludes. Can be fixed like so: --- 8 --- --- a/t/t0008-ignores.sh +++ b/t/t0008-ignores.sh @@ -5,7 +5,7 @@ test_description=check-ignore . ./test-lib.sh init_vars () { - global_excludes=$(pwd)/global-excludes + global_excludes=global-excludes } enable_global_excludes () { --- Since I do not have time for the lengthy, undirected discussion upstream seems to want to start, let's make your change, but only conditional on MINGW? Ciao, Dscho I was just testing this -- I've already wrapped the suggested fix within a test_have_prereq MINGW for our fork and committed it. This was an issue partly because was alias pwd to pwd -W and so always get Windows paths. It means the test here doesn't check absolute paths but I think we can live with that. I tried using $(builtin pwd) to avoid the -W but it didn't help and I still got C: style paths. I also grabbed Karsten's patch dir.c: fix ignore processing within not-ignored directories as this appears to deal with a .gitignore regression in 1.8.3. We can carry this until the next merge with upstream. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git-gui: fix file name handling with non-empty prefix
In the hope that the Pat Thoyts who just posted in another thread from a GMail address is the same one that maintains git-gui, let's see if that address works... On Sat, May 11, 2013 at 10:03:25PM -0400, Andrew Wong wrote: Sorry for the late reply. I was able to reproduce the problem that you were describing a while ago. And your patch indeed fixes it. It's a much more elegant way of dealing with the absolute vs relative path problem that I was trying to fix. Thanks! As for Pat, I'm not sure wha'ts going on with his email address. It was working back in October, and his username still seems to be active over at SourceForge... let's see if this email reaches him. Here's a link for his reference just in case he missed your original email: http://thread.gmane.org/gmane.comp.version-control.git/222646 On 04/27/13 10:18, John Keeping wrote: I got a bounce with 550 no such user for Pat's email address when sending this. Does anyone have more up-to-date contact details? Or is it just SourceForge being broken? On Sat, Apr 27, 2013 at 02:24:16PM +0100, John Keeping wrote: Commit e3d06ca (git-gui: Detect full path when parsing arguments - 2012-10-02) fixed the handling of absolute paths passed to the browser and blame subcommands by checking whether the file exists without the prefix before prepending the prefix and checking again. Since we have chdir'd to the top level of the working tree before doing this, this does not work if a file with the same name exists in a subdirectory and at the top level (for example Makefile in git.git's t/ directory). Instead of doing this, revert that patch and fix absolute path issue by using file join to prepend the prefix to the supplied path. This will correctly handle absolute paths by skipping the prefix in that case. Signed-off-by: John Keeping j...@keeping.me.uk --- git-gui.sh | 14 +++--- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/git-gui.sh b/git-gui.sh index e11..a94ad7f 100755 --- a/git-gui.sh +++ b/git-gui.sh @@ -3003,19 +3003,11 @@ blame { set jump_spec {} set is_path 0 foreach a $argv { - if {[file exists $a]} { - if {$path ne {}} usage - set path [normalize_relpath $a] - break - } elseif {[file exists $_prefix$a]} { - if {$path ne {}} usage - set path [normalize_relpath $_prefix$a] - break - } + set p [file join $_prefix $a] - if {$is_path} { + if {$is_path || [file exists $p]} { if {$path ne {}} usage - set path [normalize_relpath $_prefix$a] + set path [normalize_relpath $p] break } elseif {$a eq {--}} { if {$path ne {}} { -- 1.8.3.rc0.149.g98a72f2.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
Alex Bennée kernel-hac...@bennee.com writes: On 30 May 2013 16:33, Thomas Rast tr...@inf.ethz.ch wrote: Alex Bennée kernel-hac...@bennee.com writes: 41.58% git libcrypto.so.1.0.0 [.] sha1_block_data_order_ssse3 33.62% git libz.so.1.2.3.4 [.] inflate_fast 10.39% git libz.so.1.2.3.4 [.] adler32 2.03% git [kernel.kallsyms] [k] clear_page_c Do you have any large blobs in the repo that are referenced directly by a tag? Most probably. I've certainly done a bunch of releases (which are tagged) were the last thing that was updated was an FPGA image. [...] git-describe should probably be fixed to avoid loading blobs, though I'm not sure off hand if we have any infrastructure to infer the type of a loose object without inflating it. (This could probably be added by inflating only the first block.) We do have this for packed objects, so at least for packed repos there's a speedup to be had. Will it be loading the blob for every commit it traverses or just ones that hit a tag? Why does it need to load the blob at all? Surely the commit tree state doesn't need to be walked down? No, my theory is that you tagged *the blobs*. Git supports this. git-describe needs to look at the commit (if any) obtained by peeling each tag (i.e. dereferencing tags until it reaches a non-tag). So to do that, it resolves the tag's referent and loads it. Usually this will be a commit, in which case it is marked as reached by the tag. As my example shows, it also resolves tags' referents if they refer to non-commits, in particular, it will decompress large blobs that are (directly) referenced by a tag. Note that while annotated tags provide the type information themselves, e.g. $ git cat-file tag junio-gpg-pub object fe113d3f96636710600c6b02d5fd421fa7e87dd6 type blob tag junio-gpg-pub [...] unannotated tags are simply refs, so it is not enough to just look at the tag objects' referent type. I had a brief look around sha1_file.c, in particular sha1_object_info, and it turns out we lack the deflate only early part logic as I suspected. So that'll have to be fixed first. After that I *think* it should automatically carry over into the tag readers. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Should git help respect the 'pager' setting?
I have my global git config pager set to 'cat', but when I do a git help command, it still uses a pager. This is especially irksome in emacs shell buffers, where I am most of the time. I know I can do a M-x man - git-whatever, but wondered if this was a bug or user error. (git --no-pager help command does the same.) 12:31 [mcampbell] /tmp % git --no-pager help log WARNING: terminal is not fully functional - (press RETURN) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
preventing evil merges
Hey all, I've be burnt by what someone on IRC referred to as evil merges, that is loss of history after amending a merge commit: git merge anotherbranch git add something git commit --amend After the steps above the addition of something can't be found in the history anymore, but the file is there. Moreover (I think but didn't try to reproruce) a further rebase to a common ancestore of my working branch and anotherbranch resulted in further loss of the changes. The only way for me to get them back has been by checking out from reflog. I've no idea about what was going on but this experience reminded me of another one I had in the past in which we could not figure out when some changes were added into a repository (!). If amending a merge is so dangerous, would it make sense to require and hard-to-type switch in order for it to really do anything ? --strk; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Should git help respect the 'pager' setting?
Michael Campbell michael.campb...@gmail.com writes: I have my global git config pager set to 'cat', but when I do a git help command, it still uses a pager. This is especially irksome in emacs shell buffers, where I am most of the time. I know I can do a M-x man - git-whatever, but wondered if this was a bug or user error. (git --no-pager help command does the same.) git help foo just calls man git-foo by default, so what happens is the same as if you called man git-foo by hand. Git does not have much control over what man will do, it could probably call man -P $pager when the Git pager is set, but I'd find it a bit weird. If you're an Emacs user, you can read about man.viewer and set it to woman, or set PAGER=cat when inside Emacs. I personally run M-x git-foo RET, and never run git help. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Should git help respect the 'pager' setting?
Ramkumar Ramachandra artag...@gmail.com writes: It just needs to set $PAGER or $MANPAGER before the exec(), no? Yes, that should do the same as man -P. I would argue that it should do this. $GIT_PAGER works everywhere else, but obviously man has no knowledge about it. I find it a bit weird that Git sets the configuration for external commands, but it may make sense. No strong opinion here. If you're an Emacs user, you can read about man.viewer and set it to woman, or set PAGER=cat when inside Emacs. I just learnt about man.viewer. There's a small problem with it though: why is there no option for Emacs man corresponding to Emacs woman? I guess because no one implemented it ;-). I personally run M-x git-foo RET, and never run git help. M-x man git-foo RET, you mean? Yes, sorry. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Should git help respect the 'pager' setting?
Matthieu Moy wrote: I find it a bit weird that Git sets the configuration for external commands, but it may make sense. No strong opinion here. I don't mean a setenv() kind of thing: how would we unset it after that? Perhaps something like execvpe(), passing in the environment as an argument? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Should git help respect the 'pager' setting?
On Thu, May 30, 2013 at 10:38:59PM +0530, Ramkumar Ramachandra wrote: Matthieu Moy wrote: I find it a bit weird that Git sets the configuration for external commands, but it may make sense. No strong opinion here. I don't mean a setenv() kind of thing: how would we unset it after that? Perhaps something like execvpe(), passing in the environment as an argument? Overriding PAGER might make sense, but I'd be quite annoyed if Git decided to override MANPAGER without providing some way to override it. If a user sets MANPAGER then it's because they want a specific pager when reading man pages - invoking man through git help shouldn't cause it to behave differently in this case. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: t0008-ignores failure (was: [msysGit] Git for Windows 1.8.3)
Hi Pat, On Thu, 30 May 2013, Pat Thoyts wrote: On 30 May 2013 16:15, Johannes Schindelin johannes.schinde...@gmx.de wrote: On Thu, 30 May 2013, Karsten Blees wrote: Am 25.05.2013 21:16, schrieb Pat Thoyts: On that note -- with this merge as it now stands I get the following test failures: t0008-ignores.sh 155, 158, 162, 164 These tests fail because they use absolute paths, e.g. C:/.../global-excludes, which is then translated to CNUL/.../global-excludes. Can be fixed like so: --- 8 --- --- a/t/t0008-ignores.sh +++ b/t/t0008-ignores.sh @@ -5,7 +5,7 @@ test_description=check-ignore . ./test-lib.sh init_vars () { - global_excludes=$(pwd)/global-excludes + global_excludes=global-excludes } enable_global_excludes () { --- Since I do not have time for the lengthy, undirected discussion upstream seems to want to start, let's make your change, but only conditional on MINGW? I was just testing this -- I've already wrapped the suggested fix within a test_have_prereq MINGW for our fork and committed it. This was an issue partly because was alias pwd to pwd -W and so always get Windows paths. It means the test here doesn't check absolute paths but I think we can live with that. I tried using $(builtin pwd) to avoid the -W but it didn't help and I still got C: style paths. I also grabbed Karsten's patch dir.c: fix ignore processing within not-ignored directories as this appears to deal with a .gitignore regression in 1.8.3. We can carry this until the next merge with upstream. Thanks! Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
The culprit, according to some callgrind investigation, is lookup_commit_reference_gently() [for the unannotated case] or deref_tag() [annotated case] calling parse_object(). Using the scenario you described earlier, I think it ends-up spending most of its time in check_sha1_signature (both deref_tag and lookup_commit_reference_gently() go there) with 20% inflating, 80% in SHA1_Update(). Not much we can do about that, can we ? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (May 2013, #09; Wed, 29)
Am 30.05.2013 01:58, schrieb Junio C Hamano: * jl/submodule-mv (2013-04-23) 5 commits (merged to 'next' on 2013-04-23 at c04f574) + submodule.c: duplicate real_path's return value (merged to 'next' on 2013-04-19 at 45ae3c9) + rm: delete .gitmodules entry of submodules removed from the work tree + Teach mv to update the path entry in .gitmodules for moved submodules + Teach mv to move submodules using a gitfile + Teach mv to move submodules together with their work trees git mv A B when moving a submodule A does the right thing, inclusing relocating its working tree and adjusting the paths in the .gitmodules file. There are only two issues I'm aware of: *) When the .gitmodules file is already modified but unchanged running rm or mv on a submodule will stage those changes too. *) There is a harmless but unnecessary double invocation of strlen() in the function (fixed by the diff below). I plan to fix the first issue in another patch which would also get rid of the second issue, as exactly that code would have to be touched anyways. Does that make sense? --8- diff --git a/submodule.c b/submodule.c index edfc23c..4670af7 100644 --- a/submodule.c +++ b/submodule.c @@ -102,7 +102,7 @@ void stage_updated_gitmodules(void) struct cache_entry *ce; int namelen = strlen(.gitmodules); - pos = cache_name_pos(.gitmodules, strlen(.gitmodules)); + pos = cache_name_pos(.gitmodules, namelen); if (pos 0) { warning(_(could not find .gitmodules in index)); return; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (May 2013, #09; Wed, 29)
Am 30.05.2013 01:58, schrieb Junio C Hamano: * jk/submodule-subdirectory-ok (2013-04-24) 3 commits (merged to 'next' on 2013-04-24 at 6306b29) + submodule: fix quoting in relative_path() (merged to 'next' on 2013-04-22 at f211e25) + submodule: drop the top-level requirement + rev-parse: add --prefix option Allow various subcommands of git submodule to be run not from the top of the working tree of the superproject. The summary and status commands are looking good in this version (they are now showing the submodule directory paths relative to the current directory). Apart from that my other remarks from gmane $221575 still seem to apply. And this series has only tests for status, summary and add (and that just with an absolute URL), I'd rather like to see a test for each submodule command (and a relative add to) to document the desired behavior. But I'm not sure if it's better to have another iteration of this series or to address the open issues a follow-up series. Having status, summary and add - at least with absolute URLs - lose the toplevel requirement is already a huge improvement IMO. Opinions? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 22/25] string_list_add_refs_by_glob(): add a comment about memory management
On 05/29/2013 10:21 AM, Thomas Rast wrote: Michael Haggerty mhag...@alum.mit.edu writes: Since string_list_add_one_ref() adds refname to the string list, but the lifetime of refname is limited, it is important that the string_list passed to string_list_add_one_ref() has strdup_strings set. Document this fact. All current callers do the right thing. [...] +/* + * The list argument must have strdup_strings set on it. + */ void string_list_add_refs_by_glob(struct string_list *list, const char *glob) { if (has_glob_specials(glob)) { Maybe add an assert() so that this is bulletproof? Good idea. Will be added in v3. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Poor performance of git describe in big repos
On Thu, May 30, 2013 at 06:21:55PM +0200, Thomas Rast wrote: Alex Bennée kernel-hac...@bennee.com writes: On 30 May 2013 16:33, Thomas Rast tr...@inf.ethz.ch wrote: Alex Bennée kernel-hac...@bennee.com writes: 41.58% git libcrypto.so.1.0.0 [.] sha1_block_data_order_ssse3 33.62% git libz.so.1.2.3.4 [.] inflate_fast 10.39% git libz.so.1.2.3.4 [.] adler32 2.03% git [kernel.kallsyms] [k] clear_page_c Do you have any large blobs in the repo that are referenced directly by a tag? Most probably. I've certainly done a bunch of releases (which are tagged) were the last thing that was updated was an FPGA image. [...] git-describe should probably be fixed to avoid loading blobs, though I'm not sure off hand if we have any infrastructure to infer the type of a loose object without inflating it. (This could probably be added by inflating only the first block.) We do have this for packed objects, so at least for packed repos there's a speedup to be had. Will it be loading the blob for every commit it traverses or just ones that hit a tag? Why does it need to load the blob at all? Surely the commit tree state doesn't need to be walked down? No, my theory is that you tagged *the blobs*. Git supports this. You can see if that is the case by doing something like this: eval $(git for-each-ref --shell --format ' test $(git cat-file -t %(objectname)^{}) = commit || echo %(refname);') That will print out the name of any ref that doesn't point at a commit. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes
On 05/29/2013 10:25 AM, Thomas Rast wrote: Michael Haggerty mhag...@alum.mit.edu writes: I read the entire series on Monday, and give it an Ack at maybe 90% confidence level -- sorry, I was short on caffeine and sleep ;-) Thanks very much. I'll buy you a coffee the next time I see you :-) Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit}
lookup_commit_reference_gently unconditionally parses the object given to it. This slows down git-describe a lot if you have a repository with large tagged blobs in it: parse_object() will read the entire blob and verify that its sha1 matches, only to then throw it away. Speed it up by checking the type with sha1_object_info() prior to unpacking. The reason that deref_tag() does not need the same fix is a bit subtle: parse_tag_buffer() does not fill the 'tagged' member of the tag struct if the tagged object is a blob. Reported-by: Alex Bennée kernel-hac...@bennee.com Signed-off-by: Thomas Rast tr...@inf.ethz.ch --- commit.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/commit.c b/commit.c index 888e02a..00e8d4a 100644 --- a/commit.c +++ b/commit.c @@ -31,8 +31,12 @@ static struct commit *check_commit(struct object *obj, struct commit *lookup_commit_reference_gently(const unsigned char *sha1, int quiet) { - struct object *obj = deref_tag(parse_object(sha1), NULL, 0); - + struct object *obj; + int type = sha1_object_info(sha1, NULL); + /* If it's neither tag nor commit, parsing the object is wasted effort */ + if (type != OBJ_TAG type != OBJ_COMMIT) + return NULL; + obj = deref_tag(parse_object(sha1), NULL, 0); if (!obj) return NULL; return check_commit(obj, sha1, quiet); -- 1.8.3.506.g4fdeee5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] sha1_file: silence sha1_loose_object_info
sha1_object_info() returns -1 (OBJ_BAD) if it cannot find the object for some reason, which suggests that it wants the _caller_ to report this error. However, part of its work happens in sha1_loose_object_info, which _does_ report errors itself. This is doubly strange because: * packed_object_info(), which is the other half of the duo, does _not_ report this. * In the event that an object is packed and pruned while sha1_object_info_extended() goes looking for it, we would erroneously show the error -- even though the code of the latter function purports to handle this case gracefully. * A caller might invoke sha1_object_info() to find the type of an object even if that object is not known to exist. Silence this error. The others remain untouched as a corrupt object is a much more grave error than it merely being absent. Signed-off-by: Thomas Rast tr...@inf.ethz.ch --- sha1_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 67e815b..c0f6a0e 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2348,7 +2348,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, unsigned long *size map = map_sha1_file(sha1, mapsize); if (!map) - return error(unable to find %s, sha1_to_hex(sha1)); + return -1; if (unpack_sha1_header(stream, map, mapsize, hdr, sizeof(hdr)) 0) status = error(unable to unpack %s header, sha1_to_hex(sha1)); -- 1.8.3.506.g4fdeee5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 11/25] object_array_remove_duplicates(): rewrite to reduce copying
On 05/29/2013 06:18 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: The old version copied one entry to its destination position, then deleted any matching entries from the tail of the array. This required the tail of the array to be copied multiple times. It didn't affect the complexity of the algorithm because the whole tail has to be searched through anyway. But all the copying was unnecessary. Instead, check for the existence of an entry with the same name in the *head* of the list before copying an entry to its final position. This way each entry has to be copied at most one time. Extract a helper function contains_name() to do a bit of the work. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- object.c | 32 +--- object.h | 6 +- 2 files changed, 26 insertions(+), 12 deletions(-) diff --git a/object.c b/object.c index fcd4a82..10b5349 100644 --- a/object.c +++ b/object.c @@ -294,22 +294,32 @@ void object_array_filter(struct object_array *array, array-nr = dst; } +/* + * Return true iff array already contains an entry with name. + */ +static int contains_name(struct object_array *array, const char *name) +{ +unsigned nr = array-nr, i; +struct object_array_entry *object = array-objects; + +for (i = 0; i nr; i++, object++) +if (!strcmp(object-name, name)) +return 1; +return 0; +} Because some codepaths (e.g. patch 14/25) stuff NULL in the name field, we may want to be more careful with this. This is not a new problem, and I think the longer term solution is to get rid of object_array_remove_duplicates(), so it is perfectly fine to leave this function broken with respect to NULL input as-is. You make a good point about NULL names, but I agree to leave it for now since it needs work anyway. The only caller of remove-duplicates is bundle.c, which gets many starting points and end points from the command line and tries to be nice by removing obvious duplicates, e.g. git bundle create t.bundle master master but I think its logic of deduping is wrong. It runs dwim_ref() on the incoming refs after the remove-duplicates call, so git bundle create t.bundle master heads/mater will end up with two copies of refs/heads/master. To fix it, the code must dedup the result of running dwim_ref(), and at that point, there is no reason to call object_array_remove_duplicates(). That sounds reasonable. I poked around this code a bit to understand what is going on, and it occurred to me that the object_array can include both positive and negative references, right? And yet object_array_remove_duplicates() only considers names, not flags. So it seems to me that if the deduping code would see the same reference twice, once positive and once negative, then it would throw an arbitrary one of them out, which would be wrong. But I couldn't provoke this situation, so perhaps setup_revisions() already specially treats combinations like master ^master? (If that's true then why? and wouldn't it get confused by master ^heads/master?) I suppose someday I will have to dig into these functions and maybe even document them. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit}
On Thu, May 30, 2013 at 10:00:23PM +0200, Thomas Rast wrote: lookup_commit_reference_gently unconditionally parses the object given to it. This slows down git-describe a lot if you have a repository with large tagged blobs in it: parse_object() will read the entire blob and verify that its sha1 matches, only to then throw it away. Speed it up by checking the type with sha1_object_info() prior to unpacking. This would speed up the case where we do not end up looking at the object at all, but it will slow down the (presumably common) case where we will in fact find a commit and end up parsing the object anyway. Have you measured the impact of this on normal operations? During a traversal, we spend a measurable amount of time looking up commits in packfiles, and this would presumably double it. This is not the first time I have seen this tradeoff in git. It would be nice if our object access was structured to do incremental examination of the objects (i.e., store the packfile index lookup or partial unpack of a loose object header, and then use that to complete the next step of actually getting the contents). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1
On 05/29/2013 06:53 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: [...] +current_bad_sha1 = xmalloc(20); +hashcpy(current_bad_sha1, sha1); This, together with 18/25, may hint that we want hashdup()? I thought about it, but was surprised that I could only find one or two other places in the existing code that would use such a function. But sure, I would be happy to submit a patch. I think hashdup() needn't be inline, so the definition can't go with its cousins in cache.h. There is no cache.c. So where would be the best place to define hashdup()? object.c? sha1_name.c? While I'm at it, I think it would be nice to have constants like #define SHA1_LEN 20 #define SHA1_HEX_LEN 40 and start using those instead of magic numbers. Any objections (or suggestions for better names)? Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 FIXUP 22/25] fixup! string_list_add_refs_by_glob(): add a comment about memory management
Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- Junio, would you mind squashing this patch onto mh/reflife 22/25? notes.c | 1 + 1 file changed, 1 insertion(+) diff --git a/notes.c b/notes.c index 602d956..b69c0b8 100644 --- a/notes.c +++ b/notes.c @@ -932,6 +932,7 @@ static int string_list_add_one_ref(const char *refname, const unsigned char *sha */ void string_list_add_refs_by_glob(struct string_list *list, const char *glob) { + assert(list-strdup_strings); if (has_glob_specials(glob)) { for_each_glob_ref(string_list_add_one_ref, glob, list); } else { -- 1.8.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1
From: Michael Haggerty mhag...@alum.mit.edu Sent: Thursday, May 30, 2013 10:51 PM On 05/29/2013 06:53 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: [...] + current_bad_sha1 = xmalloc(20); + hashcpy(current_bad_sha1, sha1); This, together with 18/25, may hint that we want hashdup()? I thought about it, but was surprised that I could only find one or two other places in the existing code that would use such a function. But sure, I would be happy to submit a patch. I think hashdup() needn't be inline, so the definition can't go with its cousins in cache.h. There is no cache.c. So where would be the best place to define hashdup()? object.c? sha1_name.c? While I'm at it, I think it would be nice to have constants like #define SHA1_LEN 20 #define SHA1_HEX_LEN 40 and start using those instead of magic numbers. Any objections (or suggestions for better names)? The first named constant should be fully qualified to the same extent as the second, perhaps: #define SHA1_BYTE_LEN 20 and perhaps with an alternate of (though HEX is just as good): #define SHA1_CHAR_LEN 40 Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- Philip -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html