Re: [GSoC] Move ~/.git-credential-cache to ~/.cache/git

2017-03-10 Thread Junio C Hamano
Devin Lehmacher  writes:

> If I’m not mistaken magit won’t stop working with the changed
> location since it will just spawn an new instance of the
> daemon. The only downside would be it wouldn’t get credentials
> that were cached in the default socket.

I am not quite sure how you can say "only" in that sentence.  Isn't
the whole point of socket based daemon interface to allow starting
the daemon so that it can keep using it?

Somebody upthread mentioned checking the current location (and use
it if there is) and then use the new location, which I found a more
reasonable approach.

Assuming that it is sensible to move it from ~/.git-credential-cache
to ~/.cache/git/ in the first place, that is.  If both are equally
acceptable places, then perhaps a configuration that allows those
who want to have things in ~/.config/git/ to specify where to have
the thing (and those without such a custom configuration will keep
using the current location) may be more appropriate.


Re: [GSoC] Move ~/.git-credential-cache to ~/.cache/git

2017-03-10 Thread Devin Lehmacher
If I’m not mistaken magit won’t stop working with the changed location since it 
will just spawn an new instance of the daemon. The only downside would be it 
wouldn’t get credentials that were cached in the default socket.

I am going to move forward with git-credential-cache just using the new 
location at
`~/.cache/git/credential/socket` and submit a patch and then get more feedback 
on the patch.

Devin Lehmacher


Re: [GSoC] Move ~/.git-credential-cache to ~/.cache/git

2017-03-10 Thread Noam Postavsky
On Fri, Mar 10, 2017 at 7:26 PM, Jonathan Nieder  wrote:
> I find that magit does rely on the socket path.
>
> Start credential daemon on magit-credential-hook
>
> If we let git start the daemon, Emacs will send a SIGHUP when git
> finishes and closes the pty, killing the daemon.  Hence the need to have
> our own daemon running first.

Magit doesn't really care about the particular path, but it does call
git-credential-cache--daemon directly which needs to be told where to
create the socket. I guess Magit could just default to using
~/.cache/git/socket if ~/.cache/git exists? (Magit users can override
the path regardless)


Re: git-clone --config order & fetching extra refs during initial clone

2017-03-10 Thread SZEDER Gábor
On Mon, Feb 27, 2017 at 10:12 PM, Jeff King  wrote:

> I didn't actually review it very carefully before, but I'll do so now
> (spoiler: a few nits, but it looks fine).
>
>>  static struct ref *wanted_peer_refs(const struct ref *refs,
>> - struct refspec *refspec)
>> + struct refspec *refspec, unsigned int refspec_count)
>
> Most of the changes here and elsewhere are just about passing along
> multiple refspecs instead of a single, which makes sense.

The new parameter should perhaps be renamed to 'refspec_nr', though,
as '_nr' suffix seems to be more common in the codebase than '_count'.

>> @@ -856,7 +861,9 @@ int cmd_clone(int argc, const char **argv, const char 
>> *prefix)
>>   int submodule_progress;
>>
>>   struct refspec *refspec;
>> - const char *fetch_pattern;
>> + unsigned int refspec_count = 1;
>> + const char **fetch_patterns;
>> + const struct string_list *config_fetch_patterns;
>
> This "1" seems funny to up here by itself, as it must be kept in sync
> with the later logic that feeds exactly one non-configured refspec into
> our list. The current code isn't wrong, but it would be nice to have it
> all together. I.e., replacing:
>
>> + if (config_fetch_patterns)
>> + refspec_count = 1 + config_fetch_patterns->nr;
>> + fetch_patterns = xcalloc(refspec_count, sizeof(*fetch_patterns));
>> + fetch_patterns[0] = value.buf;
>
> with:
>
>   refspec_count = 1;
>   if (config_fetch_patterns)
> refspec_count += config_fetch_patterns->nr;
>   ...

Agreed.

> Though if I'm bikeshedding, I'd probably have written the whole thing
> with an argv_array and avoided counting at all.

Yeah, I did notice that you love argv_array :)  I had to raise an
eyebrow recently while looking at send-pack and how it uses argv_array
to read refspecs from stdin into an array.  I think argv_array feels a
bit out of place in both cases.  Yes, it does exactly what's needed.
However, it's called *argv*_array, indicating that its contents is
destined to become the options of some command.  But that's not the
case in these two cases, we are not dealing with arguments to a
command, these are just arrays of strings.

However, leveraging get_remote() makes it moot anyway.

>> + refspec = parse_fetch_refspec(refspec_count, fetch_patterns);
>>
>> + strbuf_reset();
>>   strbuf_reset();
>>
>>   remote = remote_get(option_origin);
>
> I do also notice that right _after_ this parsing, we use remote_get(),
> which is supposed to give us this config anyway. Which makes me wonder
> if we could just reorder this to put remote_get() first, and then read
> the resulting refspecs from remote->fetch.

Though get_remote() does give us this config, at this point the
default fetch refspec has not been configured yet, therefore it's not
included in the resulting remote->fetch array.  The default refspec is
set in write_refspec_config(), but that's called only about two
screenfulls later.  So there is a bit of extra work to do in order to
leverage get_remote()'s parsing.

I think the simplest way is to keep parsing the default fetch refspec
manually, and then append it to the remote->fetch array.  Definitely
shorter and simpler than that parsing in the current patch.
Alternatively, we could set the default fetch refspec in the
configuration temporarily, only for the duration of the get_remote()
call, but it feels a bit too hackish...

>> diff --git a/t/t5611-clone-config.sh b/t/t5611-clone-config.sh
>> index e4850b778c..3bed17783b 100755
>> --- a/t/t5611-clone-config.sh
>> +++ b/t/t5611-clone-config.sh
>> @@ -37,6 +37,30 @@ test_expect_success 'clone -c config is available during 
>> clone' '
>>   test_cmp expect child/file
>>  '
>>
>> +test_expect_success 'clone -c remote.origin.fetch= works' '
>> + rm -rf child &&
>> + git update-ref refs/grab/it refs/heads/master &&
>> + git update-ref refs/keep/out refs/heads/master &&
>> + git clone -c "remote.origin.fetch=+refs/grab/*:refs/grab/*" . child &&
>> + (
>> + cd child &&
>> + git for-each-ref --format="%(refname)" refs/grab/ >../actual
>> + ) &&
>> + echo refs/grab/it >expect &&
>> + test_cmp expect actual
>> +'
>> +
>> +test_expect_success 'git -c remote.origin.fetch= clone works' '
>> + rm -rf child &&
>> + git -c "remote.origin.fetch=+refs/grab/*:refs/grab/*" clone . child &&
>> + (
>> + cd child &&
>> + git for-each-ref --format="%(refname)" refs/grab/ >../actual
>> + ) &&
>> + echo refs/grab/it >expect &&
>> + test_cmp expect actual
>> +'
>
> These look reasonable. Using "git -C for-each-ref" would save a
> subshell, but that's minor.

Loosing a subshell is good, but I subjectively like how the
indentation highlights that that command operates on a different
repository.

However, the tests should also check that refs matching the default
fetch refspec are transferred, too, i.e. that 

Re: [PATCH v2 1/2] pathspec: allow querying for attributes

2017-03-10 Thread Brandon Williams
On 03/10, Jonathan Tan wrote:
> Thanks - I don't think I have any more comments on this patch set
> after these.
> 
> On 03/10/2017 10:59 AM, Brandon Williams wrote:
> >diff --git a/pathspec.c b/pathspec.c
> >index b961f00c8..7cd5f6e3d 100644
> >--- a/pathspec.c
> >+++ b/pathspec.c
> >@@ -87,6 +89,74 @@ static void prefix_magic(struct strbuf *sb, int 
> >prefixlen, unsigned magic)
> > strbuf_addf(sb, ",prefix:%d)", prefixlen);
> > }
> >
> >+static void parse_pathspec_attr_match(struct pathspec_item *item, const 
> >char *value)
> >+{
> >+struct string_list_item *si;
> >+struct string_list list = STRING_LIST_INIT_DUP;
> >+
> >+if (item->attr_check)
> >+die(_("Only one 'attr:' specification is allowed."));
> >+
> >+if (!value || !*value)
> >+die(_("attr spec must not be empty"));
> >+
> >+string_list_split(, value, ' ', -1);
> >+string_list_remove_empty_items(, 0);
> >+
> >+item->attr_check = attr_check_alloc();
> >+ALLOC_GROW(item->attr_match,
> >+   list.nr,
> >+   item->attr_match_alloc);
> 
> If item->attr_match always starts empty, then I think an xmalloc or
> xcalloc suffices (and we don't need item->attr_match_alloc anymore).
> 
> We should probably also check item->attr_match above - that is, `if
> (item->attr_check || item->attr_match)`.

Correct, I'll make these changes.

> 
> >+
> >+for_each_string_list_item(si, ) {
> >+size_t attr_len;
> >+char *attr_name;
> >+const struct git_attr *a;
> >+
> >+int j = item->attr_match_nr++;
> >+const char *attr = si->string;
> >+struct attr_match *am = >attr_match[j];
> >+
> >+switch (*attr) {
> >+case '!':
> >+am->match_mode = MATCH_UNSPECIFIED;
> >+attr++;
> >+attr_len = strlen(attr);
> >+break;
> >+case '-':
> >+am->match_mode = MATCH_UNSET;
> >+attr++;
> >+attr_len = strlen(attr);
> >+break;
> >+default:
> >+attr_len = strcspn(attr, "=");
> >+if (attr[attr_len] != '=')
> >+am->match_mode = MATCH_SET;
> >+else {
> >+am->match_mode = MATCH_VALUE;
> >+am->value = xstrdup([attr_len + 1]);
> >+if (strchr(am->value, '\\'))
> >+die(_("attr spec values must not 
> >contain backslashes"));
> >+}
> >+break;
> >+}
> >+
> >+attr_name = xmemdupz(attr, attr_len);
> >+a = git_attr(attr_name);
> >+if (!a)
> >+die(_("invalid attribute name %s"), attr_name);
> >+
> >+attr_check_append(item->attr_check, a);
> >+
> >+free(attr_name);
> >+}
> >+
> >+if (item->attr_check->nr != item->attr_match_nr)
> >+die("BUG: should have same number of entries");
> 
> I think such postcondition checks are usually not worth it, but
> others might differ.

yeah probably not, but its just an assert check for just in case so I'll
leave it in.

> 
> >+
> >+string_list_clear(, 0);
> >+}
> >+
> > static inline int get_literal_global(void)
> > {
> > static int literal = -1;

-- 
Brandon Williams


Re: [GSoC] Move ~/.git-credential-cache to ~/.cache/git

2017-03-10 Thread Jonathan Nieder
(+cc: npostavs)
Hi Devin,

Devin Lehmacher wrote:

> I started working on this microproject and am not quite sure what is
> necessary for backwards compatibility. Since the socket is recreated
> whenever the credential daemon exits backwards compatibility
> shouldn’t really be a concern with regard to where the socket is
> located in the filesystem.
>
> However, contrib/persistent-https depends on the socket being at
> ~/.git-credential-cache/socket, so changing the default location
> would break this. However, if we need to keep the socket at that
> location for cases like this I don’t understand how this change
> would be helpful in any way.

That's a good question.  If I'm reading contrib/persistent-https/
correctly, it uses the same directory but doesn't rely on the socket
there, so it should not be a problem.

However, that reminded me to search for other tools that might rely on
the socket.  Using
https://codesearch.debian.net/search?q=%5C.git-credential-cache, I
find that magit does rely on the socket path.

 $ git clone https://github.com/magit/magit
 $ git log -S.git-credential-cache
 commit 0f30dfbb0075ac2e99b65a2c7fac360197a989c1
 Author: Noam Postavsky 
 Date:   Sat Oct 24 15:57:54 2015 -0400

Start credential daemon on magit-credential-hook

If we let git start the daemon, Emacs will send a SIGHUP when git
finishes and closes the pty, killing the daemon.  Hence the need to have
our own daemon running first.

Cc-ing Noam to figure out what a safe transition will look like.

Thanks for noticing,
Jonathan


Re: Stable GnuPG interface, git should use GPGME

2017-03-10 Thread brian m. carlson
On Fri, Mar 10, 2017 at 11:00:07AM +0100, Bernhard E. Reiter wrote:
> My use case today was signing and git by default found the `gpg` binary by
> default and the command failed. The reason is that I have `gpg2` installed
> and most applications use it right away. So git failed signing because
> the .gnupg configuration of the user was not ready for the old `gpg` which is
> still installed on Debian GNU/Linux for purposes of the operating system. If
> git would have used libgpgme, gpgme would have choosen the most uptodate
> version of `gpg` available (or configured) without me intervening via
> gpg.program. Now because of this problem you could adding a check for `gpg2`
> and fallback to `gpg`, but even better would be to move to libgpgme. >:)

There are a couple potential problems I see with this approach.  First,
I'd want to know whether gpgme supports gpgsm, which I know some people
use to sign commits and tags.

Another issue is what happens to the git verify-* --raw output.  Some
people want the ability to script signature verification.  This can be
really important when you have automated systems verifying tags and
commits.

For example, running the following commands, we can determine that Junio
signs his tags with SHA-1 (algorithm 2), while I sign my commits with
SHA-512 (algorithm 10).

genre ok % git verify-tag --raw v2.12.0 2>&1 | grep VALIDSIG
[GNUPG:] VALIDSIG E1F036B1FEE7221FC778ECEFB0B5E88696AFE6CB 2017-02-24 
1487962205 0 4 0 1 2 00 96E07AF25771955980DAD10020D04E5A713660A7
genre ok % git verify-commit --raw object-id-part10 2>&1 | grep VALIDSIG
[GNUPG:] VALIDSIG 5FC3A781776B26DF87F70C37BF535D811F52F68B 2017-03-06 
1488760639 0 4 0 1 10 00 88ACE9B29196305BA9947552F1BA225C0223B187

There's literally no other way to get this information at the moment
(which is why I added the --raw option).  A gpgme implementation would
need to expose this same information, at which point, we might as well
have used gpg directly.

This is not an idle consideration; we have automated systems at work
that update software automatically and submit it for human review,
including verifying signatures and hashes.  This saves hundreds of hours
of staff time and results in better security.

Because the amount of the gpg API we actually use is very small, a user
who wants to use a custom signature program (say, OpenBSD's signify),
can actually write a simple wrapper that mimics it and use that instead.

Finally, I work on a development system where work is done both as an
unprivileged user and as root.  Because I use the same socket for both,
GnuPG screams bloody murder that the permissions are wrong.  I know this
is secure in my scenario, but without a custom wrapper, I have to deal
with GnuPG polluting my terminal every time I sign a commit or a tag.  A
gpgme implementation would need to honor the same wrapper script or
otherwise not scream to the terminal.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


[GSoC] Move ~/.git-credential-cache to ~/.cache/git

2017-03-10 Thread Devin Lehmacher
I started working on this microproject and am not quite sure what is necessary 
for backwards compatibility. Since the socket is recreated whenever the 
credential daemon exits backwards compatibility shouldn’t really be a concern 
with regard to where the socket is located in the filesystem.

However, contrib/persistent-https depends on the socket being at 
~/.git-credential-cache/socket, so changing the default location would break 
this. However, if we need to keep the socket at that location for cases like 
this I don’t understand how this change would be helpful in any way.

Is it safe to change the default location for this socket and removing the 
original location?

Thanks in advanced,
Devin Lehmacher


Re: [PATCH v2] repack: Add option to preserve and prune old pack files

2017-03-10 Thread Junio C Hamano
James Melvin  writes:

> The new --preserve-and-prune option renames old pack files
> instead of deleting them after repacking and prunes previously
> preserved pack files.
>
> This option is designed to prevent stale file handle exceptions
> during git operations which can happen on users of NFS repos when
> repacking is done on them. The strategy is to preserve old pack files
> around until the next repack with the hopes that they will become
> unreferenced by then and not cause any exceptions to running processes
> when they are finally deleted (pruned).

This certainly is simpler than the previous one, but if you run

git repack --preserve-and-prune 
sleep for N minutes
git repack --preserve-and-prune 

the second "repack" will drop the obsoleted packs that were
preserved by the first one no matter how short the value of N is,
no?

I suspect that users would be more comfortable with something based
on expiration timestamp that gives them a guaranteed grace period,
e.g. "--preserve-expire=8.hours", like how we expire reflog entries
and loose objects.

Perhaps builtin/prune.c can be a source of inspiration?


Re: [PATCH v1] Travis: also test on 32-bit Linux

2017-03-10 Thread Junio C Hamano
René Scharfe  writes:

>   @ depends on r @
>   expression E;
>   @@
>   - *&
> E

I guess my source of the confusion is that the tool that understands
the semantics of the C language still needs to be told about that.

I was hoping that something that understands C only needs to be told
only a single rule:

type T
T src, dst

-memcpy(, , sizeof(dst));
+dst = src;

and then can apply that rule to this code in four ways:

struct foo A, *Bp;

memcpy(Bp, , sizeof(*Bp));
memcpy(Bp, , sizeof(A));
memcpy(, dstp, sizeof(A));
memcpy(, dstp, sizeof(*Bp));

to obtain its rewrite:

struct foo A, *Bp;

*Bp = A;
*Bp = A;
A = *Bp;
A = *Bp;

by knowing that (*Bp) is of type "struct foo" (even though Bp is of
type "struct foo *") and sizeof(dst) and sizeof(src) are the same
thing in the rule because src and dst are both of type T.


What's cooking in git.git (Mar 2017, #04; Fri, 10)

2017-03-10 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ew/markdown-url-in-readme (2017-03-01) 1 commit
  (merged to 'next' on 2017-03-03 at 3d35e3a991)
 + README: create HTTP/HTTPS links from URLs in Markdown

 Doc update.


* jc/config-case-cmdline-take-2 (2017-02-23) 2 commits
  (merged to 'next' on 2017-03-01 at 2e9920eeeb)
 + config: use git_config_parse_key() in git_config_parse_parameter()
 + config: move a few helper functions up

 The code to parse "git -c VAR=VAL cmd" and set configuration
 variable for the duration of cmd had two small bugs, which have
 been fixed.
 This supersedes jc/config-case-cmdline topic that has been discarded.


* jh/send-email-one-cc (2017-02-27) 1 commit
  (merged to 'next' on 2017-03-02 at 32c0e6ad88)
 + send-email: only allow one address per body tag

 "Cc:" on the trailer part does not have to conform to RFC strictly,
 unlike in the e-mail header.  "git send-email" has been updated to
 ignore anything after '>' when picking addresses, to allow non-address
 cruft like " # stable 4.4" after the address.


* jk/http-auth (2017-02-27) 2 commits
  (merged to 'next' on 2017-03-02 at 87f81b4395)
 + http: add an "auto" mode for http.emptyauth
 + http: restrict auth methods to what the server advertises

 Reduce authentication round-trip over HTTP when the server supports
 just a single authentication method.


* jk/ident-empty (2017-02-23) 4 commits
  (merged to 'next' on 2017-03-01 at ff80031ce6)
 + ident: do not ignore empty config name/email
 + ident: reject all-crud ident name
 + ident: handle NULL email when complaining of empty name
 + ident: mark error messages for translation

 user.email that consists of only cruft chars should consistently
 error out, but didn't.


* jk/parse-config-key-cleanup (2017-02-24) 3 commits
  (merged to 'next' on 2017-03-01 at e531d8d3a9)
 + parse_hide_refs_config: tell parse_config_key we don't want a subsection
 + parse_config_key: allow matching single-level config
 + parse_config_key: use skip_prefix instead of starts_with
 (this branch uses sb/parse-hide-refs-config-cleanup.)

 The "parse_config_key()" API function has been cleaned up.


* jk/t6300-cleanup (2017-02-27) 1 commit
  (merged to 'next' on 2017-03-02 at 3087521bea)
 + t6300: avoid creating refs/heads/HEAD

 A test that creates a confusing branch whose name is HEAD has been
 corrected not to do so.


* jt/http-base-url-update-upon-redirect (2017-02-28) 1 commit
  (merged to 'next' on 2017-03-03 at 5225bd3ef8)
 + http: attempt updating base URL only if no error

 When a redirected http transport gets an error during the
 redirected request, we ignored the error we got from the server,
 and ended up giving a not-so-useful error message.


* jt/upload-pack-error-report (2017-02-23) 1 commit
  (merged to 'next' on 2017-03-01 at aea583dbe5)
 + upload-pack: report "not our ref" to client

 "git upload-pack", which is a counter-part of "git fetch", did not
 report a request for a ref that was not advertised as invalid.
 This is generally not a problem (because "git fetch" will stop
 before making such a request), but is the right thing to do.


* ps/docs-diffcore (2017-02-28) 2 commits
  (merged to 'next' on 2017-03-03 at 9ca5691de2)
 + docs/diffcore: unquote "Complete Rewrites" in headers
 + docs/diffcore: fix grammar in diffcore-rename header

 Doc update.


* rj/remove-unused-mktemp (2017-02-28) 2 commits
  (merged to 'next' on 2017-03-03 at 4512f0c5ab)
 + wrapper.c: remove unused gitmkstemps() function
 + wrapper.c: remove unused git_mkstemp() function

 Code cleanup.


* rs/commit-parsing-optim (2017-02-27) 2 commits
  (merged to 'next' on 2017-03-02 at 22239f35df)
 + commit: don't check for space twice when looking for header
 + commit: be more precise when searching for headers

 The code that parses header fields in the commit object has been
 updated for (micro)performance and code hygiene.


* rs/log-email-subject (2017-03-01) 2 commits
  (merged to 'next' on 2017-03-03 at a2ecc84866)
 + pretty: use fmt_output_email_subject()
 + log-tree: factor out fmt_output_email_subject()

 Code clean-up.


* rs/sha1-file-plug-fallback-base-leak (2017-02-27) 1 commit
  (merged to 'next' on 2017-03-02 at 03344b1119)
 + sha1_file: release fallback base's memory in unpack_entry()

 A leak in a codepath to read from a packed object in (rare) cases
 has been plugged.


* rs/strbuf-add-real-path (2017-02-27) 2 commits
  (merged to 'next' on 2017-03-02 at 69191becd6)
 + strbuf: add strbuf_add_real_path()
 + cocci: use ALLOC_ARRAY

 An helper 

Re: [RFC][PATCH] index-pack: add testcases found using AFL

2017-03-10 Thread Ævar Arnfjörð Bjarmason
On Fri, Mar 10, 2017 at 8:06 PM, Jeff King  wrote:
> [Note: your original email didn't make it to the list because it's over
> 100K; I'll quote liberally].
>
> On Fri, Mar 10, 2017 at 04:15:56PM +0100, Vegard Nossum wrote:
>
>> I've used AFL to generate a corpus of pack files that maximises the edge
>> coverage for 'git index-pack'.
>>
>> This is a supplement to (and not a replacement for) the regular test cases
>> where we know exactly what each test is checking for. These testcases are
>> more useful for avoiding regressions in edge cases or as a starting point
>> for future fuzzing efforts.
>>
>> To see the output of running 'git index-pack' on each file, you can do
>> something like this:
>>
>>   make -C t GIT_TEST_OPTS="--run=34 --verbose" t5300-pack-object.sh
>>
>> I observe the following coverage changes (for t5300 only):
>>
>>   path  old%  new%pp
>>   
>>   builtin/index-pack.c  74.3  76.6   2.3
>>   pack-write.c  79.8  80.4.6
>>   patch-delta.c 67.4  81.4  14.0
>>   usage.c   26.6  35.5   8.9
>>   wrapper.c 42.0  46.1   4.1
>>   zlib.c58.7  64.1   5.4
>
> I'm not sure how I feel about this. More coverage is good, I guess, but
> we don't have any idea what these packfiles are doing, or whether
> index-pack is behaving sanely in the new lines. The most we can say is
> that we tested more lines of code and that nothing segfaulted or
> triggered something like ASAN.

Isn't the main value with these sorts of tests that they make up the
difference in the current manually maintained coverage & some
randomized coverage. So when you change the code in the future and the
randomized coverage changes, we don't know if that's a good or a bad
thing, but at least we're more likely to know that it changed, and at
that point someone's likely to actually investigate the root cause,
which'll turn some AFL blob testcase into an isolated testcase?


Re: git-push branch confusion caused by user mistake

2017-03-10 Thread Junio C Hamano
Phil Hord  writes:

> I think git should be smarter about deducing the dest ref from the
> source ref if the source ref is in refs/remotes, but I'm not sure how
> far to take it.

My knee-jerk reaction is "Don't take it anywhere".  

Giving a refspec from the command line is an established way to
defeat the default behaviour when you do not give any and only the
remote, and making it do things behind user's back, you would be
robbing the escape hatch from people.

It often is useful in real-life workflow when "git push $dest
origin/master" does exactly the way it works now, which I actually
use myself.  Imagine that you have two repositories, use one of them
primarily to interact with the outside world and do your work, but
you then occasionally push from that primary repository to the other
one, instead of logging into the host that has the other one and
running a fetch on that host from the outside world.  Your "trying
to be clever when given a colon-less refspec" will force people to
type "git push $dest origin/master:origin/master" in such a case.




Re: [PATCH v1] Travis: also test on 32-bit Linux

2017-03-10 Thread René Scharfe

Am 10.03.2017 um 21:13 schrieb Junio C Hamano:

René Scharfe  writes:


I think this misses the other two cases: (*dst, src) and (*dst, *src).


... and that's why I left them out.  You can't get dst vs. *dst wrong
with structs (at least not without the compiler complaining); only
safe transformations are included in this round.


I haven't followed this discussion to the end, but the omission of 2
out of obvious 4 did pique my curiosity when I saw it, too, and made
me wonder if the omission was deliberate.  If so, it would be nice
to state why in the log message (or in copy.cocci file itself as a
comment).

It also made me wonder if we would be helped with a further
combinatorial explosion from "T **dstp, **srcp" and somesuch (in
other words, I am wondering why a rule for 'T *src' that uses '*src'
need to be spelled out separately when there already is a good rule
for 'T src' that uses 'src'---is that an inherent restriction of the
tool?).


There are redundancies. This semantic patch here:

@@
type T;
T dst;
T *src;
@@
- memcpy(, src, sizeof(dst));
+ dst = *src;

would match e.g. this (from convert.c):

memcpy(_stats, , sizeof(new_stats));

and transform it to:

new_stats = *

We'd need just one more rule to remove the "*&" part and could then get 
rid of two of the "T src" variants, to arrive at something like this:


@@
type T;
T dst, src;
@@
- memcpy(, , sizeof(src));
+ dst = src;

@ r @
type T;
T dst;
T *src;
@@
(
- memcpy(, src, sizeof(dst));
+ dst = *src;
|
- memcpy(, src, sizeof(*src));
+ dst = *src;
|
- memcpy(, src, sizeof(T));
+ dst = *src;
)

@ depends on r @
expression E;
@@
- *&
  E

René


[PATCH v2] repack: Add option to preserve and prune old pack files

2017-03-10 Thread James Melvin
The new --preserve-and-prune option renames old pack files
instead of deleting them after repacking and prunes previously
preserved pack files.

This option is designed to prevent stale file handle exceptions
during git operations which can happen on users of NFS repos when
repacking is done on them. The strategy is to preserve old pack files
around until the next repack with the hopes that they will become
unreferenced by then and not cause any exceptions to running processes
when they are finally deleted (pruned).

Signed-off-by: James Melvin 
---
 Documentation/git-repack.txt |  4 +++
 builtin/repack.c | 66 +++-
 2 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index 26afe6ed5..effd98a43 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -143,6 +143,10 @@ other objects in that pack they already have locally.
being removed. In addition, any unreachable loose objects will
be packed (and their loose counterparts removed).
 
+--preserve-and-prune::
+Preserve old pack files by renaming them instead of deleting. Prune any
+previously preserved pack files before preserving new ones.
+
 Configuration
 -
 
diff --git a/builtin/repack.c b/builtin/repack.c
index 677bc7c81..78fad8f1a 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -10,6 +10,7 @@
 
 static int delta_base_offset = 1;
 static int pack_kept_objects = -1;
+static int preserve_and_prune;
 static int write_bitmaps;
 static char *packdir, *packtmp;
 
@@ -108,6 +109,60 @@ static void get_non_kept_pack_filenames(struct string_list 
*fname_list)
closedir(dir);
 }
 
+/*
+ * Adds all preserved packs hex strings to the fname list
+ */
+static void get_preserved_pack_filenames(struct string_list *fname_list)
+{
+   DIR *dir;
+   struct dirent *e;
+   char *fname;
+
+   if (!(dir = opendir(packdir)))
+   return;
+
+   while ((e = readdir(dir)) != NULL) {
+   size_t len;
+   if (!strip_suffix(e->d_name, ".old-pack", ))
+   continue;
+
+   fname = xmemdupz(e->d_name, len);
+   string_list_append_nodup(fname_list, fname);
+   }
+   closedir(dir);
+}
+
+static void remove_preserved_packs(void) {
+   const char *exts[] = {"pack", "idx", "keep", "bitmap"};
+   int i;
+   struct string_list names = STRING_LIST_INIT_DUP;
+   struct string_list_item *item;
+
+   get_preserved_pack_filenames();
+
+   for_each_string_list_item(item, ) {
+   for (i = 0; i < ARRAY_SIZE(exts); i++) {
+   char *fname;
+   fname = mkpathdup("%s/%s.old-%s",
+ packdir,
+ item->string,
+ exts[i]);
+   if (remove_path(fname))
+   warning(_("failed to remove '%s'"), fname);
+   free(fname);
+   }
+   }
+}
+
+static void preserve_pack(const char *file_path, const char *file_name,  const 
char *file_ext)
+{
+   char *fname_old;
+
+   fname_old = mkpathdup("%s/%s.old-%s", packdir, file_name, ++file_ext);
+   rename(file_path, fname_old);
+   free(fname_old);
+}
+
 static void remove_redundant_pack(const char *dir_name, const char *base_name)
 {
const char *exts[] = {".pack", ".idx", ".keep", ".bitmap"};
@@ -121,7 +176,10 @@ static void remove_redundant_pack(const char *dir_name, 
const char *base_name)
for (i = 0; i < ARRAY_SIZE(exts); i++) {
strbuf_setlen(, plen);
strbuf_addstr(, exts[i]);
-   unlink(buf.buf);
+   if (preserve_and_prune)
+   preserve_pack(buf.buf, base_name, exts[i]);
+   else
+   unlink(buf.buf);
}
strbuf_release();
 }
@@ -194,6 +252,9 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
N_("maximum size of each packfile")),
OPT_BOOL(0, "pack-kept-objects", _kept_objects,
N_("repack objects in packs marked with 
.keep")),
+   OPT_BOOL(0, "preserve-and-prune", _and_prune,
+   N_("preserve old pack files by renaming them 
instead of deleting, prune any "
+   "previously preserved pack 
files before preserving new ones")),
OPT_END()
};
 
@@ -404,6 +465,9 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
 
/* End of pack replacement. */
 
+   if (preserve_and_prune)
+   remove_preserved_packs();
+
if (delete_redundant) {
int opts = 0;

Re: [PATCH] repack: Add options to preserve and prune old pack files

2017-03-10 Thread jmelvin

On 2017-03-07 13:33, Junio C Hamano wrote:

James Melvin  writes:


...


I am not sure if I understand your design.  Your model looks to me
like there are two modes of operation.  #1 uses "--preserve-old" and
sends old ones to purgatory instead of removing them and #2 uses
"--prune-preserved" to remove all the ones in the purgatory
immediately.  A few things that come to my mind immediately:

 * When "--prune-preseved" is used, it removes both ancient ones and
   more recent ones indiscriminately.  Would it make more sense to
   "expire" only the older ones while keeping the more recent ones?

 * It appears that the main reason you would want --prune-preserved
   in this design is after running with "--preserve-old" number of
   times, you want to remove really old ones that have accumulated,
   and I would imagine that at that point of time, you are only
   interested in repack, but the code structure tells me that this
   will force the users to first run a repack before pruning.

I suspect that a design that is simpler to explain to the users may
be to add a command line option "--preserve-pruned=" and
a matching configuration variable repack.preservePruned, which
defaults to "immediate" (i.e. no preserving), and

 - When the value of preserve_pruned is not "immediate", use
   preserve_pack() instead of unlink();

 - At the end, find preserved packs that are older than the value in
   preserve_pruned and unlink() them.

It also may make sense to add another command line option
"--prune-preserved-packs-only" (without matching configuration
variable) that _ONLY_ does the "find older preserved packs and
unlink them" part, without doing any repack.


I like the idea of only having a single option to do both the preserving 
and the pruning. It makes the operation more cycle based, which is more 
closely tied to the use case this is intended for. Ie. Run repack while 
preserving old pack files so that any open file handles to those pack 
files will continue to work, while the next time repack is run it is 
highly likely those will no longer be needed, so they can be cleaned up. 
Obviously this depends on how frequent repack is run, but something an 
Administrator can determine.


I'll push a patch that combines that functionality into a single option 
to review.



-James

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,

a Linux Foundation Collaborative Project


Re: [RFC PATCH] help: add optional instructions for reporting bugs

2017-03-10 Thread Christian Couder
On Fri, Mar 10, 2017 at 7:13 PM, Jonathan Nieder  wrote:
> Hi,
>
> Stefan Beller wrote:
>
>> When reporting bugs, users will usually look at the output of
>> 'git --version' at one point to write a quality bug report.
>> So that is a good spot to provide additional information to the user
>> about e.g. additional the organizational quirks how to report a bug.
>>
>> As the output of 'git --version' is parsed by scripts as well,
>> we only want to present this information to users, which is why
>> we only give the output to TTYs.
>
> Interesting thought.  This might also be a good place to point users
> to "git version --build-options" to get more detailed build
> information.

It looks like there is no documentation for `git version` which is a
bit unfortunate.
Also `git version -h` ignores the -h option so we cannot know that
--build-options exists.

> The existence of that option also suggests you're on the right track
> about 'git version' being the command for this.

I agree, but I don't think --build-options is the best for that.

>> Git is distributed in various ways by various organizations. The Git
>> community prefers to have bugs reported on the mailing list, whereas
>> other organizations may rather want to have filed a bug in a bug tracker
>> or such.  The point of contact is different by organization as well.
>
> It's tempting to put the custom information in --build-options --- e.g.
>
>  $ git version
>  git version 2.12.0.190.g6e60aba09d.dirty
>  hint: use "git version --build-options" for more detail
>  hint: and bug reporting instructions
>  $
>  $ git version --build-options
>  git version 2.12.0.190.g6e60aba09d.dirty
>  sizeof-long: 8
>  reporting-bugs: see REPORTING BUGS section in "git help git"

I don't think it's a good idea to add those "hint: ..." and
"reporting-bugs: ..." lines.

I think it's better to do things like the following:

- add `git version -h`
- add proper documentation for `git version` so that `git help version` works
- in `git help version` talk about the REPORTING BUGS section in `git help git`
- add `git version --full` or other such options to also print other
stuff like the current OS, processor architecture, libc, etc
- suggest in `git help version` and/or in the REPORTING BUGS section
in `git help git` that people just copy paste the output of `git
version --full` in their bug report

[...]

> I'm still on the fence about whether this is a good idea. At least
> having custom bug instructions in --build-options sounds like a very
> nice thing, but it's not obvious to me how people would learn about
> that option.

Yeah indeed.

Thanks,
Christian.


git-push branch confusion caused by user mistake

2017-03-10 Thread Phil Hord
This week a user accidentally did this:

$ git push origin origin/master
Total 0 (delta 0), reused 0 (delta 0)
To parent.git
 * [new branch]  origin/master -> origin/master

He saw his mistake when the "new branch" message appeared, but he was
confused about how to fix it and worried he broke something.

It seems reasonable that git expanded the original args into this one:

git push origin refs/remotes/origin/master

However, since the dest ref was not provided, it was assumed to be the
same as the source ref, so it worked as if he typed this:

git push origin refs/remotes/origin/master:refs/remotes/origin/master

Indeed, git ls-remote origin shows the result:

$ git ls-remote origin
d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e HEAD
d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/remotes/origin/master

Also, I verified that this (otherwise valid) command has similar
unexpected results:
$ git remote add other foo.git && git fetch other && git push
origin other/topic
$ git ls-remote origin
d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e HEAD
d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/heads/master
d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/remotes/origin/master
d1ff1c9224ae5e58a7656fb9ecc95865d42ed71e refs/remotes/other/topic

I think git should be smarter about deducing the dest ref from the
source ref if the source ref is in refs/remotes, but I'm not sure how
far to take it.  It feels like we should translate refspecs something
like this for push:

origin/master
=> refs/remotes/origin/master:refs/heads/master

refs/remotes/origin/master
 => refs/remotes/origin/master:refs/heads/master

origin/master:origin/master
 => refs/remotes/origin/master:refs/heads/origin/master

master:refs/remotes/origin/master
 => refs/heads/master:refs/remotes/origin/master

That is, we should not infer a remote refspec of "refs/remotes/*"; we
should only get there if "refs/remotes" was given explicitly by the
user.

Does this seem reasonable?  I can try to work up a patch if so.


Re: [RFC][PATCH] index-pack: add testcases found using AFL

2017-03-10 Thread Vegard Nossum

On 10/03/2017 20:42, Jeff King wrote:

That's something I guess, but I'm not enthused by the idea of just
dumping a bunch of binary test cases that nobody, not even the author,
understands.

[...]


My real concern is that this is the tip of the ice berg. So we increased
coverage in one program by a few percent. But wouldn't this procedure be
applicable to lots of _other_ parts of Git, too?


I think that index-pack is in a special position given its role as the
verifier for packs received over the network, which you also wrote here:
https://www.spinics.net/lists/git/msg265118.html

I also think increased coverage for other parts of git which are not
considered security-sensitive is less valuable without testing for an
actual expected result.


Vegard


Re: [GSoC] Discussion of "Submodule related work" project

2017-03-10 Thread Valery Tolstov
> This means I can take only any one of them as starting point for
> my proposal?

If it is true, than i'll try to take sh->C transition for submodule
command, and as addirional part of my whole project also this:
https://public-inbox.org/git/1488913150.881...@smtp.yandex.ru/T/

>> Have some questions about "Submodule related work" project
>>
>> First of all, I would like to add this task to the project, if I'll take it:
>> https://public-inbox.org/git/1488913150.881...@smtp.yandex.ru/T/
>> What do you think about this task?
>
> That is a nice project, though my gut feeling is that it is too small
> for a GSoC project on itself.

Does it sound good? If does, then I'll begin to work on my proposal.

Thanks,
  Valery Tolstov


bug?: git reset --mixed ignores deinitialized submodules

2017-03-10 Thread David Turner
Git reset --mixed ignores submodules which are not initialized.  I've
attached a demo script.  

On one hand, this matches the documentation ("Resets the index but not
the working tree").  But on the other hand, it kind of doesn't: "(i.e.,
the changed files are preserved but not marked for commit)".

It's hard to figure out what a mixed reset should do.  It would be
weird for it to initialize the submodule.  Maybe it should just refuse
to run?  Maybe there should be an option for it to initialize the
submodule for you?  Maybe it should drop a special-purpose file that
git understands to be a submodule change?  For instance (and this is
insane, but, like, maybe worth considering) it could use extended
filesystem attributes, where available.

#!/bin/bash
mkdir demo
cd demo

git init main

(
git init sub1 &&
cd sub1 &&
dd if=/dev/urandom of=f bs=40 count=1 &&
git add f &&
git commit -m f
) &&

(
cd main &&
git submodule add ../sub1 sub1 &&
git commit -m 'add submodule' &&
git tag start
) &&

# add a commit on sub1
(
cd main/sub1 &&
echo morx > f &&
git add f &&
git commit -m 'a commit'
) &&

# commit that change on main,  deinit the submodule and do a mixed
reset
(
cd main &&
git add sub1 &&
git commit -m 'update sub1' &&
git submodule deinit sub1 &&
git reset --mixed HEAD^ &&
git status # change to sub1 is lost
)



Re: [PATCH v3 2/2] submodule--helper.c: remove duplicate code

2017-03-10 Thread Stefan Beller
On Fri, Mar 10, 2017 at 12:40 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> On Fri, Mar 10, 2017 at 11:42 AM, Junio C Hamano  wrote:
>>> Brandon Williams  writes:
>>>
 On 03/09, Valery Tolstov wrote:
> Remove code fragment from module_clone that duplicates functionality
> of connect_work_tree_and_git_dir in dir.c
>
> Signed-off-by: Valery Tolstov 

 Looks good.
>>>
>>> I'll queue with your Reviewed-by: added.
>>>
>>> If sb/checkout-recurse-submodules is going to be rerolled, I'd
>>> appreciate if it includes this patch inserted at an appropriate
>>> place in the series, instead of me having to remember re-applying
>>> this patch every time it happens.
>>
>> Instead of mixing these two series, can you just take Valerys series as is,
>> and sb/checkout-recurse-submodules builds on top of that when rerolled?
>
> That's fine by me, too, but that still means I need to keep an eye
> on two independent topics that interact each other.  Is a single
> patch 2/2 that important to be on a separate topic?  Expressed in
> another way, is it expected that sb/checkout-recurse-submodules
> would take forever to mature relative to these two patches?

Using the times and number of rerolls this has been around, it
is a not unreasonable to estimate sb/checkout-... will take longer
than this code deduplication patch.

Thanks,
Stefan


Re: [PATCH v3 2/2] submodule--helper.c: remove duplicate code

2017-03-10 Thread Junio C Hamano
Stefan Beller  writes:

> On Fri, Mar 10, 2017 at 11:42 AM, Junio C Hamano  wrote:
>> Brandon Williams  writes:
>>
>>> On 03/09, Valery Tolstov wrote:
 Remove code fragment from module_clone that duplicates functionality
 of connect_work_tree_and_git_dir in dir.c

 Signed-off-by: Valery Tolstov 
>>>
>>> Looks good.
>>
>> I'll queue with your Reviewed-by: added.
>>
>> If sb/checkout-recurse-submodules is going to be rerolled, I'd
>> appreciate if it includes this patch inserted at an appropriate
>> place in the series, instead of me having to remember re-applying
>> this patch every time it happens.
>
> Instead of mixing these two series, can you just take Valerys series as is,
> and sb/checkout-recurse-submodules builds on top of that when rerolled?

That's fine by me, too, but that still means I need to keep an eye
on two independent topics that interact each other.  Is a single
patch 2/2 that important to be on a separate topic?  Expressed in
another way, is it expected that sb/checkout-recurse-submodules
would take forever to mature relative to these two patches?


[PATCH v3] ref-filter: Add --no-contains option to tag/branch/for-each-ref

2017-03-10 Thread Ævar Arnfjörð Bjarmason
Change the tag, branch & for-each-ref commands to have a --no-contains
option in addition to their longstanding --contains options.

The use-case I have for this is to find the last-good rollout tag
given a known-bad . Right now, given a hypothetically bad
commit v2.10.1-3-gcf5c7253e0, you can find which git version to revert
to with this hacky two-liner:

(./git tag -l 'v[0-9]*'; ./git tag -l 'v[0-9]*' --contains 
v2.10.1-3-gcf5c7253e0) \
|sort|uniq -c|grep -E '^ *1 '|awk '{print $2}' | tail -n 10

But with the --no-contains option you can now get the exact same
output with:

./git tag -l 'v[0-9]*' --no-contains v2.10.1-3-gcf5c7253e0|sort|tail -n 10

The filtering machinery is generic between the tag, branch &
for-each-ref commands, so once I'd implemented it for tag it was
trivial to add support for this to the other two.

A practical use for this with "branch" is e.g. finding branches which
diverged between 2.8 & 2.10:

./git branch --contains v2.8.0 --no-contains v2.10.0

The "describe" command also has a --contains option, but its semantics
are unrelated to what tag/branch/for-each-ref use --contains for. I
don't see how a --no-contains option for "describe" would make any
sense, other than being exactly equivalent to not supplying --contains
at all, which would be confusing at best.

I'm adding a --without option to "tag" as an alias for --no-contains
for consistency with --with and --contains. Since we don't even
document --with anymore (or test it). The --with option is
undocumented, and possibly the only user of it is Junio[1]. But it's
trivial to support, so let's do that.

Where I'm changing existing documentation lines I'm mainly word
wrapping at 75 columns to be consistent with the existing style. The
changes to Documentation/ are much smaller with: git show --word-diff,
same for the minor change to git-completion.bash.

Most of the test changes I've made are just doing the inverse of the
existing --contains tests, with this change --no-contains for tag,
branch & for-each-ref is just as well tested as the existing
--contains option.

In addition to those tests I've added tests for --contains in
combination with --no-contains for all three commands, and a test for
"tag" which asserts that --no-contains won't find tree/blob tags,
which is slightly unintuitive, but consistent with how --contains
works.

When --contains and --no-contains are provided to "tag" we disable the
optimized code to find tags added in v1.7.2-rc1-1-gffc4b8012d. That
code changes flags on commit objects as an optimization, and thus
can't be called twice.

Jeff King has a WIP patch to fix that[2], but rather than try to
incorporate that into this already big patch I'm just disabling the
optimized codepath. Using both --contains and --no-contains will most
likely be rare, and we can get an initial correct version working &
optimize it later.

It's possible that the --merge & --no-merge codepaths for "tag" have a
similar bug, but as of writing I can't produce any evidence of that
via a brute-force test script[3].

1. 
2. <20170309125132.tubwxtneffok4...@sigill.intra.peff.net>
3. 

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

With the changes noted in my

&&
.

 Documentation/git-branch.txt   |  24 ---
 Documentation/git-for-each-ref.txt |   6 +-
 Documentation/git-tag.txt  |   6 +-
 builtin/branch.c   |   4 +-
 builtin/for-each-ref.c |   3 +-
 builtin/tag.c  |  17 -
 contrib/completion/git-completion.bash |   9 +--
 parse-options.h|   4 +-
 ref-filter.c   |  16 +++--
 ref-filter.h   |   1 +
 t/t3201-branch-contains.sh |  51 +-
 t/t6302-for-each-ref-filter.sh |  16 +
 t/t7004-tag.sh | 123 -
 13 files changed, 251 insertions(+), 29 deletions(-)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index 092f1bcf9f..28a36a8a0a 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 'git branch' [--color[=] | --no-color] [-r | -a]
[--list] [-v [--abbrev= | --no-abbrev]]
[--column[=] | --no-column]
-   [(--merged | --no-merged | --contains) []] [--sort=]
+   [(--merged | --no-merged | --contains | --no-contains) []] 
[--sort=]
[--points-at ] [--format=] [...]
 'git branch' [--set-upstream | --track | --no-track] [-l] [-f]  
[]
 'git branch' (--set-upstream-to= | -u ) []
@@ -35,11 +35,12 @@ as branch creation.
 
 With `--contains`, shows only the branches that contain the named commit
 (in other 

Re: Stable GnuPG interface, git should use GPGME

2017-03-10 Thread Theodore Ts'o
On Fri, Mar 10, 2017 at 10:54:19AM -0800, Linus Torvalds wrote:
>  - library versioning.
> 
>I don't know why, but I've never *ever* met a library developer who
> realized that libraries were all about stable API's, and the library
> users don't want to fight different versions.

Actually, you have.  (Raises hand :-)

libext2fs has a stable API *and* ABI.  We add new functions instead of
changing function parameters (so ext2fs_block_iterate2() is
implemented in terms of ext2fs_block_iterate3(), and so on).  And
structures have magic numbers that have served as versioning signal.
This is actually not rocket science.  If you've met anyone who's
programmed for Multics, they did something similar.  And of course,
that's why we have the wait3(2) and wait(4) system calls.

I do have to agree with your general point, that most developers tend
to be *incredibly* sloppy with their interfaces.  That being said, not
all library developers are as bad as GNOME.  :-)

 - Ted


Re: [PATCH v1] Travis: also test on 32-bit Linux

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 12:13:11PM -0800, Junio C Hamano wrote:

> René Scharfe  writes:
> 
> >> I think this misses the other two cases: (*dst, src) and (*dst, *src).
> >
> > ... and that's why I left them out.  You can't get dst vs. *dst wrong
> > with structs (at least not without the compiler complaining); only
> > safe transformations are included in this round.
> 
> I haven't followed this discussion to the end, but the omission of 2
> out of obvious 4 did pique my curiosity when I saw it, too, and made
> me wonder if the omission was deliberate.  If so, it would be nice
> to state why in the log message (or in copy.cocci file itself as a
> comment).

Yeah, it definitely would be worth mentioning. I'm still undecided on
whether we want to be endorsing struct assignment more fully.

> It also made me wonder if we would be helped with a further
> combinatorial explosion from "T **dstp, **srcp" and somesuch (in
> other words, I am wondering why a rule for 'T *src' that uses '*src'
> need to be spelled out separately when there already is a good rule
> for 'T src' that uses 'src'---is that an inherent restriction of the
> tool?).

I had that thought, too, but I think the 4-way rules are necessary,
because the transformations aren't the same in each case. E.g., for the
four cases, the resulting assignments are:

(dst, src): dst = src;
   (dst, *src): dst = *src;
   (*dst, src): *dst = src;
  (*dst, *src): *dst = *src;

For pointer-to-pointer, I assumed the tool would handle that
automatically by matching "T" as "T*". Though if that is the case, I
think "(dst, src)" and "(*dst, *src)" would be equivalent (though of
course our rule matches are different, as you do not memcpy the raw
structs).

-Peff


Re: [PATCH] Makefile: detect errors in running spatch

2017-03-10 Thread Junio C Hamano
Jeff King  writes:

> It also doesn't help that shells are awkward at passing status out of a
> for-loop. I think the most "make-ish" way of doing this would actually
> be to lose the for loop and have a per-cocci-per-source target.

As we assume we can freely use GNUmake facilities, another option,
(i.e. the most "gnumake-ish" way) may be to have it unroll the loop
with $(foreach,...) so that the shell just sees a series of
commands.

> I don't know if that would make the patches harder to apply. The results
> aren't full patches, so I assume you usually do some kind of munging on
> them? I resorted to:
>
>   make coccicheck SPATCH='spatch --in-place'
>
>  Makefile | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 9ec6065cc..d97633892 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2336,9 +2336,17 @@ check: common-cmds.h
>  C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
>  %.cocci.patch: %.cocci $(C_SOURCES)
>   @echo '' SPATCH $<; \
> + ret=0; \
>   for f in $(C_SOURCES); do \
> - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \
> - done >$@ 2>$@.log; \
> + $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \
> + { ret=$$?; break; }; \
> + done >$@+ 2>$@.log; \
> + if test $$ret != 0; \
> + then \
> + cat $@.log; \
> + exit 1; \
> + fi; \
> + mv $@+ $@; \
>   if test -s $@; \
>   then \
>   echo '' SPATCH result: $@; \


Re: [PATCH v1] Travis: also test on 32-bit Linux

2017-03-10 Thread Junio C Hamano
René Scharfe  writes:

>> I think this misses the other two cases: (*dst, src) and (*dst, *src).
>
> ... and that's why I left them out.  You can't get dst vs. *dst wrong
> with structs (at least not without the compiler complaining); only
> safe transformations are included in this round.

I haven't followed this discussion to the end, but the omission of 2
out of obvious 4 did pique my curiosity when I saw it, too, and made
me wonder if the omission was deliberate.  If so, it would be nice
to state why in the log message (or in copy.cocci file itself as a
comment).

It also made me wonder if we would be helped with a further
combinatorial explosion from "T **dstp, **srcp" and somesuch (in
other words, I am wondering why a rule for 'T *src' that uses '*src'
need to be spelled out separately when there already is a good rule
for 'T src' that uses 'src'---is that an inherent restriction of the
tool?).






Re: git commit --interactive patch-mode no longer allows selecting files

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 11:57:04AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > It's a bug. The fix is in c852bd54bd87fdcdc825f5d45c26aa745be13ba6, but
> > has not yet been merged to any integration branches. I hope it will make
> > it into v2.12.1.
> 
> Wow, you got me worried.  
> 
> It has been in 'pu' (and my private edition 'jch', which is
> somewhere between 'next' and 'pu') and marked as "Will merge to
> 'next'".

Yeah, sorry, I just meant "not in master or next". I did check that you
had picked it up.

-Peff


Re: git commit --interactive patch-mode no longer allows selecting files

2017-03-10 Thread Junio C Hamano
Jeff King  writes:

> It's a bug. The fix is in c852bd54bd87fdcdc825f5d45c26aa745be13ba6, but
> has not yet been merged to any integration branches. I hope it will make
> it into v2.12.1.

Wow, you got me worried.  

It has been in 'pu' (and my private edition 'jch', which is
somewhere between 'next' and 'pu') and marked as "Will merge to
'next'".




Re: [PATCH v2 1/2] pathspec: allow querying for attributes

2017-03-10 Thread Jonathan Tan
Thanks - I don't think I have any more comments on this patch set after 
these.


On 03/10/2017 10:59 AM, Brandon Williams wrote:

diff --git a/pathspec.c b/pathspec.c
index b961f00c8..7cd5f6e3d 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -87,6 +89,74 @@ static void prefix_magic(struct strbuf *sb, int prefixlen, 
unsigned magic)
strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }

+static void parse_pathspec_attr_match(struct pathspec_item *item, const char 
*value)
+{
+   struct string_list_item *si;
+   struct string_list list = STRING_LIST_INIT_DUP;
+
+   if (item->attr_check)
+   die(_("Only one 'attr:' specification is allowed."));
+
+   if (!value || !*value)
+   die(_("attr spec must not be empty"));
+
+   string_list_split(, value, ' ', -1);
+   string_list_remove_empty_items(, 0);
+
+   item->attr_check = attr_check_alloc();
+   ALLOC_GROW(item->attr_match,
+  list.nr,
+  item->attr_match_alloc);


If item->attr_match always starts empty, then I think an xmalloc or 
xcalloc suffices (and we don't need item->attr_match_alloc anymore).


We should probably also check item->attr_match above - that is, `if 
(item->attr_check || item->attr_match)`.



+
+   for_each_string_list_item(si, ) {
+   size_t attr_len;
+   char *attr_name;
+   const struct git_attr *a;
+
+   int j = item->attr_match_nr++;
+   const char *attr = si->string;
+   struct attr_match *am = >attr_match[j];
+
+   switch (*attr) {
+   case '!':
+   am->match_mode = MATCH_UNSPECIFIED;
+   attr++;
+   attr_len = strlen(attr);
+   break;
+   case '-':
+   am->match_mode = MATCH_UNSET;
+   attr++;
+   attr_len = strlen(attr);
+   break;
+   default:
+   attr_len = strcspn(attr, "=");
+   if (attr[attr_len] != '=')
+   am->match_mode = MATCH_SET;
+   else {
+   am->match_mode = MATCH_VALUE;
+   am->value = xstrdup([attr_len + 1]);
+   if (strchr(am->value, '\\'))
+   die(_("attr spec values must not contain 
backslashes"));
+   }
+   break;
+   }
+
+   attr_name = xmemdupz(attr, attr_len);
+   a = git_attr(attr_name);
+   if (!a)
+   die(_("invalid attribute name %s"), attr_name);
+
+   attr_check_append(item->attr_check, a);
+
+   free(attr_name);
+   }
+
+   if (item->attr_check->nr != item->attr_match_nr)
+   die("BUG: should have same number of entries");


I think such postcondition checks are usually not worth it, but others 
might differ.



+
+   string_list_clear(, 0);
+}
+
 static inline int get_literal_global(void)
 {
static int literal = -1;


Re: RFC v3: Another proposed hash function transition plan

2017-03-10 Thread Jonathan Nieder
Jeff King wrote:
> On Thu, Mar 09, 2017 at 12:24:08PM -0800, Jonathan Nieder wrote:

>>> SHA-1 to SHA-3: lookup SHA-1 in .msha1, reverse .idx, find offset to
>>> read the SHA-3.
>>> SHA-3 to SHA-1: lookup SHA-3 in .idx, and reverse the .msha1 file to
>>> translate offset to SHA-1.
>>
>> Thanks for this suggestion.  I was initially vaguely nervous about
>> lookup times in an idx-style file, but as you say, object reads from a
>> packfile already have to deal with this kind of lookup and work fine.
>
> Not exactly. The "reverse .idx" step has to build the reverse mapping on
> the fly, and it's non-trivial.

Sure.  To be clear, I was handwaving over that since adding an on-disk
reverse .idx is a relatively small change.

[...]
> So I think it's solvable, but I suspect we would want an extension to
> the .idx format to store the mapping array, in order to keep it log-n.

i.e., this.

The loose object side is the more worrying bit, since we currently don't
have any practical bound on the number of loose objects.

One way to deal with that is to disallow loose objects completely.
Use packfiles for new objects, batching the objects produced by a
single process into a single packfile.  Teach "git gc --auto" a
behavior similar to Martin Fick's "git exproll" to combine packfiles
between full gcs to maintain reasonable performance.  For unreachable
objects, instead of using loose objects, use "unreachable garbage"
packs explicitly labeled as such, with similar semantics to what
JGit's DfsRepository backend uses (described in the discussion at
https://git.eclipse.org/r/89455).

That's a direction that I want in the long term anyway.  I was hoping
not to couple such changes with the hash transition but it might be
one of the simpler ways to go.

Jonathan


Re: What's cooking in git.git (Mar 2017, #03; Wed, 8)

2017-03-10 Thread Junio C Hamano
"brian m. carlson"  writes:

> On Wed, Mar 08, 2017 at 03:47:20PM -0800, Junio C Hamano wrote:
>> * bc/object-id (2017-02-22) 19 commits
>>  - wt-status: convert to struct object_id
>>  - builtin/merge-base: convert to struct object_id
>>  - Convert object iteration callbacks to struct object_id
>>  - sha1_file: introduce an nth_packed_object_oid function
>>  - refs: simplify parsing of reflog entries
>>  - refs: convert each_reflog_ent_fn to struct object_id
>>  - reflog-walk: convert struct reflog_info to struct object_id
>>  - builtin/replace: convert to struct object_id
>>  - Convert remaining callers of resolve_refdup to object_id
>>  - builtin/merge: convert to struct object_id
>>  - builtin/clone: convert to struct object_id
>>  - builtin/branch: convert to struct object_id
>>  - builtin/grep: convert to struct object_id
>>  - builtin/fmt-merge-message: convert to struct object_id
>>  - builtin/fast-export: convert to struct object_id
>>  - builtin/describe: convert to struct object_id
>>  - builtin/diff-tree: convert to struct object_id
>>  - builtin/commit: convert to struct object_id
>>  - hex: introduce parse_oid_hex
>> 
>>  "uchar [40]" to "struct object_id" conversion continues.
>> 
>>  Now at v5.
>>  cf. <20170221234737.894681-1-sand...@crustytoothpaste.net>
>
> Were you expecting more work on this series?  I believe I've addressed
> all the review comments that were outstanding, but if I've missed
> something, please let me know.

I myself am not aware but I wasn't the one with most comments to
this series, so we need help from others to reconfirm "Yeah, this
one is now good to go".

Thanks.




Re: [PATCH v3 2/2] submodule--helper.c: remove duplicate code

2017-03-10 Thread Stefan Beller
On Fri, Mar 10, 2017 at 11:42 AM, Junio C Hamano  wrote:
> Brandon Williams  writes:
>
>> On 03/09, Valery Tolstov wrote:
>>> Remove code fragment from module_clone that duplicates functionality
>>> of connect_work_tree_and_git_dir in dir.c
>>>
>>> Signed-off-by: Valery Tolstov 
>>
>> Looks good.
>
> I'll queue with your Reviewed-by: added.
>
> If sb/checkout-recurse-submodules is going to be rerolled, I'd
> appreciate if it includes this patch inserted at an appropriate
> place in the series, instead of me having to remember re-applying
> this patch every time it happens.

Instead of mixing these two series, can you just take Valerys series as is,
and sb/checkout-recurse-submodules builds on top of that when rerolled?

Thanks,
Stefan

>
> Thanks.


Re: [PATCH 02/10] pack-objects: add --partial-by-size=n --partial-special

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 11:38:10AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I think we ended up deciding that it would be better to just disallow
> > symlink .gitattributes (and .git*) from entering the index, the way we
> > disallow ".git".
> 
> Hmph, I thought we would need both, though.  Or do we specifically
> want to honor untracked .gitattributes that is left as a symlink
> pointing to elsewhere in the filesystem or something like that?

I wasn't going to worry about an untracked .gitattributes. The reasons
for disallowing symlinked .gitattributes are:

  - it doesn't behave the same when accessed internally via the tree
objects

  - malicious symlinks that try to leave the repository

Neither of those issues is at play if your symlink .gitattributes file
isn't tracked. So there's some inconsistency in the sense that it
"works" until you try to "git add" it. But either you aren't going to
add it (in which case it's a feature that it works), or you are going to
add it, and you'll get notified then that it's disallowed.

-Peff


Re: [RFC][PATCH] index-pack: add testcases found using AFL

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 08:34:45PM +0100, Vegard Nossum wrote:

> > That's something I guess, but I'm not enthused by the idea of just
> > dumping a bunch of binary test cases that nobody, not even the author,
> > understands.
> 
> I understand your concern. This is how I see it:
> 
> Negatives:
> 
>  - 'make test' takes 1 second longer to run
> 
>  - 548K data added to git.git

My real concern is that this is the tip of the ice berg. So we increased
coverage in one program by a few percent. But wouldn't this procedure be
applicable to lots of _other_ parts of Git, too?

IOW, I'm worried about a day when we've added dozens or hundreds of
seconds to the test suite. Sure, we can quit adding at any time, but I
feel like it's easier to make a decision from the outset.

I'm tempted to say these should go into a different test-suite, or be
marked with a special flag or something. But then I guess nobody runs
them.

-Peff


Re: [PATCH v3 2/2] submodule--helper.c: remove duplicate code

2017-03-10 Thread Junio C Hamano
Brandon Williams  writes:

> On 03/09, Valery Tolstov wrote:
>> Remove code fragment from module_clone that duplicates functionality
>> of connect_work_tree_and_git_dir in dir.c
>> 
>> Signed-off-by: Valery Tolstov 
>
> Looks good.

I'll queue with your Reviewed-by: added.

If sb/checkout-recurse-submodules is going to be rerolled, I'd
appreciate if it includes this patch inserted at an appropriate
place in the series, instead of me having to remember re-applying
this patch every time it happens.

Thanks.


Re: RFC v3: Another proposed hash function transition plan

2017-03-10 Thread Jeff King
On Thu, Mar 09, 2017 at 12:24:08PM -0800, Jonathan Nieder wrote:

> > SHA-1 to SHA-3: lookup SHA-1 in .msha1, reverse .idx, find offset to
> > read the SHA-3.
> > SHA-3 to SHA-1: lookup SHA-3 in .idx, and reverse the .msha1 file to
> > translate offset to SHA-1.
> 
> Thanks for this suggestion.  I was initially vaguely nervous about
> lookup times in an idx-style file, but as you say, object reads from a
> packfile already have to deal with this kind of lookup and work fine.

Not exactly. The "reverse .idx" step has to build the reverse mapping on
the fly, and it's non-trivial. For instance, try:

  sha1=$(git rev-parse HEAD)
  time echo $sha1 | git cat-file --batch-check='%(objectsize)'
  time echo $sha1 | git cat-file --batch-check='%(objectsize:disk)'

on a large repo (where HEAD is in a big pack). The on-disk size is
conceptually simpler, as we only need to look at the offset of the
object versus the offset of the object after it. But in practice it
takes much longer, because it has to build the revindex on the fly (I
get 7ms versus 179ms on linux.git).

The effort is linear in the number of objects (we create the revindex
with a radix sort).

The reachability bitmaps suffer from this, too, as they need the
revindex to know which object is at which bit position. At GitHub we
added an extension to the .bitmap files that stores this "bit cache".
Here are timings before and after on linux.git:

  $ time git rev-list --use-bitmap-index --count master
  659371

  real  0m0.182s
  user  0m0.136s
  sys   0m0.044s

  $ time git.gh rev-list --use-bitmap-index --count master
  659371

  real  0m0.016s
  user  0m0.008s
  sys   0m0.004s

It's not a full revindex, but it's enough for bitmap use. You can also
use it to generate the revindex slightly more quickly, because you can
skip the sorting step (you just insert the entries in the correct order
by walking the bit cache and dereferencing the offsets from the .idx
portion). So it's still linear, but with a smaller constant factor.

I think for the purposes here, though, we don't actually care about the
offsets. For the cost of one uint32_t per object, you can keep a list
mapping positions in the sha1 index into the sha3 index. So then you do
the log-n binary search to find the sha1, a constant-time lookup in the
mapping array, and that gives you the position in the sha3 index, from
which you can then access the sha3 (or the actual pack offset, for that
matter).

So I think it's solvable, but I suspect we would want an extension to
the .idx format to store the mapping array, in order to keep it log-n.

-Peff


Re: [PATCH 02/10] pack-objects: add --partial-by-size=n --partial-special

2017-03-10 Thread Junio C Hamano
Jeff King  writes:

> I think we ended up deciding that it would be better to just disallow
> symlink .gitattributes (and .git*) from entering the index, the way we
> disallow ".git".

Hmph, I thought we would need both, though.  Or do we specifically
want to honor untracked .gitattributes that is left as a symlink
pointing to elsewhere in the filesystem or something like that?



Re: [RFC][PATCH] index-pack: add testcases found using AFL

2017-03-10 Thread Vegard Nossum

On 10/03/2017 20:06, Jeff King wrote:

On Fri, Mar 10, 2017 at 04:15:56PM +0100, Vegard Nossum wrote:


I've used AFL to generate a corpus of pack files that maximises the edge
coverage for 'git index-pack'.

This is a supplement to (and not a replacement for) the regular test cases
where we know exactly what each test is checking for. These testcases are
more useful for avoiding regressions in edge cases or as a starting point
for future fuzzing efforts.

To see the output of running 'git index-pack' on each file, you can do
something like this:

  make -C t GIT_TEST_OPTS="--run=34 --verbose" t5300-pack-object.sh

I observe the following coverage changes (for t5300 only):

  path  old%  new%pp
  
  builtin/index-pack.c  74.3  76.6   2.3
  pack-write.c  79.8  80.4.6
  patch-delta.c 67.4  81.4  14.0
  usage.c   26.6  35.5   8.9
  wrapper.c 42.0  46.1   4.1
  zlib.c58.7  64.1   5.4


I'm not sure how I feel about this. More coverage is good, I guess, but
we don't have any idea what these packfiles are doing, or whether
index-pack is behaving sanely in the new lines. The most we can say is
that we tested more lines of code and that nothing segfaulted or
triggered something like ASAN.

That's something I guess, but I'm not enthused by the idea of just
dumping a bunch of binary test cases that nobody, not even the author,
understands.


I understand your concern. This is how I see it:

Negatives:

 - 'make test' takes 1 second longer to run

 - 548K data added to git.git

Positives:

 - regularly exercising more of the code, especially some of the corner
cases which are not caught by the rest of the test suite, possibly
catching bugs in a security-critical bit of git before it makes it into
a release

 - no impact to existing code, everything self-contained in 1 directory

 - giving more people access to the testcases I discovered without
having to repeat the effort of setting up AFL, fixing up SHA1 checksums,
minimising the corpus, running AFL for a week, etc. (each step by itself
is pretty small, but taken altogether I think it's worthwhile not
to have to repeat that).

Then I guess you have to weigh the negatives and positives. For me it's
a clear net win, but others may see it differently.

For sure, I (or somebody else) can go through each testcase and figure
out what it's doing, what it's doing differently from the existing
manual testcases in t5300, and what its expected output should be. It's
not that I couldn't understand what each testcase is doing if I tried,
but I don't think it's worth the effort. I've run everything under
valgrind and the only thing that turned up were some suspicious-looking
allocations (which AFAICT should be safe to ignore because of git's
built-in limits). Otherwise it's mostly hitting sanity checks:

error: inflate: data stream error (incorrect header check)
fatal: pack has 4 unresolved deltas
fatal: pack has bad object at offset 12: delta base offset is out of bound
fatal: invalid tag
...

These are errors you wouldn't see normally and which the existing
testcases don't check for (or not exhaustively or systematically, in any
case).

If somebody were to look at the code and say: "hey, that check looks a
bit off" (which is something I personally do all the time), then being
able to quickly find an input to execute exactly that line of code is
extremely valuable -- and you can do that simply by running the
testcases through gcov.

Anyway, the patch/data is there, use it or don't.


Vegard


Re: [PATCH v5 04/11] setup_git_directory_1(): avoid changing global state

2017-03-10 Thread Junio C Hamano
Johannes Schindelin  writes:

>  /*
>   * We cannot decide in this function whether we are in the work tree or
>   * not, since the config can only be read _after_ this function was called.
> + *
> + * Also, we avoid changing any global state (such as the current working
> + * directory) to allow early callers.
> + *
> + * The directory where the search should start needs to be passed in via the
> + * `dir` parameter; upon return, the `dir` buffer will contain the path of
> + * the directory where the search ended, and `gitdir` will contain the path 
> of
> + * the discovered .git/ directory, if any. This path may be relative against
> + * `dir` (i.e. *not* necessarily the cwd).

I had to read the last sentence three times because my earlier two
attempts misread/misinterpreted as "this may be relative to cwd, but
not necessarily, because this may be relative to dir".  The correct
reading is "when this is relative, it is relative to dir.  It would
not be relative to cwd if you start with dir that is not cwd".

It would be nicer if all readers can get to that without re-reading
three times.

> -static const char *setup_git_directory_gently_1(int *nongit_ok)
> +static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
> ...
> @@ -889,63 +888,104 @@ static const char *setup_git_directory_gently_1(int 
> *nongit_ok)
>*/
>   one_filesystem = !git_env_bool("GIT_DISCOVERY_ACROSS_FILESYSTEM", 0);
>   if (one_filesystem)
> - current_device = get_device_or_die(".", NULL, 0);
> + current_device = get_device_or_die(dir->buf, NULL, 0);
>   for (;;) {
> - gitfile = (char*)read_gitfile(DEFAULT_GIT_DIR_ENVIRONMENT);
> - if (gitfile)
> - gitdirenv = gitfile = xstrdup(gitfile);
> - else {
> - if (is_git_directory(DEFAULT_GIT_DIR_ENVIRONMENT))
> - gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT;
> + int offset = dir->len;
> +
> + if (offset > min_offset)
> + strbuf_addch(dir, '/');
> + strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT);
> + gitdirenv = read_gitfile(dir->buf);
> + if (!gitdirenv && is_git_directory(dir->buf))
> + gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT;
> + strbuf_setlen(dir, offset);
> + if (gitdirenv) {
> + strbuf_addstr(gitdir, gitdirenv);
> + return GIT_DIR_DISCOVERED;
>   }

I commented on this part more extensively in a later step of the
series after the read_gitfile() call is replaced with the non-dying
version.  I see that issues in corner cases are inherited from the
code even before this step.  

We'd either want to at least document the corner cases that are not
handled with /* NEEDSWORK: */ in-code comments , or address them in
a follow-up series before we forget.  They are not new problems, so
I am OK if we choose to leave them broken, as people haven't triggered
them in the current code.

Thanks.


Re: [GSoC] Discussion of "Submodule related work" project

2017-03-10 Thread Valery Tolstov
So... I thought those items listed in "Submodule related work" are
considered too small to be complete projects separately, and they
are just "subprojects" of bigger project (maybe I have this thought
because I can't estimate complexity before truly digging in).
In your response you talk about them as independent projects...
This means I can take only any one of them as starting point for
my proposal? Or maybe I misunderstood you?


Thanks,
  Valery Tolstov


Re: [RFC][PATCH] index-pack: add testcases found using AFL

2017-03-10 Thread Jeff King
[Note: your original email didn't make it to the list because it's over
100K; I'll quote liberally].

On Fri, Mar 10, 2017 at 04:15:56PM +0100, Vegard Nossum wrote:

> I've used AFL to generate a corpus of pack files that maximises the edge
> coverage for 'git index-pack'.
> 
> This is a supplement to (and not a replacement for) the regular test cases
> where we know exactly what each test is checking for. These testcases are
> more useful for avoiding regressions in edge cases or as a starting point
> for future fuzzing efforts.
> 
> To see the output of running 'git index-pack' on each file, you can do
> something like this:
> 
>   make -C t GIT_TEST_OPTS="--run=34 --verbose" t5300-pack-object.sh
> 
> I observe the following coverage changes (for t5300 only):
> 
>   path  old%  new%pp
>   
>   builtin/index-pack.c  74.3  76.6   2.3
>   pack-write.c  79.8  80.4.6
>   patch-delta.c 67.4  81.4  14.0
>   usage.c   26.6  35.5   8.9
>   wrapper.c 42.0  46.1   4.1
>   zlib.c58.7  64.1   5.4

I'm not sure how I feel about this. More coverage is good, I guess, but
we don't have any idea what these packfiles are doing, or whether
index-pack is behaving sanely in the new lines. The most we can say is
that we tested more lines of code and that nothing segfaulted or
triggered something like ASAN.

That's something I guess, but I'm not enthused by the idea of just
dumping a bunch of binary test cases that nobody, not even the author,
understands.

-Peff


Re: [PATCH v5 09/11] Test read_early_config()

2017-03-10 Thread Junio C Hamano
Johannes Schindelin  writes:

> Subject: Re: [PATCH v5 09/11] Test read_early_config()

Let's retitle it to

t1309: test read_early_config()

> So far, we had no explicit tests of that function.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  t/helper/test-config.c  | 15 +++
>  t/t1309-early-config.sh | 50 
> +
>  2 files changed, 65 insertions(+)
>  create mode 100755 t/t1309-early-config.sh
>
> diff --git a/t/helper/test-config.c b/t/helper/test-config.c
> index 83a4f2ab869..8e3ed6a76cb 100644
> --- a/t/helper/test-config.c
> +++ b/t/helper/test-config.c
> @@ -66,6 +66,16 @@ static int iterate_cb(const char *var, const char *value, 
> void *data)
>   return 0;
>  }
>  
> +static int early_config_cb(const char *var, const char *value, void *vdata)
> +{
> + const char *key = vdata;
> +
> + if (!strcmp(key, var))
> + printf("%s\n", value);
> +
> + return 0;
> +}
> +
>  int cmd_main(int argc, const char **argv)
>  {
>   int i, val;
> @@ -73,6 +83,11 @@ int cmd_main(int argc, const char **argv)
>   const struct string_list *strptr;
>   struct config_set cs;
>  
> + if (argc == 3 && !strcmp(argv[1], "read_early_config")) {
> + read_early_config(early_config_cb, (void *)argv[2]);
> + return 0;
> + }
> +
>   setup_git_directory();

Makes perfect sense to have this as the very beginning, before we
even do the usual setup ;-)



Re: [PATCH 2/2] pathspec: allow escaped query values

2017-03-10 Thread Brandon Williams
On 03/09, Jonathan Tan wrote:
> On 03/09/2017 01:07 PM, Brandon Williams wrote:
> >diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
> >index b5e5a0607..585d17bad 100755
> >--- a/t/t6135-pathspec-with-attrs.sh
> >+++ b/t/t6135-pathspec-with-attrs.sh
> >@@ -178,4 +178,13 @@ test_expect_success 'abort on asking for wrong magic' '
> > test_must_fail git ls-files . ":(attr:!label=foo)"
> > '
> >
> >+test_expect_success 'check attribute list' '
> >+cat <<-EOF >>.gitattributes &&
> >+* whitespace=indent,trail,space
> >+EOF
> >+git ls-files ":(attr:whitespace=indent\,trail\,space)" >actual &&
> >+git ls-files >expect &&
> >+test_cmp expect actual
> >+'
> >+
> > test_done
> 
> Is there a way to verify that `\,` is not escaped by the shell into `,`?

You can run with GIT_TRACE=1 to see the actual string passed to git.
'\,' is indeed passed to git with no problems. 

> 
> Maybe also add tests that show \ as the last character and \
> escaping another \.

Done

-- 
Brandon Williams


[PATCH v2 1/2] pathspec: allow querying for attributes

2017-03-10 Thread Brandon Williams
The pathspec mechanism is extended via the new
":(attr:eol=input)pattern/to/match" syntax to filter paths so that it
requires paths to not just match the given pattern but also have the
specified attrs attached for them to be chosen.

Based on a patch by Stefan Beller 
Signed-off-by: Brandon Williams 
---
 Documentation/glossary-content.txt |  21 +
 attr.c |  17 
 attr.h |   1 +
 dir.c  |  43 -
 pathspec.c | 117 +++-
 pathspec.h |  16 +++-
 t/t6135-pathspec-with-attrs.sh | 181 +
 7 files changed, 387 insertions(+), 9 deletions(-)
 create mode 100755 t/t6135-pathspec-with-attrs.sh

diff --git a/Documentation/glossary-content.txt 
b/Documentation/glossary-content.txt
index fc9320e59..6e991c246 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -384,6 +384,27 @@ full pathname may have special meaning:
 +
 Glob magic is incompatible with literal magic.
 
+attr;;
+After `attr:` comes a space separated list of "attribute
+requirements", all of which must be met in order for the
+path to be considered a match; this is in addition to the
+usual non-magic pathspec pattern matching.
+See linkgit:gitattributes[5].
++
+Each of the attribute requirements for the path takes one of
+these forms:
+
+- "`ATTR`" requires that the attribute `ATTR` be set.
+
+- "`-ATTR`" requires that the attribute `ATTR` be unset.
+
+- "`ATTR=VALUE`" requires that the attribute `ATTR` be
+  set to the string `VALUE`.
+
+- "`!ATTR`" requires that the attribute `ATTR` be
+  unspecified.
++
+
 exclude;;
After a path matches any non-exclude pathspec, it will be run
through all exclude pathspec (magic signature: `!` or its
diff --git a/attr.c b/attr.c
index 5493bff22..7e2134471 100644
--- a/attr.c
+++ b/attr.c
@@ -603,6 +603,23 @@ struct attr_check *attr_check_initl(const char *one, ...)
return check;
 }
 
+struct attr_check *attr_check_dup(const struct attr_check *check)
+{
+   struct attr_check *ret;
+
+   if (!check)
+   return NULL;
+
+   ret = attr_check_alloc();
+
+   ret->nr = check->nr;
+   ret->alloc = check->alloc;
+   ALLOC_ARRAY(ret->items, ret->nr);
+   COPY_ARRAY(ret->items, check->items, ret->nr);
+
+   return ret;
+}
+
 struct attr_check_item *attr_check_append(struct attr_check *check,
  const struct git_attr *attr)
 {
diff --git a/attr.h b/attr.h
index 48ab3e1c2..442d464db 100644
--- a/attr.h
+++ b/attr.h
@@ -44,6 +44,7 @@ struct attr_check {
 
 extern struct attr_check *attr_check_alloc(void);
 extern struct attr_check *attr_check_initl(const char *, ...);
+extern struct attr_check *attr_check_dup(const struct attr_check *check);
 
 extern struct attr_check_item *attr_check_append(struct attr_check *check,
 const struct git_attr *attr);
diff --git a/dir.c b/dir.c
index 4541f9e14..2fe7acbcf 100644
--- a/dir.c
+++ b/dir.c
@@ -9,6 +9,7 @@
  */
 #include "cache.h"
 #include "dir.h"
+#include "attr.h"
 #include "refs.h"
 #include "wildmatch.h"
 #include "pathspec.h"
@@ -134,7 +135,8 @@ static size_t common_prefix_len(const struct pathspec 
*pathspec)
   PATHSPEC_LITERAL |
   PATHSPEC_GLOB |
   PATHSPEC_ICASE |
-  PATHSPEC_EXCLUDE);
+  PATHSPEC_EXCLUDE |
+  PATHSPEC_ATTR);
 
for (n = 0; n < pathspec->nr; n++) {
size_t i = 0, len = 0, item_len;
@@ -209,6 +211,36 @@ int within_depth(const char *name, int namelen,
 #define DO_MATCH_DIRECTORY (1<<1)
 #define DO_MATCH_SUBMODULE (1<<2)
 
+static int match_attrs(const char *name, int namelen,
+  const struct pathspec_item *item)
+{
+   int i;
+
+   git_check_attr(name, item->attr_check);
+   for (i = 0; i < item->attr_match_nr; i++) {
+   const char *value;
+   int matched;
+   enum attr_match_mode match_mode;
+
+   value = item->attr_check->items[i].value;
+   match_mode = item->attr_match[i].match_mode;
+
+   if (ATTR_TRUE(value))
+   matched = (match_mode == MATCH_SET);
+   else if (ATTR_FALSE(value))
+   matched = (match_mode == MATCH_UNSET);
+   else if (ATTR_UNSET(value))
+   matched = (match_mode == MATCH_UNSPECIFIED);
+   else
+   matched = (match_mode == MATCH_VALUE &&
+  !strcmp(item->attr_match[i].value, value));
+   if (!matched)
+   return 0;
+   }
+
+   return 1;
+}
+
 /*
  * Does 'match' match the given name?
  

[PATCH v2 2/2] pathspec: allow escaped query values

2017-03-10 Thread Brandon Williams
In our own .gitattributes file we have attributes such as:

*.[ch] whitespace=indent,trail,space

When querying for attributes we want to be able to ask for the exact
value, i.e.

git ls-files :(attr:whitespace=indent,trail,space)

should work, but the commas are used in the attr magic to introduce
the next attr, such that this query currently fails with

fatal: Invalid pathspec magic 'trail' in ':(attr:whitespace=indent,trail,space)'

This change allows escaping characters by a backslash, such that the query

git ls-files :(attr:whitespace=indent\,trail\,space)

will match all path that have the value "indent,trail,space" for the
whitespace attribute. To accomplish this, we need to modify two places.
First `parse_long_magic` needs to not stop early upon seeing a comma or
closing paren that is escaped. As a second step we need to remove any
escaping from the attr value.

Based on a patch by Stefan Beller 
Signed-off-by: Brandon Williams 
---
 pathspec.c | 52 ++
 t/t6135-pathspec-with-attrs.sh | 19 +++
 2 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 7cd5f6e3d..d7956f6bf 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -89,6 +89,51 @@ static void prefix_magic(struct strbuf *sb, int prefixlen, 
unsigned magic)
strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
+static size_t strcspn_escaped(const char *s, const char *stop)
+{
+   const char *i;
+
+   for (i = s; *i; i++) {
+   /* skip the escaped character */
+   if (i[0] == '\\' && i[1]) {
+   i++;
+   continue;
+   }
+
+   if (strchr(stop, *i))
+   break;
+   }
+   return i - s;
+}
+
+static inline int invalid_value_char(const char ch)
+{
+   if (isalnum(ch) || strchr(",-_", ch))
+   return 0;
+   return -1;
+}
+
+static char *attr_value_unescape(const char *value)
+{
+   const char *src;
+   char *dst, *ret;
+
+   ret = xmallocz(strlen(value));
+   for (src = value, dst = ret; *src; src++, dst++) {
+   if (*src == '\\') {
+   if (!src[1])
+   die(_("Escape character '\\' not allowed as "
+ "last character in attr value"));
+   src++;
+   }
+   if (invalid_value_char(*src))
+   die("cannot use '%c' for value matching", *src);
+   *dst = *src;
+   }
+   *dst = '\0';
+   return ret;
+}
+
 static void parse_pathspec_attr_match(struct pathspec_item *item, const char 
*value)
 {
struct string_list_item *si;
@@ -133,10 +178,9 @@ static void parse_pathspec_attr_match(struct pathspec_item 
*item, const char *va
if (attr[attr_len] != '=')
am->match_mode = MATCH_SET;
else {
+   const char *v = [attr_len + 1];
am->match_mode = MATCH_VALUE;
-   am->value = xstrdup([attr_len + 1]);
-   if (strchr(am->value, '\\'))
-   die(_("attr spec values must not 
contain backslashes"));
+   am->value = attr_value_unescape(v);
}
break;
}
@@ -241,7 +285,7 @@ static const char *parse_long_magic(unsigned *magic, int 
*prefix_len,
const char *nextat;
 
for (pos = elem + 2; *pos && *pos != ')'; pos = nextat) {
-   size_t len = strcspn(pos, ",)");
+   size_t len = strcspn_escaped(pos, ",)");
int i;
 
if (pos[len] == ',')
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index b5e5a0607..f60af29a9 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -178,4 +178,23 @@ test_expect_success 'abort on asking for wrong magic' '
test_must_fail git ls-files . ":(attr:!label=foo)"
 '
 
+test_expect_success 'check attribute list' '
+   cat <<-EOF >>.gitattributes &&
+   * whitespace=indent,trail,space
+   EOF
+   git ls-files ":(attr:whitespace=indent\,trail\,space)" >actual &&
+   git ls-files >expect &&
+   test_cmp expect actual
+'
+
+test_expect_success 'backslash cannot be the last character' '
+   test_must_fail git ls-files ":(attr:label=foo\\ labelA=bar)" 2>actual &&
+   test_i18ngrep "not allowed as last character in attr value" actual
+'
+
+test_expect_success 'backslash cannot be used as a value' '
+   test_must_fail git ls-files ":(attr:label=f\\\oo)" 2>actual &&
+   test_i18ngrep "for value matching" actual
+'
+
 test_done
-- 
2.12.0.246.ga2ecc84866-goog



[PATCH v2 0/2] bringing attributes to pathspecs

2017-03-10 Thread Brandon Williams
v2 addresses the comments made by Jonathan.

Brandon Williams (2):
  pathspec: allow querying for attributes
  pathspec: allow escaped query values

 Documentation/glossary-content.txt |  21 
 attr.c |  17 
 attr.h |   1 +
 dir.c  |  43 +++-
 pathspec.c | 163 --
 pathspec.h |  16 ++-
 t/t6135-pathspec-with-attrs.sh | 200 +
 7 files changed, 451 insertions(+), 10 deletions(-)
 create mode 100755 t/t6135-pathspec-with-attrs.sh

-- 
2.12.0.246.ga2ecc84866-goog



Re: [PATCH v5 10/11] setup_git_directory_gently_1(): avoid die()ing

2017-03-10 Thread Junio C Hamano
Johannes Schindelin  writes:

> @@ -890,14 +892,22 @@ static enum discovery_result 
> setup_git_directory_gently_1(struct strbuf *dir,
>   if (one_filesystem)
>   current_device = get_device_or_die(dir->buf, NULL, 0);
>   for (;;) {
> - int offset = dir->len;
> + int offset = dir->len, error_code = 0;
>  
>   if (offset > min_offset)
>   strbuf_addch(dir, '/');
>   strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT);
> - gitdirenv = read_gitfile(dir->buf);
> - if (!gitdirenv && is_git_directory(dir->buf))
> - gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT;
> + gitdirenv = read_gitfile_gently(dir->buf, die_on_error ?
> + NULL : _code);
> + if (!gitdirenv) {

We tried to read ".git" as a regular file, but couldn't.  There are
some cases but I am having trouble to convince myself all cases are
covered correctly here, so let me follow the code aloud.

> + if (die_on_error ||
> + error_code == READ_GITFILE_ERR_NOT_A_FILE) {
> + if (is_git_directory(dir->buf))
> + gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT;

If we are told to die on error, but if it is a Git directory, then
we do not have to (and want to) die and instead report that
directory as discovered.

If we are to report the failure status instead of dying, we also do
want to recover when the read_gitfile_gentrly() failed only because
it was a real Git directory.  READ_GITFILE_ERR_NOT_A_FILE could be a
signal for that, and we recover after making sure it is a Git
directory.

Among the READ_GITFILE_ERR_* codes, ERR_NOT_A_FILE is the only one
we attempt this recovery.  All others just take the "else if" thing
below.

What happens when is_git_directory() test here does not pass,
though?  Let's say stat() in read_gitfile_gently() found a FIFO
there, it gives us ERR_NOT_A_FILE, is_git_directory() would say
"Nah", and then ...?  Don't we want the other side for this if()
statement that returns failure?

> + } else if (error_code &&
> +error_code != READ_GITFILE_ERR_STAT_FAILED)
> + return GIT_DIR_INVALID_GITFILE;
> + }

On the other hand, if read_gitfile_gently() didn't say "we found
something that is not a regular file" we come here.  If the error
came because there wasn't ".git" in the directory we are looking at,
i.e. stat(2) in read_gitfile_gently() gave ENOENT, it is perfectly
normal and we just want to go one level up.

But shouldn't read_gitfile_gently() be inspecting errno when it sees
a stat() failure?  If there _is_ ".git" but we failed to stat it for
whatever reason (EACCES, ELOOP,...), we do not want to go up but
give the INVALID_GITFILE from here, just like other cases, no?
For that I imagine that ERR_STAT_FAILED needs to be split into two,
one for true ERR_STAT_FAILED (from which we cannot recover) and the
other for ERR_ENOENT to signal us that there is nothing there (which
allows us to go up).

By the way, is the "error_code &&" needed?  Unless the original path
given to read_gitfile_gently() is a NULL pointer, the only time its
return value is NULL is when it has non-zero error_code.


>   strbuf_setlen(dir, offset);
>   if (gitdirenv) {
>   strbuf_addstr(gitdir, gitdirenv);
> @@ -934,7 +944,7 @@ const char *discover_git_directory(struct strbuf *gitdir)
>   return NULL;
>  
>   cwd_len = dir.len;
> - if (setup_git_directory_gently_1(, gitdir) <= 0) {
> + if (setup_git_directory_gently_1(, gitdir, 0) <= 0) {
>   strbuf_release();
>   return NULL;
>   }
> @@ -994,7 +1004,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
>   die_errno(_("Unable to read current working directory"));
>   strbuf_addbuf(, );
>  
> - switch (setup_git_directory_gently_1(, )) {
> + switch (setup_git_directory_gently_1(, , 1)) {
>   case GIT_DIR_NONE:
>   prefix = NULL;
>   break;


Re: Stable GnuPG interface, git should use GPGME

2017-03-10 Thread Linus Torvalds
On Fri, Mar 10, 2017 at 2:00 AM, Bernhard E. Reiter
 wrote:
>
> git uses an pipe-and-exec approach to running a GnuPG binary
> as writen in the documentation [1]:
>
> gpg.program
>Use this custom program instead of "gpg" found on $PATH when making
>or verifying a PGP signature. The program must support the same
>command-line interface as GPG
>
> please consider using libgpgme interfacing to GnuPG, because the gpg
> command-line interface is not considered an official API to GnuPG by the
> GnuPG-devs and thus potentially unstable.

Quite frankly, I will NAK this just based on previous bad experiences
with using "helpful" libraries.

Maybe you can lay my worries to rest, but the problems with libraries
in this context tend to be

 - hard to personalize.

   At least right now, we just allow people to set their gpg binary.
I'm betting that the library would pick whatever random preferred
version, and in the process possibly screwed us.

   Example: what if somebody is actually using another pgp
implementation entirely for some reason, and is just scripting around
it?

   Or maybe he's using the regular gnupg, but using different keys for
different projects (using "--homedir"). That's trivial with the
current model. How trivial is that with a library?

 - existing configuration

   This is the main problem I've seen in the past. Using the "ssh"
_program_ is easy. You add your keys, your config files, whatever, and
it "just works" (or rather, you fight it once and it definitely
doesn't "just" work, but then you copy your .ssh directory around for
the rest of your and forget how it ever worked, but it does).

   Using "libssh2" is an exercise in futility, and you have to do a
crazy amount of stupid "look up keys" and simple configuration in your
.ssh/config (like per-host keys, hostname swizzling etc) just don't
pick up the configurations you already did for the program.

 - UI

   For things like gpg, the UI is traditionally horrible. But there
tends to be various things like password agents that help with caching
passwords and/or add a graphical UI to get the password when
necessary.

 - library versioning.

   I don't know why, but I've never *ever* met a library developer who
realized that libraries were all about stable API's, and the library
users don't want to fight different versions.

   And to make matters worse, the different versions (particularly if
you end up having to use a development version due to bugs or required
features etc) are always made horribly bad to even detect at
built-time automatically with simple #ifdef etc, so now you have to do
autoconf crap etc.

Now, it may be that the pgpme library "just works" across
architectures and handles all of the above situations as gracefully as
the external program does. In that case - but _ONLY_ in that case -
would a switch-over to the library possibly be a good thing.

I'd be pleasantly surprised. But I *would* be surprised, because every
time I've seen that "library vs program" model, I've seen the above
issues.

In fact, we have those exact issues very much in git itself too. Yes,
I've used libgit2 (for subsurface). It's a pain in the arse to do
*exactly* the above kinds of things, and the thing is, that isn't
git-specific.

So I'm very down on using external libraries unless they are stable
and have no need for configuration etc. Things like zlib is fine -
there just isn't much to configure outside of the "just how hard do
you want me to try to compress". Nobody has a ".zlib/config" file that
you need to worry about accessing etc.

Of course, maybe pgpme is a world first, and actually does read your
.gnupg/config file trivially, and has all the gpg agent integration
that it picks up automatically, and allows various per-user
configurations, and all my worries are bogus.

But that would literally be the first time I've ever seen that.

   Linus


Re: [GSoC] Discussion of "Submodule related work" project

2017-03-10 Thread Stefan Beller
On Fri, Mar 10, 2017 at 3:27 AM, Valery Tolstov  wrote:
> Have some questions about "Submodule related work" project
>
> First of all, I would like to add this task to the project, if I'll take it:
> https://public-inbox.org/git/1488913150.881...@smtp.yandex.ru/T/
> What do you think about this task?

That is a nice project, though my gut feeling is that it is too small
for a GSoC project on itself.

>> Cleanup our test suite. Do not use a repo itself as a submodule for itself
>
> Not quite familiar with submodules yet, why this is considered to be
> ineligible (i.e. using repo as a submodule for itself)?

(a bit of background on submodules)

man gitglossary (then searching for submodule):
   submodule
   A repository that holds the history of a separate project inside
   another repository (the latter of which is called superproject).

   superproject
   A repository that references repositories of other projects in its
   working tree as submodules. The superproject knows about the names
   of (but does not hold copies of) commit objects of the contained
   submodules.

An example that I just found on Github[1]. It is a game
(so it includes graphics, game code etc). But it makes use of a library[2],
which could be used by different projects.

[1] https://github.com/stephank/orona
[2] https://github.com/stephank/villain

Now why would a repo be ineligible to use itself as a submodule?
There is nothing wrong with it *technically* (which is why we do such things
in the test suite.)

But what are the use cases for it? Why would a project bind itself
as a submodule (you can get the same effect of having the source code
by just checking out that other version.) ? Well now that I think about it,
it may be useful if you want to test old versions of yourself for e.g.
networking compatibility. But for that you'd probably still not use submodules.

So the use case of using submodules for another copy of itself is
*very rare* if it exists at all out there. And the Git test suite
should rather test
use cases that are not these weird corner cases, but rather pay attention to
the common case first.

I thought this project would have been solved parially already, but I was wrong.
($ git grep "submodule add \./\."). This also doesn't seem large enough for
a summer project, after thinking about it further.

>> (Advanced datastructure knowledge required?) Protect submodule from gc-ing
>> interesting HEADS.
>
> Can you provide a small example that shows the problem, please?

Let's use this example from above:

$ git clone --recursive https://github.com/stephank/orona
# now we have 2 repositories, the orona repo as well as its submodule
# at node_modules/villain
#
# "Let's inspect the Readmes/license files, if they are ok to use
# Oh! the submodule is MIT licensed but doesn't have the full
# license text, I can contribute and make a patch for it."
$ cd node_modules/villain
$ $EDIT LICENSE
$ git add LICENSE
$ git commit -a -m "add license full text"
$
$ cd ../.. # go back to the superproject
$ git add  node_modules/villain
$ git commit -a -m "update game to include latest lib"
$ git checkout -b "fix_license"
# note how I forget to actually push it / pull request it!
# All we need for the demonstration is a local commit
# in the submodule that is referenced by the superproject...
#
# ... "Let's test the pristine copy of the game!" ...
$ git checkout origin/master
$ git submodule update
# ... which gets lost here. The submodule commit
# is only referenced by a superproject commit.

.. time passes ..

# "My disk is so full, maybe I can clean up all these random
# accumulated projects, to have more disk space again."
# my cleanup script may do this:

$ cd node_modules/villain
$ git reflog expire --all --expire=all
$ git gc --prune=all
$ cd ../..

$ git branch
# "Oh what about this 'fix_license branch' ?
#  Did I actually send that upstream?"
$ git checkout fix_license
$ git submodule update
error: no such remote ref 96016818b2ed9eb0ca72552b18f8339fc20850b4
Fetched in submodule path 'villain', but it did not contain
96016818b2ed9eb0ca72552b18f8339fc20850b4. Direct fetching of that
commit failed.

> And why advanced datastructure knowledge is expected?

I am not quite sure how to approach this problem, so I put
a "warning; it may be complicated" sticker on it. ;)

The problem is that a submodule until now was considered
its own repository, in full control what to keep and delete,
how to name its branches and so on.

git-gc only pays attention to commits (and its history) of all
branches and commits mentioned in the reflog.
(which is why we had to delete the reflog, and as we
were making the license commit on a "detached HEAD",
there was no need to delete its branch).

However it should also consider all commits referenced
by the superproject valuable.

In this case the superproject has 

Re: [PATCH v1] Travis: also test on 32-bit Linux

2017-03-10 Thread René Scharfe

Am 10.03.2017 um 18:57 schrieb Jeff King:

On Fri, Mar 10, 2017 at 05:20:13PM +0100, René Scharfe wrote:


I think this misses the other two cases: (*dst, src) and (*dst, *src).


... and that's why I left them out.  You can't get dst vs. *dst wrong with
structs (at least not without the compiler complaining); only safe
transformations are included in this round.


I don't think the transformation could be wrong without the original
code being wrong.


Avoiding to introduce bugs with automatic transformations is essential, 
indeed -- if we're not careful here we'd be better off keeping the 
original code.



I'm also not sure why mine would be unsafe and yours would be safe. It
seems like the other way around, because mine will do:

  *dst = ...

which would fail unless "dst" is a pointer. Whereas in yours, we end up
with:

  dst = ...

and somebody mistaking pointer assignment for a struct copy would not
notice.


If dst is a struct then having *dst on the left-hand side results in a 
compiler error -- you can't get that one wrong.  If it's a pointer then 
both dst and *dst can be valid (pointer assignment vs. content copy), so 
there is the possibility of making a mistake without the compiler noticing.



But either way, I don't think we can have a rule like "you can use
struct assignment only if you don't have a pointer for the destination".
That's too arcane to expect developers and reviewers to follow. We
should either allow struct assignment or not.


With an automatic transformation in place it's more like "you can't use 
memcpy() in this case any more as it gets turned into an assignment with 
the next cocci patch".  I think we shouldn't be that restrictive for 
pointers as destinations (yet?).


René


Re: [PATCH 1/2] pathspec: allow querying for attributes

2017-03-10 Thread Brandon Williams
On 03/09, Jonathan Tan wrote:
> On 03/09/2017 01:07 PM, Brandon Williams wrote:
> >diff --git a/Documentation/glossary-content.txt 
> >b/Documentation/glossary-content.txt
> >index fc9320e59..5c32d1905 100644
> >--- a/Documentation/glossary-content.txt
> >+++ b/Documentation/glossary-content.txt
> >@@ -384,6 +384,26 @@ full pathname may have special meaning:
> > +
> > Glob magic is incompatible with literal magic.
> >
> >+attr;;
> >+After `attr:` comes a space separated list of "attribute
> >+requirements", all of which must be met in order for the
> >+path to be considered a match; this is in addition to the
> >+usual non-magic pathspec pattern matching.
> >++
> >+Each of the attribute requirements for the path takes one of
> >+these forms:
> >+
> >+- "`ATTR`" requires that the attribute `ATTR` must be set.
> 
> As a relative newcomer to attributes, I was confused by the fact
> that "set" and "set to a value" is different (and likewise "unset"
> and "unspecified"). Maybe it's worthwhile including a link to
> "gitattributes" to explain the different (exclusive) states that an
> attribute can be in.

Good idea! I'll add in a link to gitattributes.
> 
> >+
> >+- "`-ATTR`" requires that the attribute `ATTR` must be unset.
> >+
> >+- "`ATTR=VALUE`" requires that the attribute `ATTR` must be
> >+  set to the string `VALUE`.
> >+
> >+- "`!ATTR`" requires that the attribute `ATTR` must be
> >+  unspecified.
> 
> It would read better to me if you omitted "must" in all 4 bullet
> points (and it is redundant anyway with "requires"), but I don't
> feel too strongly about this.

I agree, the first paragraph already says "must" so it reads better
without repeating must over and over again.

> 
> >diff --git a/pathspec.c b/pathspec.c
> >index b961f00c8..583ed5208 100644
> >--- a/pathspec.c
> >+++ b/pathspec.c
> >@@ -87,6 +89,72 @@ static void prefix_magic(struct strbuf *sb, int 
> >prefixlen, unsigned magic)
> > strbuf_addf(sb, ",prefix:%d)", prefixlen);
> > }
> >
> >+static void parse_pathspec_attr_match(struct pathspec_item *item, const 
> >char *value)
> >+{
> >+struct string_list_item *si;
> >+struct string_list list = STRING_LIST_INIT_DUP;
> >+
> >+if (item->attr_check)
> >+die(_("Only one 'attr:' specification is allowed."));
> >+
> >+if (!value || !strlen(value))
> 
> You can write `!*value` instead of `!strlen(value)`.
> 

Done.

> >+string_list_remove_empty_items(, 0);
> >+
> >+item->attr_check = attr_check_alloc();
> >+ALLOC_GROW(item->attr_match,
> >+   item->attr_match_nr + list.nr,
> >+   item->attr_match_alloc);
> 
> Is there a time when this function is called while
> item->attr_match_nr is not zero?

Nope, it pretty much has to be zero.  I'll change this to just use
list.nr.  item->attr_match_nr will be incremented up to list.nr over the
course of the for loop and I'll move the equality check to the end of
this function.

> >+string_list_clear(, 0);
> >+return;
> 
> Redundant return?

I'll remove it.

> 
> >@@ -544,6 +628,10 @@ void parse_pathspec(struct pathspec *pathspec,
> > if (item[i].nowildcard_len < item[i].len)
> > pathspec->has_wildcard = 1;
> > pathspec->magic |= item[i].magic;
> >+
> >+if (item[i].attr_check &&
> >+item[i].attr_check->nr != item[i].attr_match_nr)
> >+die("BUG: should have same number of entries");
> 
> I'm not sure if this check is giving us any benefit - I would expect
> this type of code before some other code that assumed that the
> numbers matched, and that will potentially segfault if not.

I'll push the check to right after the object creation (see comment
above).

-- 
Brandon Williams


Re: [RFC PATCH] help: add optional instructions for reporting bugs

2017-03-10 Thread Jonathan Nieder
Hi,

Stefan Beller wrote:

> When reporting bugs, users will usually look at the output of
> 'git --version' at one point to write a quality bug report.
> So that is a good spot to provide additional information to the user
> about e.g. additional the organizational quirks how to report a bug.
>
> As the output of 'git --version' is parsed by scripts as well,
> we only want to present this information to users, which is why
> we only give the output to TTYs.

Interesting thought.  This might also be a good place to point users
to "git version --build-options" to get more detailed build
information.

The existence of that option also suggests you're on the right track
about 'git version' being the command for this.

> Git is distributed in various ways by various organizations. The Git
> community prefers to have bugs reported on the mailing list, whereas
> other organizations may rather want to have filed a bug in a bug tracker
> or such.  The point of contact is different by organization as well.

It's tempting to put the custom information in --build-options --- e.g.

 $ git version
 git version 2.12.0.190.g6e60aba09d.dirty
 hint: use "git version --build-options" for more detail
 hint: and bug reporting instructions
 $
 $ git version --build-options
 git version 2.12.0.190.g6e60aba09d.dirty
 sizeof-long: 8
 reporting-bugs: see REPORTING BUGS section in "git help git"

Using 'hint:' would put this in the advice category. Usually those are
possible to disable using advice.* configuration (e.g.
advice.versionBuildOptions) for less noisy output.

[...]
> --- a/help.c
> +++ b/help.c
> @@ -9,6 +9,12 @@
>  #include "version.h"
>  #include "refs.h"
>  
> +#ifdef GIT_BUG_REPORT_HELP

If doing this for real, there would be a knob in the Makefile for
setting the bug report string.

[...]
> @@ -435,6 +441,8 @@ int cmd_version(int argc, const char **argv, const char 
> *prefix)
>   /* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
>   }
>   }
> + if (isatty(1))
> + puts(git_bug_reporting_string);

Should this go to stderr?  E.g. advise() writes to stderr.

Some scripts may run "git version" in output that is written to stdout
and meant to be copy/pasted.  Is there a way for such scripts to
suppress this output?  Should they use -c
advice.versionBuildOptions=0, does 'git version' need an option to
suppress the human-oriented output, should they use 2>/dev/null, or is
this just something that people have to live with?

I'm still on the fence about whether this is a good idea. At least
having custom bug instructions in --build-options sounds like a very
nice thing, but it's not obvious to me how people would learn about
that option.

Thanks and hope that helps,
Jonathan


Re: [PATCH] Makefile: detect errors in running spatch

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 06:03:47PM +0100, René Scharfe wrote:

> > This shell code is getting a bit unwieldy to stick inside the Makefile,
> > with all the line continuation and $-escaping.  It might be worth moving
> > it into a helper script.
> 
> There is one for the kernel 
> (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/scripts/coccicheck).
> It's quite big, though.

Yeah, there's a lot going on there that I don't think we care about
(though I am new to coccinelle, so maybe I would grow to appreciate the
features).

I was thinking of just moving the current Makefile snippet into a
script. That lets us avoid the irritating quoting. And we can use things
like functions, which would make the $?-handling in the loop less
tedious (because we can return straight out of the loop).

> > I don't know if that would make the patches harder to apply. The results
> > aren't full patches, so I assume you usually do some kind of munging on
> > them?
> 
> They work with patch -p0.

Hrm, you're right. I tried it earlier based on the commit message from
the original "coccicheck" commit, but I got "only garbage was found".
But now it works. I must have screwed it up (perhaps tab-completion
stopped at "copy.cocci" instead of "copy.cocci.patch").

> >   make coccicheck SPATCH='spatch --in-place'
> 
> Using SPATCH_FLAGS for adding an option in such case would be a bit simpler.

That too. :)

> We can get rid of the loop by using the spatch options --use-gitgrep and
> --dir.  I can't find the former one in the docs, though, so I'm not sure if
> it only works with certain versions or what exactly it is even doing.  It
> seems to have the side effect of producing git-style patches (applicable
> with patch -p1) at least.

I have no objections to pursuing that. But I think with my patch, it's
workable for now, so there's no rush.

-Peff


Re: [PATCH 02/10] pack-objects: add --partial-by-size=n --partial-special

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 09:58:23AM -0800, Brandon Williams wrote:

> > A while back when we discussed whether to allow symlinks for
> > .gitattributes, etc, I think the consensus was to treat the whole
> > ".git*" namespace consistently. I haven't followed up with patches yet,
> > but my plan was to go that route.
> 
> Well if I remember correctly you sent out some patches for
> .gitattributes but I got in the way with the refactoring work! :)

True. :) But those were the old method that tries to treat
.gitattributes specially, by using O_NOFOLLOW in the attribute code (but
only for in-tree files, naturally).

I think we ended up deciding that it would be better to just disallow
symlink .gitattributes (and .git*) from entering the index, the way we
disallow ".git".

-Peff


Re: [PATCH 02/10] pack-objects: add --partial-by-size=n --partial-special

2017-03-10 Thread Brandon Williams
On 03/09, Jeff King wrote:
> On Wed, Mar 08, 2017 at 03:21:11PM -0500, Jeff Hostetler wrote:
> 
> > > And not ."gitmodules"?
> > > 
> > > What happens when we later add ".gitsomethingelse"?
> > > 
> > > Do we have to worry about the case where the set of git "special
> > > files" (can we have a better name for them please, by the way?)
> > > understood by the sending side and the receiving end is different?
> > > 
> > > I have a feeling that a mode that makes anything whose name begins
> > > with ".git" excempt from the size based cutoff may generally be
> > > easier to handle.
> > 
> > I forgot about ".gitmodules".  The more I think about it, maybe
> > we should always include them (or anything starting with ".git*")
> > and ignore the size, since they are important for correct behavior.
> 
> I'm also in favor of staking out ".git*" as "this is special and belongs
> to Git".

I agree, .git* files should probably be the bare minimum of files
included.  Especially since things like .gitattributes can effect things
like checkout.

> 
> A while back when we discussed whether to allow symlinks for
> .gitattributes, etc, I think the consensus was to treat the whole
> ".git*" namespace consistently. I haven't followed up with patches yet,
> but my plan was to go that route.

Well if I remember correctly you sent out some patches for
.gitattributes but I got in the way with the refactoring work! :)

-- 
Brandon Williams


Re: [PATCH v1] Travis: also test on 32-bit Linux

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 05:20:13PM +0100, René Scharfe wrote:

> > I think this misses the other two cases: (*dst, src) and (*dst, *src).
> 
> ... and that's why I left them out.  You can't get dst vs. *dst wrong with
> structs (at least not without the compiler complaining); only safe
> transformations are included in this round.

I don't think the transformation could be wrong without the original
code being wrong.

I'm also not sure why mine would be unsafe and yours would be safe. It
seems like the other way around, because mine will do:

  *dst = ...

which would fail unless "dst" is a pointer. Whereas in yours, we end up
with:

  dst = ...

and somebody mistaking pointer assignment for a struct copy would not
notice.

But either way, I don't think we can have a rule like "you can use
struct assignment only if you don't have a pointer for the destination".
That's too arcane to expect developers and reviewers to follow. We
should either allow struct assignment or not.

Or have I totally misunderstood your point?

-Peff


Re: [PATCH] Makefile: detect errors in running spatch

2017-03-10 Thread René Scharfe

Am 10.03.2017 um 09:31 schrieb Jeff King:

The "make coccicheck" target runs spatch against each source
file. But it does so in a for loop, so "make" never sees the
exit code of spatch. Worse, it redirects stderr to a log
file, so the user has no indication of any failure. And then
to top it all off, because we touched the patch file's
mtime, make will refuse to repeat the command because it
think the target is up-to-date.

So for example:

  $ make coccicheck SPATCH=does-not-exist
  SPATCH contrib/coccinelle/free.cocci
  SPATCH contrib/coccinelle/qsort.cocci
  SPATCH contrib/coccinelle/xstrdup_or_null.cocci
  SPATCH contrib/coccinelle/swap.cocci
  SPATCH contrib/coccinelle/strbuf.cocci
  SPATCH contrib/coccinelle/object_id.cocci
  SPATCH contrib/coccinelle/array.cocci
  $ make coccicheck SPATCH=does-not-exist
  make: Nothing to be done for 'coccicheck'.

With this patch, you get:

  $ make coccicheck SPATCH=does-not-exist
   SPATCH contrib/coccinelle/free.cocci
  /bin/sh: 4: does-not-exist: not found
  Makefile:2338: recipe for target 'contrib/coccinelle/free.cocci.patch' failed
  make: *** [contrib/coccinelle/free.cocci.patch] Error 1


That's nice.  The current version is just a contraption that does the 
bare minimum of work.



It also dumps the log on failure, so any errors from spatch
itself (like syntax errors in our .cocci files) will be seen
by the user.

Signed-off-by: Jeff King 
---
This shell code is getting a bit unwieldy to stick inside the Makefile,
with all the line continuation and $-escaping.  It might be worth moving
it into a helper script.


There is one for the kernel 
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/scripts/coccicheck). 
 It's quite big, though.



It also doesn't help that shells are awkward at passing status out of a
for-loop. I think the most "make-ish" way of doing this would actually
be to lose the for loop and have a per-cocci-per-source target.

I don't know if that would make the patches harder to apply. The results
aren't full patches, so I assume you usually do some kind of munging on
them?


They work with patch -p0.

We can get rid of the loop by using the spatch options --use-gitgrep and 
--dir.  I can't find the former one in the docs, though, so I'm not sure 
if it only works with certain versions or what exactly it is even doing. 
 It seems to have the side effect of producing git-style patches 
(applicable with patch -p1) at least.



I resorted to:

  make coccicheck SPATCH='spatch --in-place'


Using SPATCH_FLAGS for adding an option in such case would be a bit simpler.


 Makefile | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 9ec6065cc..d97633892 100644
--- a/Makefile
+++ b/Makefile
@@ -2336,9 +2336,17 @@ check: common-cmds.h
 C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
 %.cocci.patch: %.cocci $(C_SOURCES)
@echo '' SPATCH $<; \
+   ret=0; \
for f in $(C_SOURCES); do \
-   $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \
-   done >$@ 2>$@.log; \
+   $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \
+   { ret=$$?; break; }; \
+   done >$@+ 2>$@.log; \
+   if test $$ret != 0; \
+   then \
+   cat $@.log; \
+   exit 1; \
+   fi; \
+   mv $@+ $@; \
if test -s $@; \
then \
echo '' SPATCH result: $@; \



Re: Git Vendor Support

2017-03-10 Thread Konstantin Khomoutov
On Fri, 10 Mar 2017 16:13:20 +
"COLLINS, ROGER W GG-12 USAF NASIC/SCPW" 
wrote:

> ALCON,
> 
> Is there is a specific group or vendor backing Git?  As part of our
> internal approval process, our organization requires that we list a
> vendor that provides active support (ie. Patches) for the Git
> software.

As Git is a volunteer-driven free-software project, there are no vendors
behind it.

Looks like you basically have two options to fulfill the requirements
of your organization:

 * Contact the Software Freedom Conservancy [1] which is an official
   body serving as an aegis for Git and a bunch of other F/OSS projects.
   They are not vendor but at least they are a legal entity and are
   able to do law-speak.

 * Sign a contract with any company offerring support for Git or
   its Git-based solutions.  Atlassian might be an option (I don't know
   for sure -- just speculating, and I'm not affiliated with them
   in any regard).

1. https://sfconservancy.org/about/contact/


Re: Git Vendor Support

2017-03-10 Thread Stefan Beller
On Fri, Mar 10, 2017 at 8:13 AM, COLLINS, ROGER W GG-12 USAF
NASIC/SCPW  wrote:
> ALCON,
>
> Is there is a specific group or vendor backing Git?

https://sfconservancy.org/ takes care of the financial needs of the community.

> active support

I guess companies that make money primarily via Git hosting
(e.g. one of Github, GitLab, Atlassian, Bitbucket) *may* be interested
in active support.

Thanks,
Stefan


Re: [PATCH] branch: honor --abbrev/--no-abbrev in --list mode

2017-03-10 Thread Junio C Hamano
Jakub Narębski  writes:

> W dniu 08.03.2017 o 23:16, Junio C Hamano pisze:
>  
>> diff --git a/builtin/branch.c b/builtin/branch.c
>> index cbaa6d03c0..537c47811a 100644
>> --- a/builtin/branch.c
>> +++ b/builtin/branch.c
>> @@ -335,9 +335,18 @@ static char *build_format(struct ref_filter *filter, 
>> int maxwidth, const char *r
>>  branch_get_color(BRANCH_COLOR_CURRENT));
>>  
>>  if (filter->verbose) {
>> +struct strbuf obname = STRBUF_INIT;
>> +
>> +if (filter->abbrev < 0)
>> +strbuf_addf(, "%%(objectname:short)");
>> +else if (!filter->abbrev)
>> +strbuf_addf(, "%%(objectname)");
>> +else
>> +strbuf_addf(, " %%(objectname:short=%d) ", 
>> filter->abbrev);
>   ^   ^
> I wonder why the last one has leading space --/ and trailing one -/
> The rest (for default abbrev and for no abbrev do not).

Thanks for spotting; that's just an editor cruft that shouldn't have
been there.




Re: [PATCH v3 0/9] Fix the early config

2017-03-10 Thread Junio C Hamano
Jeff King  writes:

>> > > Yes, exactly.  It would have been less confusing if I picked something
>> > > that passed nongit_ok. Like hash-object:
>> 
>> ... or like testing the early config directly?
>
> I was trying to demonstrate that the problem existed already without
> your patch series.

Yup, they are not new breakages introduced by Dscho's change.

>> From: Johannes Schindelin 
>> Subject: [PATCH] t1309: document cases where we would want early config not 
>> to
>>  die()
>> 
>> Jeff King came up with a couple examples that demonstrate how the new
>> read_early_config() that looks harder for the current .git/ directory
>> could die() in an undesirable way.
>> 
>> Let's add those cases to the test script, to document what we would like
>> to happen when early config encounters problems.
>
> Yep, these all look fine.

OK.  Thanks, both.


Git Vendor Support

2017-03-10 Thread COLLINS, ROGER W GG-12 USAF NASIC/SCPW
ALCON,

Is there is a specific group or vendor backing Git?  As part of our internal 
approval process, our organization requires that we list a vendor that provides 
active support (ie. Patches) for the Git software.

Thanks,
Roger


Re: [PATCH v1] Travis: also test on 32-bit Linux

2017-03-10 Thread René Scharfe

Am 10.03.2017 um 09:18 schrieb Jeff King:

On Fri, Mar 10, 2017 at 01:14:16AM +0100, René Scharfe wrote:


  2. Ones which just copy a single object, like:

   memcpy(, , sizeof(dst));

 Perhaps we should be using struct assignment like:

   dst = src;

 here. It's safer and it should give the compiler more room to
 optimize. The only downside is that if you have pointers, it is
 easy to write "dst = src" when you meant "*dst = *src".


Compilers can usually inline memcpy(3) calls, but assignments are
shorter and more pleasing to the eye, and we get a type check for
free.  How about this?


Yeah, I mostly wasn't sure how people felt about "shorter and more
pleasing". It _is_ shorter and there's less to get wrong. But the
memcpy() screams "hey, I am making a copy" and is idiomatic to at least
a certain generation of C programmers.

I guess something like COPY(dst, src) removes the part that you can get
wrong, while still screaming copy. It's not idiomatic either, but at
least it stands out. I dunno.


Yes ...


I think this misses the other two cases: (*dst, src) and (*dst, *src).


... and that's why I left them out.  You can't get dst vs. *dst wrong 
with structs (at least not without the compiler complaining); only safe 
transformations are included in this round.


René


Re: [PATCH] blame: move blame_entry duplication to add_blame_entry()

2017-03-10 Thread René Scharfe

Am 10.03.2017 um 09:32 schrieb Jeff King:

On Fri, Mar 10, 2017 at 01:12:59AM +0100, René Scharfe wrote:


All callers of add_blame_entry() allocate and copy the second argument.
Let the function do it for them, reducing code duplication.


I assume you found this due to the DUPLICATE() discussion elsewhere.


Indeed, your grep command in that discussion showed them.

René


Re: [RFC][PATCH] index-pack: add testcases found using AFL

2017-03-10 Thread Vegard Nossum

On 10/03/2017 16:15, Vegard Nossum wrote:

I've used AFL to generate a corpus of pack files that maximises the edge
coverage for 'git index-pack'.

This is a supplement to (and not a replacement for) the regular test cases
where we know exactly what each test is checking for. These testcases are
more useful for avoiding regressions in edge cases or as a starting point
for future fuzzing efforts.

To see the output of running 'git index-pack' on each file, you can do
something like this:

  make -C t GIT_TEST_OPTS="--run=34 --verbose" t5300-pack-object.sh

I observe the following coverage changes (for t5300 only):

  path  old%  new%pp
  
  builtin/index-pack.c  74.3  76.6   2.3
  pack-write.c  79.8  80.4.6
  patch-delta.c 67.4  81.4  14.0
  usage.c   26.6  35.5   8.9
  wrapper.c 42.0  46.1   4.1
  zlib.c58.7  64.1   5.4


And if you add this simple patch on top (sorry, I didn't think of it
until after I'd sent the previous e-mail):

diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 19e02ffc2..db705ba5c 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -425,8 +425,10 @@ test_expect_success 'index-pack  works in 
non-repo' '

 test_expect_success 'index-pack edge coverage' '
for pack in "$TEST_DIRECTORY"/t5300/*.pack
do
-   rm -rf "${pack%.pack}.idx" &&
-   test_might_fail git index-pack $pack
+   rm -rf "${pack%.pack}.idx" tmp.pack tmp.idx &&
+   test_might_fail git index-pack $pack &&
+   test_might_fail git index-pack --strict $pack &&
+   test_might_fail git index-pack --stdin --fix-thin 
tmp.pack < $pack

done
 '


you get this change to the coverage profile instead:

path  old%  new%pp


alloc.c   58.1  67.4   9.3
builtin/index-pack.c  74.3  80.7   6.4
commit.c  13.9  17.4   3.5
date.c 3.5   4.2.7
fsck.c15.7  33.7  18.0
object.c  56.0  58.7   2.7
pack-write.c  79.8  81.4   1.6
patch-delta.c 67.4  81.4  14.0
path.c31.6  32.1.5
sha1_file.c   48.9  49.6.7
tag.c  3.7  16.8  13.1
tree.c36.6  37.5.9
usage.c   26.6  35.5   8.9
wrapper.c 42.0  46.1   4.1
zlib.c58.7  64.1   5.4

Of course, it's likely some of those gains can be found in other
testcases outside t5300 -- also, coverage isn't everything. Still seems
like a nice gain with very little effort.


Vegard


Re: Stable GnuPG interface, git should use GPGME

2017-03-10 Thread Ævar Arnfjörð Bjarmason
On Fri, Mar 10, 2017 at 11:00 AM, Bernhard E. Reiter
 wrote:
> Dear Git-Devs,

I haven't contributed to Git's GPG code, but I'm taking the liberty of
CC-ing some people who have.

> git uses an pipe-and-exec approach to running a GnuPG binary
> as writen in the documentation [1]:
>
> gpg.program
>Use this custom program instead of "gpg" found on $PATH when making
>or verifying a PGP signature. The program must support the same
>command-line interface as GPG
>
> please consider using libgpgme interfacing to GnuPG, because the gpg
> command-line interface is not considered an official API to GnuPG by the
> GnuPG-devs and thus potentially unstable.
>
> == Details
>
> I'm involved in GnuPG development. For most applications using libgpgme is the
> way what GnuPG-devs would recommend, also see
>
>   https://wiki.gnupg.org/APIs .
>
> GnuPG devs are making a good effort of trying to keep the command-line
> interface stable, though it is not for sure. Git is only using a small part
> of the interface, so the risk when keeping the current way is small.
> Still I believe git's stability and usability would profit when moving to
> libgpgme, especially with the coming move to GnuPG 2.2, better diagnosing
> messages and for cross-plattform usage.
>
> == Usability problem with `gpg2` vs `gpg`
>
> My use case today was signing and git by default found the `gpg` binary by
> default and the command failed. The reason is that I have `gpg2` installed
> and most applications use it right away. So git failed signing because
> the .gnupg configuration of the user was not ready for the old `gpg` which is
> still installed on Debian GNU/Linux for purposes of the operating system. If
> git would have used libgpgme, gpgme would have choosen the most uptodate
> version of `gpg` available (or configured) without me intervening via
> gpg.program. Now because of this problem you could adding a check for `gpg2`
> and fallback to `gpg`, but even better would be to move to libgpgme. >:)

I'm on Debian but haven't had these issues. What's your gpg & gpg2
--version & Debian release? And what in particular failed?

And what git version was this? I see we've had a couple of workarounds
for gpg2, in particular Linus's v2.8.4-1-gb624a3e67f, but if you have
v2.10.0 or later that won't fix whatever issue you had.

Using the library sounds good, but a shorter-term immediate fix would
be to figure out what bug you encountered in our use of the
command-line version, and see if we've fixed that already or not.
Regardless of what we do with a gpg library in the future some distros
might want to backport such a small patch if we can come up with it.

> Best Regards and thanks for maintaining Git as Free Software,
> Bernhard
>
> == how to respond
>
> ps: Please copy me on replies as I am not on git@vger.kernel.org.
> pps: I've copied gnupg-devel@ so they can see I've send this report, you don't
> have to.
>
>
> [1]
> https://github.com/git/git/blob/3bc53220cb2dcf709f7a027a3f526befd021d858/Documentation/config.txt
> search for 'gpg.program'.
>
> --
> www.intevation.de/~bernhard (CEO) +49 541 33 508 3-3
> Intevation GmbH, Osnabrück, Germany; Amtsgericht Osnabrück, HRB 18998
> Owned and run by Frank Koormann, Bernhard Reiter, Dr. Jan-Oliver Wagner


Re: [PATCH v2] t2027: avoid using pipes

2017-03-10 Thread Prathamesh Chavan
On Thu, Mar 9, 2017 at 6:00 PM, Christian Couder
 wrote:
> On Thu, Mar 9, 2017 at 10:53 AM, Prathamesh Chavan  wrote:
>> Whenever a git command is present in the upstream of a pipe, its failure
>> gets masked by piping and hence it should be avoided for testing the
>> upstream git command. By writing out the output of the git command to
>> a file, we can test the exit codes of both the commands as a failure exit
>> code in any command is able to stop the && chain.
>>
>> Signed-off-by: Prathamesh 
>> ---
>
> Please add in Cc those who previously commented on the patch.

Actually I initially used submitGit to send patches, where there was
no option of
adding cc to the patch. But after your comment I have switched to git send-email
and git format-patch for sending patches.

>
>>  t/t2027-worktree-list.sh | 14 +++---
>>  1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/t/t2027-worktree-list.sh b/t/t2027-worktree-list.sh
>> index 848da5f..daa7a04 100755
>> --- a/t/t2027-worktree-list.sh
>> +++ b/t/t2027-worktree-list.sh
>> @@ -31,7 +31,7 @@ test_expect_success '"list" all worktrees from main' '
>> test_when_finished "rm -rf here && git worktree prune" &&
>> git worktree add --detach here master &&
>> echo "$(git -C here rev-parse --show-toplevel) $(git rev-parse 
>> --short HEAD) (detached HEAD)" >>expect &&
>> -   git worktree list | sed "s/  */ /g" >actual &&
>> +   git worktree list >out && sed "s/  */ /g" actual &&
>
> I still think that it would be better if the 'sed' commend was on its
> own line like this:
>
> +   git worktree list >out &&
> +   sed "s/  */ /g" actual &&
>
>> test_cmp expect actual
>>  '


Re: BUG: "git branch --contains " does nothing, silently fails

2017-03-10 Thread Ævar Arnfjörð Bjarmason
On Fri, Mar 10, 2017 at 1:42 PM, Jeff King  wrote:
> On Fri, Mar 10, 2017 at 11:43:15AM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> Ran into this when preparing my --no-contains series, this is a long
>> standing bug:
>>
>> $ ./git branch -D test; ./git branch --contains v2.8.0 test; echo
>> $?; git rev-parse test
>> error: branch 'test' not found.
>> 0
>> test
>> fatal: ambiguous argument 'test': unknown revision or path not in
>> the working tree.
>
> Isn't that asking to list branches starting with "test" that contain
> v2.8.0? There are none to report (since you just made sure to delete
> it), so the empty output is correct.
>
>> I.e. this should return an error like "git tag" does:
>>
>> $ ./git tag -d test; ./git tag --contains v2.8.0 test; echo $?;
>> git rev-parse test
>> error: tag 'test' not found.
>> fatal: --contains option is only allowed with -l.
>> 128
>> test
>> fatal: ambiguous argument 'test': unknown revision or path not in
>> the working tree.
>
> The difference between "branch" and "tag" here is that "branch
> --contains" implies "--list" (and the argument becomes a pattern).
> Whereas "tag --contains" just detects the situation and complains.
>
> If anything, I'd think we would consider teaching "tag" to behave more
> like "branch".

Yes you're right, sorry about the noise, e.g. this works:

git branch --contains v2.8.0 'a*'
git branch --contains v2.8.0 --list 'a*'

Whereas with "git tag" you always need -l as you point out.


Re: BUG: "git branch --contains " does nothing, silently fails

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 11:43:15AM +0100, Ævar Arnfjörð Bjarmason wrote:

> Ran into this when preparing my --no-contains series, this is a long
> standing bug:
> 
> $ ./git branch -D test; ./git branch --contains v2.8.0 test; echo
> $?; git rev-parse test
> error: branch 'test' not found.
> 0
> test
> fatal: ambiguous argument 'test': unknown revision or path not in
> the working tree.

Isn't that asking to list branches starting with "test" that contain
v2.8.0? There are none to report (since you just made sure to delete
it), so the empty output is correct.

> I.e. this should return an error like "git tag" does:
> 
> $ ./git tag -d test; ./git tag --contains v2.8.0 test; echo $?;
> git rev-parse test
> error: tag 'test' not found.
> fatal: --contains option is only allowed with -l.
> 128
> test
> fatal: ambiguous argument 'test': unknown revision or path not in
> the working tree.

The difference between "branch" and "tag" here is that "branch
--contains" implies "--list" (and the argument becomes a pattern).
Whereas "tag --contains" just detects the situation and complains.

If anything, I'd think we would consider teaching "tag" to behave more
like "branch".

-Peff


Re: [PATCH v2] ref-filter: Add --no-contains option to tag/branch/for-each-ref

2017-03-10 Thread Ævar Arnfjörð Bjarmason
On Fri, Mar 10, 2017 at 12:46 PM, Ævar Arnfjörð Bjarmason
 wrote:
> On Thu, Mar 9, 2017 at 9:31 PM, Christian Couder
>  wrote:
>> On Thu, Mar 9, 2017 at 9:02 PM, Ævar Arnfjörð Bjarmason
>>  wrote:
>>>
>>> diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
>>> index 525737a5d8..4938496194 100644
>>> --- a/Documentation/git-tag.txt
>>> +++ b/Documentation/git-tag.txt
>>> @@ -12,7 +12,7 @@ SYNOPSIS
>>>  'git tag' [-a | -s | -u ] [-f] [-m  | -F ]
>>>  [ | ]
>>>  'git tag' -d ...
>>> -'git tag' [-n[]] -l [--contains ] [--points-at ]
>>> +'git tag' [-n[]] -l [--[no-]contains ] [--points-at ]
>>> [--column[=] | --no-column] [--create-reflog] 
>>> [--sort=]
>>> [--format=] [--[no-]merged []] [...]
>>>  'git tag' -v [--format=] ...
>>> @@ -124,6 +124,10 @@ This option is only applicable when listing tags 
>>> without annotation lines.
>>> Only list tags which contain the specified commit (HEAD if not
>>> specified).
>>>
>>> +--no-contains []::
>>> +   Only list tags which don't contain the specified commit (HEAD if
>>> +   not secified).
>>
>> s/secified/specified/
>>
>>> +
>>>  --points-at ::
>>> Only list tags of the given object.
>>>
>>> diff --git a/builtin/branch.c b/builtin/branch.c
>>> index 94f7de7fa5..e8d534604c 100644
>>> --- a/builtin/branch.c
>>> +++ b/builtin/branch.c
>>> @@ -548,7 +548,9 @@ int cmd_branch(int argc, const char **argv, const char 
>>> *prefix)
>>> OPT_SET_INT('r', "remotes", , N_("act on 
>>> remote-tracking branches"),
>>> FILTER_REFS_REMOTES),
>>> OPT_CONTAINS(_commit, N_("print only branches 
>>> that contain the commit")),
>>> +   OPT_NO_CONTAINS(_commit, N_("print only branches 
>>> that don't contain the commit")),
>>> OPT_WITH(_commit, N_("print only branches that 
>>> contain the commit")),
>>> +   OPT_WITHOUT(_commit, N_("print only branches 
>>> that don't contain the commit")),
>>
>> s/with_commit/no_commit/
>
> Thanks!
>
> FWIW this is the current status of my WIP v3. I noticed a couple of
> other issues where --contains was mentioned without --no-contains.
>
> diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
> index 4938496194..d9243bf5e4 100644
> --- a/Documentation/git-tag.txt
> +++ b/Documentation/git-tag.txt
> @@ -129 +129 @@ This option is only applicable when listing tags
> without annotation lines.
> -   not secified).
> +   not specified).
> diff --git a/builtin/branch.c b/builtin/branch.c
> index e8d534604c..dd96319726 100644
> --- a/builtin/branch.c
> +++ b/builtin/branch.c
> @@ -553 +553 @@ int cmd_branch(int argc, const char **argv, const char 
> *prefix)
> -   OPT_WITHOUT(_commit, N_("print only
> branches that don't contain the commit")),
> +   OPT_WITHOUT(_commit, N_("print only branches
> that don't contain the commit")),
> diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
> index b1ae2388e6..a11542c4fd 100644
> --- a/builtin/for-each-ref.c
> +++ b/builtin/for-each-ref.c
> @@ -12 +12 @@ static char const * const for_each_ref_usage[] = {
> -   N_("git for-each-ref [--contains []]"),
> +   N_("git for-each-ref [(--contains | --no-contains) []]"),
> diff --git a/builtin/tag.c b/builtin/tag.c
> index d83674e3e6..57289132a9 100644
> --- a/builtin/tag.c
> +++ b/builtin/tag.c
> @@ -25 +25 @@ static const char * const git_tag_usage[] = {
> -   N_("git tag -l [-n[]] [--contains ] [--points-at 
> ]"
> +   N_("git tag -l [-n[]] [--[no-]contains ]
> [--points-at ]"

Add to that:

diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 8ad5719962..bec3d0fb42 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1804 +1804 @@ EOF"
-   test_line_count ">" 70 actual
+   test_line_count ">" 10 actual

The tests currently fail with v2 if gpg isn't installed, which'll
cause the number to dip below 70, just setting it to a much more
conservative 10, but maybe this should just be "test -s actual" ...

>
> These last two hunks are going to bust the i18n files for most
> languages. Junio/Jiang, in cases like these where I could fix those up
> with a search/replace on po/* without knowing the languages in
> question (since it's purely changing e.g. --contains to
> --[no-]contains), what do you prefer to do, have that as part of this
> patch, or do it after the fact through the normal i18n maintenance
> process?


Re: [PATCH v2] ref-filter: Add --no-contains option to tag/branch/for-each-ref

2017-03-10 Thread Ævar Arnfjörð Bjarmason
On Thu, Mar 9, 2017 at 9:31 PM, Christian Couder
 wrote:
> On Thu, Mar 9, 2017 at 9:02 PM, Ævar Arnfjörð Bjarmason
>  wrote:
>>
>> diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
>> index 525737a5d8..4938496194 100644
>> --- a/Documentation/git-tag.txt
>> +++ b/Documentation/git-tag.txt
>> @@ -12,7 +12,7 @@ SYNOPSIS
>>  'git tag' [-a | -s | -u ] [-f] [-m  | -F ]
>>  [ | ]
>>  'git tag' -d ...
>> -'git tag' [-n[]] -l [--contains ] [--points-at ]
>> +'git tag' [-n[]] -l [--[no-]contains ] [--points-at ]
>> [--column[=] | --no-column] [--create-reflog] [--sort=]
>> [--format=] [--[no-]merged []] [...]
>>  'git tag' -v [--format=] ...
>> @@ -124,6 +124,10 @@ This option is only applicable when listing tags 
>> without annotation lines.
>> Only list tags which contain the specified commit (HEAD if not
>> specified).
>>
>> +--no-contains []::
>> +   Only list tags which don't contain the specified commit (HEAD if
>> +   not secified).
>
> s/secified/specified/
>
>> +
>>  --points-at ::
>> Only list tags of the given object.
>>
>> diff --git a/builtin/branch.c b/builtin/branch.c
>> index 94f7de7fa5..e8d534604c 100644
>> --- a/builtin/branch.c
>> +++ b/builtin/branch.c
>> @@ -548,7 +548,9 @@ int cmd_branch(int argc, const char **argv, const char 
>> *prefix)
>> OPT_SET_INT('r', "remotes", , N_("act on 
>> remote-tracking branches"),
>> FILTER_REFS_REMOTES),
>> OPT_CONTAINS(_commit, N_("print only branches 
>> that contain the commit")),
>> +   OPT_NO_CONTAINS(_commit, N_("print only branches 
>> that don't contain the commit")),
>> OPT_WITH(_commit, N_("print only branches that 
>> contain the commit")),
>> +   OPT_WITHOUT(_commit, N_("print only branches 
>> that don't contain the commit")),
>
> s/with_commit/no_commit/

Thanks!

FWIW this is the current status of my WIP v3. I noticed a couple of
other issues where --contains was mentioned without --no-contains.

diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 4938496194..d9243bf5e4 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -129 +129 @@ This option is only applicable when listing tags
without annotation lines.
-   not secified).
+   not specified).
diff --git a/builtin/branch.c b/builtin/branch.c
index e8d534604c..dd96319726 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -553 +553 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
-   OPT_WITHOUT(_commit, N_("print only
branches that don't contain the commit")),
+   OPT_WITHOUT(_commit, N_("print only branches
that don't contain the commit")),
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index b1ae2388e6..a11542c4fd 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -12 +12 @@ static char const * const for_each_ref_usage[] = {
-   N_("git for-each-ref [--contains []]"),
+   N_("git for-each-ref [(--contains | --no-contains) []]"),
diff --git a/builtin/tag.c b/builtin/tag.c
index d83674e3e6..57289132a9 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -25 +25 @@ static const char * const git_tag_usage[] = {
-   N_("git tag -l [-n[]] [--contains ] [--points-at ]"
+   N_("git tag -l [-n[]] [--[no-]contains ]
[--points-at ]"


These last two hunks are going to bust the i18n files for most
languages. Junio/Jiang, in cases like these where I could fix those up
with a search/replace on po/* without knowing the languages in
question (since it's purely changing e.g. --contains to
--[no-]contains), what do you prefer to do, have that as part of this
patch, or do it after the fact through the normal i18n maintenance
process?


Re: [PATCH] branch & tag: Add a --no-contains option

2017-03-10 Thread Ævar Arnfjörð Bjarmason
On Thu, Mar 9, 2017 at 3:55 PM, Jeff King  wrote:
> On Thu, Mar 09, 2017 at 03:52:09PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> -   filter->with_commit_tag_algo = 1;
>> +   if ((filter->merge_commit + filter->with_commit +
>> filter->no_commit) > 1)
>> +   filter->with_commit_tag_algo = 0;
>> +   else
>> +   filter->with_commit_tag_algo = 1;
>> filter_refs(, filter, FILTER_REFS_TAGS);
>> ref_array_sort(sorting, );
>>
>> I think I'll amend the tip of my WIP v2 to have something like that,
>> and then we can follow-up with making these cases where you supply
>> multiple options faster.
>
> Yeah, that's another option.  I think you may find that it's unbearably
> slow if you have a lot of tags.

It's what I'm going for in my v2 currently. I think it's a rare enough
(and new) use-case that it's OK to slow it down for now & investigate
your line of patching separately.

>> > Looking at this, I'm pretty sure that using "--contains" with "--merged"
>> > has similar problems, as they both use the UNINTERESTING bit. So even
>> > without your patch, there is a lurking bug.
>>
>> I'm currently running this:
>> https://gist.github.com/avar/45cf288ce7cdc43e7395c6cbf9a98d68
>
> Cute. I'll be curious if it turns up anything.

Currently at ~500 attempts out of 5K with no bad results.


[GSoC] Discussion of "Submodule related work" project

2017-03-10 Thread Valery Tolstov

Have some questions about "Submodule related work" project

First of all, I would like to add this task to the project, if I'll 
take it:

https://public-inbox.org/git/1488913150.881...@smtp.yandex.ru/T/
What do you think about this task?

> Cleanup our test suite. Do not use a repo itself as a submodule for 
itself


Not quite familiar with submodules yet, why this is considered to be 
ineligible

(i.e. using repo as a submodule for itself)?

> (Advanced datastructure knowledge required?) Protect submodule from 
gc-ing

> interesting HEADS.

Can you provide a small example that shows the problem, please?
And why advanced datastructure knowledge is expected?

Maybe you have something else about this project to say.

Thanks,
 Valery Tolstov




BUG: "git branch --contains " does nothing, silently fails

2017-03-10 Thread Ævar Arnfjörð Bjarmason
Ran into this when preparing my --no-contains series, this is a long
standing bug:

$ ./git branch -D test; ./git branch --contains v2.8.0 test; echo
$?; git rev-parse test
error: branch 'test' not found.
0
test
fatal: ambiguous argument 'test': unknown revision or path not in
the working tree.

I.e. this should return an error like "git tag" does:

$ ./git tag -d test; ./git tag --contains v2.8.0 test; echo $?;
git rev-parse test
error: tag 'test' not found.
fatal: --contains option is only allowed with -l.
128
test
fatal: ambiguous argument 'test': unknown revision or path not in
the working tree.

I briefly looked through builtin/branch.c, couldn't find a trivial way
to patch it, unfamiliar with the code, didn't want to forget about it,
hence this E-Mail.


Stable GnuPG interface, git should use GPGME

2017-03-10 Thread Bernhard E. Reiter
Dear Git-Devs,

git uses an pipe-and-exec approach to running a GnuPG binary 
as writen in the documentation [1]:

gpg.program
   Use this custom program instead of "gpg" found on $PATH when making
   or verifying a PGP signature. The program must support the same
   command-line interface as GPG

please consider using libgpgme interfacing to GnuPG, because the gpg 
command-line interface is not considered an official API to GnuPG by the 
GnuPG-devs and thus potentially unstable. 

== Details

I'm involved in GnuPG development. For most applications using libgpgme is the 
way what GnuPG-devs would recommend, also see 

  https://wiki.gnupg.org/APIs .

GnuPG devs are making a good effort of trying to keep the command-line 
interface stable, though it is not for sure. Git is only using a small part 
of the interface, so the risk when keeping the current way is small. 
Still I believe git's stability and usability would profit when moving to 
libgpgme, especially with the coming move to GnuPG 2.2, better diagnosing 
messages and for cross-plattform usage.

== Usability problem with `gpg2` vs `gpg`

My use case today was signing and git by default found the `gpg` binary by 
default and the command failed. The reason is that I have `gpg2` installed 
and most applications use it right away. So git failed signing because 
the .gnupg configuration of the user was not ready for the old `gpg` which is 
still installed on Debian GNU/Linux for purposes of the operating system. If 
git would have used libgpgme, gpgme would have choosen the most uptodate 
version of `gpg` available (or configured) without me intervening via 
gpg.program. Now because of this problem you could adding a check for `gpg2` 
and fallback to `gpg`, but even better would be to move to libgpgme. >:)

Best Regards and thanks for maintaining Git as Free Software,
Bernhard

== how to respond

ps: Please copy me on replies as I am not on git@vger.kernel.org. 
pps: I've copied gnupg-devel@ so they can see I've send this report, you don't 
have to.


[1] 
https://github.com/git/git/blob/3bc53220cb2dcf709f7a027a3f526befd021d858/Documentation/config.txt
search for 'gpg.program'.

-- 
www.intevation.de/~bernhard (CEO) +49 541 33 508 3-3
Intevation GmbH, Osnabrück, Germany; Amtsgericht Osnabrück, HRB 18998
Owned and run by Frank Koormann, Bernhard Reiter, Dr. Jan-Oliver Wagner


signature.asc
Description: This is a digitally signed message part.


[Patch v2] t2027: avoid using pipes

2017-03-10 Thread Prathamesh Chavan
From: Prathamesh 

Whenever a git command is present in the upstream of a pipe, its failure
gets masked by piping and hence it should be avoided for testing the
upstream git command. By writing out the output of the git command to
a file, we can test the exit codes of both the commands as a failure exit
code in any command is able to stop the && chain.

Signed-off-by: Prathamesh 
---
New version of patch updated to include suggested change of add the git
command which was above the grep command on a new line. Also some more
changes similar to the above change were made.
Another reason for a newer version in to improvise the previous mistake
of not including the patch version, as well as getting more familiar with
the submitting patch process.


 t/t2027-worktree-list.sh | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/t/t2027-worktree-list.sh b/t/t2027-worktree-list.sh
index 848da5f36..a3e77fee5 100755
--- a/t/t2027-worktree-list.sh
+++ b/t/t2027-worktree-list.sh
@@ -31,7 +31,8 @@ test_expect_success '"list" all worktrees from main' '
test_when_finished "rm -rf here && git worktree prune" &&
git worktree add --detach here master &&
echo "$(git -C here rev-parse --show-toplevel) $(git rev-parse --short 
HEAD) (detached HEAD)" >>expect &&
-   git worktree list | sed "s/  */ /g" >actual &&
+   git worktree list >out &&
+   sed "s/  */ /g" actual &&
test_cmp expect actual
 '
 
@@ -40,7 +41,8 @@ test_expect_success '"list" all worktrees from linked' '
test_when_finished "rm -rf here && git worktree prune" &&
git worktree add --detach here master &&
echo "$(git -C here rev-parse --show-toplevel) $(git rev-parse --short 
HEAD) (detached HEAD)" >>expect &&
-   git -C here worktree list | sed "s/  */ /g" >actual &&
+   git -C here worktree list >out &&
+   sed "s/  */ /g" actual &&
test_cmp expect actual
 '
 
@@ -73,7 +75,8 @@ test_expect_success '"list" all worktrees from bare main' '
git -C bare1 worktree add --detach ../there master &&
echo "$(pwd)/bare1 (bare)" >expect &&
echo "$(git -C there rev-parse --show-toplevel) $(git -C there 
rev-parse --short HEAD) (detached HEAD)" >>expect &&
-   git -C bare1 worktree list | sed "s/  */ /g" >actual &&
+   git -C bare1 worktree list >out &&
+   sed "s/  */ /g" actual &&
test_cmp expect actual
 '
 
@@ -96,7 +99,8 @@ test_expect_success '"list" all worktrees from linked with a 
bare main' '
git -C bare1 worktree add --detach ../there master &&
echo "$(pwd)/bare1 (bare)" >expect &&
echo "$(git -C there rev-parse --show-toplevel) $(git -C there 
rev-parse --short HEAD) (detached HEAD)" >>expect &&
-   git -C there worktree list | sed "s/  */ /g" >actual &&
+   git -C there worktree list >out &&
+   sed "s/  */ /g" actual &&
test_cmp expect actual
 '
 
@@ -118,9 +122,11 @@ test_expect_success 'broken main worktree still at the 
top' '
cd linked &&
echo "worktree $(pwd)" >expected &&
echo "ref: .broken" >../.git/HEAD &&
-   git worktree list --porcelain | head -n 3 >actual &&
+   git worktree list --porcelain >out &&
+   head -n 3 out >actual &&
test_cmp ../expected actual &&
-   git worktree list | head -n 1 >actual.2 &&
+   git worktree list >out &&
+   head -n 1 out >actual.2 &&
grep -F "(error)" actual.2
)
 '
@@ -134,7 +140,8 @@ test_expect_success 'linked worktrees are sorted' '
test_commit new &&
git worktree add ../first &&
git worktree add ../second &&
-   git worktree list --porcelain | grep ^worktree >actual
+   git worktree list --porcelain >out &&
+   grep ^worktree out >actual
) &&
cat >expected <<-EOF &&
worktree $(pwd)/sorted/main
-- 
2.11.0



Re: [PATCH] blame: move blame_entry duplication to add_blame_entry()

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 01:12:59AM +0100, René Scharfe wrote:

> All callers of add_blame_entry() allocate and copy the second argument.
> Let the function do it for them, reducing code duplication.

I assume you found this due to the DUPLICATE() discussion elsewhere.
Regardless of the results of that discussion, I think this is a nice
simplification.

-Peff


[PATCH] Makefile: detect errors in running spatch

2017-03-10 Thread Jeff King
The "make coccicheck" target runs spatch against each source
file. But it does so in a for loop, so "make" never sees the
exit code of spatch. Worse, it redirects stderr to a log
file, so the user has no indication of any failure. And then
to top it all off, because we touched the patch file's
mtime, make will refuse to repeat the command because it
think the target is up-to-date.

So for example:

  $ make coccicheck SPATCH=does-not-exist
  SPATCH contrib/coccinelle/free.cocci
  SPATCH contrib/coccinelle/qsort.cocci
  SPATCH contrib/coccinelle/xstrdup_or_null.cocci
  SPATCH contrib/coccinelle/swap.cocci
  SPATCH contrib/coccinelle/strbuf.cocci
  SPATCH contrib/coccinelle/object_id.cocci
  SPATCH contrib/coccinelle/array.cocci
  $ make coccicheck SPATCH=does-not-exist
  make: Nothing to be done for 'coccicheck'.

With this patch, you get:

  $ make coccicheck SPATCH=does-not-exist
   SPATCH contrib/coccinelle/free.cocci
  /bin/sh: 4: does-not-exist: not found
  Makefile:2338: recipe for target 'contrib/coccinelle/free.cocci.patch' failed
  make: *** [contrib/coccinelle/free.cocci.patch] Error 1

It also dumps the log on failure, so any errors from spatch
itself (like syntax errors in our .cocci files) will be seen
by the user.

Signed-off-by: Jeff King 
---
This shell code is getting a bit unwieldy to stick inside the Makefile,
with all the line continuation and $-escaping.  It might be worth moving
it into a helper script.

It also doesn't help that shells are awkward at passing status out of a
for-loop. I think the most "make-ish" way of doing this would actually
be to lose the for loop and have a per-cocci-per-source target.

I don't know if that would make the patches harder to apply. The results
aren't full patches, so I assume you usually do some kind of munging on
them? I resorted to:

  make coccicheck SPATCH='spatch --in-place'

 Makefile | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 9ec6065cc..d97633892 100644
--- a/Makefile
+++ b/Makefile
@@ -2336,9 +2336,17 @@ check: common-cmds.h
 C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
 %.cocci.patch: %.cocci $(C_SOURCES)
@echo '' SPATCH $<; \
+   ret=0; \
for f in $(C_SOURCES); do \
-   $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \
-   done >$@ 2>$@.log; \
+   $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \
+   { ret=$$?; break; }; \
+   done >$@+ 2>$@.log; \
+   if test $$ret != 0; \
+   then \
+   cat $@.log; \
+   exit 1; \
+   fi; \
+   mv $@+ $@; \
if test -s $@; \
then \
echo '' SPATCH result: $@; \
-- 
2.12.0.450.gd7e60cc16


Re: [PATCH v1] Travis: also test on 32-bit Linux

2017-03-10 Thread Jeff King
On Fri, Mar 10, 2017 at 01:14:16AM +0100, René Scharfe wrote:

> >   2. Ones which just copy a single object, like:
> > 
> >memcpy(, , sizeof(dst));
> > 
> >  Perhaps we should be using struct assignment like:
> > 
> >dst = src;
> > 
> >  here. It's safer and it should give the compiler more room to
> >  optimize. The only downside is that if you have pointers, it is
> >  easy to write "dst = src" when you meant "*dst = *src".
> 
> Compilers can usually inline memcpy(3) calls, but assignments are
> shorter and more pleasing to the eye, and we get a type check for
> free.  How about this?

Yeah, I mostly wasn't sure how people felt about "shorter and more
pleasing". It _is_ shorter and there's less to get wrong. But the
memcpy() screams "hey, I am making a copy" and is idiomatic to at least
a certain generation of C programmers.

I guess something like COPY(dst, src) removes the part that you can get
wrong, while still screaming copy. It's not idiomatic either, but at
least it stands out. I dunno.

> diff --git a/contrib/coccinelle/copy.cocci b/contrib/coccinelle/copy.cocci
> new file mode 100644
> index 00..f0d883932a
> --- /dev/null
> +++ b/contrib/coccinelle/copy.cocci
> @@ -0,0 +1,31 @@
> +@@
> +type T;
> +T dst;
> +T src;
> [...]
> +type T;
> +T dst;
> +T *src;

I think this misses the other two cases: (*dst, src) and (*dst, *src).

Perhaps squash this in?

---
 builtin/blame.c   |  2 +-
 builtin/pack-redundant.c  |  4 ++--
 contrib/coccinelle/copy.cocci | 32 
 diff.c|  4 ++--
 dir.c |  2 +-
 fast-import.c |  6 +++---
 line-log.c|  2 +-
 7 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index f7aa95f4b..d0d917b37 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -680,7 +680,7 @@ static void dup_entry(struct blame_entry ***queue,
 {
origin_incref(src->suspect);
origin_decref(dst->suspect);
-   memcpy(dst, src, sizeof(*src));
+   *dst = *src;
dst->next = **queue;
**queue = dst;
*queue = >next;
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index 72c815844..ac1d8111e 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -207,7 +207,7 @@ static inline struct pack_list * pack_list_insert(struct 
pack_list **pl,
   struct pack_list *entry)
 {
struct pack_list *p = xmalloc(sizeof(struct pack_list));
-   memcpy(p, entry, sizeof(struct pack_list));
+   *p = *entry;
p->next = *pl;
*pl = p;
return p;
@@ -451,7 +451,7 @@ static void minimize(struct pack_list **min)
while (perm) {
if (is_superset(perm->pl, missing)) {
new_perm = xmalloc(sizeof(struct pll));
-   memcpy(new_perm, perm, sizeof(struct pll));
+   *new_perm = *perm;
new_perm->next = perm_ok;
perm_ok = new_perm;
}
diff --git a/contrib/coccinelle/copy.cocci b/contrib/coccinelle/copy.cocci
index f0d883932..da9969c84 100644
--- a/contrib/coccinelle/copy.cocci
+++ b/contrib/coccinelle/copy.cocci
@@ -29,3 +29,35 @@ T *src;
 - memcpy(, src, sizeof(T));
 + dst = *src;
 )
+
+@@
+type T;
+T *dst;
+T src;
+@@
+(
+- memcpy(dst, , sizeof(*dst));
++ *dst = src;
+|
+- memcpy(dst, , sizeof(src));
++ *dst = src;
+|
+- memcpy(dst, , sizeof(T));
++ *dst = src;
+)
+
+@@
+type T;
+T *dst;
+T *src;
+@@
+(
+- memcpy(dst, src, sizeof(*dst));
++ *dst = *src;
+|
+- memcpy(dst, src, sizeof(*src));
++ *dst = *src;
+|
+- memcpy(dst, src, sizeof(T));
++ *dst = *src;
+)
diff --git a/diff.c b/diff.c
index 051761be4..fbd68ecd0 100644
--- a/diff.c
+++ b/diff.c
@@ -1169,7 +1169,7 @@ static void init_diff_words_data(struct emit_callback 
*ecbdata,
 {
int i;
struct diff_options *o = xmalloc(sizeof(struct diff_options));
-   memcpy(o, orig_opts, sizeof(struct diff_options));
+   *o = *orig_opts;
 
ecbdata->diff_words =
xcalloc(1, sizeof(struct diff_words_data));
@@ -3359,7 +3359,7 @@ static void run_checkdiff(struct diff_filepair *p, struct 
diff_options *o)
 
 void diff_setup(struct diff_options *options)
 {
-   memcpy(options, _diff_options, sizeof(*options));
+   *options = default_diff_options;
 
options->file = stdout;
 
diff --git a/dir.c b/dir.c
index 4541f9e14..d3d0039bf 100644
--- a/dir.c
+++ b/dir.c
@@ -2499,7 +2499,7 @@ static int read_one_dir(struct untracked_cache_dir 
**untracked_,
if (next > rd->end)
return -1;
*untracked_ = untracked = xmalloc(st_add(sizeof(*untracked), len));
-   memcpy(untracked, , sizeof(ud));
+   *untracked = ud;
memcpy(untracked->name, data, len +