date:20161102

Add configuration option 'core.allowProtocol' to allow users to create a
whitelist of allowed protocols for fetch/push/clone in their gitconfig.

For git-submodule.sh, fallback to default whitelist only if the user
hasn't explicitly set `GIT_ALLOW_PROTOCOL` or doesn't have a whitelist
in their gitconfig.

Signed-off-by: Brandon Williams 
---
 Documentation/config.txt|  8 
 Documentation/git.txt   |  6 --
 git-submodule.sh|  3 ++-
 t/lib-proto-disable.sh  | 27 +++
 t/t5815-submodule-protos.sh | 22 ++
 transport.c |  6 ++
 6 files changed, 69 insertions(+), 3 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 27069ac..78c97e3 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -455,6 +455,14 @@ core.sshCommand::
the `GIT_SSH_COMMAND` environment variable and is overridden
when the environment variable is set.
 
+core.allowProtocol::
+   Provide a colon-separated list of protocols which are allowed to be
+   used with fetch/push/clone.  Any protocol not mentioned will be
+   disallowed (i.e., this is a whitelist, not a blacklist).  If the
+   `GIT_ALLOW_PROTOCOL` environment variable is set, it is used as the
+   protocol whitelist instead of this config option.  If neither is set,
+   all protocols are enabled. See git(1) for more details.
+
 core.ignoreStat::
If true, Git will avoid using lstat() calls to detect if files have
changed by setting the "assume-unchanged" bit for those tracked files
diff --git a/Documentation/git.txt b/Documentation/git.txt
index ab7215e..a86ec16 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1155,8 +1155,10 @@ of clones and fetches.
restrict recursive submodule initialization from an untrusted
repository. Any protocol not mentioned will be disallowed (i.e.,
this is a whitelist, not a blacklist). If the variable is not
-   set at all, all protocols are enabled.  The protocol names
-   currently used by git are:
+   set at all, all protocols are enabled.  The exception to this is when
+   running `git-submodule` which will use a default list of known-safe
+   protocols (file:git:http:https:ssh) in the event no whitelist is
+   provided.  The protocol names currently used by git are:
 
  - `file`: any local file-based path (including `file://` URLs,
or local paths)
diff --git a/git-submodule.sh b/git-submodule.sh
index a024a13..bfbfb8e 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -27,7 +27,8 @@ cd_to_toplevel
 #
 # If the user has already specified a set of allowed protocols,
 # we assume they know what they're doing and use that instead.
-: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
+whitelist=$(git config core.allowProtocol) || whitelist=file:git:http:https:ssh
+: ${GIT_ALLOW_PROTOCOL=$whitelist}
 export GIT_ALLOW_PROTOCOL
 
 command=
diff --git a/t/lib-proto-disable.sh b/t/lib-proto-disable.sh
index b0917d9..e8820a6 100644
--- a/t/lib-proto-disable.sh
+++ b/t/lib-proto-disable.sh
@@ -62,6 +62,33 @@ test_proto () {
test_must_fail git clone --bare "$url" tmp.git
)
'
+
+   # Run tests again using the gitconfig method for setting a whitelist
+   test_expect_success "clone $1 (enabled)" '
+   rm -rf tmp.git &&
+   git -c core.allowProtocol="$proto" clone --bare "$url" tmp.git
+   '
+
+   test_expect_success "fetch $1 (enabled)" '
+   git -C tmp.git -c core.allowProtocol="$proto" fetch
+   '
+
+   test_expect_success "push $1 (enabled)" '
+   git -C tmp.git -c core.allowProtocol="$proto" push origin 
HEAD:pushed
+   '
+
+   test_expect_success "push $1 (disabled)" '
+   test_must_fail git -C tmp.git -c core.allowProtocol=none push 
origin HEAD:pushed
+   '
+
+   test_expect_success "fetch $1 (disabled)" '
+   test_must_fail git -C tmp.git -c core.allowProtocol=none fetch
+   '
+
+   test_expect_success "clone $1 (disabled)" '
+   rm -rf tmp.git &&
+   test_must_fail git -C tmp.git -c core.allowProtocol=none clone 
--bare "$url" tmp.git
+   '
 }
 
 # set up an ssh wrapper that will access $host/$repo in the
diff --git a/t/t5815-submodule-protos.sh b/t/t5815-submodule-protos.sh
index 06f55a1..f85e71c 100755
--- a/t/t5815-submodule-protos.sh
+++ b/t/t5815-submodule-protos.sh
@@ -40,4 +40,26 @@ test_expect_success 'user can override whitelist' '
GIT_ALLOW_PROTOCOL=ext git -C dst submodule update ext-module
 '
 
+test_expect_success 'reset dst repo for config tests' '
+   rm -rf dst &&
+   git clone . dst &&
+   git -C dst submodule init
+'
+
+test_expect_success 'update of ssh not allowed when not in config whitelist' '
+   test_must_fail git -C

Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Blake Burkhart

Thanks for CCing me.

I haven't looked at this implementation in detail, but it would be
good to move this configuration into the config system because I think
we can more easily provide a default safe configuration.

It would be nice to use this to introduce a default list of
whitelisted protocols that even applies to `git clone`. I strongly
think we need to find a way to have git-remote-ext disabled by
default. This could be a way to do it.

On Wed, Nov 2, 2016 at 7:22 PM, Jonathan Nieder  wrote:
> That reminds me: external tools also set GIT_ALLOW_PROTOCOL when the
> user hasn't set it explicitly, like git-submodule.sh does.  E.g.
> repo ,
> mercurial 
> .
> Other external tools consume GIT_ALLOW_PROTOCOL, like 'go get'
> .
> Can we make it more convenient for them to support this configuration
> too?

Most of these are my fault too. I encouraged git-repo and mercurial to
use GIT_ALLOW_PROTOCOL to avoid security issues from git-remote-ext.

-- 
Blake Burkhart

Re: [PATCH v11 13/14] convert: add filter..process option

2016-11-02 Thread Lars Schneider


> On 2 Nov 2016, at 18:03, Johannes Sixt  wrote:
> 
>> Am 17.10.2016 um 01:20 schrieb larsxschnei...@gmail.com:
>> +# Compare two files and ensure that `clean` and `smudge` respectively are
>> +# called at least once if specified in the `expect` file. The actual
>> +# invocation count is not relevant because their number can vary.
>> +# c.f. 
>> http://public-inbox.org/git/xmqqshv18i8i@gitster.mtv.corp.google.com/
>> +test_cmp_count () {
>> +expect=$1
>> +actual=$2
>> +for FILE in "$expect" "$actual"
>> +do
>> +sort "$FILE" | uniq -c | sed "s/^[ ]*//" |
>> +sed "s/^\([0-9]\) IN: clean/x IN: clean/" |
>> +sed "s/^\([0-9]\) IN: smudge/x IN: smudge/" >"$FILE.tmp" &&
> 
> This is not sufficiently portable. Some versions of uniq write the
> count left-adjusted, not right-adjusted. How about this on top:
> 
> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index a20b9f58e3..f60858c517 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -40,10 +40,9 @@ test_cmp_count () {
>actual=$2
>for FILE in "$expect" "$actual"
>do
> -sort "$FILE" | uniq -c | sed "s/^[ ]*//" |
> -sed "s/^\([0-9]\) IN: clean/x IN: clean/" |
> -sed "s/^\([0-9]\) IN: smudge/x IN: smudge/" >"$FILE.tmp" &&
> -mv "$FILE.tmp" "$FILE"
> +sort "$FILE" | uniq -c |
> +sed -e "s/^ *[0-9][0-9]* *IN: /x IN: /" >"$FILE.tmp" &&

This looks good (thanks for cleaning up the redundant clean/smudge stuff - that 
was a refactoring artifact!). One minor nit: doesn't sed understand '[0-9]+' ?

> +mv "$FILE.tmp" "$FILE" || return

Why '|| return' here?

>done &&
>test_cmp "$expect" "$actual"
> }

Thank you,
Lars

Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Jonathan Nieder

(+peff and bburky, who introduced GIT_ALLOW_PROTOCOL)
Brandon Williams wrote:

> Add configuration option 'core.allowProtocol' to allow users to create a
> whitelist of allowed protocols for fetch/push/clone in their gitconfig.

Ooh.

This would be especially useful at $DAYJOB, where there is a custom
sso:// protocol that is often used by submodules.  Using an envvar to
whitelist it globally is painful because

 - it disables other protocols even when explicitly requested on a
   plain "git clone" command line by the user.  By comparison, the
   built-in git-submodule.sh whitelist only applies to submodules.

 - platform-specific instructions to set an environment variable can
   be more difficult than "just set this git configuration"

Another difficulty with setting GIT_ALLOW_PROTOCOL globally is that it
requires copy/pasting the default value from upstream and then adding
the values I want.  There's no straightforward way to get the current
value and add to it, in case I want to benefit from future upstream
fixes to the default list.

That is, would it be possible to use something like

[protocol "sso"]
allow = always

instead of

[core]
allowProtocol = file:git:http:https::sso

?

[...]
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -27,7 +27,8 @@ cd_to_toplevel
>  #
>  # If the user has already specified a set of allowed protocols,
>  # we assume they know what they're doing and use that instead.
> -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> +config_whitelist=$(git config core.allowProtocol)
> +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}

optional: To avoid config parsing when GIT_ALLOW_PROTOCOL is already
set, could do something like

 if ! test "${GIT_ALLOW_PROTOCOL+set}"
 then
GIT_ALLOW_PROTOCOL=$(
git config --name-only --get-regexp 'protocol\..*\.allow' 
always |
sed -e 's/^protocol.//' -e 's/.allow$//' |
tr '\n' ':'
)
GIT_ALLOW_PROTOCOL=${GIT_ALLOW_PROTOCOL%:}
: ${GIT_ALLOW_PROTOCOL:=file:git:http:https:ssh}
 fi

[...]
> --- a/transport.c
> +++ b/transport.c
> @@ -652,7 +652,7 @@ static const struct string_list *protocol_whitelist(void)
>  
>   if (enabled < 0) {
>   const char *v = getenv("GIT_ALLOW_PROTOCOL");
> - if (v) {
> + if (v || !git_config_get_value("core.allowProtocol", )) {
>   string_list_split(, v, ':', -1);

This has the effect of always disabling other protocols when
core.allowProtocol is set.  Is that intended?

Like the default list used by submodule, I'd be happiest if this only
applied to repositories cloned implicitly instead of those passed
directly to 'git clone'.

That reminds me: external tools also set GIT_ALLOW_PROTOCOL when the
user hasn't set it explicitly, like git-submodule.sh does.  E.g.
repo ,
mercurial 
.
Other external tools consume GIT_ALLOW_PROTOCOL, like 'go get'
.
Can we make it more convenient for them to support this configuration
too?

An example approach would be a GIT_ALLOW_PROTOCOL var returned by
"git var".

That way git-submodule.sh could do

: ${GIT_ALLOW_PROTOCOL=$(git var GIT_ALLOW_PROTOCOL)}

and it would just work.  Other tools could do the same, with a
fallback to the current default until new enough git is in widespread
use.

Thanks and hope that helps,
Jonathan

Re: [PATCH] transport: add core.allowProtocol config option

On Wed, Nov 02, 2016 at 04:46:13PM -0700, Brandon Williams wrote:

> > > I thought at first we'd have to deal with leaking "v", but "get_value"
> > > is the "raw" version that gives you the uninterpreted value. I think
> > > that means it may give you NULL, though if we see an implicit bool like:
> > > 
> > >   [core]
> > >   allowProtocol
> > > 
> > > That's nonsense, of course, but we would still segfault. I
> > > think the easiest way to test is:
> > > 
> > >   git -c core.allowProtocol fetch
> > > 
> > > which seems to segfault for me with this patch.
> > 
> > what is the desired behavior when a user provides a config in a way that
> > isn't intended?
> 
> oh...I can just drop in git_config_get_string_const() instead.

Yes, it will call git_config_string(), which will make sure there's an
actual value and die otherwise. But note that it also duplicates the
string, so you'd have to deal with freeing it.

-Peff

Re: [PATCH] transport: add core.allowProtocol config option

> > > diff --git a/transport.c b/transport.c
> > > index d57e8de..b1098cd 100644
> > > --- a/transport.c
> > > +++ b/transport.c
> > > @@ -652,7 +652,7 @@ static const struct string_list 
> > > *protocol_whitelist(void)
> > >  
> > >   if (enabled < 0) {
> > >   const char *v = getenv("GIT_ALLOW_PROTOCOL");
> > > - if (v) {
> > > + if (v || !git_config_get_value("core.allowProtocol", )) {
> > >   string_list_split(, v, ':', -1);
> > >   string_list_sort();
> > >   enabled = 1;
> > 
> > I thought at first we'd have to deal with leaking "v", but "get_value"
> > is the "raw" version that gives you the uninterpreted value. I think
> > that means it may give you NULL, though if we see an implicit bool like:
> > 
> >   [core]
> >   allowProtocol
> > 
> > That's nonsense, of course, but we would still segfault. I
> > think the easiest way to test is:
> > 
> >   git -c core.allowProtocol fetch
> > 
> > which seems to segfault for me with this patch.
> 
> what is the desired behavior when a user provides a config in a way that
> isn't intended?

oh...I can just drop in git_config_get_string_const() instead.

-- 
Brandon Williams

Re: [PATCH] transport: add core.allowProtocol config option

On 11/02, Jeff King wrote:
> On Wed, Nov 02, 2016 at 07:05:39PM -0400, Jeff King wrote:
> 
> > > +core.allowProtocol::
> > > + Provide a colon-separated list of protocols which are allowed to be
> > > + used with fetch/push/clone. This is useful to restrict recursive
> > > + submodule initialization from an untrusted repository. Any protocol not
> > > + mentioned will be disallowed (i.e., this is a whitelist, not a
> > > + blacklist). If the variable is not set at all, all protocols are
> > > + enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
> > > + used as the protocol whitelist instead of this config option.
> > 
> > The "not set at all, all protocols are enabled" bit is not quite
> > correct, is it? It is true for a top-level fetch, but not for submodule
> > recursion (and especially since you are talking about submodule
> > recursion immediately before, it is rather confusing).
> 
> Heh, just saw that you copied this straight from the discussion of
> GIT_ALLOW_PROTOCOL. What idiot wrote the original? :)
> 
> It might be worth fixing both places (or possibly just fixing the
> original and phrasing this one as "If GIT_ALLOW_PROTOCOL is not set, use
> this as the default value; see git(1) for details").
> 
> -Peff

haha K I'll fix the original as well.

-- 
Brandon Williams

Re: [PATCH] transport: add core.allowProtocol config option

On 11/02, Jeff King wrote:
> On Wed, Nov 02, 2016 at 03:20:47PM -0700, Brandon Williams wrote:
> 
> > Add configuration option 'core.allowProtocol' to allow users to create a
> > whitelist of allowed protocols for fetch/push/clone in their gitconfig.
> > 
> > For git-submodule.sh, fallback to default whitelist only if the user
> > hasn't explicitly set `GIT_ALLOW_PROTOCOL` or doesn't have a whitelist
> > in their gitconfig.
> 
> This says "what", but not "why". What's the use case?
> 
> I can see somebody wanting to pare down the whitelist further (e.g.,
> because they are carrying ssh credentials that they don't want to use on
> behalf of a malicious repo). But in general I'd expect this setting to
> be a function of the environment you're operating in, and not the
> on-disk config.
> 
> Or is the intent to broaden it for cases where you have a clone that
> uses some non-standard protocol, and you want it to Just Work on
> subsequent recursive fetches?
> 
> > +core.allowProtocol::
> > +   Provide a colon-separated list of protocols which are allowed to be
> > +   used with fetch/push/clone. This is useful to restrict recursive
> > +   submodule initialization from an untrusted repository. Any protocol not
> > +   mentioned will be disallowed (i.e., this is a whitelist, not a
> > +   blacklist). If the variable is not set at all, all protocols are
> > +   enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
> > +   used as the protocol whitelist instead of this config option.
> 
> The "not set at all, all protocols are enabled" bit is not quite
> correct, is it? It is true for a top-level fetch, but not for submodule
> recursion (and especially since you are talking about submodule
> recursion immediately before, it is rather confusing).

Yeah stefan mentioned this to me.  I simply copied the documentaion from
GIT_ALLOW_PROTOCOL, perhaps that should be updated as well?

> 
> > --- a/git-submodule.sh
> > +++ b/git-submodule.sh
> > @@ -27,7 +27,8 @@ cd_to_toplevel
> >  #
> >  # If the user has already specified a set of allowed protocols,
> >  # we assume they know what they're doing and use that instead.
> > -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> > +config_whitelist=$(git config core.allowProtocol)
> > +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}
> 
> The original uses "=" without a ":" so that an empty variable takes
> precedence over the stock list (i.e., allowing nothing). Would you want
> the same behavior for the config variable? I.e.:
> 
>   # this should probably allow nothing, right?
>   git config core.allowProtocol ""
> 
> I think you'd have to check the return code of "git config" to
> distinguish those cases.

Oh, I didn't think of that case.  That can be done easy enough, just
makes the code a bit more verbose.

> 
> > diff --git a/transport.c b/transport.c
> > index d57e8de..b1098cd 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -652,7 +652,7 @@ static const struct string_list 
> > *protocol_whitelist(void)
> >  
> > if (enabled < 0) {
> > const char *v = getenv("GIT_ALLOW_PROTOCOL");
> > -   if (v) {
> > +   if (v || !git_config_get_value("core.allowProtocol", )) {
> > string_list_split(, v, ':', -1);
> > string_list_sort();
> > enabled = 1;
> 
> I thought at first we'd have to deal with leaking "v", but "get_value"
> is the "raw" version that gives you the uninterpreted value. I think
> that means it may give you NULL, though if we see an implicit bool like:
> 
>   [core]
>   allowProtocol
> 
> That's nonsense, of course, but we would still segfault. I
> think the easiest way to test is:
> 
>   git -c core.allowProtocol fetch
> 
> which seems to segfault for me with this patch.

what is the desired behavior when a user provides a config in a way that
isn't intended?

-- 
Brandon Williams

[PATCH 2/3] submodule-config: rename commit_sha1 to commit_or_tree

It is also possible to pass in a tree hash to lookup a submodule config.
Make it clear by naming the variables accrodingly. Looking up a submodule
config by tree hash will come in handy in a later patch.

Signed-off-by: Stefan Beller 
---
 Documentation/technical/api-submodule-config.txt |  4 +-
 submodule-config.c   | 47 
 submodule-config.h   |  4 +-
 t/t7411-submodule-config.sh  | 11 ++
 4 files changed, 39 insertions(+), 27 deletions(-)

diff --git a/Documentation/technical/api-submodule-config.txt 
b/Documentation/technical/api-submodule-config.txt
index 941fa178dd..81921e477b 100644
--- a/Documentation/technical/api-submodule-config.txt
+++ b/Documentation/technical/api-submodule-config.txt
@@ -47,11 +47,11 @@ Functions
Can be passed to the config parsing infrastructure to parse
local (worktree) submodule configurations.
 
-`const struct submodule *submodule_from_path(const unsigned char *commit_sha1, 
const char *path)`::
+`const struct submodule *submodule_from_path(const unsigned char 
*commit_or_tree, const char *path)`::
 
Lookup values for one submodule by its commit_sha1 and path.
 
-`const struct submodule *submodule_from_name(const unsigned char *commit_sha1, 
const char *name)`::
+`const struct submodule *submodule_from_name(const unsigned char 
*commit_or_tree, const char *name)`::
 
The same as above but lookup by name.
 
diff --git a/submodule-config.c b/submodule-config.c
index 15ffab6af4..4c5f5d074b 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -263,12 +263,12 @@ int parse_push_recurse_submodules_arg(const char *opt, 
const char *arg)
return parse_push_recurse(opt, arg, 1);
 }
 
-static void warn_multiple_config(const unsigned char *commit_sha1,
+static void warn_multiple_config(const unsigned char *commit_or_tree,
 const char *name, const char *option)
 {
const char *commit_string = "WORKTREE";
-   if (commit_sha1)
-   commit_string = sha1_to_hex(commit_sha1);
+   if (commit_or_tree)
+   commit_string = sha1_to_hex(commit_or_tree);
warning("%s:.gitmodules, multiple configurations found for "
"'submodule.%s.%s'. Skipping second one!",
commit_string, name, option);
@@ -276,7 +276,7 @@ static void warn_multiple_config(const unsigned char 
*commit_sha1,
 
 struct parse_config_parameter {
struct submodule_cache *cache;
-   const unsigned char *commit_sha1;
+   const unsigned char *commit_or_tree;
const unsigned char *gitmodules_sha1;
int overwrite;
 };
@@ -300,7 +300,7 @@ static int parse_config(const char *var, const char *value, 
void *data)
if (!value)
ret = config_error_nonbool(var);
else if (!me->overwrite && submodule->path)
-   warn_multiple_config(me->commit_sha1, submodule->name,
+   warn_multiple_config(me->commit_or_tree, 
submodule->name,
"path");
else {
if (submodule->path)
@@ -314,7 +314,7 @@ static int parse_config(const char *var, const char *value, 
void *data)
int die_on_error = is_null_sha1(me->gitmodules_sha1);
if (!me->overwrite &&
submodule->fetch_recurse != RECURSE_SUBMODULES_NONE)
-   warn_multiple_config(me->commit_sha1, submodule->name,
+   warn_multiple_config(me->commit_or_tree, 
submodule->name,
"fetchrecursesubmodules");
else
submodule->fetch_recurse = parse_fetch_recurse(
@@ -324,7 +324,7 @@ static int parse_config(const char *var, const char *value, 
void *data)
if (!value)
ret = config_error_nonbool(var);
else if (!me->overwrite && submodule->ignore)
-   warn_multiple_config(me->commit_sha1, submodule->name,
+   warn_multiple_config(me->commit_or_tree, 
submodule->name,
"ignore");
else if (strcmp(value, "untracked") &&
 strcmp(value, "dirty") &&
@@ -340,7 +340,7 @@ static int parse_config(const char *var, const char *value, 
void *data)
if (!value) {
ret = config_error_nonbool(var);
} else if (!me->overwrite && submodule->url) {
-   warn_multiple_config(me->commit_sha1, submodule->name,
+   warn_multiple_config(me->commit_or_tree, 
submodule->name,
"url");
} else {
free((void *) submodule->url);
@@ -351,21 +351,21 @@ static int

[PATCH 3/3] submodule-config: clarify parsing of null_sha1 element

Signed-off-by: Stefan Beller 
---
 Documentation/technical/api-submodule-config.txt | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/technical/api-submodule-config.txt 
b/Documentation/technical/api-submodule-config.txt
index 81921e477b..1df7a827ff 100644
--- a/Documentation/technical/api-submodule-config.txt
+++ b/Documentation/technical/api-submodule-config.txt
@@ -55,8 +55,11 @@ Functions
 
The same as above but lookup by name.
 
-If given the null_sha1 as commit_sha1 the local configuration of a
-submodule will be returned (e.g. consolidated values from local git
+Whenever a submodule configuration is parsed in `parse_submodule_config_option`
+via e.g. `gitmodules_config()`, it will be overwrite the entry with the sha1
+zeroed out.  So in the normal case, when HEAD:.gitmodules is parsed first and
+then overlayed with the repository configuration, the null_sha1 entry contains
+the local configuration of a submodule (e.g. consolidated values from local git
 configuration and the .gitmodules file in the worktree).
 
 For an example usage see test-submodule-config.c.
-- 
2.10.2.621.g399b625.dirty

[PATCH 0/3] submodule-config: clarify/cleanup docs and header

A small series that would have helped me understand the submodule config
once again.

Thanks,
Stefan

Stefan Beller (3):
  submodule config: inline config_from_{name, path}
  submodule-config: rename commit_sha1 to commit_or_tree
  submodule-config: clarify parsing of null_sha1 element

 Documentation/technical/api-submodule-config.txt | 11 +++--
 submodule-config.c   | 59 ++--
 submodule-config.h   |  4 +-
 t/t7411-submodule-config.sh  | 11 +
 4 files changed, 44 insertions(+), 41 deletions(-)

-- 
2.10.2.621.g399b625.dirty

[PATCH 1/3] submodule config: inline config_from_{name, path}

There is no other user of config_from_{name, path}, such that there is no
reason for the existence of these one liner functions. Just inline these
to increase readability.

Signed-off-by: Stefan Beller 
---
 submodule-config.c | 16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index 098085be69..15ffab6af4 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -471,18 +471,6 @@ static const struct submodule *config_from(struct 
submodule_cache *cache,
return submodule;
 }
 
-static const struct submodule *config_from_path(struct submodule_cache *cache,
-   const unsigned char *commit_sha1, const char *path)
-{
-   return config_from(cache, commit_sha1, path, lookup_path);
-}
-
-static const struct submodule *config_from_name(struct submodule_cache *cache,
-   const unsigned char *commit_sha1, const char *name)
-{
-   return config_from(cache, commit_sha1, name, lookup_name);
-}
-
 static void ensure_cache_init(void)
 {
if (is_cache_init)
@@ -508,14 +496,14 @@ const struct submodule *submodule_from_name(const 
unsigned char *commit_sha1,
const char *name)
 {
ensure_cache_init();
-   return config_from_name(_submodule_cache, commit_sha1, name);
+   return config_from(_submodule_cache, commit_sha1, name, 
lookup_name);
 }
 
 const struct submodule *submodule_from_path(const unsigned char *commit_sha1,
const char *path)
 {
ensure_cache_init();
-   return config_from_path(_submodule_cache, commit_sha1, path);
+   return config_from(_submodule_cache, commit_sha1, path, 
lookup_path);
 }
 
 void submodule_free(void)
-- 
2.10.2.621.g399b625.dirty

Re: [PATCH] transport: add core.allowProtocol config option

On Wed, Nov 02, 2016 at 07:05:39PM -0400, Jeff King wrote:

> > +core.allowProtocol::
> > +   Provide a colon-separated list of protocols which are allowed to be
> > +   used with fetch/push/clone. This is useful to restrict recursive
> > +   submodule initialization from an untrusted repository. Any protocol not
> > +   mentioned will be disallowed (i.e., this is a whitelist, not a
> > +   blacklist). If the variable is not set at all, all protocols are
> > +   enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
> > +   used as the protocol whitelist instead of this config option.
> 
> The "not set at all, all protocols are enabled" bit is not quite
> correct, is it? It is true for a top-level fetch, but not for submodule
> recursion (and especially since you are talking about submodule
> recursion immediately before, it is rather confusing).

Heh, just saw that you copied this straight from the discussion of
GIT_ALLOW_PROTOCOL. What idiot wrote the original? :)

It might be worth fixing both places (or possibly just fixing the
original and phrasing this one as "If GIT_ALLOW_PROTOCOL is not set, use
this as the default value; see git(1) for details").

-Peff

Re: [PATCH] transport: add core.allowProtocol config option

On Wed, Nov 02, 2016 at 03:20:47PM -0700, Brandon Williams wrote:

> Add configuration option 'core.allowProtocol' to allow users to create a
> whitelist of allowed protocols for fetch/push/clone in their gitconfig.
> 
> For git-submodule.sh, fallback to default whitelist only if the user
> hasn't explicitly set `GIT_ALLOW_PROTOCOL` or doesn't have a whitelist
> in their gitconfig.

This says "what", but not "why". What's the use case?

I can see somebody wanting to pare down the whitelist further (e.g.,
because they are carrying ssh credentials that they don't want to use on
behalf of a malicious repo). But in general I'd expect this setting to
be a function of the environment you're operating in, and not the
on-disk config.

Or is the intent to broaden it for cases where you have a clone that
uses some non-standard protocol, and you want it to Just Work on
subsequent recursive fetches?

> +core.allowProtocol::
> + Provide a colon-separated list of protocols which are allowed to be
> + used with fetch/push/clone. This is useful to restrict recursive
> + submodule initialization from an untrusted repository. Any protocol not
> + mentioned will be disallowed (i.e., this is a whitelist, not a
> + blacklist). If the variable is not set at all, all protocols are
> + enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
> + used as the protocol whitelist instead of this config option.

The "not set at all, all protocols are enabled" bit is not quite
correct, is it? It is true for a top-level fetch, but not for submodule
recursion (and especially since you are talking about submodule
recursion immediately before, it is rather confusing).

> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -27,7 +27,8 @@ cd_to_toplevel
>  #
>  # If the user has already specified a set of allowed protocols,
>  # we assume they know what they're doing and use that instead.
> -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> +config_whitelist=$(git config core.allowProtocol)
> +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}

The original uses "=" without a ":" so that an empty variable takes
precedence over the stock list (i.e., allowing nothing). Would you want
the same behavior for the config variable? I.e.:

  # this should probably allow nothing, right?
  git config core.allowProtocol ""

I think you'd have to check the return code of "git config" to
distinguish those cases.

> diff --git a/transport.c b/transport.c
> index d57e8de..b1098cd 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -652,7 +652,7 @@ static const struct string_list *protocol_whitelist(void)
>  
>   if (enabled < 0) {
>   const char *v = getenv("GIT_ALLOW_PROTOCOL");
> - if (v) {
> + if (v || !git_config_get_value("core.allowProtocol", )) {
>   string_list_split(, v, ':', -1);
>   string_list_sort();
>   enabled = 1;

I thought at first we'd have to deal with leaking "v", but "get_value"
is the "raw" version that gives you the uninterpreted value. I think
that means it may give you NULL, though if we see an implicit bool like:

  [core]
  allowProtocol

That's nonsense, of course, but we would still segfault. I
think the easiest way to test is:

  git -c core.allowProtocol fetch

which seems to segfault for me with this patch.

-Peff

Re: [PATCH] transport: add core.allowProtocol config option

On 11/02, Stefan Beller wrote:

> > This is useful to restrict recursive
> > +   submodule initialization from an untrusted repository.
> 
> ok. Though as a user submodules may not spring to mind immediately here.
> I think this is generally useful, too. e.g. an admin could put this in
> the system wide
> config to prevent certain protocols from being used.

Oh I pretty much copied the description from what exists for
`GIT_ALLOW_PROTOCOL` which included this bit about submodules.

> > Any protocol not
> > +   mentioned will be disallowed
> 
> For the regular fetch/clone/pull case. For the submodule case we still
> fall back to
> the hardcoded list of known good things?

Yep! This is done by explicitly setting GIT_ALLOW_PROTOCOL to the
hardcoded list if the user hasn't supplied a whitelist.

> 
> > (i.e., this is a whitelist, not a
> > +   blacklist).
> 
> That is very explicit, I'd drop it. However this inspires bike
> shedding on the name:
> What about core.protocolWhitelist instead?

Simply to keep the name similar to the env variable that already exists
for this functionality.

> So the env var is of higher priority than this config.

> > -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> > +config_whitelist=$(git config core.allowProtocol)
> 
> So first we lookup the configured protocols.
> 
> > +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}
> 
> Then if they are not configured use the current hard coded white list.

The lookup of the configured whitelist is done first but wont be used
unless GIT_ALLOW_PROTOCOL is unset.  If neither is set it will fallback
to the hardcoded list.

> Do we have any tests for this that could be extended? (Otherwise we'd
> maybe want to add a test for both the regular case as well as a forbidden
> submodule?)
> 

I can write a couple tests for a v2 of the patch.

-- 
Brandon Williams

Re: [PATCH] transport: add core.allowProtocol config option

On Wed, Nov 2, 2016 at 3:20 PM, Brandon Williams  wrote:
> Add configuration option 'core.allowProtocol' to allow users to create a
> whitelist of allowed protocols for fetch/push/clone in their gitconfig.
>
> For git-submodule.sh, fallback to default whitelist only if the user
> hasn't explicitly set `GIT_ALLOW_PROTOCOL` or doesn't have a whitelist
> in their gitconfig.
>
> Signed-off-by: Brandon Williams 
> ---
>  Documentation/config.txt | 9 +
>  git-submodule.sh | 3 ++-
>  transport.c  | 2 +-
>  3 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 27069ac..7f83e40 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -455,6 +455,15 @@ core.sshCommand::
> the `GIT_SSH_COMMAND` environment variable and is overridden
> when the environment variable is set.
>
> +core.allowProtocol::
> +   Provide a colon-separated list of protocols which are allowed to be
> +   used with fetch/push/clone.

ok.

> This is useful to restrict recursive
> +   submodule initialization from an untrusted repository.

ok. Though as a user submodules may not spring to mind immediately here.
I think this is generally useful, too. e.g. an admin could put this in
the system wide
config to prevent certain protocols from being used.

> Any protocol not
> +   mentioned will be disallowed

For the regular fetch/clone/pull case. For the submodule case we still
fall back to
the hardcoded list of known good things?

> (i.e., this is a whitelist, not a
> +   blacklist).

That is very explicit, I'd drop it. However this inspires bike
shedding on the name:
What about core.protocolWhitelist instead?

>  If the variable is not set at all, all protocols are
> +   enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it 
> is
> +   used as the protocol whitelist instead of this config option.

So the env var is of higher priority than this config.

> +
>  core.ignoreStat::
> If true, Git will avoid using lstat() calls to detect if files have
> changed by setting the "assume-unchanged" bit for those tracked files
> diff --git a/git-submodule.sh b/git-submodule.sh
> index a024a13..ad94c75 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -27,7 +27,8 @@ cd_to_toplevel
>  #
>  # If the user has already specified a set of allowed protocols,
>  # we assume they know what they're doing and use that instead.
> -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> +config_whitelist=$(git config core.allowProtocol)

So first we lookup the configured protocols.

> +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}

Then if they are not configured use the current hard coded white list.

>  export GIT_ALLOW_PROTOCOL
>
>  command=
> diff --git a/transport.c b/transport.c
> index d57e8de..b1098cd 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -652,7 +652,7 @@ static const struct string_list *protocol_whitelist(void)
>
> if (enabled < 0) {
> const char *v = getenv("GIT_ALLOW_PROTOCOL");
> -   if (v) {
> +   if (v || !git_config_get_value("core.allowProtocol", )) {

This implementation matches what the config promised, I would think.

Do we have any tests for this that could be extended? (Otherwise we'd
maybe want to add a test for both the regular case as well as a forbidden
submodule?)


> string_list_split(, v, ':', -1);
> string_list_sort();
> enabled = 1;
> --
> 2.8.0.rc3.226.g39d4020
>

Re: RFE: Discard hunks during `git add -p`

On Wed, Nov 02, 2016 at 06:11:14PM -0400, Jeff King wrote:

> > Being able to discard hunks (reset working copy to index contents) 
> > during add-p would alleviate the (quite broad) hard reset.
> 
> As Konstantin pointed out, you can already discard interactively with
> "git checkout -p". It might be nice to be able to do both in the same
> run, and turn the "yes/no" decision into "yes/no/discard".
> 
> In theory it should be easy, as the same code drives the hunk selector
> for both commands. It's just a matter of which command we feed the
> selected hunks to. I don't know if there would be corner cases around
> hunk-editing and splitting, though. The "add" phase should never touch
> the working tree file itself, so any hunks present from the initial list
> should still apply cleanly during the "discard" phase.

The patch is something like the one below, which worked for me in a very
trivial test. I won't be surprised if there are some corner cases it's
missing. At the very least, coalesce_overlapping_hunks() needs to learn
about the differences between "apply" and "discard" hunks (and not
coalesce them!).

I don't have immediate plans for this, so if somebody wants to pick it
up and run with it, be my guest.

-Peff

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index ee3d81269..43651435a 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -109,6 +109,7 @@ my %patch_modes = (
PARTICIPLE => 'staging',
FILTER => 'file-only',
IS_REVERSE => 0,
+   DISCARD => sub { apply_patch 'apply -R', @_; },
},
'stash' => {
DIFF => 'diff-index -p HEAD',
@@ -1325,6 +1326,11 @@ sub patch_update_file {
my ($prev, $next, $other, $undecided, $i);
$other = '';
 
+   my $discard = exists $patch_mode_flavour{DISCARD};
+   if ($discard) {
+   $other .= ',D';
+   }
+
if ($num <= $ix) {
$ix = 0;
}
@@ -1384,6 +1390,9 @@ sub patch_update_file {
elsif ($line =~ /^n/i) {
$hunk[$ix]{USE} = 0;
}
+   elsif ($discard && $line =~ /^D/) {
+   $hunk[$ix]{USE} = -1;
+   }
elsif ($line =~ /^a/i) {
while ($ix < $num) {
if (!defined $hunk[$ix]{USE}) {
@@ -1539,9 +1548,12 @@ sub patch_update_file {
 
my $n_lofs = 0;
my @result = ();
+   my @discard = ();
for (@hunk) {
-   if ($_->{USE}) {
+   if ($_->{USE} > 0) {
push @result, @{$_->{TEXT}};
+   } elsif ($_->{USE} < 0) {
+   push @discard, @{$_->{TEXT}};
}
}
 
@@ -1552,6 +1564,13 @@ sub patch_update_file {
refresh();
}
 
+   if (@discard) {
+   my @patch = reassemble_patch($head->{TEXT}, @discard);
+   my $apply_routine = $patch_mode_flavour{DISCARD};
+   &$apply_routine(@patch);
+   refresh();
+   }
+
print "\n";
return $quit;
 }

Re: send-email garbled header with trailing doublequote in email

2016-11-02 Thread Andrea Arcangeli

On Wed, Nov 02, 2016 at 06:04:37PM -0400, Jeff King wrote:
> Nope, it looks exactly as --dry-run reports it.

My sendmail is postfix 3.1.0.

> To see exactly what is being sent out, try:
> 
> -- >8 --
> 
> cat >/tmp/foo <<\EOF
> #!/bin/sh
> echo "args: $*"
> sed 's/^/stdin: /'
> EOF
> 
> chmod +x /tmp/foo
> 
> git send-email --smtp-server=/tmp/foo --to=whatever
> 
> -- 8< --

Right it's the same as --dry-run:

stdin: To: "what ever" " <.>

There's not my hostname and not removed space. If I add more addresses
they also go in the second line with a leading space and they're not cut.

So this must be postfix then that out of the blue decided to garble it
in a strange way while parsing the input... The removal of all
whitespaces s/what ever/whatever/ especially I've no idea how it
decided to do so.

Can you reproduce with postfix as sendmail at least? If you can
reproduce also see what happens if you add another --to.

[PATCH] transport: add core.allowProtocol config option

Add configuration option 'core.allowProtocol' to allow users to create a
whitelist of allowed protocols for fetch/push/clone in their gitconfig.

For git-submodule.sh, fallback to default whitelist only if the user
hasn't explicitly set `GIT_ALLOW_PROTOCOL` or doesn't have a whitelist
in their gitconfig.

Signed-off-by: Brandon Williams 
---
 Documentation/config.txt | 9 +
 git-submodule.sh | 3 ++-
 transport.c  | 2 +-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 27069ac..7f83e40 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -455,6 +455,15 @@ core.sshCommand::
the `GIT_SSH_COMMAND` environment variable and is overridden
when the environment variable is set.
 
+core.allowProtocol::
+   Provide a colon-separated list of protocols which are allowed to be
+   used with fetch/push/clone. This is useful to restrict recursive
+   submodule initialization from an untrusted repository. Any protocol not
+   mentioned will be disallowed (i.e., this is a whitelist, not a
+   blacklist). If the variable is not set at all, all protocols are
+   enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
+   used as the protocol whitelist instead of this config option.
+
 core.ignoreStat::
If true, Git will avoid using lstat() calls to detect if files have
changed by setting the "assume-unchanged" bit for those tracked files
diff --git a/git-submodule.sh b/git-submodule.sh
index a024a13..ad94c75 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -27,7 +27,8 @@ cd_to_toplevel
 #
 # If the user has already specified a set of allowed protocols,
 # we assume they know what they're doing and use that instead.
-: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
+config_whitelist=$(git config core.allowProtocol)
+: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}
 export GIT_ALLOW_PROTOCOL
 
 command=
diff --git a/transport.c b/transport.c
index d57e8de..b1098cd 100644
--- a/transport.c
+++ b/transport.c
@@ -652,7 +652,7 @@ static const struct string_list *protocol_whitelist(void)
 
if (enabled < 0) {
const char *v = getenv("GIT_ALLOW_PROTOCOL");
-   if (v) {
+   if (v || !git_config_get_value("core.allowProtocol", )) {
string_list_split(, v, ':', -1);
string_list_sort();
enabled = 1;
-- 
2.8.0.rc3.226.g39d4020

Re: RFE: Discard hunks during `git add -p`

On Wed, Nov 02, 2016 at 02:46:04PM +0100, Jan Engelhardt wrote:

> Current version: 2.10.2
> Example workflow:
> 
> * I would do a global substitution across a source tree, e.g. `perl -i 
>   -pe 's{OLD_FOO\(x\)}{NEW_BAR(x, 0)}' *.c`
> * Using `git add -p`, I would verify each of the substitutions that they 
>   make sense in their respective locations, and, based on that,
>   answer "y" or "n" to the interactive prompting to stage good hunks.
> * When done with add-p, I would commit the so-staged hunks,
>   and then use `git reset --hard` to discard all changes that were 
>   not acknowledged during add-p.
> 
> Being able to discard hunks (reset working copy to index contents) 
> during add-p would alleviate the (quite broad) hard reset.

As Konstantin pointed out, you can already discard interactively with
"git checkout -p". It might be nice to be able to do both in the same
run, and turn the "yes/no" decision into "yes/no/discard".

In theory it should be easy, as the same code drives the hunk selector
for both commands. It's just a matter of which command we feed the
selected hunks to. I don't know if there would be corner cases around
hunk-editing and splitting, though. The "add" phase should never touch
the working tree file itself, so any hunks present from the initial list
should still apply cleanly during the "discard" phase.

-Peff

Re: send-email garbled header with trailing doublequote in email

On Wed, Nov 02, 2016 at 10:38:05PM +0100, Andrea Arcangeli wrote:

> > But in either case, in my test, the actual email address is still
> > extracted correctly and fed to the MTA, so the mail is delivered.
> 
> The email is delivered to the right email for me too, but to see the
> problem you need to check the email itself, and look how the To: field
> looks in the actual email in your mail client or web interface.
> 
> Don't you see whatever@yourhostname without spaces and @yourhostname
> instead of the email domain?

Nope, it looks exactly as --dry-run reports it.

To see exactly what is being sent out, try:

-- >8 --

cat >/tmp/foo <<\EOF
#!/bin/sh
echo "args: $*"
sed 's/^/stdin: /'
EOF

chmod +x /tmp/foo

git send-email --smtp-server=/tmp/foo --to=whatever

-- 8< --

Mine shows the same header as --dry-run. Which makes sense, because the
code is literally just dumping the $header variable, which is the same
thing that gets sent to the sendmail binary (or the SMTP server if that
is in use).

So my guess would be that either an MTA in the routing path is garbling
it (or possibly mailing list software). or maybe even the eventual MUA,
though it sounds like you checked the raw bytes.

> If you still can't reproduce, maybe it's a different perl
> Mail::Address, I'm using dev-perl/MailTools-2.140.0

Mine is 2.13, which I would imagine is comparable.

> If --dry-run complained and failed instead of passing and then sending
> an email with garbled To/Cc list, it'd be more than enough. Either
> that or it should generate a mail header that matches the output of
> --dry-run so the review of --dry-run becomes meaningful again.

OK. I think we get that first part right. The problem is that the
garbling is such that somebody in the middle is unhappy with it (which
makes sense; it's syntactically bogus). So the tool is there to see the
problem in --dry-run, but of course it's rather subtle.

In theory we should be able to detect and complain about bogus syntax
like this, but right now we don't re-parse the addresses at all. We rely
on Mail::Address to produce valid output, and it doesn't seem to be
doing so here.

-Peff

Re: [ANNOUNCE] Git v2.11.0-rc0

2016-11-02 Thread Michael Haggerty

On 11/01/2016 10:38 PM, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
>> If it involves renaming and swapping, however, we may be talking
>> about a change that we cannot deliver with confidence in less than a
>> week before -rc1, which sounds, eh, suboptimal.
>>
>> FWIW, here is how the removal of compaction without renaming looks
>> like.
> 
> And here is _with_ renaming.  I think "compaction heuristics" is a
> much better phrase to call "heuristics used during the diff hunk
> compaction process" without specifying how that heuristics work
> (like taking hints in the indentation levels).  If we are to retire
> one while keeping the other, compaction-heuristics should be the
> name we give and keep for the surviving one.

Personally, I'm not a fan of the name "compaction heuristics".

The name *seems* to make sense because it affects the behavior of a
function called `xdl_change_compact()`. But that means nothing to end
users. The option doesn't affect how the diff is "compacted" in the
usual sense of the word; the slider optimization usually doesn't change
the number of lines in the diff at all [1].

Moreover, if we ever want to add another heuristic (for example, imagine
a language-aware algorithm that is based on parsing the file), we would
want a different name for that option, at least temporarily. From that
point of view, it makes a little bit of sense for the name of the option
to hint at the particular heuristic being used.

That being said, I don't think it is a big deal either way, and I can
live with either name. I rather hope that this option will become the
default soon, and hopefully after that hardly anybody will need to learn
its name.

Regarding making it the default: given that it is presumably too late in
this release cycle to make this option the default behavior, I suggest
that it be made the default early in the next release cycle so that it
gets a lot of testing, and people have enough time to voice any objections.

Regarding your patches themselves, once the old compaction heuristic is
removed (with or without renaming), you can also get rid of the
`blank_lines` local variable in `xdl_change_compact()`, and also the
function `is_blank_line()` in the same source file.

Michael

[1] The only exceptions are when it causes context lines for adjacent
diff hunks to join/split, or makes the context lines for a diff hunk
extend past the beginning/end of the file. But these are uninteresting
side-effects; it doesn't *try* to do or avoid either of these things.

Re: send-email garbled header with trailing doublequote in email

2016-11-02 Thread Andrea Arcangeli

Hello,

On Wed, Nov 02, 2016 at 05:11:18PM -0400, Jeff King wrote:
> In fact, it is not even git that does this, but rather what
> Mail::Address happens to output (though git has fallback routines if
> that module isn't available that do the same thing).

If it's something in perl it should be fixed there indeed. I didn't
debug what exactly was wrong, so I sent it here first.

> But in either case, in my test, the actual email address is still
> extracted correctly and fed to the MTA, so the mail is delivered.

The email is delivered to the right email for me too, but to see the
problem you need to check the email itself, and look how the To: field
looks in the actual email in your mail client or web interface.

Don't you see whatever@yourhostname without spaces and @yourhostname
instead of the email domain?

If you still can't reproduce, maybe it's a different perl
Mail::Address, I'm using dev-perl/MailTools-2.140.0

>From who receives the email it just looks garbled, but --dry-run
showed a correct To/Cc list so it was undetectable that it would end
up garbled during the real email delivery.

> So I'm not sure what you wanted to happen that didn't. Did you want
> --dry-run to complain about the bogus address? Did the message not get
> delivered for you? Something else?

If --dry-run complained and failed instead of passing and then sending
an email with garbled To/Cc list, it'd be more than enough. Either
that or it should generate a mail header that matches the output of
--dry-run so the review of --dry-run becomes meaningful again.

Thanks,
Andrea

Re: send-email garbled header with trailing doublequote in email

On Wed, Nov 02, 2016 at 09:27:09PM +0100, Andrea Arcangeli wrote:

> send-email gets confused with one trailing " at the end of the
> email. Details and how to reproduce below, it breaks also with
> upstream git version 2.10.2.dirty.

I'm not quite sure what happened, and what you wanted to happen. In your
example:

> FYI: apparently I hit a git bug in this submit... reproducible with
> the below command:
> 
> git send-email -1 --to '"what ever" <--you...@--email--.com>"'

The "to" address is slightly bogus here because of the extra
double-quote. That gets turned into a slightly bogus rfc2822 header:

> *snip*
> Dry-OK. Log says:
> To: "what ever" " <--you...@--email--.com>

Which is funny, but matches what we do with other addresses that are
invalid according to the rfc. E.g., there was a discussion recently on:

  Stable  [4.8+]

which has historically been converted to:

  "Stable [4.8+]" 

In fact, it is not even git that does this, but rather what
Mail::Address happens to output (though git has fallback routines if
that module isn't available that do the same thing).

But in either case, in my test, the actual email address is still
extracted correctly and fed to the MTA, so the mail is delivered.

So I'm not sure what you wanted to happen that didn't. Did you want
--dry-run to complain about the bogus address? Did the message not get
delivered for you? Something else?

-Peff

[ANNOUNCE] Git for Windows 2.10.2

2016-11-02 Thread Johannes Schindelin

Dear Git users,

It is my pleasure to announce that Git for Windows 2.10.2 is available from:

https://git-for-windows.github.io/

Changes since Git for Windows v2.10.1(2) (October 13th 2016)

Git for windows v2.10.1(2) was a MinGit-only release (i.e. there was no
Git for windows installer for that version).

New Features

• Comes with Git v2.10.2.
• Comes with Git Credential Manager v1.8.1.
• Comes with cURL v7.51.0.
• Git for Windows can now be built easily with Visual C++ 2015.
• The installer now logs post-install errors more verbosely.
• A new option asks the installer to skip installation if Git's files
are in use.
• A new option asks the installer to quietly skip downgrading Git for
Windows, without indicating failure.
• There is now an explicit option for symbolic link support,
including a link to a more verbose explanation of the issue.

Bug Fixes

• when upgrading Git for Windows, SSH agent processes are now
auto-terminated.
• When trying to install/upgrade on a Windows version that is no
longer supported, we now refuse to do so.

Filename | SHA-256
| ---
Git-2.10.2-64-bit.exe |
57238ebf86f8b51e32cab44bb91c8060e04774676b77b9fb92f78e7bc7e9a179
Git-2.10.2-32-bit.exe |
8b15054a4ea2dd5d2be0471a430d8ae7c772b94e1a1048221083a0040011011c
PortableGit-2.10.2-64-bit.7z.exe |
101314826892480043d5b11989726fc8ee448991eb7b0a1c61aca751161bc15b
PortableGit-2.10.2-32-bit.7z.exe |
edc616817e96a6f15246bb0dd93c97e53d38d3b2a0b7375f26bd0bd082c41a73
MinGit-2.10.2-64-bit.zip |
a10de5d5a8e71e207eff20d833ca56902a0668940e3def5f21d089e4f533ea40
MinGit-2.10.2-32-bit.zip |
2b191598bcb37565a2b80729ef8d00c03df02b75f6b9d012080c458999f89cc0
Git-2.10.2-64-bit.tar.bz2 |
7df449ac1813876b5a2e9215d94bca9458c2e0870c65e5b78bd7fc2a448a2a90
Git-2.10.2-32-bit.tar.bz2 |
3f8d0bebc53fabad863b8fe352a317fde61833efa4750f96656cf95017a621e8

Ciao,
Johannes

--
You received this message because you are subscribed to the Google Groups
"git-for-windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to git-for-windows+unsubscr...@googlegroups.com.
To post to this group, send email to git-for-wind...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/git-for-windows/20161102204358.6856-1-johannes.schindelin%40gmx.de.
For more options, visit https://groups.google.com/d/optout.

Re: [PATCH 4/4] t0021: fix filehandle usage on older perl

2016-11-02 Thread Torsten Bögershausen


> +use IO::File;

That did the trick (the J6T version work as well)

Thanks for the fast responses.

send-email garbled header with trailing doublequote in email

2016-11-02 Thread Andrea Arcangeli

Hello,

send-email gets confused with one trailing " at the end of the
email. Details and how to reproduce below, it breaks also with
upstream git version 2.10.2.dirty.

Feel free to CC me if you need any further info.

Thanks,
Andrea

- Forwarded message from Andrea Arcangeli  -

Date: Wed, 2 Nov 2016 21:07:02 +0100
From: Andrea Arcangeli 
To: linux...@kvack.org
Subject: Re: [PATCH 00/33] userfaultfd tmpfs/hugetlbfs/non-cooperative
User-Agent: Mutt/1.7.1 (2016-10-04)

FYI: apparently I hit a git bug in this submit... reproducible with
the below command:

git send-email -1 --to '"what ever" <--you...@--email--.com>"'

after replacing --you...@email--.com with your own email.

*snip*
Dry-OK. Log says:
To: "what ever" " <--you...@--email--.com>
*snip*
X-Mailer: git-send-email 2.7.3

Result: OK

It's not ok if the --dry-run outputs the above with a fine header, but
the actual header in the email data is different. Of course I tested
--dry-run twice and it was fine like the above is fine as well.

*snip*

- End forwarded message -

Re: [PATCH 0/5] disallow symlinks for .gitignore and .gitattributes

On Wed, Nov 02, 2016 at 12:46:28PM -0700, Stefan Beller wrote:

> On Wed, Nov 2, 2016 at 6:04 AM, Jeff King  wrote:
> 
> >
> >  attr.c| 58 
> > ---
> 
> $ git diff --stat origin/master..origin/sb/attr  |grep attr.c
>  attr.c |  531 +-
> 
> From a cursory read of your series this may result in a merge
> conflict, but would be
> easily fixable (changed signature of functions that clash).

Yeah, I knew you guys were doing some refactoring of the attribute code
elsewhere, but hadn't actually seen how bad the damage was. I just did
the merge with sb/attr, though, and the conflicts are quite trivial
(mostly s/1/flags/ in bootstrap_attr_stack()).

I'm happy to re-roll on a different base if sb/attr graduates first, but
I suspect Junio can just resolve the conflicts at merge time.

-Peff

Re: [PATCH 0/5] disallow symlinks for .gitignore and .gitattributes

On Wed, Nov 2, 2016 at 6:04 AM, Jeff King  wrote:

>
>  attr.c| 58 
> ---

$ git diff --stat origin/master..origin/sb/attr  |grep attr.c
 attr.c |  531 +-

>From a cursory read of your series this may result in a merge
conflict, but would be
easily fixable (changed signature of functions that clash).

[PATCH 4/4] t0021: fix filehandle usage on older perl

The rot13-filter.pl script calls methods on implicitly
defined filehandles (STDOUT, and the result of an open()
call).  Prior to perl 5.13, these methods are not
automatically loaded, and perl will complain with:

  Can't locate object method "flush" via package "IO::Handle"

Let's explicitly load IO::File (which inherits from
IO::Handle). That's more than we need for just "flush", but
matches what perl has done since:

  
http://perl5.git.perl.org/perl.git/commit/15e6cdd91beb4cefae4b65e855d68cf64766965d

Signed-off-by: Jeff King 
---
I see J6t solved this a week ago using "FileHandle". These days that is
basically a compatibility synonym for IO::File. I think both should be
available pretty much everywhere, so I went with IO::File for the
reasons above. But if that doesn't work for some reason, switching to
"use FileHandle" should be OK, too.

I don't have an old enough perl easily available to test it either way.

 t/t0021/rot13-filter.pl | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index e3ea58e1e..4d5697ee5 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -21,6 +21,7 @@
 
 use strict;
 use warnings;
+use IO::File;
 
 my $MAX_PACKET_CONTENT_SIZE = 65516;
 my @capabilities= @ARGV;
-- 
2.11.0.rc0.258.gf434c15

[PATCH 3/4] t0021: use $PERL_PATH for rot13-filter.pl

The rot13-filter.pl script hardcodes "#!/usr/bin/perl", and
does not respect $PERL_PATH at all. That is a problem if the
system does not have perl at that path, or if it has a perl
that is too old to run a complicated script like the
rot13-filter (but PERL_PATH points to a more modern one).

We can fix this by using write_script() to create a new copy
of the script with the correct #!-line. In theory we could
move the whole script inside t0021-conversion.sh rather than
having it as an auxiliary file, but it's long enough that
it just makes things harder to read.

As a bonus, we can stop using the full path to the script in
the filter-process config we add (because the trash
directory is in our PATH). Not only is this shorter, but it
sidesteps any shell-quoting issues. The original was broken
when $TEST_DIRECTORY contained a space, because it was
interpolated in the outer script.

Signed-off-by: Jeff King 
---
 t/t0021-conversion.sh   | 19 +++
 t/t0021/rot13-filter.pl |  1 -
 2 files changed, 11 insertions(+), 9 deletions(-)
 mode change 100755 => 100644 t/t0021/rot13-filter.pl

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index c1ad20c61..a8fa52148 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -13,6 +13,9 @@ tr \
   'nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM'
 EOF
 
+write_script rot13-filter.pl "$PERL_PATH" \
+   <"$TEST_DIRECTORY"/t0021/rot13-filter.pl
+
 generate_random_characters () {
LEN=$1
NAME=$2
@@ -341,7 +344,7 @@ test_expect_success 'diff does not reuse worktree files 
that need cleaning' '
 '
 
 test_expect_success PERL 'required process filter should filter data' '
-   test_config_global filter.protocol.process 
"$TEST_DIRECTORY/t0021/rot13-filter.pl clean smudge" &&
+   test_config_global filter.protocol.process "rot13-filter.pl clean 
smudge" &&
test_config_global filter.protocol.required true &&
rm -rf repo &&
mkdir repo &&
@@ -434,7 +437,7 @@ test_expect_success PERL 'required process filter should 
filter data' '
 
 test_expect_success PERL 'required process filter takes precedence' '
test_config_global filter.protocol.clean false &&
-   test_config_global filter.protocol.process 
"$TEST_DIRECTORY/t0021/rot13-filter.pl clean" &&
+   test_config_global filter.protocol.process "rot13-filter.pl clean" &&
test_config_global filter.protocol.required true &&
rm -rf repo &&
mkdir repo &&
@@ -459,7 +462,7 @@ test_expect_success PERL 'required process filter takes 
precedence' '
 '
 
 test_expect_success PERL 'required process filter should be used only for 
"clean" operation only' '
-   test_config_global filter.protocol.process 
"$TEST_DIRECTORY/t0021/rot13-filter.pl clean" &&
+   test_config_global filter.protocol.process "rot13-filter.pl clean" &&
rm -rf repo &&
mkdir repo &&
(
@@ -494,7 +497,7 @@ test_expect_success PERL 'required process filter should be 
used only for "clean
 '
 
 test_expect_success PERL 'required process filter should process multiple 
packets' '
-   test_config_global filter.protocol.process 
"$TEST_DIRECTORY/t0021/rot13-filter.pl clean smudge" &&
+   test_config_global filter.protocol.process "rot13-filter.pl clean 
smudge" &&
test_config_global filter.protocol.required true &&
 
rm -rf repo &&
@@ -554,7 +557,7 @@ test_expect_success PERL 'required process filter should 
process multiple packet
 '
 
 test_expect_success PERL 'required process filter with clean error should 
fail' '
-   test_config_global filter.protocol.process 
"$TEST_DIRECTORY/t0021/rot13-filter.pl clean smudge" &&
+   test_config_global filter.protocol.process "rot13-filter.pl clean 
smudge" &&
test_config_global filter.protocol.required true &&
rm -rf repo &&
mkdir repo &&
@@ -573,7 +576,7 @@ test_expect_success PERL 'required process filter with 
clean error should fail'
 '
 
 test_expect_success PERL 'process filter should restart after unexpected write 
failure' '
-   test_config_global filter.protocol.process 
"$TEST_DIRECTORY/t0021/rot13-filter.pl clean smudge" &&
+   test_config_global filter.protocol.process "rot13-filter.pl clean 
smudge" &&
rm -rf repo &&
mkdir repo &&
(
@@ -624,7 +627,7 @@ test_expect_success PERL 'process filter should restart 
after unexpected write f
 '
 
 test_expect_success PERL 'process filter should not be restarted if it signals 
an error' '
-   test_config_global filter.protocol.process 
"$TEST_DIRECTORY/t0021/rot13-filter.pl clean smudge" &&
+   test_config_global filter.protocol.process "rot13-filter.pl clean 
smudge" &&
rm -rf repo &&
mkdir repo &&
(
@@ -663,7 +666,7 @@ test_expect_success PERL 'process filter should not be 
restarted if it signals a
 '
 
 test_expect_success PERL 'process filter abort stops processing of all further 
files' '
-

[PATCH 2/4] t0021: put $TEST_ROOT in $PATH

We create a rot13.sh script in the trash directory, but need
to call it by its full path when we have moved our cwd to
another directory. Let's just put $TEST_ROOT in our $PATH so
that the script is always found.

This is a minor convenience for rot13.sh, but will be a
major one when we switch rot13-filter.pl to a script in the
same directory, as it means we will not have to deal with
shell quoting inside the filter-process config.

Signed-off-by: Jeff King 
---
 t/t0021-conversion.sh | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index dfde22549..c1ad20c61 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -5,6 +5,7 @@ test_description='blob conversion via gitattributes'
 . ./test-lib.sh
 
 TEST_ROOT="$(pwd)"
+PATH=$TEST_ROOT:$PATH
 
 write_script <<\EOF "$TEST_ROOT/rot13.sh"
 tr \
@@ -64,7 +65,7 @@ test_cmp_exclude_clean () {
 # is equal to the committed content.
 test_cmp_committed_rot13 () {
test_cmp "$1" "$2" &&
-   "$TEST_ROOT/rot13.sh" <"$1" >expected &&
+   rot13.sh <"$1" >expected &&
git cat-file blob :"$2" >actual &&
test_cmp expected actual
 }
@@ -513,7 +514,7 @@ test_expect_success PERL 'required process filter should 
process multiple packet
for FILE in "$TEST_ROOT"/*.file
do
cp "$FILE" . &&
-   "$TEST_ROOT/rot13.sh" <"$FILE" >"$FILE.rot13"
+   rot13.sh <"$FILE" >"$FILE.rot13"
done &&
 
echo "*.file filter=protocol" >.gitattributes &&
@@ -616,7 +617,7 @@ test_expect_success PERL 'process filter should restart 
after unexpected write f
 
# Smudge failed
! test_cmp smudge-write-fail.o smudge-write-fail.r &&
-   "$TEST_ROOT/rot13.sh" expected &&
+   rot13.sh expected &&
git cat-file blob :smudge-write-fail.r >actual &&
test_cmp expected actual
)
-- 
2.11.0.rc0.258.gf434c15

[PATCH 1/4] t0021: use write_script to create rot13 shell script

This avoids us fooling around with $SHELL_PATH and the
executable bit ourselves.

Signed-off-by: Jeff King 
---
 t/t0021-conversion.sh | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index a20b9f58e..dfde22549 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -6,13 +6,11 @@ test_description='blob conversion via gitattributes'
 
 TEST_ROOT="$(pwd)"
 
-cat <"$TEST_ROOT/rot13.sh"
-#!$SHELL_PATH
+write_script <<\EOF "$TEST_ROOT/rot13.sh"
 tr \
   'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' \
   'nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM'
 EOF
-chmod +x "$TEST_ROOT/rot13.sh"
 
 generate_random_characters () {
LEN=$1
-- 
2.11.0.rc0.258.gf434c15

[PATCH 0/4] t0021 perl portability fixups

This series should fix the portability problem you saw (in the final
patch), and fixes the failure to use PERL_PATH along the way.

  [1/4]: t0021: use write_script to create rot13 shell script
  [2/4]: t0021: put $TEST_ROOT in $PATH
  [3/4]: t0021: use $PERL_PATH for rot13-filter.pl
  [4/4]: t0021: fix filehandle usage on older perl

 t/t0021-conversion.sh   | 30 --
 t/t0021/rot13-filter.pl |  2 +-
 2 files changed, 17 insertions(+), 15 deletions(-)
 mode change 100755 => 100644 t/t0021/rot13-filter.pl

Re: [PATCH v11 13/14] convert: add filter..process option

2016-11-02 Thread Johannes Sixt

Am 17.10.2016 um 01:20 schrieb larsxschnei...@gmail.com:
> +# Compare two files and ensure that `clean` and `smudge` respectively are
> +# called at least once if specified in the `expect` file. The actual
> +# invocation count is not relevant because their number can vary.
> +# c.f. 
> http://public-inbox.org/git/xmqqshv18i8i@gitster.mtv.corp.google.com/
> +test_cmp_count () {
> + expect=$1
> + actual=$2
> + for FILE in "$expect" "$actual"
> + do
> + sort "$FILE" | uniq -c | sed "s/^[ ]*//" |
> + sed "s/^\([0-9]\) IN: clean/x IN: clean/" |
> + sed "s/^\([0-9]\) IN: smudge/x IN: smudge/" 
> >"$FILE.tmp" &&

This is not sufficiently portable. Some versions of uniq write the
count left-adjusted, not right-adjusted. How about this on top:

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index a20b9f58e3..f60858c517 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -40,10 +40,9 @@ test_cmp_count () {
actual=$2
for FILE in "$expect" "$actual"
do
-   sort "$FILE" | uniq -c | sed "s/^[ ]*//" |
-   sed "s/^\([0-9]\) IN: clean/x IN: clean/" |
-   sed "s/^\([0-9]\) IN: smudge/x IN: smudge/" 
>"$FILE.tmp" &&
-   mv "$FILE.tmp" "$FILE"
+   sort "$FILE" | uniq -c |
+   sed -e "s/^ *[0-9][0-9]* *IN: /x IN: /" >"$FILE.tmp" &&
+   mv "$FILE.tmp" "$FILE" || return
done &&
test_cmp "$expect" "$actual"
 }

-- Hannes

Re: What's cooking in git.git (Oct 2016, #09; Mon, 31)

2016-11-02 Thread Lars Schneider

On 2 Nov 2016, at 17:04, Torsten Bögershausen  wrote:

>> * ls/filter-process (2016-10-17) 14 commits
>>  (merged to 'next' on 2016-10-19 at ffd0de042c)
> 
> Some (late, as I recently got a new battery for the Mac OS 10.6 test system) 
> comments:
> t0021 failes here:
> 
> 
> Can't locate object method "flush" via package "IO::Handle" at 
> /Users/tb/projects/git/git.next/t/t0021/rot13-filter.pl line 90.
> fatal: The remote end hung up unexpectedly
> 
> 
> perl itself is 5.10 and we use the one shipped with Mac OS.
> Why that ?

Thanks for reporting! It surprises me that flush is not available. I don't have 
a 10.6 system to tests. Where you able to reproduce the problem on a newer 
system with an older Perl, too?

@Martin: do you have a clue?

I can't debug the issue right now (only on mobile) but I will look into it on 
Friday!

> t0021 uses the hard-coded path:
> t0021/rot13-filter.pl (around line 345) and the nice macro
> PERL_PATH from the Makefile is fully t

Can you check the line number? Rot13-filter.pl has only 192 lines. I'll look 
into the PERL_PATH. 

> Commenting out the different "flush" makes the test hang, and I haven't 
> digged further.

That would be expected because the pl file writes output to a file which is 
checked by the calling sh script.

Thanks,
Lars

Re: What's cooking in git.git (Oct 2016, #09; Mon, 31)

2016-11-02 Thread Johannes Sixt

Am 02.11.2016 um 18:04 schrieb Torsten Bögershausen:
>> * ls/filter-process (2016-10-17) 14 commits
>>   (merged to 'next' on 2016-10-19 at ffd0de042c)
> 
> Some (late, as I recently got a new battery for the Mac OS 10.6 test system) 
> comments:
> t0021 failes here:
> 
> 
> Can't locate object method "flush" via package "IO::Handle" at 
> /Users/tb/projects/git/git.next/t/t0021/rot13-filter.pl line 90.
> fatal: The remote end hung up unexpectedly
> 
> 
> perl itself is 5.10 and we use the one shipped with Mac OS.
> Why that ?
> t0021 uses the hard-coded path:
> t0021/rot13-filter.pl (around line 345) and the nice macro
> PERL_PATH from the Makefile is fully ignored.
> 
> Commenting out the different "flush" makes the test hang, and I haven't 
> digged further.
> 

https://public-inbox.org/git/e8deda5f-11a6-1463-4fc5-25454084c...@kdbg.org/

-- Hannes

Re: What's cooking in git.git (Oct 2016, #09; Mon, 31)

On Wed, Nov 02, 2016 at 05:04:15PM +, Torsten Bögershausen wrote:

> > * ls/filter-process (2016-10-17) 14 commits
> >   (merged to 'next' on 2016-10-19 at ffd0de042c)
> 
> Some (late, as I recently got a new battery for the Mac OS 10.6 test system) 
> comments:
> t0021 failes here:
> 
> 
> Can't locate object method "flush" via package "IO::Handle" at 
> /Users/tb/projects/git/git.next/t/t0021/rot13-filter.pl line 90.
> fatal: The remote end hung up unexpectedly
> 
> 
> perl itself is 5.10 and we use the one shipped with Mac OS.

Wow, haven't seen that bug in a while[1]. The problem is that STDIN is a
filehandle object, but older versions of perl do not automatically load
IO::Handle to get all of the methods. This was fixed in perl 5.13.

We can work around it with:

  use IO::Handle;

at the top of the script. That should work everywhere, as IO::Handle has
been part of the core system for ages. But another option would be to
just turn on autoflush, with:

  $| = 1;

at the top of the script (though it looks like we flush $debug, too, so
we'd probably need to "select $debug; $| = 1" there, too). The "use"
command is preferable IMHO.

> Why that ?
> t0021 uses the hard-coded path:
> t0021/rot13-filter.pl (around line 345) and the nice macro
> PERL_PATH from the Makefile is fully ignored.

Yeah, we should be using PERL_PATH. Doing so inside the filter config
value is probably a pain due to shell quoting issues. But we could
use write_script() to get a local copy.

I'll see if I can work up a patch.

-Peff

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=261953

[PATCH v3 2/5] commit: make ignore_non_trailer take buf/len

Make ignore_non_trailer take a buf/len pair instead of struct strbuf.

Signed-off-by: Jonathan Tan 
---
 builtin/commit.c |  2 +-
 commit.c | 22 +++---
 commit.h |  2 +-
 trailer.c|  2 +-
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 8976c3d..887ccc7 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -790,7 +790,7 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
strbuf_stripspace(, 0);
 
if (signoff)
-   append_signoff(, ignore_non_trailer(), 0);
+   append_signoff(, ignore_non_trailer(sb.buf, sb.len), 0);
 
if (fwrite(sb.buf, 1, sb.len, s->fp) < sb.len)
die_errno(_("could not write commit template"));
diff --git a/commit.c b/commit.c
index 856fd4a..2cf8515 100644
--- a/commit.c
+++ b/commit.c
@@ -1649,7 +1649,7 @@ const char *find_commit_header(const char *msg, const 
char *key, size_t *out_len
 }
 
 /*
- * Inspect sb and determine the true "end" of the log message, in
+ * Inspect the given string and determine the true "end" of the log message, in
  * order to find where to put a new Signed-off-by: line.  Ignored are
  * trailing comment lines and blank lines, and also the traditional
  * "Conflicts:" block that is not commented out, so that we can use
@@ -1659,37 +1659,37 @@ const char *find_commit_header(const char *msg, const 
char *key, size_t *out_len
  * Returns the number of bytes from the tail to ignore, to be fed as
  * the second parameter to append_signoff().
  */
-int ignore_non_trailer(struct strbuf *sb)
+int ignore_non_trailer(const char *buf, size_t len)
 {
int boc = 0;
int bol = 0;
int in_old_conflicts_block = 0;
 
-   while (bol < sb->len) {
-   char *next_line;
+   while (bol < len) {
+   const char *next_line = memchr(buf + bol, '\n', len - bol);
 
-   if (!(next_line = memchr(sb->buf + bol, '\n', sb->len - bol)))
-   next_line = sb->buf + sb->len;
+   if (!next_line)
+   next_line = buf + len;
else
next_line++;
 
-   if (sb->buf[bol] == comment_line_char || sb->buf[bol] == '\n') {
+   if (buf[bol] == comment_line_char || buf[bol] == '\n') {
/* is this the first of the run of comments? */
if (!boc)
boc = bol;
/* otherwise, it is just continuing */
-   } else if (starts_with(sb->buf + bol, "Conflicts:\n")) {
+   } else if (starts_with(buf + bol, "Conflicts:\n")) {
in_old_conflicts_block = 1;
if (!boc)
boc = bol;
-   } else if (in_old_conflicts_block && sb->buf[bol] == '\t') {
+   } else if (in_old_conflicts_block && buf[bol] == '\t') {
; /* a pathname in the conflicts block */
} else if (boc) {
/* the previous was not trailing comment */
boc = 0;
in_old_conflicts_block = 0;
}
-   bol = next_line - sb->buf;
+   bol = next_line - buf;
}
-   return boc ? sb->len - boc : 0;
+   return boc ? len - boc : 0;
 }
diff --git a/commit.h b/commit.h
index afd14f3..9c12abb 100644
--- a/commit.h
+++ b/commit.h
@@ -355,7 +355,7 @@ extern const char *find_commit_header(const char *msg, 
const char *key,
  size_t *out_len);
 
 /* Find the end of the log message, the right place for a new trailer. */
-extern int ignore_non_trailer(struct strbuf *sb);
+extern int ignore_non_trailer(const char *buf, size_t len);
 
 typedef void (*each_mergetag_fn)(struct commit *commit, struct 
commit_extra_header *extra,
 void *cb_data);
diff --git a/trailer.c b/trailer.c
index dc525e3..9d7765e 100644
--- a/trailer.c
+++ b/trailer.c
@@ -840,7 +840,7 @@ static int find_trailer_end(struct strbuf **lines, int 
patch_start)
 
for (i = 0; i < patch_start; i++)
strbuf_addbuf(, lines[i]);
-   ignore_bytes = ignore_non_trailer();
+   ignore_bytes = ignore_non_trailer(sb.buf, sb.len);
strbuf_release();
for (i = patch_start - 1; i >= 0 && ignore_bytes > 0; i--)
ignore_bytes -= lines[i]->len;
-- 
2.8.0.rc3.226.g39d4020

[PATCH v3 3/5] trailer: avoid unnecessary splitting on lines

trailer.c currently splits lines while processing a buffer (and also
rejoins lines when needing to invoke ignore_non_trailer).

Avoid such line splitting, except when generating the strings
corresponding to trailers (for ease of use by clients - a subsequent
patch will allow other components to obtain the layout of a trailer
block in a buffer, including the trailers themselves). The main purpose
of this is to make it easy to return pointers into the original buffer
(for a subsequent patch), but this also significantly reduces the number
of memory allocations required.

Signed-off-by: Jonathan Tan 
---
 trailer.c | 194 --
 1 file changed, 100 insertions(+), 94 deletions(-)

diff --git a/trailer.c b/trailer.c
index 9d7765e..afbff4b 100644
--- a/trailer.c
+++ b/trailer.c
@@ -102,12 +102,12 @@ static int same_trailer(struct trailer_item *a, struct 
arg_item *b)
return same_token(a, b) && same_value(a, b);
 }
 
-static inline int contains_only_spaces(const char *str)
+static inline int is_blank_line(const char *str)
 {
const char *s = str;
-   while (*s && isspace(*s))
+   while (*s && *s != '\n' && isspace(*s))
s++;
-   return !*s;
+   return !*s || *s == '\n';
 }
 
 static inline void strbuf_replace(struct strbuf *sb, const char *a, const char 
*b)
@@ -700,51 +700,71 @@ static void process_command_line_args(struct list_head 
*arg_head,
free(cl_separators);
 }
 
-static struct strbuf **read_input_file(const char *file)
+static void read_input_file(struct strbuf *sb, const char *file)
 {
-   struct strbuf **lines;
-   struct strbuf sb = STRBUF_INIT;
-
if (file) {
-   if (strbuf_read_file(, file, 0) < 0)
+   if (strbuf_read_file(sb, file, 0) < 0)
die_errno(_("could not read input file '%s'"), file);
} else {
-   if (strbuf_read(, fileno(stdin), 0) < 0)
+   if (strbuf_read(sb, fileno(stdin), 0) < 0)
die_errno(_("could not read from stdin"));
}
+}
 
-   lines = strbuf_split(, '\n');
+static const char *next_line(const char *str)
+{
+   const char *nl = strchrnul(str, '\n');
+   return nl + !!*nl;
+}
 
-   strbuf_release();
+/*
+ * Return the position of the start of the last line. If len is 0, return -1.
+ */
+static int last_line(const char *buf, size_t len)
+{
+   int i;
+   if (len == 0)
+   return -1;
+   if (len == 1)
+   return 0;
+   /*
+* Skip the last character (in addition to the null terminator),
+* because if the last character is a newline, it is considered as part
+* of the last line anyway.
+*/
+   i = len - 2;
 
-   return lines;
+   for (; i >= 0; i--) {
+   if (buf[i] == '\n')
+   return i + 1;
+   }
+   return 0;
 }
 
 /*
- * Return the (0 based) index of the start of the patch or the line
- * count if there is no patch in the message.
+ * Return the position of the start of the patch or the length of str if there
+ * is no patch in the message.
  */
-static int find_patch_start(struct strbuf **lines, int count)
+static int find_patch_start(const char *str)
 {
-   int i;
+   const char *s;
 
-   /* Get the start of the patch part if any */
-   for (i = 0; i < count; i++) {
-   if (starts_with(lines[i]->buf, "---"))
-   return i;
+   for (s = str; *s; s = next_line(s)) {
+   if (starts_with(s, "---"))
+   return s - str;
}
 
-   return count;
+   return s - str;
 }
 
 /*
- * Return the (0 based) index of the first trailer line or count if
- * there are no trailers. Trailers are searched only in the lines from
- * index (count - 1) down to index 0.
+ * Return the position of the first trailer line or len if there are no
+ * trailers.
  */
-static int find_trailer_start(struct strbuf **lines, int count)
+static int find_trailer_start(const char *buf, size_t len)
 {
-   int start, end_of_title, only_spaces = 1;
+   const char *s;
+   int end_of_title, l, only_spaces = 1;
int recognized_prefix = 0, trailer_lines = 0, non_trailer_lines = 0;
/*
 * Number of possible continuation lines encountered. This will be
@@ -756,13 +776,13 @@ static int find_trailer_start(struct strbuf **lines, int 
count)
int possible_continuation_lines = 0;
 
/* The first paragraph is the title and cannot be trailers */
-   for (start = 0; start < count; start++) {
-   if (lines[start]->buf[0] == comment_line_char)
+   for (s = buf; s < buf + len; s = next_line(s)) {
+   if (s[0] == comment_line_char)
continue;
-   if (contains_only_spaces(lines[start]->buf))
+   if (is_blank_line(s))

[PATCH v3 5/5] sequencer: use trailer's trailer layout

Make sequencer use trailer.c's trailer layout definition, as opposed to
parsing the footer by itself. This makes "commit -s", "cherry-pick -x",
and "format-patch --signoff" consistent with trailer, allowing
non-trailer lines and multiple-line trailers in trailer blocks under
certain conditions, and therefore suppressing the extra newline in those
cases.

Consistency with trailer extends to respecting trailer configs.  Tests
have been included to show that.

Signed-off-by: Jonathan Tan 
---
 sequencer.c  | 75 +---
 t/t3511-cherry-pick-x.sh | 16 +--
 t/t4014-format-patch.sh  | 37 
 t/t7501-commit.sh| 36 +++
 4 files changed, 95 insertions(+), 69 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 5fd75f3..d64c973 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -16,6 +16,7 @@
 #include "refs.h"
 #include "argv-array.h"
 #include "quote.h"
+#include "trailer.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -56,30 +57,6 @@ static const char *get_todo_path(const struct replay_opts 
*opts)
return git_path_todo_file();
 }
 
-static int is_rfc2822_line(const char *buf, int len)
-{
-   int i;
-
-   for (i = 0; i < len; i++) {
-   int ch = buf[i];
-   if (ch == ':')
-   return 1;
-   if (!isalnum(ch) && ch != '-')
-   break;
-   }
-
-   return 0;
-}
-
-static int is_cherry_picked_from_line(const char *buf, int len)
-{
-   /*
-* We only care that it looks roughly like (cherry picked from ...)
-*/
-   return len > strlen(cherry_picked_prefix) + 1 &&
-   starts_with(buf, cherry_picked_prefix) && buf[len - 1] == ')';
-}
-
 /*
  * Returns 0 for non-conforming footer
  * Returns 1 for conforming footer
@@ -89,49 +66,25 @@ static int is_cherry_picked_from_line(const char *buf, int 
len)
 static int has_conforming_footer(struct strbuf *sb, struct strbuf *sob,
int ignore_footer)
 {
-   char prev;
-   int i, k;
-   int len = sb->len - ignore_footer;
-   const char *buf = sb->buf;
-   int found_sob = 0;
-
-   /* footer must end with newline */
-   if (!len || buf[len - 1] != '\n')
-   return 0;
+   struct trailer_info info;
+   int i;
+   int found_sob = 0, found_sob_last = 0;
 
-   prev = '\0';
-   for (i = len - 1; i > 0; i--) {
-   char ch = buf[i];
-   if (prev == '\n' && ch == '\n') /* paragraph break */
-   break;
-   prev = ch;
-   }
+   trailer_info_get(, sb->buf);
 
-   /* require at least one blank line */
-   if (prev != '\n' || buf[i] != '\n')
+   if (info.trailer_start == info.trailer_end)
return 0;
 
-   /* advance to start of last paragraph */
-   while (i < len - 1 && buf[i] == '\n')
-   i++;
-
-   for (; i < len; i = k) {
-   int found_rfc2822;
-
-   for (k = i; k < len && buf[k] != '\n'; k++)
-   ; /* do nothing */
-   k++;
+   for (i = 0; i < info.trailer_nr; i++)
+   if (sob && !strncmp(info.trailers[i], sob->buf, sob->len)) {
+   found_sob = 1;
+   if (i == info.trailer_nr - 1)
+   found_sob_last = 1;
+   }
 
-   found_rfc2822 = is_rfc2822_line(buf + i, k - i - 1);
-   if (found_rfc2822 && sob &&
-   !strncmp(buf + i, sob->buf, sob->len))
-   found_sob = k;
+   trailer_info_release();
 
-   if (!(found_rfc2822 ||
- is_cherry_picked_from_line(buf + i, k - i - 1)))
-   return 0;
-   }
-   if (found_sob == i)
+   if (found_sob_last)
return 3;
if (found_sob)
return 2;
diff --git a/t/t3511-cherry-pick-x.sh b/t/t3511-cherry-pick-x.sh
index 9cce5ae..bf0a5c9 100755
--- a/t/t3511-cherry-pick-x.sh
+++ b/t/t3511-cherry-pick-x.sh
@@ -25,9 +25,8 @@ Signed-off-by: B.U. Thor "
 
 mesg_broken_footer="$mesg_no_footer
 
-The signed-off-by string should begin with the words Signed-off-by followed
-by a colon and space, and then the signers name and email address. e.g.
-Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
+This is not recognized as a footer because Myfooter is not a recognized token.
+Myfooter: A.U. Thor "
 
 mesg_with_footer_sob="$mesg_with_footer
 Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>"
@@ -112,6 +111,17 @@ test_expect_success 'cherry-pick -s inserts blank line 
after non-conforming foot
test_cmp expect actual
 '
 
+test_expect_success 'cherry-pick -s recognizes trailer config' '
+   pristine_detach initial &&
+   git -c

[PATCH v3 4/5] trailer: have function to describe trailer layout

Create a function that, taking a string, describes the position of its
trailer block (if available) and the contents thereof, and make trailer
use it. This makes it easier for other Git components, in the future, to
interpret trailer blocks in the same way as trailer.

In a subsequent patch, another component will be made to use this.

Signed-off-by: Jonathan Tan 
---
 trailer.c | 118 +++---
 trailer.h |  25 +
 2 files changed, 107 insertions(+), 36 deletions(-)

diff --git a/trailer.c b/trailer.c
index afbff4b..bc6893b 100644
--- a/trailer.c
+++ b/trailer.c
@@ -46,6 +46,8 @@ static LIST_HEAD(conf_head);
 
 static char *separators = ":";
 
+static int configured;
+
 #define TRAILER_ARG_STRING "$ARG"
 
 static const char *git_generated_prefixes[] = {
@@ -546,6 +548,17 @@ static int git_trailer_config(const char *conf_key, const 
char *value, void *cb)
return 0;
 }
 
+static void ensure_configured(void)
+{
+   if (configured)
+   return;
+
+   /* Default config must be setup first */
+   git_config(git_trailer_default_config, NULL);
+   git_config(git_trailer_config, NULL);
+   configured = 1;
+}
+
 static const char *token_from_item(struct arg_item *item, char *tok)
 {
if (item->conf.key)
@@ -873,59 +886,43 @@ static int process_input_file(FILE *outfile,
  const char *str,
  struct list_head *head)
 {
-   int patch_start, trailer_start, trailer_end;
+   struct trailer_info info;
struct strbuf tok = STRBUF_INIT;
struct strbuf val = STRBUF_INIT;
-   struct trailer_item *last = NULL;
-   struct strbuf *trailer, **trailer_lines, **ptr;
+   int i;
 
-   patch_start = find_patch_start(str);
-   trailer_end = find_trailer_end(str, patch_start);
-   trailer_start = find_trailer_start(str, trailer_end);
+   trailer_info_get(, str);
 
/* Print lines before the trailers as is */
-   fwrite(str, 1, trailer_start, outfile);
+   fwrite(str, 1, info.trailer_start - str, outfile);
 
-   if (!ends_with_blank_line(str, trailer_start))
+   if (!info.blank_line_before_trailer)
fprintf(outfile, "\n");
 
-   /* Parse trailer lines */
-   trailer_lines = strbuf_split_buf(str + trailer_start,
-trailer_end - trailer_start,
-'\n',
-0);
-   for (ptr = trailer_lines; *ptr; ptr++) {
+   for (i = 0; i < info.trailer_nr; i++) {
int separator_pos;
-   trailer = *ptr;
-   if (trailer->buf[0] == comment_line_char)
-   continue;
-   if (last && isspace(trailer->buf[0])) {
-   struct strbuf sb = STRBUF_INIT;
-   strbuf_addf(, "%s\n%s", last->value, trailer->buf);
-   strbuf_strip_suffix(, "\n");
-   free(last->value);
-   last->value = strbuf_detach(, NULL);
+   char *trailer = info.trailers[i];
+   if (trailer[0] == comment_line_char)
continue;
-   }
-   separator_pos = find_separator(trailer->buf, separators);
+   separator_pos = find_separator(trailer, separators);
if (separator_pos >= 1) {
-   parse_trailer(, , NULL, trailer->buf,
+   parse_trailer(, , NULL, trailer,
  separator_pos);
-   last = add_trailer_item(head,
-   strbuf_detach(, NULL),
-   strbuf_detach(, NULL));
+   add_trailer_item(head,
+strbuf_detach(, NULL),
+strbuf_detach(, NULL));
} else {
-   strbuf_addbuf(, trailer);
+   strbuf_addstr(, trailer);
strbuf_strip_suffix(, "\n");
add_trailer_item(head,
 NULL,
 strbuf_detach(, NULL));
-   last = NULL;
}
}
-   strbuf_list_free(trailer_lines);
 
-   return trailer_end;
+   trailer_info_release();
+
+   return info.trailer_end - str;
 }
 
 static void free_all(struct list_head *head)
@@ -976,9 +973,7 @@ void process_trailers(const char *file, int in_place, int 
trim_empty, struct str
int trailer_end;
FILE *outfile = stdout;
 
-   /* Default config must be setup first */
-   git_config(git_trailer_default_config, NULL);
-   git_config(git_trailer_config, NULL);
+   ensure_configured();

[PATCH v3 0/5] Make other git commands use trailer layout

This is the same as v2 except that in 1/5, the comment about
find_separators has been moved from the commit message to the code
itself. Also, a trailing whitespace and unused variable fix.

Jonathan Tan (5):
  trailer: be stricter in parsing separators
  commit: make ignore_non_trailer take buf/len
  trailer: avoid unnecessary splitting on lines
  trailer: have function to describe trailer layout
  sequencer: use trailer's trailer layout

 builtin/commit.c |   2 +-
 commit.c |  22 ++--
 commit.h |   2 +-
 sequencer.c  |  75 +++-
 t/t3511-cherry-pick-x.sh |  16 ++-
 t/t4014-format-patch.sh  |  37 +-
 t/t7501-commit.sh|  36 ++
 trailer.c| 299 +--
 trailer.h|  25 
 9 files changed, 316 insertions(+), 198 deletions(-)

-- 
2.8.0.rc3.226.g39d4020

[PATCH v3 1/5] trailer: be stricter in parsing separators

Currently, a line is interpreted to be a trailer line if it contains a
separator. Make parsing stricter by requiring the text on the left of
the separator, if not the empty string, to be of the "" form.

Signed-off-by: Jonathan Tan 
---
 trailer.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/trailer.c b/trailer.c
index f0ecde2..dc525e3 100644
--- a/trailer.c
+++ b/trailer.c
@@ -563,15 +563,30 @@ static int token_matches_item(const char *tok, struct 
arg_item *item, int tok_le
 }
 
 /*
- * Return the location of the first separator in line, or -1 if there is no
- * separator.
+ * If the given line is of the form
+ * "..." or "...", return the
+ * location of the separator. Otherwise, return -1.
+ *
+ * The separator-starts-line case (in which this function returns 0) is
+ * distinguished from the non-well-formed-line case (in which this function
+ * returns -1) because some callers of this function need such a distinction.
  */
 static int find_separator(const char *line, const char *separators)
 {
-   int loc = strcspn(line, separators);
-   if (!line[loc])
-   return -1;
-   return loc;
+   int whitespace_found = 0;
+   const char *c;
+   for (c = line; *c; c++) {
+   if (strchr(separators, *c))
+   return c - line;
+   if (!whitespace_found && (isalnum(*c) || *c == '-'))
+   continue;
+   if (c != line && (*c == ' ' || *c == '\t')) {
+   whitespace_found = 1;
+   continue;
+   }
+   break;
+   }
+   return -1;
 }
 
 /*
-- 
2.8.0.rc3.226.g39d4020

Re: What's cooking in git.git (Oct 2016, #09; Mon, 31)

2016-11-02 Thread Torsten Bögershausen

> * ls/filter-process (2016-10-17) 14 commits
>   (merged to 'next' on 2016-10-19 at ffd0de042c)

Some (late, as I recently got a new battery for the Mac OS 10.6 test system) 
comments:
t0021 failes here:


Can't locate object method "flush" via package "IO::Handle" at 
/Users/tb/projects/git/git.next/t/t0021/rot13-filter.pl line 90.
fatal: The remote end hung up unexpectedly


perl itself is 5.10 and we use the one shipped with Mac OS.
Why that ?
t0021 uses the hard-coded path:
t0021/rot13-filter.pl (around line 345) and the nice macro
PERL_PATH from the Makefile is fully ignored.

Commenting out the different "flush" makes the test hang, and I haven't digged 
further.

Re: How can I tell, from a script, if “git cherry-pick” fails?

2016-11-02 Thread Pranit Bauva

Hey Kevin,

On Wed, Nov 2, 2016 at 8:44 PM, Kevin Layer  wrote:
> If the cherry-pick fails due to a merge conflict, it just returns an
> exit status of 0.  I have a script that does a series of cherry-picks
> and I need to know if they succeed.

Well, I haven't checked what it returns in git 2.10, but you can
always redirect the stdout and stderr to the output and grep for the
text which it shows in the actual output. Here[1] is an example of how
to do it.

[1]: 
https://github.com/git/git/blob/master/t/t3507-cherry-pick-conflict.sh#L42-L55

Regards,
Pranit Bauva

Re: How can I tell, from a script, if “git cherry-pick” fails?

2016-11-02 Thread Kevin Layer

Nevermind.  It's working as it should.  The script that was doing the
cherry-pick was doing it in an if and I neglected to exit with a
non-zero status.  Sorry for the noise.

On Wed, Nov 2, 2016 at 8:14 AM, Kevin Layer  wrote:
> If the cherry-pick fails due to a merge conflict, it just returns an
> exit status of 0.  I have a script that does a series of cherry-picks
> and I need to know if they succeed.
>
> I'm sure this has been covered before.
>
> Using git version 1.8.3.1.
>
> Thank you.
>
> Kevin

How can I tell, from a script, if “git cherry-pick” fails?

2016-11-02 Thread Kevin Layer

If the cherry-pick fails due to a merge conflict, it just returns an
exit status of 0.  I have a script that does a series of cherry-picks
and I need to know if they succeed.

I'm sure this has been covered before.

Using git version 1.8.3.1.

Thank you.

Kevin

Re: RFE: Discard hunks during `git add -p`

2016-11-02 Thread Konstantin Khomoutov

On Wed, 2 Nov 2016 14:46:04 +0100 (CET)
Jan Engelhardt  wrote:

> Current version: 2.10.2
> Example workflow:
> 
> * I would do a global substitution across a source tree, e.g. `perl
> -i -pe 's{OLD_FOO\(x\)}{NEW_BAR(x, 0)}' *.c`
> * Using `git add -p`, I would verify each of the substitutions that
> they make sense in their respective locations, and, based on that,
>   answer "y" or "n" to the interactive prompting to stage good hunks.
> * When done with add-p, I would commit the so-staged hunks,
>   and then use `git reset --hard` to discard all changes that were 
>   not acknowledged during add-p.
> 
> Being able to discard hunks (reset working copy to index contents) 
> during add-p would alleviate the (quite broad) hard reset.

Couldn't you just do

  git checkout -- .

after staging your approved changes?

To selectively zap uncommitted changes from your working tree, you could
do

  git checkout --patch -- .


I'm not sure overloading `git add` with a "reverse" action is a good
idea.  I'm actually prefer pragmatism over conceptual purity but I'm
not sure the prospective gain here is clear.

RFE: Discard hunks during `git add -p`

2016-11-02 Thread Jan Engelhardt


Current version: 2.10.2
Example workflow:

* I would do a global substitution across a source tree, e.g. `perl -i 
  -pe 's{OLD_FOO\(x\)}{NEW_BAR(x, 0)}' *.c`
* Using `git add -p`, I would verify each of the substitutions that they 
  make sense in their respective locations, and, based on that,
  answer "y" or "n" to the interactive prompting to stage good hunks.
* When done with add-p, I would commit the so-staged hunks,
  and then use `git reset --hard` to discard all changes that were 
  not acknowledged during add-p.

Being able to discard hunks (reset working copy to index contents) 
during add-p would alleviate the (quite broad) hard reset.

Similar approach:

* global substitution
* Using `git add -p`, some hunks may warrant some more editing, doable 
  with the "e" command. The index would be updated with the extra
  change, but the working copy be left as-is.
* When rerunning `git add -p` in such a state, a difference is shown 
  again for such edited spots, which I would like to discard (bring 
  the working copy into sync with index).

[PATCH 5/5] exclude: do not respect symlinks for in-tree .gitignore

Like .gitattributes, we would like to make sure that
.gitignore files are handled consistently whether read from
the index or from the filesystem. We can do so by using
O_NOFOLLOW when opening the files.

Signed-off-by: Jeff King 
---
 dir.c  |  9 +++--
 t/t0008-ignores.sh | 29 +
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index 4fa1ca109..cf3fde005 100644
--- a/dir.c
+++ b/dir.c
@@ -693,6 +693,7 @@ static void invalidate_directory(struct untracked_cache *uc,
 
 /* Flags for add_excludes() */
 #define EXCLUDE_CHECK_INDEX (1<<0)
+#define EXCLUDE_NOFOLLOW (1<<1)
 
 /*
  * Given a file with name "fname", read it (either from disk, or from
@@ -713,7 +714,11 @@ static int add_excludes(const char *fname, const char 
*base, int baselen,
size_t size = 0;
char *buf, *entry;
 
-   fd = open(fname, O_RDONLY);
+   if (flags & EXCLUDE_NOFOLLOW)
+   fd = open_nofollow(fname, O_RDONLY);
+   else
+   fd = open(fname, O_RDONLY);
+
if (fd < 0 || fstat(fd, ) < 0) {
if (errno != ENOENT)
warn_on_inaccessible(fname);
@@ -1130,7 +1135,7 @@ static void prep_exclude(struct dir_struct *dir, const 
char *base, int baselen)
strbuf_addstr(, dir->exclude_per_dir);
el->src = strbuf_detach(, NULL);
add_excludes(el->src, el->src, stk->baselen, el,
-EXCLUDE_CHECK_INDEX,
+EXCLUDE_CHECK_INDEX | EXCLUDE_NOFOLLOW,
 untracked ? _stat : NULL);
}
/*
diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index d27f438bf..7348b8e6a 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -841,4 +841,33 @@ test_expect_success 'info/exclude trumps 
core.excludesfile' '
test_cmp expect actual
 '
 
+test_expect_success SYMLINKS 'set up ignore file for symlink tests' '
+   echo "*" >ignore
+'
+
+test_expect_success SYMLINKS 'symlinks respected in core.excludesFile' '
+   test_when_finished "rm symlink" &&
+   ln -s ignore symlink &&
+   test_config core.excludesFile "$(pwd)/symlink" &&
+   echo file >expect &&
+   git check-ignore file >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success SYMLINKS 'symlinks respected in info/exclude' '
+   test_when_finished "rm .git/info/exclude" &&
+   ln -sf ../../ignore .git/info/exclude &&
+   echo file >expect &&
+   git check-ignore file >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success SYMLINKS 'symlinks not respected in-tree' '
+   test_when_finished "rm .gitignore" &&
+   ln -sf ignore .gitignore &&
+   >expect &&
+   test_must_fail git check-ignore file >actual &&
+   test_cmp expect actual
+'
+
 test_done
-- 
2.11.0.rc0.258.gf434c15

[PATCH 4/5] attr: do not respect symlinks for in-tree .gitattributes

The attributes system may sometimes read in-tree files from
the filesystem, and sometimes from the index. In the latter
case, we do not resolve symbolic links (and are not likely
to ever start doing so). Let's open filesystem links with
O_NOFOLLOW so that the two cases behave consistently.

As a bonus, this means that git will not follow such
symlinks to read and parse out-of-tree paths. It's unlikely
that this could have any security implications (a malicious
repository can already feed arbitrary content to the
attribute parser, and any disclosure of the out-of-tree
contents happens only to stderr). But it's one less oddball
thing to worry about.

Note that O_NOFOLLOW only prevents following links for the
path itself, not intermediate directories in the path.  At
first glance, it seems like

  ln -s /some/path in-repo

might still look at "in-repo/.gitattributes", following the
symlink to "/some/path/.gitattributes". However, if
"in-repo" is a symbolic link, then we know that it has no
git paths below it, and will never look at its
.gitattributes file.

We will continue to support out-of-tree symbolic links
(e.g., in $GIT_DIR/info/attributes); this just affects
in-tree links. When a symbolic link is encountered, the
contents are ignored and a warning is printed. POSIX
specifies ELOOP in this case, so the user would generally
see something like:

  warning: unable to access '.gitattributes': Too many levels of symbolic links

Signed-off-by: Jeff King 
---
 attr.c| 17 +
 t/t0003-attributes.sh | 31 +++
 2 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/attr.c b/attr.c
index 79bd89226..abf375c8a 100644
--- a/attr.c
+++ b/attr.c
@@ -153,6 +153,7 @@ static const char blank[] = " \t\r\n";
 
 /* Flags usable in read_attr() and parse_attr_line() family of functions. */
 #define READ_ATTR_MACRO_OK (1<<0)
+#define READ_ATTR_NOFOLLOW (1<<1)
 
 /*
  * Parse a whitespace-delimited attribute state (i.e., "attr",
@@ -371,16 +372,24 @@ static struct index_state *use_index;
 
 static struct attr_stack *read_attr_from_file(const char *path, unsigned flags)
 {
-   FILE *fp = fopen(path, "r");
+   int fd;
+   FILE *fp;
struct attr_stack *res;
char buf[2048];
int lineno = 0;
 
-   if (!fp) {
+   if (flags & READ_ATTR_NOFOLLOW)
+   fd = open_nofollow(path, O_RDONLY);
+   else
+   fd = open(path, O_RDONLY);
+
+   if (fd < 0) {
if (errno != ENOENT && errno != ENOTDIR)
warn_on_inaccessible(path);
return NULL;
}
+   fp = xfdopen(fd, "r");
+
res = xcalloc(1, sizeof(*res));
while (fgets(buf, sizeof(buf), fp)) {
char *bufp = buf;
@@ -528,7 +537,7 @@ static void bootstrap_attr_stack(void)
}
 
if (!is_bare_repository() || direction == GIT_ATTR_INDEX) {
-   elem = read_attr(GITATTRIBUTES_FILE, flags);
+   elem = read_attr(GITATTRIBUTES_FILE, flags | 
READ_ATTR_NOFOLLOW);
elem->origin = xstrdup("");
elem->originlen = 0;
elem->prev = attr_stack;
@@ -620,7 +629,7 @@ static void prepare_attr_stack(const char *path, int dirlen)
strbuf_add(, path, cp - path);
strbuf_addch(, '/');
strbuf_addstr(, GITATTRIBUTES_FILE);
-   elem = read_attr(pathbuf.buf, 0);
+   elem = read_attr(pathbuf.buf, READ_ATTR_NOFOLLOW);
strbuf_setlen(, cp - path);
elem->origin = strbuf_detach(, 
>originlen);
elem->prev = attr_stack;
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index f0fbb4255..bd01945e5 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -297,4 +297,35 @@ test_expect_success 'bare repository: test 
info/attributes' '
)
 '
 
+check_symlink () {
+   echo "$1: symlink: $2" >expect &&
+   git check-attr symlink "$1" >actual &&
+   test_cmp expect actual
+}
+
+test_expect_success SYMLINKS 'set up attr file for symlink tests' '
+   echo "* symlink" >attr
+'
+
+test_expect_success SYMLINKS 'symlinks respected in core.attributesFile' '
+   test_when_finished "rm symlink" &&
+   ln -s attr symlink &&
+   test_config core.attributesFile "$(pwd)/symlink" &&
+   check_symlink file set
+'
+
+test_expect_success SYMLINKS 'symlinks respected in info/attributes' '
+   test_when_finished "rm .git/info/attributes" &&
+   ln -s ../../attr .git/info/attributes &&
+   check_symlink file set
+'
+
+test_expect_success SYMLINKS 'symlinks not respected in-tree' '
+   test_when_finished "rm .gitattributes" &&
+   ln -sf attr .gitattributes &&
+   mkdir subdir &&
+   ln -sf ../attr subdir/.gitattributes &&
+   check_symlink subdir/file unspecified
+'
+

[PATCH 3/5] exclude: convert "check_index" into a flags field

We pass the "check_index" flag through the variants of
add_excludes(). Let's turn this into a full flags bit-field,
so that we can add more flags to it without affecting the
function signature.

Note that only one caller actually needs to use the new flag
name, as the rest all were passing "0" already.

Signed-off-by: Jeff King 
---
 dir.c | 13 +
 dir.h |  2 +-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index bfa8c8a9a..4fa1ca109 100644
--- a/dir.c
+++ b/dir.c
@@ -691,6 +691,9 @@ static void invalidate_directory(struct untracked_cache *uc,
dir->dirs[i]->recurse = 0;
 }
 
+/* Flags for add_excludes() */
+#define EXCLUDE_CHECK_INDEX (1<<0)
+
 /*
  * Given a file with name "fname", read it (either from disk, or from
  * the index if "check_index" is non-zero), parse it and store the
@@ -701,9 +704,10 @@ static void invalidate_directory(struct untracked_cache 
*uc,
  * ss_valid is non-zero, "ss" must contain good value as input.
  */
 static int add_excludes(const char *fname, const char *base, int baselen,
-   struct exclude_list *el, int check_index,
+   struct exclude_list *el, unsigned flags,
struct sha1_stat *sha1_stat)
 {
+   int check_index = !!(flags & EXCLUDE_CHECK_INDEX);
struct stat st;
int fd, i, lineno = 1;
size_t size = 0;
@@ -787,9 +791,9 @@ static int add_excludes(const char *fname, const char 
*base, int baselen,
 
 int add_excludes_from_file_to_list(const char *fname, const char *base,
   int baselen, struct exclude_list *el,
-  int check_index)
+  unsigned flags)
 {
-   return add_excludes(fname, base, baselen, el, check_index, NULL);
+   return add_excludes(fname, base, baselen, el, flags, NULL);
 }
 
 struct exclude_list *add_exclude_list(struct dir_struct *dir,
@@ -1125,7 +1129,8 @@ static void prep_exclude(struct dir_struct *dir, const 
char *base, int baselen)
strbuf_addbuf(, >basebuf);
strbuf_addstr(, dir->exclude_per_dir);
el->src = strbuf_detach(, NULL);
-   add_excludes(el->src, el->src, stk->baselen, el, 1,
+   add_excludes(el->src, el->src, stk->baselen, el,
+EXCLUDE_CHECK_INDEX,
 untracked ? _stat : NULL);
}
/*
diff --git a/dir.h b/dir.h
index 97c83bb38..ba7eb924c 100644
--- a/dir.h
+++ b/dir.h
@@ -239,7 +239,7 @@ extern int is_excluded(struct dir_struct *dir, const char 
*name, int *dtype);
 extern struct exclude_list *add_exclude_list(struct dir_struct *dir,
 int group_type, const char *src);
 extern int add_excludes_from_file_to_list(const char *fname, const char *base, 
int baselen,
- struct exclude_list *el, int 
check_index);
+ struct exclude_list *el, unsigned 
flags);
 extern void add_excludes_from_file(struct dir_struct *, const char *fname);
 extern void parse_exclude_pattern(const char **string, int *patternlen, 
unsigned *flags, int *nowildcardlen);
 extern void add_exclude(const char *string, const char *base,
-- 
2.11.0.rc0.258.gf434c15

[PATCH 2/5] attr: convert "macro_ok" into a flags field

The attribute code can have a rather deep callstack, through
which we have to pass the "macro_ok" flag. In anticipation
of adding other flags, let's convert this to a generic
bit-field.

Signed-off-by: Jeff King 
---
 attr.c | 43 ---
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/attr.c b/attr.c
index 1fcf042b8..79bd89226 100644
--- a/attr.c
+++ b/attr.c
@@ -151,6 +151,9 @@ struct match_attr {
 
 static const char blank[] = " \t\r\n";
 
+/* Flags usable in read_attr() and parse_attr_line() family of functions. */
+#define READ_ATTR_MACRO_OK (1<<0)
+
 /*
  * Parse a whitespace-delimited attribute state (i.e., "attr",
  * "-attr", "!attr", or "attr=value") from the string starting at src.
@@ -200,7 +203,7 @@ static const char *parse_attr(const char *src, int lineno, 
const char *cp,
 }
 
 static struct match_attr *parse_attr_line(const char *line, const char *src,
- int lineno, int macro_ok)
+ int lineno, unsigned flags)
 {
int namelen;
int num_attr, i;
@@ -215,7 +218,7 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
namelen = strcspn(name, blank);
if (strlen(ATTRIBUTE_MACRO_PREFIX) < namelen &&
starts_with(name, ATTRIBUTE_MACRO_PREFIX)) {
-   if (!macro_ok) {
+   if (!(flags & READ_ATTR_MACRO_OK)) {
fprintf(stderr, "%s not allowed: %s:%d\n",
name, src, lineno);
return NULL;
@@ -339,11 +342,11 @@ static void handle_attr_line(struct attr_stack *res,
 const char *line,
 const char *src,
 int lineno,
-int macro_ok)
+unsigned flags)
 {
struct match_attr *a;
 
-   a = parse_attr_line(line, src, lineno, macro_ok);
+   a = parse_attr_line(line, src, lineno, flags);
if (!a)
return;
ALLOC_GROW(res->attrs, res->num_matches + 1, res->alloc);
@@ -358,14 +361,15 @@ static struct attr_stack *read_attr_from_array(const char 
**list)
 
res = xcalloc(1, sizeof(*res));
while ((line = *(list++)) != NULL)
-   handle_attr_line(res, line, "[builtin]", ++lineno, 1);
+   handle_attr_line(res, line, "[builtin]", ++lineno,
+READ_ATTR_MACRO_OK);
return res;
 }
 
 static enum git_attr_direction direction;
 static struct index_state *use_index;
 
-static struct attr_stack *read_attr_from_file(const char *path, int macro_ok)
+static struct attr_stack *read_attr_from_file(const char *path, unsigned flags)
 {
FILE *fp = fopen(path, "r");
struct attr_stack *res;
@@ -382,13 +386,13 @@ static struct attr_stack *read_attr_from_file(const char 
*path, int macro_ok)
char *bufp = buf;
if (!lineno)
skip_utf8_bom(, strlen(bufp));
-   handle_attr_line(res, bufp, path, ++lineno, macro_ok);
+   handle_attr_line(res, bufp, path, ++lineno, flags);
}
fclose(fp);
return res;
 }
 
-static struct attr_stack *read_attr_from_index(const char *path, int macro_ok)
+static struct attr_stack *read_attr_from_index(const char *path, unsigned 
flags)
 {
struct attr_stack *res;
char *buf, *sp;
@@ -406,34 +410,34 @@ static struct attr_stack *read_attr_from_index(const char 
*path, int macro_ok)
;
more = (*ep == '\n');
*ep = '\0';
-   handle_attr_line(res, sp, path, ++lineno, macro_ok);
+   handle_attr_line(res, sp, path, ++lineno, flags);
sp = ep + more;
}
free(buf);
return res;
 }
 
-static struct attr_stack *read_attr(const char *path, int macro_ok)
+static struct attr_stack *read_attr(const char *path, unsigned flags)
 {
struct attr_stack *res;
 
if (direction == GIT_ATTR_CHECKOUT) {
-   res = read_attr_from_index(path, macro_ok);
+   res = read_attr_from_index(path, flags);
if (!res)
-   res = read_attr_from_file(path, macro_ok);
+   res = read_attr_from_file(path, flags);
}
else if (direction == GIT_ATTR_CHECKIN) {
-   res = read_attr_from_file(path, macro_ok);
+   res = read_attr_from_file(path, flags);
if (!res)
/*
 * There is no checked out .gitattributes file there, 
but
 * we might have it in the index.  We allow operation 
in a
 * sparsely checked out work tree, so read from it.
 */
-   res = read_attr_from_index(path,

[PATCH 1/5] add open_nofollow() helper

Some callers of open() would like to optionally use
O_NOFOLLOW, but it is not available on all platforms. We
could abstract this by publicly defining O_NOFOLLOW to 0 on
those platforms, but that leaves us no room for any
workarounds (e.g., by checking the file type via lstat()).

Instead, let's abstract it into its own function. We don't
implement any workarounds here, but it it would be easy to
add them later.

Signed-off-by: Jeff King 
---
I didn't add the workaround because I think the current callers are OK
with "best effort", and doing it non-racily is quite tricky (though we
might also be OK with a racy version; we are not trying to beat
/tmp races, but just making sure a checkout that we did is sane).

 git-compat-util.h | 3 +++
 wrapper.c | 8 
 2 files changed, 11 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 87237b092..a2cc33ebc 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1080,6 +1080,9 @@ int access_or_die(const char *path, int mode, unsigned 
flag);
 /* Warn on an inaccessible file that ought to be accessible */
 void warn_on_inaccessible(const char *path);
 
+/* Open with O_NOFOLLOW, if available on this platform */
+int open_nofollow(const char *path, int flags);
+
 #ifdef GMTIME_UNRELIABLE_ERRORS
 struct tm *git_gmtime(const time_t *);
 struct tm *git_gmtime_r(const time_t *, struct tm *);
diff --git a/wrapper.c b/wrapper.c
index e7f197996..6bc7f24f5 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -679,3 +679,11 @@ void sleep_millisec(int millisec)
 {
poll(NULL, 0, millisec);
 }
+
+#ifndef O_NOFOLLOW
+#define O_NOFOLLOW 0
+#endif
+int open_nofollow(const char *path, int flags)
+{
+   return open(path, flags | O_NOFOLLOW);
+}
-- 
2.11.0.rc0.258.gf434c15

[PATCH 0/5] disallow symlinks for .gitignore and .gitattributes