Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Johannes Sixt

Am 04.11.2015 um 21:14 schrieb Stefan Beller:

On Wed, Nov 4, 2015 at 10:14 AM, Junio C Hamano  wrote:

Imagine that we are running two things A and B at the same time.  We
ask poll(2) and it says both A and B have some data ready to be
read, and we try to read from A.  strbuf_read_once() would try to
read up to 8K, relying on the fact that you earlier set the IO to be
nonblock.  It will get stuck reading from A without allowing output
from B to drain.  B's write may get stuck because we are not reading
from it, and would cause B to stop making progress.

What if the other sides of the connection from A and B are talking
with each other,


I am not sure if we want to allow this ever. How would that work with
jobs==1? How do we guarantee to have A and B running at the same time?


I think that a scenario where A and B are communicating is rather 
far-fetched. We are talking about parallelizing independent tasks. I 
would not worry.


-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 25/25] refs: break out ref conflict checks

2015-11-04 Thread David Turner
On Tue, 2015-11-03 at 08:40 +0100, Michael Haggerty wrote:
> + * extras and skip must be sorted lists of reference names. Either one
> + * can be NULL, signifying the empty list.
> + */

My version had:

"skip can be NULL; extras cannot."

The first thing that function does is:
string_list_find_insert_index(extras, dirname, 0)

And that crashes when extras is null.  So I think my version is correct
here.

Other than that, I've reviewed both the patches themselves and the
overall diff and everything looks good to me.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Junio C Hamano
Johannes Sixt  writes:

> I think that a scenario where A and B are communicating is rather
> far-fetched. We are talking about parallelizing independent tasks. I
> would not worry.

I wouldn't worry too much if this were merely a hack that is only
applicable to submodules, but I do not think it is a healthy
attitude to dismiss potential problem as far-fetched without
thinking things through, when you are designing what goes into
the run-command API.

I'd grant you that a complete deadlock is unlikely to be a problem
on its own.  Somewhere somebody will eventually time out and unblock
the deadlock anyway.

But the symptom does not have to be as severe as a total deadlock to
be problematic.  If we block B (and other tasks) by not reading from
them quickly because we are blocked on reading from A, which may
take forever (in timescale of B and other tasks) to feed us enough
to satisfy strbuf_read_once(), we are wasting resource by spawning B
(and other tasks) early when we are not prepared to service them
well, on both our end and on the other side of the connection.

By the way, A and B do not have to be directly communicating to
deadlock.  The underlying system, either the remote end or the local
end or even a relaying system in between (think: network) can
throttle to cause the same symptom without A and B knowing (which
was the example I gave).


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Stefan Beller
On Wed, Nov 4, 2015 at 12:42 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Another approach would be to test if we can set to non blocking and if
>> that is not possible, do not buffer it, but redirect the subcommand
>> directly to stderr of the calling process.
>>
>> if (set_nonblocking(pp->children[i].process.err) < 0) {
>> pp->children[i].process.err = 2;
>> degraded_parallelism = 1;
>> }
>>
>> and once we observe the degraded_parallelism flag, we can only
>> schedule a maximum of one job at a time, having direct output?
>
> I would even say that on a platform that is _capable_ of setting fd
> non-blocking, we should signal a grave error and die if an attempt
> to do so fails, period.

So more like:

if (platform_capable_non_blocking_IO())
set_nonblocking_or_die(>children[i].process.err);
else
pp->children[i].process.err = 2; /* ugly intermixed output is possible*/

>
> On the other hand, on a platform that is known to be incapable
> (e.g. lacks SETFL or NONBLOCK), we have two options.
>
> 1. If we can arrange to omit the intermediary buffer processing
>without butchering the flow of the main logic with many
>#ifdef..#endif, then that would make a lot of sense to do so, and
>running the processes in parallel with mixed output might be OK.
>It may not be very nice, but should be an acceptable compromise.

>From what I hear this kind of output is very annoying. (One of the
main complaints of repo users beside missing atomic fetch transactions)

>
> 2. If we need to sprinkle conditional compilation all over the place
>to do so, then I do not think it is worth it.  Instead, we should
>keep a single code structure, and forbid setting numtasks to more
>than one, which would also remove the need for nonblock IO.

So additional to the code above, we can add the
platform_capable_non_blocking_IO() condition to either the ramp up process,
or have a

if (!platform_capable_non_blocking_IO())
pp.max_processes = 1;

in the init phase. Then we have only 2 places that deal with the
problem, no #ifdefs
elsewhere.

>
> Either way, bringing "parallelism with sequential output" to
> platforms without nonblock IO can be left for a later day, when we
> find either (1) a good approach that does not require nonblock IO to
> do this, or (2) a good approach to do a nonblock IO on these
> platforms (we know about Windows, but there may be others; I dunno).
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] run-parallel: rename set_nonblocking to set_nonblocking_or_die

2015-11-04 Thread Stefan Beller
Discussion turned out a warning is not enough, but we need to die in case
of blocking output as we can lockup the child processes.

Junio wrote:
> Imagine that we are running two things A and B at the same time.  We
> ask poll(2) and it says both A and B have some data ready to be
> read, and we try to read from A.  strbuf_read_once() would try to
> read up to 8K, relying on the fact that you earlier set the IO to be
> nonblock.  It will get stuck reading from A without allowing output
> from B to drain.  B's write may get stuck because we are not reading
> from it, and would cause B to stop making progress.

> What if the other sides of the connection from A and B are talking
> with each other, and B's non-progress caused the processing for A on
> the other side of the connection to block, causing it not to produce
> more output to allow us to make progress reading from A (so that
> eventually we can give B a chance to drain its output)?  Imagine A
> and B are pushes to the same remote, B may be pushing a change to a
> submodule while A may be pushing a matching change to its
> superproject, and the server may be trying to make sure that the
> submodule update completes and updates the ref before making the
> superproject's tree that binds that updated submodule's commit
> availble, for example?  Can we make any progress from that point?

Signed-off-by: Stefan Beller 
---
 run-command.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/run-command.c b/run-command.c
index 0a3c24e..86fbe50 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1006,15 +1006,13 @@ static void pp_cleanup(struct parallel_processes *pp)
sigchain_pop_common();
 }
 
-static void set_nonblocking(int fd)
+static void set_nonblocking_or_die(int fd)
 {
int flags = fcntl(fd, F_GETFL);
if (flags < 0)
-   warning("Could not get file status flags, "
-   "output will be degraded");
+   die("Could not get file status flags");
else if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
-   warning("Could not set file status flags, "
-   "output will be degraded");
+   die("Could not set file status flags");
 }
 
 /* returns
@@ -1052,7 +1050,7 @@ static int pp_start_one(struct parallel_processes *pp)
return code ? -1 : 1;
}
 
-   set_nonblocking(pp->children[i].process.err);
+   set_nonblocking_or_die(pp->children[i].process.err);
 
pp->nr_processes++;
pp->children[i].in_use = 1;
-- 
2.6.1.247.ge8f2a41.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] run-parallel: Run sequential if nonblocking I/O is unavailable

2015-11-04 Thread Stefan Beller
Windows doesn't have O_NONBLOCK nor F_GETFL defined, so we need cannot run
in parallel there. Instead the children will output directly to our stderr
and we run one child at a time.

Bonus: We are setting process.err = -1; which we previously expected the
get_next_task callback to do. It is easy to forget that part in the callback
leading to hard to debug errors.

Signed-off-by: Stefan Beller 
---
 run-command.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index 86fbe50..19de253 100644
--- a/run-command.c
+++ b/run-command.c
@@ -958,6 +958,10 @@ static struct parallel_processes *pp_init(int n,
if (n < 1)
n = online_cpus();
 
+#if !(defined (O_NONBLOCK) && defined (F_GETFL))
+   n = 1;
+#endif
+
pp->max_processes = n;
pp->data = data;
if (!get_next_task)
@@ -1006,6 +1010,7 @@ static void pp_cleanup(struct parallel_processes *pp)
sigchain_pop_common();
 }
 
+#if defined (O_NONBLOCK) && defined (F_GETFL)
 static void set_nonblocking_or_die(int fd)
 {
int flags = fcntl(fd, F_GETFL);
@@ -1014,6 +1019,7 @@ static void set_nonblocking_or_die(int fd)
else if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
die("Could not set file status flags");
 }
+#endif
 
 /* returns
  *  0 if a new task was started.
@@ -1031,6 +1037,12 @@ static int pp_start_one(struct parallel_processes *pp)
if (i == pp->max_processes)
die("BUG: bookkeeping is hard");
 
+#if defined (O_NONBLOCK) && defined (F_GETFL)
+   pp->children[i].process.err = -1;
+#else
+   pp->children[i].process.err = 2;
+#endif
+
if (!pp->get_next_task(>children[i].data,
   >children[i].process,
   >children[i].err,
@@ -1049,8 +1061,9 @@ static int pp_start_one(struct parallel_processes *pp)
strbuf_reset(>children[i].err);
return code ? -1 : 1;
}
-
+#if defined (O_NONBLOCK) && defined (F_GETFL)
set_nonblocking_or_die(pp->children[i].process.err);
+#endif
 
pp->nr_processes++;
pp->children[i].in_use = 1;
-- 
2.6.1.247.ge8f2a41.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Missing O_NONBLOCK under Windows (was: git.git as of tonight)

2015-11-04 Thread Stefan Beller
The first patch is a general fixup as per discussion,
the second patch will make Git compile in Windows again (hopefully, not tested)

The number of #ifdefs seems acceptable to me, opinions on that?

This has been developed on top of d075d2604c0f9 (Merge branch 
'rs/daemon-plug-child-leak'
into sb/submodule-parallel-update), but should also apply on top of 
origin/sb/submodule-parallel-fetch

Thanks,
Stefan

Stefan Beller (2):
  run-parallel: rename set_nonblocking to set_nonblocking_or_die
  run-parallel: Run sequential if nonblocking I/O is unavailable

 run-command.c | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

-- 
2.6.1.247.ge8f2a41.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Stefan Beller
On Wed, Nov 4, 2015 at 1:19 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> So more like:
>>
>> if (platform_capable_non_blocking_IO())
>> set_nonblocking_or_die(>children[i].process.err);
>> else
>> pp->children[i].process.err = 2; /* ugly intermixed output is 
>> possible*/
>
> When I mentioned #if..#endif, I didn't mean it as a dogmatic
> "conditional compilation is wrong" sense.  It was more about "the
> high-level flow of logic should not have to know too much about
> platform peculiarities".  As platform_capable_non_blocking_IO()
> function would be a constant function after you compile it for a
> single platform, if you add 10 instances of such if/else in a patch
> that adds 250 lines, unless the change is to add a set of lowest
> level helpers to be called from the higher-level flow of logic so
> that the callers do not have to know about the platform details,
> that's just as bad as adding 10 instances of #if..#endif.

Right. Yeah I was just outlining the program flow as I understood it.

Specially:
 * It's fine if the platform doesn't support non blocking IO
 * but if the platform claims to support it and fails to, this is a hard error.
How is this worse than the platform claiming to not support it at all?

I mean another solution there would be to try to set the non blocking IO
and if that fails we put that process-to-be-started into a special queue,
which will start the process once the stderr channel is not occupied any more
(i.e. the process which is the current output owner has ended or there is
no such process)

That way we would fall back to no parallelism in Windows and could
even gracefully fallback in other systems, which suddenly have problems
setting non blocking IO. (Though I suspect that isn't required here).

>
>>> On the other hand, on a platform that is known to be incapable
>>> (e.g. lacks SETFL or NONBLOCK), we have two options.
>>>
>>> 1. If we can arrange to omit the intermediary buffer processing
>>>without butchering the flow of the main logic with many
>>>#ifdef..#endif, then that would make a lot of sense to do so, and
>>>running the processes in parallel with mixed output might be OK.
>>>It may not be very nice, but should be an acceptable compromise.
>>
>> From what I hear this kind of output is very annoying. (One of the
>> main complaints of repo users beside missing atomic fetch transactions)
>
> When (1) "parallelism with sequential output" is the desired
> outcome, (2) on some platforms we haven't found a way to achieve
> both, and (3) a non-sequential output is unacceptable, then
> parallelism has to give :-(.

Ok, then I will go that route.

>
> I was getting an impression from your "not buffer" suggestion that
> "sequential output" would be the one that can be sacrificed, but
> that is OK.  Until we find a way to achieve both at the same time,
> achieving only either one or the other is better than achieving
> nothing.

I was not sure what is better to sacrifice. Scarifying parallelism sounds
safer to me (no apparent change compared to before).

>
>>> Either way, bringing "parallelism with sequential output" to
>>> platforms without nonblock IO can be left for a later day, when we
>>> find either (1) a good approach that does not require nonblock IO to
>>> do this, or (2) a good approach to do a nonblock IO on these
>>> platforms (we know about Windows, but there may be others; I dunno).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 01:01:53PM -0800, Junio C Hamano wrote:

> But the symptom does not have to be as severe as a total deadlock to
> be problematic.  If we block B (and other tasks) by not reading from
> them quickly because we are blocked on reading from A, which may
> take forever (in timescale of B and other tasks) to feed us enough
> to satisfy strbuf_read_once(), we are wasting resource by spawning B
> (and other tasks) early when we are not prepared to service them
> well, on both our end and on the other side of the connection.

I'm not sure I understand this line of reasoning. It is entirely
possible that I have not been paying close enough attention and am
missing something subtle, so please feel free to hit me with the clue
stick.

But why would we ever block reading from A? If poll() reported to us
that "A" is ready to read, and we call strbuf_read_once(), we will make
a _single_ read call (which was, after all, the point of adding
strbuf_read_once in the first place).

So even if descriptor "A" isn't non-blocking, why would we block? Only
if the OS told us we are ready to read via poll(), but we are somehow
not (which, AFAIK, would be a bug in the OS).

So I'm not sure I see why we need to be non-blocking at all here, if we
are correctly hitting poll() and doing a single read on anybody who
claims to be ready (rather than trying to soak up all of their available
data), then we should never block, and we should never starve one
process (even without blocking, we could be in a busy loop slurping from
A and starve B, but by hitting the descriptors in round-robin for each
poll(), we make sure they all progress).

What am I missing?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Junio C Hamano
Stefan Beller  writes:

> Another approach would be to test if we can set to non blocking and if
> that is not possible, do not buffer it, but redirect the subcommand
> directly to stderr of the calling process.
>
> if (set_nonblocking(pp->children[i].process.err) < 0) {
> pp->children[i].process.err = 2;
> degraded_parallelism = 1;
> }
>
> and once we observe the degraded_parallelism flag, we can only
> schedule a maximum of one job at a time, having direct output?

I would even say that on a platform that is _capable_ of setting fd
non-blocking, we should signal a grave error and die if an attempt
to do so fails, period.

On the other hand, on a platform that is known to be incapable
(e.g. lacks SETFL or NONBLOCK), we have two options.

1. If we can arrange to omit the intermediary buffer processing
   without butchering the flow of the main logic with many
   #ifdef..#endif, then that would make a lot of sense to do so, and
   running the processes in parallel with mixed output might be OK.
   It may not be very nice, but should be an acceptable compromise.

2. If we need to sprinkle conditional compilation all over the place
   to do so, then I do not think it is worth it.  Instead, we should
   keep a single code structure, and forbid setting numtasks to more
   than one, which would also remove the need for nonblock IO.

Either way, bringing "parallelism with sequential output" to
platforms without nonblock IO can be left for a later day, when we
find either (1) a good approach that does not require nonblock IO to
do this, or (2) a good approach to do a nonblock IO on these
platforms (we know about Windows, but there may be others; I dunno).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Junio C Hamano
Stefan Beller  writes:

> So more like:
>
> if (platform_capable_non_blocking_IO())
> set_nonblocking_or_die(>children[i].process.err);
> else
> pp->children[i].process.err = 2; /* ugly intermixed output is 
> possible*/

When I mentioned #if..#endif, I didn't mean it as a dogmatic
"conditional compilation is wrong" sense.  It was more about "the
high-level flow of logic should not have to know too much about
platform peculiarities".  As platform_capable_non_blocking_IO()
function would be a constant function after you compile it for a
single platform, if you add 10 instances of such if/else in a patch
that adds 250 lines, unless the change is to add a set of lowest
level helpers to be called from the higher-level flow of logic so
that the callers do not have to know about the platform details,
that's just as bad as adding 10 instances of #if..#endif.

>> On the other hand, on a platform that is known to be incapable
>> (e.g. lacks SETFL or NONBLOCK), we have two options.
>>
>> 1. If we can arrange to omit the intermediary buffer processing
>>without butchering the flow of the main logic with many
>>#ifdef..#endif, then that would make a lot of sense to do so, and
>>running the processes in parallel with mixed output might be OK.
>>It may not be very nice, but should be an acceptable compromise.
>
> From what I hear this kind of output is very annoying. (One of the
> main complaints of repo users beside missing atomic fetch transactions)

When (1) "parallelism with sequential output" is the desired
outcome, (2) on some platforms we haven't found a way to achieve
both, and (3) a non-sequential output is unacceptable, then
parallelism has to give :-(.

I was getting an impression from your "not buffer" suggestion that
"sequential output" would be the one that can be sacrificed, but
that is OK.  Until we find a way to achieve both at the same time,
achieving only either one or the other is better than achieving
nothing.

>> Either way, bringing "parallelism with sequential output" to
>> platforms without nonblock IO can be left for a later day, when we
>> find either (1) a good approach that does not require nonblock IO to
>> do this, or (2) a good approach to do a nonblock IO on these
>> platforms (we know about Windows, but there may be others; I dunno).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: When a file was locked by some program, git will work stupidly

2015-11-04 Thread Lasse Makholm
On 2 November 2015 at 15:33, Randall S. Becker  wrote:
>
> On November-01-15 11:57 PM dayong xie wrote:
>>To be specific
>>In my Unity project, there is a native plugin,  and plugin's extension is 
>>.dll, >and this plugin is under git version control, when Unity is running, 
>>the plugin >file will be locked.
>>If i merge another branch, which contains modification of the plugin, git 
>>will >report error, looks
>>like:
>>error: unable to unlink old 'xxx/xxx.dll' (Permission denied) This is not 
>>>bad, however, the unfinished merge action will not revert by git, a lot of 
>>>changes produced in repository.
>>usually it makes me crazy, even worse, some of my partners are not good at 
>>>using git.
>>Of course, this problem can be avoided by quit Unity, but not every time we 
>>can >remember. In my opinion, git should revert the unfinished action when 
>>the error >occurred, not just stop.
>
> What version of Unity (or Unity Pro) are you using? Is this experienced with 
> the Commit in MonoDevelop/Visual Studio or are you using a separate git 
> client?
>
> I have found similar issues in some versions of Unity and MonoDevelop or 
> Visual Studio not saving all files, especially the project files until you 
> have fully exited - nothing to do with git, but your git commits may not 
> contain complete images of your change.
>
> When I use git with Unity, I either have the source committed through 
> MonoDevelop (no issues) if my changes are source-only, or if I have assets 
> and project changes, then I do exit completely so that I am sure Unity 
> flushes everything to disk and I can get a single atomic commit with all the 
> Unity and C# bits using SourceTree or gitk.

Ouch. That sounds broken. Do you have a case number with Unity for
this? I work at Unity (though not on the editor) and can ask around...

> OTOH I'm not sure you really should be storing build products out of 
> MonoDevelop or Visual Studio in your git repository. If the DLL can be 
> rebuild automatically on the client - usual answer is yes - then let it. 
> Handle the release build separately - at least in a separate branch that does 
> not involve having the Unity editor open to get in the way.
>
> In my environment, I have added *.dll (and other stuff) to .gitignore so that 
> I do not track dll changes - they get built on demand instead. There are 
> recommended ways of using git in the Unity forums.

The way I'm reading this, the plugin comes from a third party and is
not built as part of the project. I could be wrong though.

Dayong - do you have a case number with Unity for your problem?

And just to not make this not completely off-topic: I agree that git
should ideally do a proper rollback in case of a failed (as opposed to
conflicted) merge but I'm not sure what the precedent is for handling
failure cases like this.

Perhaps this is more easily solved, for now at least, by helping the
people who are not comfortable with git setup an alias as an
easier-to-remember way of saying e.g. git reset --hard HEAD ?

/Lasse

>
> Cheers,
> Randall
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Junio C Hamano
Jeff King  writes:

> So I'm not sure I see why we need to be non-blocking at all here, if we
> are correctly hitting poll() and doing a single read on anybody who
> claims to be ready (rather than trying to soak up all of their available
> data), then we should never block, and we should never starve one
> process (even without blocking, we could be in a busy loop slurping from
> A and starve B, but by hitting the descriptors in round-robin for each
> poll(), we make sure they all progress).
>
> What am I missing?

I've always assumed that the original reason why we wanted to set
the fd to nonblock was because poll(2) only tells us there is
something to read (even a single byte), and the xread_nonblock()
call strbuf_read_once() makes with the default size of 8KB is
allowed to consume all available bytes and then get stuck waiting
for the remainder of 8KB before returning.

If the read(2) in xread_nonblock() always returns as soon as we
receive as much as there is data available without waiting for any
more, ignoring the size of the buffer (rather, taking the size of
the buffer only as the upper bound), then there is no need for
nonblock anywhere.

So perhaps the original reasoning of doing nonblock was faulty, you
are saying?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


File owner/group and git

2015-11-04 Thread David Turner
In unpack-trees.c, in verify_uptodate_1, we check ie_match_stat.  This
returns OWNER_CHANGED if a file has changed ownership since the index
was updated.  Do we actually care about that particular case?  Or really
anything other than DATA_CHANGED?

(We noticed this due to a bug in our watchman code, which fakes up the
owners, but it's a general question).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: File owner/group and git

2015-11-04 Thread Junio C Hamano
David Turner  writes:

> In unpack-trees.c, in verify_uptodate_1, we check ie_match_stat.  This
> returns OWNER_CHANGED if a file has changed ownership since the index
> was updated.  Do we actually care about that particular case?  Or really
> anything other than DATA_CHANGED?

That's a 10-year old code and there aren't that many people left
who can answer the original rationale, I am afraid ;-)

In general, "Do we actually care?" is not the question we ask in
this area of the code.  "Does it help us to catch real changes, or
does it change spuriously to make it too unreliable a signal to be
useful?" is the question that drives the design of this part of the
system.

DATA_CHANGED is "we know the contents are different without even
looking at the data".  If the size is different from the last time
we hashed the data, the contents must have changed.  The inverse is
not true (and that is half of the "racy git" issue).

Other *_CHANGED are finely classified only because originally we
didn't really know which are useful to treat as notable change
event, and "changed" variable had sufficient number of bits to hold
different classification, so that we could pick and choose which
ones we truly care.  We knew MTIME was useful in the sense that even
if the size is the same, updated mtime is good enough indication
that the stuff has changed, even to "make" utility.

INODE and CTIME are not so stable on some filesystems (e.g. inum may
not be stable on a network share across remount) and in some
environments (e.g. some virus scanners touch ctime to mark scanned
files, cf. 1ce4790b), and would trigger false positives too often to
be useful.  We always paid attention to them initially, but there
are configurations to tell Git not raise them these days.

OWNER probably falls into a category that is stable enough to be
useful, as the most likely way for it to change is not by running
"chown" on the file in-place (which does not change the contents),
but by running "mv" to drop another file owned by somebody else to
the original location (which likely does change the contents).  At
the same time, "mv" a different file into the path would likely
trigger changes to INODE and MTIME as well, so it cannot be more
than belt-and-suspenders measure to catch modification.  In that
sense ignoring OWNER would not hurt too much.

If it changes spuriously to make it too unreliable a signal to be
useful, it certainly is OK to introduce a knob to ignore it.  It
might even make sense to ignore it unconditionally if the false hit
happens too frequently, but offhand my gut reaction is that there
may be something wrong in the environment (i.e. system outside Git
in which Git runs) if owner/group changes spuriously to cause
issues.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 25/25] refs: break out ref conflict checks

2015-11-04 Thread Michael Haggerty
On 11/04/2015 10:01 PM, David Turner wrote:
> On Tue, 2015-11-03 at 08:40 +0100, Michael Haggerty wrote:
>> + * extras and skip must be sorted lists of reference names. Either one
>> + * can be NULL, signifying the empty list.
>> + */
> 
> My version had:
> 
> "skip can be NULL; extras cannot."
> 
> The first thing that function does is:
> string_list_find_insert_index(extras, dirname, 0)
> 
> And that crashes when extras is null.  So I think my version is correct
> here.

We're talking about the function find_descendant_ref(), which was added
in this patch, right? Because the first thing that function does is

+   if (!extras)
+   return NULL;

(This guard was in your version, too.) Also, the callsite doesn't
protect against extras==NULL. So either we're talking about two
different things here, or I disagree with you.

> Other than that, I've reviewed both the patches themselves and the
> overall diff and everything looks good to me.

Thanks!

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] run-parallel: rename set_nonblocking to set_nonblocking_or_die

2015-11-04 Thread Junio C Hamano
Torsten Bögershausen  writes:

> (Jumping into an old discussion, I may be off topic)

I think this is exactly the latest "I wonder" from Peff, to which I
said "well, perhaps we didn't need nonblock after all from the
beginning".

> And this work regardless if the fd blocking or not, so from that point of 
> view,
> the set_nonblocking() is not needed at all.
>
> The major question is, if the poll() works under Windows, (and I
> haven't found time to dig further)

;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/4] upload-pack: strip refs before calling ref_is_hidden()

2015-11-04 Thread Lukas Fleischer
Make hideRefs handling in upload-pack consistent with the behavior
described in the documentation by stripping refs before comparing them
with prefixes in hideRefs.

Signed-off-by: Lukas Fleischer 
---
 upload-pack.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index d0bc3ca..4ca960e 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -692,7 +692,7 @@ static int mark_our_ref(const char *refname, const struct 
object_id *oid)
 {
struct object *o = lookup_unknown_object(oid->hash);
 
-   if (ref_is_hidden(refname)) {
+   if (refname && ref_is_hidden(refname)) {
o->flags |= HIDDEN_REF;
return 1;
}
@@ -703,7 +703,7 @@ static int mark_our_ref(const char *refname, const struct 
object_id *oid)
 static int check_ref(const char *refname, const struct object_id *oid,
 int flag, void *cb_data)
 {
-   mark_our_ref(refname, oid);
+   mark_our_ref(strip_namespace(refname), oid);
return 0;
 }
 
@@ -726,7 +726,7 @@ static int send_ref(const char *refname, const struct 
object_id *oid,
const char *refname_nons = strip_namespace(refname);
struct object_id peeled;
 
-   if (mark_our_ref(refname, oid))
+   if (mark_our_ref(refname_nons, oid))
return 0;
 
if (capabilities) {
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 4/4] t5509: add basic tests for hideRefs

2015-11-04 Thread Lukas Fleischer
Test whether regular and full hideRefs patterns work as expected when
namespaces are used.

Helped-by: Eric Sunshine 
Signed-off-by: Lukas Fleischer 
---
 t/t5509-fetch-push-namespaces.sh | 41 
 1 file changed, 41 insertions(+)

diff --git a/t/t5509-fetch-push-namespaces.sh b/t/t5509-fetch-push-namespaces.sh
index cc0b31f..bc44ac3 100755
--- a/t/t5509-fetch-push-namespaces.sh
+++ b/t/t5509-fetch-push-namespaces.sh
@@ -82,4 +82,45 @@ test_expect_success 'mirroring a repository using a ref 
namespace' '
)
 '
 
+test_expect_success 'hide namespaced refs with transfer.hideRefs' '
+   GIT_NAMESPACE=namespace \
+   git -C pushee -c transfer.hideRefs=refs/tags \
+   ls-remote "ext::git %s ." >actual &&
+   printf "$commit1\trefs/heads/master\n" >expected &&
+   test_cmp expected actual
+'
+
+test_expect_success 'check that transfer.hideRefs does not match unstripped 
refs' '
+   GIT_NAMESPACE=namespace \
+   git -C pushee -c 
transfer.hideRefs=refs/namespaces/namespace/refs/tags \
+   ls-remote "ext::git %s ." >actual &&
+   printf "$commit1\trefs/heads/master\n" >expected &&
+   printf "$commit0\trefs/tags/0\n" >>expected &&
+   printf "$commit1\trefs/tags/1\n" >>expected &&
+   test_cmp expected actual
+'
+
+test_expect_success 'hide full refs with transfer.hideRefs' '
+   GIT_NAMESPACE=namespace \
+   git -C pushee -c 
transfer.hideRefs="^refs/namespaces/namespace/refs/tags" \
+   ls-remote "ext::git %s ." >actual &&
+   printf "$commit1\trefs/heads/master\n" >expected &&
+   test_cmp expected actual
+'
+
+test_expect_success 'try to update a hidden ref' '
+   test_config -C pushee transfer.hideRefs refs/heads/master &&
+   test_must_fail git -C original push pushee-namespaced master
+'
+
+test_expect_success 'try to update a ref that is not hidden' '
+   test_config -C pushee transfer.hideRefs 
refs/namespaces/namespace/refs/heads/master &&
+   git -C original push pushee-namespaced master
+'
+
+test_expect_success 'try to update a hidden full ref' '
+   test_config -C pushee transfer.hideRefs 
"^refs/namespaces/namespace/refs/heads/master" &&
+   test_must_fail git -C original push pushee-namespaced master
+'
+
 test_done
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/4] Add support for matching full refs in hideRefs

2015-11-04 Thread Lukas Fleischer
In addition to matching stripped refs, one can now add hideRefs patterns
that the full (unstripped) ref is matched against. To distinguish
between stripped and full matches, those new patterns must be prefixed
with a circumflex (^).

This commit also removes support for the undocumented and unintended
hideRefs settings "have" (suppressing all "have" lines) and
"capabilities^{}" (suppressing the capabilities line).

Signed-off-by: Lukas Fleischer 
---
 Documentation/config.txt |  3 ++-
 builtin/receive-pack.c   | 27 +--
 refs.c   | 15 ---
 refs.h   | 10 +-
 upload-pack.c| 13 -
 5 files changed, 52 insertions(+), 16 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 74a81e0..d816338 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2691,7 +2691,8 @@ prefix `refs/heads/master` is specified in 
`transfer.hideRefs` and the current
 namespace is `foo`, then `refs/namespaces/foo/refs/heads/master` is omitted
 from the advertisements but `refs/heads/master` and
 `refs/namespaces/bar/refs/heads/master` are still advertised as so-called
-"have" lines.
+"have" lines. In order to match refs before stripping, add a `^` in front of
+the ref name. If you combine `!` and `^`, `!` must be specified first.
 
 transfer.unpackLimit::
When `fetch.unpackLimit` or `receive.unpackLimit` are
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index bcb624b..f06f70a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -195,9 +195,6 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
 
 static void show_ref(const char *path, const unsigned char *sha1)
 {
-   if (ref_is_hidden(path))
-   return;
-
if (sent_capabilities) {
packet_write(1, "%s %s\n", sha1_to_hex(sha1), path);
} else {
@@ -219,9 +216,14 @@ static void show_ref(const char *path, const unsigned char 
*sha1)
}
 }
 
-static int show_ref_cb(const char *path, const struct object_id *oid, int 
flag, void *unused)
+static int show_ref_cb(const char *path_full, const struct object_id *oid,
+  int flag, void *unused)
 {
-   path = strip_namespace(path);
+   const char *path = strip_namespace(path_full);
+
+   if (ref_is_hidden(path, path_full))
+   return 0;
+
/*
 * Advertise refs outside our current namespace as ".have"
 * refs, so that the client can use them to minimize data
@@ -1195,16 +1197,29 @@ static int iterate_receive_command_list(void *cb_data, 
unsigned char sha1[20])
 
 static void reject_updates_to_hidden(struct command *commands)
 {
+   struct strbuf refname_full = STRBUF_INIT;
+   size_t prefix_len;
struct command *cmd;
 
+   strbuf_addstr(_full, get_git_namespace());
+   prefix_len = refname_full.len;
+
for (cmd = commands; cmd; cmd = cmd->next) {
-   if (cmd->error_string || !ref_is_hidden(cmd->ref_name))
+   if (cmd->error_string)
+   continue;
+
+   strbuf_setlen(_full, prefix_len);
+   strbuf_addstr(_full, cmd->ref_name);
+
+   if (!ref_is_hidden(cmd->ref_name, refname_full.buf))
continue;
if (is_null_sha1(cmd->new_sha1))
cmd->error_string = "deny deleting a hidden ref";
else
cmd->error_string = "deny updating a hidden ref";
}
+
+   strbuf_release(_full);
 }
 
 static int should_process_cmd(struct command *cmd)
diff --git a/refs.c b/refs.c
index 72d96ed..892fffb 100644
--- a/refs.c
+++ b/refs.c
@@ -321,7 +321,7 @@ int parse_hide_refs_config(const char *var, const char 
*value, const char *secti
return 0;
 }
 
-int ref_is_hidden(const char *refname)
+int ref_is_hidden(const char *refname, const char *refname_full)
 {
int i;
 
@@ -329,6 +329,7 @@ int ref_is_hidden(const char *refname)
return 0;
for (i = hide_refs->nr - 1; i >= 0; i--) {
const char *match = hide_refs->items[i].string;
+   const char *subject;
int neg = 0;
int len;
 
@@ -337,10 +338,18 @@ int ref_is_hidden(const char *refname)
match++;
}
 
-   if (!starts_with(refname, match))
+   if (*match == '^') {
+   subject = refname_full;
+   match++;
+   } else {
+   subject = refname;
+   }
+
+   /* refname can be NULL when namespaces are used. */
+   if (!subject || !starts_with(subject, match))
continue;
len = strlen(match);
-   if (!refname[len] || refname[len] == '/')
+   if (!subject[len] 

[PATCH v3 0/4] Improve hideRefs when used with namespaces

2015-11-04 Thread Lukas Fleischer
This is a first set of patches improving documentation and behavior of
the transfer.hideRefs feature as discussed in [1]. In particular,
hideRefs is changed to generally match stripped refs by default and
match full refs when prefixed with "^". The documentation is updated
accordingly. Basic tests are added.

Changes since v2:

* Reword the additions to the documentation as suggested by Junio.
* Fix the return value in show_ref_cb().

[1] http://marc.info/?l=git=144604694223920

Lukas Fleischer (4):
  Document the semantics of hideRefs with namespaces
  upload-pack: strip refs before calling ref_is_hidden()
  Add support for matching full refs in hideRefs
  t5509: add basic tests for hideRefs

 Documentation/config.txt |  9 +
 builtin/receive-pack.c   | 27 --
 refs.c   | 15 ---
 refs.h   | 10 +-
 t/t5509-fetch-push-namespaces.sh | 41 
 upload-pack.c| 13 -
 6 files changed, 100 insertions(+), 15 deletions(-)

-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] run-parallel: rename set_nonblocking to set_nonblocking_or_die

2015-11-04 Thread Torsten Bögershausen

On 11/04/2015 11:43 PM, Stefan Beller wrote:

Discussion turned out a warning is not enough, but we need to die in case
of blocking output as we can lockup the child processes.

Junio wrote:

Imagine that we are running two things A and B at the same time.  We
ask poll(2) and it says both A and B have some data ready to be
read, and we try to read from A.  strbuf_read_once() would try to
read up to 8K, relying on the fact that you earlier set the IO to be
nonblock.

(Jumping into an old discussion, I may be off topic)

When we want to make really sure, not to get stucked, we can do like this:

ssize_t xread_nonblock(int fd, void *buf, size_t len)
{
ssize_t nr;
if (len > MAX_IO_SIZE)
len = MAX_IO_SIZE;
nr = read(fd, buf, len);
if (nr < 0) {
   if (errno == EWOULDBLOCK)
errno = EAGAIN;
 return nr;
}

Once poll() returns POLLIN as set on a fd, we can safely call read() once 
without getting stucked.

"Data can be read without blocking".

And this work regardless if the fd blocking or not, so from that point of view,
the set_nonblocking() is not needed at all.


The major question is, if the poll() works under Windows, (and I haven't found 
time to dig further)


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/4] Document the semantics of hideRefs with namespaces

2015-11-04 Thread Lukas Fleischer
Right now, there is no clear definition of how transfer.hideRefs should
behave when a namespace is set. Explain that hideRefs prefixes match
stripped names in that case. This is how hideRefs patterns are currently
handled in receive-pack.

Signed-off-by: Lukas Fleischer 
---
 Documentation/config.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1204072..74a81e0 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2684,6 +2684,14 @@ You may also include a `!` in front of the ref name to 
negate the entry,
 explicitly exposing it, even if an earlier entry marked it as hidden.
 If you have multiple hideRefs values, later entries override earlier ones
 (and entries in more-specific config files override less-specific ones).
++
+If a namespace is set, the namespace prefix is stripped from each reference
+before references are matched against hideRefs patterns. For example, if the
+prefix `refs/heads/master` is specified in `transfer.hideRefs` and the current
+namespace is `foo`, then `refs/namespaces/foo/refs/heads/master` is omitted
+from the advertisements but `refs/heads/master` and
+`refs/namespaces/bar/refs/heads/master` are still advertised as so-called
+"have" lines.
 
 transfer.unpackLimit::
When `fetch.unpackLimit` or `receive.unpackLimit` are
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 06:05:17PM -0800, Junio C Hamano wrote:

> I've always assumed that the original reason why we wanted to set
> the fd to nonblock was because poll(2) only tells us there is
> something to read (even a single byte), and the xread_nonblock()
> call strbuf_read_once() makes with the default size of 8KB is
> allowed to consume all available bytes and then get stuck waiting
> for the remainder of 8KB before returning.
> 
> If the read(2) in xread_nonblock() always returns as soon as we
> receive as much as there is data available without waiting for any
> more, ignoring the size of the buffer (rather, taking the size of
> the buffer only as the upper bound), then there is no need for
> nonblock anywhere.

This latter paragraph was my impression of how pipe reading generally
worked, for blocking or non-blocking. That is, if there is data, both
cases return what we have (up to the length specified by the user), and
it is only when there is _no_ data that we might choose to block.

It's easy to verify experimentally. E.g.:

  perl -e 'while(1) { syswrite(STDOUT, "a", 1); sleep(1); }' |
  strace perl -e 'while(1) { sysread(STDIN, my $buf, 1024) }'

should show a series of 1-byte reads. But of course that only shows that
it works on my system[1], not everywhere.

POSIX implies it is the case in the definition of read[2] in two ways:

  1. The O_NONBLOCK behavior for pipes is mentioned only when dealing
 with empty pipes.

  2. Later, it says:

   The value returned may be less than nbyte if the number of bytes
   left in the file is less than nbyte, if the read() request was
   interrupted by a signal, or if the file is a pipe or FIFO or
   special file and has fewer than nbyte bytes immediately available
   for reading.

 That is not explicit, but the "immediately" there seems to imply
 it.

> So perhaps the original reasoning of doing nonblock was faulty, you
> are saying?

Exactly. And therefore a convenient way to deal with the portability
issue is to get rid of it. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About global --progress option

2015-11-04 Thread Junio C Hamano
Edmundo Carmona Antoranz  writes:

> Which would be the correct development path?
>
> - Two-step process: First step, implement global --[no-]progress at
> the git level and also support the option from the builtins that
> laready have it. Let it live like that for some releases (so that
> people have the time to dump using the builtin option and start using
> the global option) and then on the second step dump the builtin
> options and keep the global one.
>
> or
>
> - A single step that would remove --[no-]progress from all builtins
> that support it and would place it as a global git option?

Assuming by "dump" you mean "removal of support", I doubt either is
correct.

Existing users know that the way to squelch progress output from
their "git clone" is with "git clone --no-progress".  Introducing
another way to say the same thing, i.e. "git --no-progress clone" is
one thing, but I do not see what value you are adding to the system
by breaking what they know has worked forever and forcing them to
use "git --no-progress clone".  Why should they learn that the
"--no-checkout" option for example has to come after "clone" and
the "--no-progress" option has to come before?  Why should they
adjust their scripts and their finger memory to your whim?

Besides, I am not convinced that you are bringing in a net positive
value by allowing "git --no-progress clone".  You would need to
think what to do when you get two conflicting options (e.g. should
"git --no-progress clone --progress" give progress?  Why?), you
would need to explain to the users why the resulting code works the
way you made it work (most likely, "do nothing special") when the
global one is given to a command that does not give any progress
output, and you would need to make sure whatever answers you would
give to these questions are implemented consistently.  And you would
need more code to do so.  It is unclear to me what value the end
users get out of all that effort, if any, and if the value offered
to the end users outweigh the added complexity, additional code that
has to be maintained, and additional risk to introduce unnecessary
bugs.

A lot more useful thing to do in the same area with a lot smaller
amount of effort would be to find an operation that sometimes take a
long time to perform that does not show the progress report and
teach the codepath involved in the operation to show progress, I
would think.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About global --progress option

2015-11-04 Thread Edmundo Carmona Antoranz
On Thu, Nov 5, 2015 at 12:11 AM, Junio C Hamano  wrote:
> Besides, I am not convinced that you are bringing in a net positive
> value by allowing "git --no-progress clone".  You would need to
> think what to do when you get two conflicting options (e.g. should
> "git --no-progress clone --progress" give progress?  Why?), you
> would need to explain to the users why the resulting code works the
> way you made it work (most likely, "do nothing special") when the
> global one is given to a command that does not give any progress
> output, and you would need to make sure whatever answers you would
> give to these questions are implemented consistently.  And you would
> need more code to do so.  It is unclear to me what value the end
> users get out of all that effort, if any, and if the value offered
> to the end users outweigh the added complexity, additional code that
> has to be maintained, and additional risk to introduce unnecessary
> bugs.

The contradictory case of git --progress whatever --no-progress (or
viceversa) was going to be my follow-up question. Dude, don't get too
far ahead in the conversation :-)

On the technical side, I think the global --progress option (and
removing the option from the builtins) would not add complexity but
the opposite because setting the flag would be done at the "main git"
level and then all the builtins would just forget about it and would
use progress regardless (cause deciding _if_ progress should be shown
or not won't be up to them anymore) so the code from the builtins to
support the option would be gone. It would certainly be more complex
while keeping global and builtin options alive. Anyway, I do
understand your concerns and will stand down on the topic (as in
global --progress who???).

> A lot more useful thing to do in the same area with a lot smaller
> amount of effort would be to find an operation that sometimes take a
> long time to perform that does not show the progress report and
> teach the codepath involved in the operation to show progress, I
> would think.

Sounds interesting and in-topic (different direction). You got a list
of those paths that I could work on? At least a couple you can think
off the top of your head? Cause I can't complain about the progress I
get on the workflows I follow (although I could keep a closer look to
try to catch some operation I know is taking place and is not showing
any progress and I'm used to it and so I don't complain at the lack of
progress). Right now I'm thinking about cherry-picking... sometimes it
can take a few seconds (or more seconds you have to see the
performance of some of the boxes that I work on) and getting some
progress there would be nice. Will take a look at it. Nice way to get
familiarized with the code, by the way.

Thank you very much, Junio!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


About global --progress option

2015-11-04 Thread Edmundo Carmona Antoranz
Hi, everybody!

Coming back from my patch about the --[no-]progress option for
checkout (it's already in next, J), I'd like to build into the
idea of the single --[no-]progress option for git as a whole.

So, in order to know what/how things should be developed, let's start
with a tiny survey

Which would be the correct development path?

- Two-step process: First step, implement global --[no-]progress at
the git level and also support the option from the builtins that
laready have it. Let it live like that for some releases (so that
people have the time to dump using the builtin option and start using
the global option) and then on the second step dump the builtin
options and keep the global one.

or

- A single step that would remove --[no-]progress from all builtins
that support it and would place it as a global git option?


Thanks in advance.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] run-parallel: rename set_nonblocking to set_nonblocking_or_die

2015-11-04 Thread Junio C Hamano
Junio C Hamano  writes:

> Torsten Bögershausen  writes:
>
>> (Jumping into an old discussion, I may be off topic)
>
> I think this is exactly the latest "I wonder" from Peff, to which I
> said "well, perhaps we didn't need nonblock after all from the
> beginning".
>
>> And this work regardless if the fd blocking or not, so from that point of 
>> view,
>> the set_nonblocking() is not needed at all.
>>
>> The major question is, if the poll() works under Windows, (and I
>> haven't found time to dig further)
>
> ;-)

Having read your message, I notice these disturbing passages in my
copy of manual pages.

poll(2) has

BUGS
   See the discussion of spurious readiness notifications under
   the BUGS section of select(2).

and then select(2) has

   Under Linux, select() may report a socket file descriptor as
   "ready for read‐ ing", while nevertheless a subsequent read
   blocks.  This could for example happen when data has arrived
   but upon examination has wrong checksum and is discarded.
   There may be other circumstances in which a file descriptor
   is spuriously reported as ready.  Thus it may be safer to use
   O_NONBLOCK on sock‐ ets that should not block.

The world is not all Linux, so there may be systems that we do not
have to worry about the bug described here, but there still are some
Linux systems in use in the world, so we cannot assume the bug
described above will not matter, either.

So I am not convinced that set_nonblocking() is unnecessary X-<.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/3] Move all the SHA1 implementations into one directory

2015-11-04 Thread atousa . p
From: Atousa Pahlevan Duprat 

The various SHA1 implementations were spread around in 3 directories.
This makes it easier to understand what implementations are
available at a glance.

Signed-off-by: Atousa Pahlevan Duprat 
---
 Makefile  |  10 +-
 block-sha1/sha1.c | 251 --
 block-sha1/sha1.h |  22 -
 cache.h   |   2 +-
 compat/sha1-chunked.c |  19 
 compat/sha1-chunked.h |   2 -
 ppc/sha1.c|  72 ---
 ppc/sha1.h|  25 -
 ppc/sha1ppc.S | 224 
 sha1/blk-sha1.c   | 251 ++
 sha1/blk-sha1.h   |  22 +
 sha1/chk-sha1.c   |  19 
 sha1/chk-sha1.h   |   2 +
 sha1/ppc-sha1.c   |  72 +++
 sha1/ppc-sha1.h   |  25 +
 sha1/ppc-sha1asm.S| 224 
 16 files changed, 621 insertions(+), 621 deletions(-)
 delete mode 100644 block-sha1/sha1.c
 delete mode 100644 block-sha1/sha1.h
 delete mode 100644 compat/sha1-chunked.c
 delete mode 100644 compat/sha1-chunked.h
 delete mode 100644 ppc/sha1.c
 delete mode 100644 ppc/sha1.h
 delete mode 100644 ppc/sha1ppc.S
 create mode 100644 sha1/blk-sha1.c
 create mode 100644 sha1/blk-sha1.h
 create mode 100644 sha1/chk-sha1.c
 create mode 100644 sha1/chk-sha1.h
 create mode 100644 sha1/ppc-sha1.c
 create mode 100644 sha1/ppc-sha1.h
 create mode 100644 sha1/ppc-sha1asm.S

diff --git a/Makefile b/Makefile
index 6a4ca59..94f74d7 100644
--- a/Makefile
+++ b/Makefile
@@ -1345,12 +1345,12 @@ ifdef APPLE_COMMON_CRYPTO
 endif
 
 ifdef BLK_SHA1
-   SHA1_HEADER = "block-sha1/sha1.h"
-   LIB_OBJS += block-sha1/sha1.o
+   SHA1_HEADER = "sha1/blk-sha1.h"
+   LIB_OBJS += sha1/blk-sha1.o
 else
 ifdef PPC_SHA1
-   SHA1_HEADER = "ppc/sha1.h"
-   LIB_OBJS += ppc/sha1.o ppc/sha1ppc.o
+   SHA1_HEADER = "sha1/ppc-sha1.h"
+   LIB_OBJS += sha1/ppc-sha1.o sha1/ppc-sha1asm.o
 else
 ifdef APPLE_COMMON_CRYPTO
COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL
@@ -1363,7 +1363,7 @@ endif
 endif
 
 ifdef SHA1_MAX_BLOCK_SIZE
-   LIB_OBJS += compat/sha1-chunked.o
+   LIB_OBJS += sha1/chk-sha1.o
BASIC_CFLAGS += -DSHA1_MAX_BLOCK_SIZE="$(SHA1_MAX_BLOCK_SIZE)"
 endif
 ifdef NO_PERL_MAKEMAKER
diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
deleted file mode 100644
index 22b125c..000
--- a/block-sha1/sha1.c
+++ /dev/null
@@ -1,251 +0,0 @@
-/*
- * SHA1 routine optimized to do word accesses rather than byte accesses,
- * and to avoid unnecessary copies into the context array.
- *
- * This was initially based on the Mozilla SHA1 implementation, although
- * none of the original Mozilla code remains.
- */
-
-/* this is only to get definitions for memcpy(), ntohl() and htonl() */
-#include "../git-compat-util.h"
-
-#include "sha1.h"
-
-#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
-
-/*
- * Force usage of rol or ror by selecting the one with the smaller constant.
- * It _can_ generate slightly smaller code (a constant of 1 is special), but
- * perhaps more importantly it's possibly faster on any uarch that does a
- * rotate with a loop.
- */
-
-#define SHA_ASM(op, x, n) ({ unsigned int __res; __asm__(op " %1,%0":"=r" 
(__res):"i" (n), "0" (x)); __res; })
-#define SHA_ROL(x,n)   SHA_ASM("rol", x, n)
-#define SHA_ROR(x,n)   SHA_ASM("ror", x, n)
-
-#else
-
-#define SHA_ROT(X,l,r) (((X) << (l)) | ((X) >> (r)))
-#define SHA_ROL(X,n)   SHA_ROT(X,n,32-(n))
-#define SHA_ROR(X,n)   SHA_ROT(X,32-(n),n)
-
-#endif
-
-/*
- * If you have 32 registers or more, the compiler can (and should)
- * try to change the array[] accesses into registers. However, on
- * machines with less than ~25 registers, that won't really work,
- * and at least gcc will make an unholy mess of it.
- *
- * So to avoid that mess which just slows things down, we force
- * the stores to memory to actually happen (we might be better off
- * with a 'W(t)=(val);asm("":"+m" (W(t))' there instead, as
- * suggested by Artur Skawina - that will also make gcc unable to
- * try to do the silly "optimize away loads" part because it won't
- * see what the value will be).
- *
- * Ben Herrenschmidt reports that on PPC, the C version comes close
- * to the optimized asm with this (ie on PPC you don't want that
- * 'volatile', since there are lots of registers).
- *
- * On ARM we get the best code generation by forcing a full memory barrier
- * between each SHA_ROUND, otherwise gcc happily get wild with spilling and
- * the stack frame size simply explode and performance goes down the drain.
- */
-
-#if defined(__i386__) || defined(__x86_64__)
-  #define setW(x, val) (*(volatile unsigned int *)(x) = (val))
-#elif defined(__GNUC__) && defined(__arm__)
-  #define setW(x, val) do { W(x) = (val); __asm__("":::"memory"); } while (0)
-#else
-  #define setW(x, 

Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Junio C Hamano
Jeff King  writes:

> POSIX implies it is the case in the definition of read[2] in two ways:
>
>   1. The O_NONBLOCK behavior for pipes is mentioned only when dealing
>  with empty pipes.
>
>   2. Later, it says:
>
>The value returned may be less than nbyte if the number of bytes
>left in the file is less than nbyte, if the read() request was
>interrupted by a signal, or if the file is a pipe or FIFO or
>special file and has fewer than nbyte bytes immediately available
>for reading.
>
>  That is not explicit, but the "immediately" there seems to imply
>  it.

We were reading the same book, but I was more worried about that
"may" there; it merely tells the caller of read(2) not to be alarmed
when the call returned without filling the entire buffer, without
mandating the implementation of read(2) never to block.

Having said that,...

>> So perhaps the original reasoning of doing nonblock was faulty, you
>> are saying?
>
> Exactly. And therefore a convenient way to deal with the portability
> issue is to get rid of it. :)

... I do like the simplification you alluded to in the other
message.  Not having to worry about the nonblock (at least until it
is found problematic in the real world) is a very good first step,
especially because the approach allows us to collectively make
progress by letting all of us in various platforms build and
experiment with "something that works".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 1/3] Provide another level of abstraction for the SHA1 utilities.

2015-11-04 Thread atousa . p
From: Atousa Pahlevan Duprat 

The git source uses git_SHA1_Update() and friends to call
into the code that computes the hashes.  This is can then be
mapped directly to an implementation that computes the hash,
such as platform_SHA1_Update(); or as we will do in a subsequent
patch, it can be mapped to something more complex that will in turn call
into the platform's SHA implementation.

Signed-off-by: Atousa Pahlevan Duprat 
---
 cache.h | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index a9aaa03..a934a2e 100644
--- a/cache.h
+++ b/cache.h
@@ -12,10 +12,21 @@
 
 #include SHA1_HEADER
 #ifndef git_SHA_CTX
-#define git_SHA_CTXSHA_CTX
-#define git_SHA1_Init  SHA1_Init
-#define git_SHA1_UpdateSHA1_Update
-#define git_SHA1_Final SHA1_Final
+
+/* platform's underlying implementation of SHA1, could be OpenSSL,
+   blk_SHA, Apple CommonCrypto, etc...  */
+#define platform_SHA_CTX   SHA_CTX
+#define platform_SHA1_Init SHA1_Init
+#define platform_SHA1_Update   SHA1_Update
+#define platform_SHA1_FinalSHA1_Final
+
+/* git may call platform's underlying implementation of SHA1 directly,
+   or may call it through a wrapper */
+#define git_SHA_CTXplatform_SHA_CTX
+#define git_SHA1_Init  platform_SHA1_Init
+#define git_SHA1_Updateplatform_SHA1_Update
+#define git_SHA1_Final platform_SHA1_Final
+
 #endif
 
 #include 
-- 
2.4.9 (Apple Git-60)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] test: accept death by SIGPIPE as a valid failure mode

2015-11-04 Thread Jeff King
On Fri, Oct 30, 2015 at 02:22:14PM -0700, Junio C Hamano wrote:

> On a local host, the object/history transport code often talks over
> pipe with the other side.  The other side may notice some (expected)
> failure, send the error message either to our process or to the
> standard error and hung up.  In such codepaths, if timing were not
> unfortunate, our side would receive the report of (expected) failure
> from the other side over the pipe and die().  Otherwise, our side
> may still be trying to talk to it and would die with a SIGPIPE.
> 
> This was observed as an intermittent breakage in t5516 by a few
> people.
> 
> In the real-life scenario, either mode of death exits with a
> non-zero status, and the user would learn that the command failed.
> The test_must_fail helper should also know that dying with SIGPIPE
> is one of the valid failure modes when we are expecting the tested
> operation to notice problem and fail.

Sorry for the slow review; before commenting I wanted to dig into
whether this SIGPIPE ambiguity was avoidable in the first place.

I think the answer is "probably not". We do call write_or_die() pretty
consistently in the network-aware programs. So we could ignore SIGPIPE,
and then we would catch EPIPE (of course, we convert that into SIGPIPE
in many places, but we do not have to do so). But since the SIGPIPE
behavior is global, that carries the risk of us failing to check a write
against some other descriptor. It's probably not worth it.

Teaching the tests to handle both cases seems like a reasonable
workaround. Changing test_must_fail covers a lot of cases; I wondered if
there are other tests that would not want to silently cover up a SIGPIPE
death. But I could not really think of a plausible reason.

So I think your patch is the best thing to do.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/3] Limit the size of the data block passed to SHA1_Update()

2015-11-04 Thread atousa . p
From: Atousa Pahlevan Duprat 

Use the previous commit's abstraction mechanism for SHA1 to
support a chunked implementation of SHA1_Update() which limits
the amount of data in the chunk passed to SHA1_Update().

This is enabled by using the SHA1_MAX_BLOCK_SIZE envvar to
specify chunk size.  When using Apple's CommonCrypto library
this is enable and set to 1GiB.

Signed-off-by: Atousa Pahlevan Duprat 
---
 Makefile | 13 +
 cache.h  | 17 +++--
 compat/apple-common-crypto.h |  4 
 compat/sha1-chunked.c| 19 +++
 compat/sha1-chunked.h|  2 ++
 5 files changed, 53 insertions(+), 2 deletions(-)
 create mode 100644 compat/sha1-chunked.c
 create mode 100644 compat/sha1-chunked.h

diff --git a/Makefile b/Makefile
index 04c2231..6a4ca59 100644
--- a/Makefile
+++ b/Makefile
@@ -141,6 +141,10 @@ all::
 # Define PPC_SHA1 environment variable when running make to make use of
 # a bundled SHA1 routine optimized for PowerPC.
 #
+# Define SHA1_MAX_BLOCK_SIZE to limit the amount of data that will be hashed
+# in one call to the platform's SHA1_Update(). e.g. APPLE_COMMON_CRYPTO
+# wants 'SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L' defined.
+#
 # Define NEEDS_CRYPTO_WITH_SSL if you need -lcrypto when using -lssl (Darwin).
 #
 # Define NEEDS_SSL_WITH_CRYPTO if you need -lssl when using -lcrypto (Darwin).
@@ -1335,6 +1339,11 @@ ifdef NO_POSIX_GOODIES
BASIC_CFLAGS += -DNO_POSIX_GOODIES
 endif
 
+ifdef APPLE_COMMON_CRYPTO
+   # Apple CommonCrypto requires chunking
+   SHA1_MAX_BLOCK_SIZE = 1024L*1024L*1024L
+endif
+
 ifdef BLK_SHA1
SHA1_HEADER = "block-sha1/sha1.h"
LIB_OBJS += block-sha1/sha1.o
@@ -1353,6 +1362,10 @@ endif
 endif
 endif
 
+ifdef SHA1_MAX_BLOCK_SIZE
+   LIB_OBJS += compat/sha1-chunked.o
+   BASIC_CFLAGS += -DSHA1_MAX_BLOCK_SIZE="$(SHA1_MAX_BLOCK_SIZE)"
+endif
 ifdef NO_PERL_MAKEMAKER
export NO_PERL_MAKEMAKER
 endif
diff --git a/cache.h b/cache.h
index a934a2e..182ac62 100644
--- a/cache.h
+++ b/cache.h
@@ -11,7 +11,7 @@
 #include "string-list.h"
 
 #include SHA1_HEADER
-#ifndef git_SHA_CTX
+#ifndef platform_SHA_CTX
 
 /* platform's underlying implementation of SHA1, could be OpenSSL,
blk_SHA, Apple CommonCrypto, etc...  */
@@ -20,15 +20,28 @@
 #define platform_SHA1_Update   SHA1_Update
 #define platform_SHA1_FinalSHA1_Final
 
+#endif
+
 /* git may call platform's underlying implementation of SHA1 directly,
or may call it through a wrapper */
+
+#ifndef git_SHA_CTX
+
+#ifdef SHA1_MAX_BLOCK_SIZE
+#include "compat/sha1-chunked.h"
+#define git_SHA_CTXplatform_SHA_CTX
+#define git_SHA1_Init  platform_SHA1_Init
+#define git_SHA1_Updategit_SHA1_Update_Chunked
+#define git_SHA1_Final platform_SHA1_Final
+#else
 #define git_SHA_CTXplatform_SHA_CTX
 #define git_SHA1_Init  platform_SHA1_Init
 #define git_SHA1_Updateplatform_SHA1_Update
 #define git_SHA1_Final platform_SHA1_Final
-
 #endif
 
+#endif 
+
 #include 
 typedef struct git_zstream {
z_stream z;
diff --git a/compat/apple-common-crypto.h b/compat/apple-common-crypto.h
index c8b9b0e..d3fb264 100644
--- a/compat/apple-common-crypto.h
+++ b/compat/apple-common-crypto.h
@@ -16,6 +16,10 @@
 #undef TYPE_BOOL
 #endif
 
+#ifndef SHA1_MAX_BLOCK_SIZE
+#error Using Apple Common Crypto library requires setting SHA1_MAX_BLOCK_SIZE
+#endif
+
 #ifdef APPLE_LION_OR_NEWER
 #define git_CC_error_check(pattern, err) \
do { \
diff --git a/compat/sha1-chunked.c b/compat/sha1-chunked.c
new file mode 100644
index 000..6adfcfd
--- /dev/null
+++ b/compat/sha1-chunked.c
@@ -0,0 +1,19 @@
+#include "cache.h"
+
+int git_SHA1_Update_Chunked(platform_SHA_CTX *c, const void *data, size_t len)
+{
+   size_t nr;
+   size_t total = 0;
+   const char *cdata = (const char*)data;
+
+   while (len) {
+   nr = len;
+   if (nr > SHA1_MAX_BLOCK_SIZE)
+   nr = SHA1_MAX_BLOCK_SIZE;
+   platform_SHA1_Update(c, cdata, nr);
+   total += nr;
+   cdata += nr;
+   len -= nr;
+   }
+   return total;
+}
diff --git a/compat/sha1-chunked.h b/compat/sha1-chunked.h
new file mode 100644
index 000..7b2df28
--- /dev/null
+++ b/compat/sha1-chunked.h
@@ -0,0 +1,2 @@
+
+int git_SHA1_Update_Chunked(platform_SHA_CTX *c, const void *data, size_t len);
-- 
2.4.9 (Apple Git-60)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] run-parallel: rename set_nonblocking to set_nonblocking_or_die

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 10:19:43PM -0800, Junio C Hamano wrote:

> Having read your message, I notice these disturbing passages in my
> copy of manual pages.
> 
> poll(2) has
> 
> BUGS
>See the discussion of spurious readiness notifications under
>the BUGS section of select(2).
> 
> and then select(2) has
> 
>Under Linux, select() may report a socket file descriptor as
>"ready for read‐ ing", while nevertheless a subsequent read
>blocks.  This could for example happen when data has arrived
>but upon examination has wrong checksum and is discarded.
>There may be other circumstances in which a file descriptor
>is spuriously reported as ready.  Thus it may be safer to use
>O_NONBLOCK on sock‐ ets that should not block.
> 
> The world is not all Linux, so there may be systems that we do not
> have to worry about the bug described here, but there still are some
> Linux systems in use in the world, so we cannot assume the bug
> described above will not matter, either.
> 
> So I am not convinced that set_nonblocking() is unnecessary X-<.

Yes, I've heard of this bug, and I wouldn't be surprised if it is
present on many systems. But notice that it is talking about sockets,
whereas here I think we are dealing only with pipes. Pipes generally
have much simpler and more predictable behavior, and their behaviors are
more closely mandated by POSIX.

Of course that's all hearsay, and I can't quite you chapter and verse of
POSIX (as if that would be enough, anyway) to prove it. But personally,
I would be comfortable dropping the non-blocking bits completely from
Stefan's series and adding them back in later if anybody can actually
find a real-world example that doesn't conform.

Of course, simply omitting the O_NONBLOCK bits on Windows would also
work if we can guarantee the Windows behavior. It means we're carrying
around extra code that may or may not be doing anything useful, but it's
not _that_ much code.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About global --progress option

2015-11-04 Thread Jeff King
On Thu, Nov 05, 2015 at 12:47:22AM -0600, Edmundo Carmona Antoranz wrote:

> On the technical side, I think the global --progress option (and
> removing the option from the builtins) would not add complexity but
> the opposite because setting the flag would be done at the "main git"
> level and then all the builtins would just forget about it and would
> use progress regardless (cause deciding _if_ progress should be shown
> or not won't be up to them anymore) so the code from the builtins to
> support the option would be gone. It would certainly be more complex
> while keeping global and builtin options alive. Anyway, I do
> understand your concerns and will stand down on the topic (as in
> global --progress who???).

I think you are missing one important element, which is that git
programs do not all share a memory space with the main git binary. So
you cannot simply set the "progress" variable in the main program and
expect everybody to see it. Running "git foo" may invoke a separate
"git-foo" program, written in another language entirely.

For that reason, options to the main git binary typically set an
environment variable which is communicated to all sub-processes. For an
example, see how "--literal-pathspecs" is implemented. So it actually
does add some complexity.

That being said, the environment variable can be a good thing. For
example, imagine I have a script "git-foo" which runs several commands,
including "git-fetch". It's cumbersome for "git-foo" to take a
"--progress" option and then pass that down to "git-fetch". If you could
instead run:

  git --no-progress foo

and have that flag magically propagate to any git sub-programs which
care about showing progress, then that could perhaps make the feature
worthwhile (I say perhaps because while it seems plausible to me, I have
not heard of anyone actually wanting this feature in practice).

But adding in "git --progress" is an orthogonal decision to removing
support for "git  --progress". I do not see any big advantage to
the latter at this point, and a lot of potential negatives as we break
scripts and user expectations.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] t5509: add basic tests for hideRefs

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 02:36:54PM -0500, Eric Sunshine wrote:

> > +test_expect_success 'try to update a hidden ref' '
> > +   test_config -C pushee transfer.hideRefs refs/heads/master &&
> > +   test_must_fail git -C original push pushee-namespaced master
> 
> In above tests, you use -c to set the configuration temporarily for
> the git invocation, but not in this and following tests. Is that
> because the -c isn't visible to sub-commands which git-push invokes?
> (Genuine question; I want to make sure I understand the reasoning.)

Yes, we explicitly clear "-c" variables when we cross repo boundaries.
You can do it like:

  git push --receive-pack='git -c transfer.hideRefs=...  receive-pack'

but that is probably more obfuscated than using test_config.

I was going to complain that "test_config -C" does not actually work,
but somehow I missed 5fafc07 (test-lib-functions: support "test_config
-C  ...", 2015-09-05) going by. Very cool.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Limit the size of the data block passed to SHA1_Update()

2015-11-04 Thread Eric Sunshine
On Tuesday, November 3, 2015,  wrote:
> [PATCH] Limit the size of the data block passed to SHA1_Update()

As an aid for reviewers, please include the version number of the
patch, such as [PATCH vN] where "N" is the revision. format-patch's -v
option can help automate this.

> Some implementations of SHA_Updates have inherent limits
> on the max chunk size. SHA1_MAX_BLOCK_SIZE can be defined
> to set the max chunk size supported, if required.  This is
> enabled for OSX CommonCrypto library and set to 1GiB.
>
> Signed-off-by: Atousa Pahlevan Duprat 
> ---

It is helpful to reviewers if you can describe here, below the "---"
line, how this version of the patch differs from the previous one.
More below...

> diff --git a/Makefile b/Makefile
> @@ -136,11 +136,15 @@ all::
>  # to provide your own OpenSSL library, for example from MacPorts.
>  #
>  # Define BLK_SHA1 environment variable to make use of the bundled
> -# optimized C SHA1 routine.
> +# optimized C SHA1 routine.  This implies NO_APPLE_COMMON_CRYPTO.
>  #
>  # Define PPC_SHA1 environment variable when running make to make use of
>  # a bundled SHA1 routine optimized for PowerPC.
>  #
> +# Define SHA1_MAX_BLOCK_SIZE if your SSH1_Update() implementation can
> +# hash only a limited amount of data in one call (e.g. APPLE_COMMON_CRYPTO
> +# may want 'SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L' defined).
> +#
>  # Define NEEDS_CRYPTO_WITH_SSL if you need -lcrypto when using -lssl 
> (Darwin).
>  #
>  # Define NEEDS_SSL_WITH_CRYPTO if you need -lssl when using -lcrypto 
> (Darwin).
> @@ -986,6 +990,10 @@ ifeq (no,$(USE_PARENS_AROUND_GETTEXT_N))
>  endif
>  endif
>
> +ifdef BLK_SHA1
> +   NO_APPLE_COMMON_CRYPTO=1
> +endif

I think Junio already mentioned[1] that this change (and its
corresponding documentation update above) likely is undesirable, but I
just wanted to mention that you would typically want to split such a
change out to a separate patch since it's unrelated to the overall
intention of the current patch. More below...

[1]: http://article.gmane.org/gmane.comp.version-control.git/280859

>  ifeq ($(uname_S),Darwin)
> ifndef NO_FINK
> ifeq ($(shell test -d /sw/lib && echo y),y)
> @@ -1346,6 +1354,8 @@ else
>  ifdef APPLE_COMMON_CRYPTO
> COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL
> SHA1_HEADER = 
> +   # Apple CommonCrypto requires chunking
> +   SHA1_MAX_BLOCK_SIZE = 1024L*1024L*1024L
>  else
> SHA1_HEADER = 
> EXTLIBS += $(LIB_4_CRYPTO)
> @@ -1353,6 +1363,10 @@ endif
>  endif
>  endif
>
> +ifdef SHA1_MAX_BLOCK_SIZE
> +   LIB_OBJS += compat/sha1_chunked.o
> +   BASIC_CFLAGS += -DSHA1_MAX_BLOCK_SIZE="$(SHA1_MAX_BLOCK_SIZE)"
> +endif
> diff --git a/block-sha1/sha1.h b/block-sha1/sha1.h
> @@ -18,5 +18,5 @@ void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX 
> *ctx);
>
>  #define git_SHA_CTXblk_SHA_CTX
>  #define git_SHA1_Init  blk_SHA1_Init
> -#define git_SHA1_Updateblk_SHA1_Update
> +#define platform_SHA1_Update   blk_SHA1_Update
>  #define git_SHA1_Final blk_SHA1_Final
> diff --git a/cache.h b/cache.h
> index 79066e5..e345e38 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -10,12 +10,21 @@
>  #include "trace.h"
>  #include "string-list.h"
>
> +/* platform's underlying implementation of SHA1 */
>  #include SHA1_HEADER
>  #ifndef git_SHA_CTX
> -#define git_SHA_CTXSHA_CTX
> -#define git_SHA1_Init  SHA1_Init
> -#define git_SHA1_UpdateSHA1_Update
> -#define git_SHA1_Final SHA1_Final
> +#define git_SHA_CTXSHA_CTX
> +#define git_SHA1_Init  SHA1_Init
> +#define platform_SHA1_Update   SHA1_Update
> +#define git_SHA1_Final SHA1_Final
> +#endif

It's a bit ugly that this special-cases only "update" with the
"platform_" abstraction. It _might_ be preferable to generalize this
for all the SHA1 operations. If so, that's something you'd want to do
as a separate preparatory patch specifically aimed at adding this new
abstraction layer.

(In fact, even if you decide only to special-case "update", that still
might deserve a separate preparatory patch since it's a conceptually
distinct change from the overall focus of this patch, and would make
this patch less noisy, thus easier to review.)

> +/* choose chunked implementation or not */
> +#ifdef SHA1_MAX_BLOCK_SIZE
> +int git_SHA1_Update_Chunked(SHA_CTX *c, const void *data, size_t len);
> +#define git_SHA1_Update   git_SHA1_Update_Chunked
> +#else
> +#define git_SHA1_Update   platform_SHA1_Update
>  #endif
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] http: use credential API to handle proxy authentication

2015-11-04 Thread Junio C Hamano
Knut Franke  writes:

> @@ -337,6 +342,24 @@ static void var_override(const char **var, char *value)
>  
>  static void init_curl_proxy_auth(CURL *result)
>  {
> + if (proxy_auth.username) {
> + struct strbuf s = STRBUF_INIT;

Having this variable triggers compilation error with newer libcurl
version as it is only used in the #else clause X-<.

> + if (!proxy_auth.password)
> + credential_fill(_auth);
> +#if LIBCURL_VERSION_NUM >= 0x071301
> + curl_easy_setopt(result, CURLOPT_PROXYUSERNAME,
> + proxy_auth.username);
> + curl_easy_setopt(result, CURLOPT_PROXYPASSWORD,
> + proxy_auth.password);
> +#else
> + strbuf_addstr_urlencode(, proxy_auth.username, 1);
> + strbuf_addch(, ':');
> + strbuf_addstr_urlencode(, proxy_auth.password, 1);
> + curl_proxyuserpwd = strbuf_detach(, NULL);
> + curl_easy_setopt(result, CURLOPT_PROXYUSERPWD, 
> curl_proxyuserpwd);
> +#endif
> + }

It is probably easier to follow the flow of the logic of the primary
interface (i.e. init_curl_proxy_auth()) if you split this part into
its own helper function that deals with implementation detail, e.g.

 http.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/http.c b/http.c
index 1f269b0..fad2ba5 100644
--- a/http.c
+++ b/http.c
@@ -340,24 +340,30 @@ static void var_override(const char **var, char *value)
}
 }
 
-static void init_curl_proxy_auth(CURL *result)
+static void set_proxyauth_name_password(CURL *result)
 {
-   if (proxy_auth.username) {
-   struct strbuf s = STRBUF_INIT;
-   if (!proxy_auth.password)
-   credential_fill(_auth);
 #if LIBCURL_VERSION_NUM >= 0x071301
curl_easy_setopt(result, CURLOPT_PROXYUSERNAME,
proxy_auth.username);
curl_easy_setopt(result, CURLOPT_PROXYPASSWORD,
proxy_auth.password);
 #else
+   struct strbuf s = STRBUF_INIT;
+
strbuf_addstr_urlencode(, proxy_auth.username, 1);
strbuf_addch(, ':');
strbuf_addstr_urlencode(, proxy_auth.password, 1);
curl_proxyuserpwd = strbuf_detach(, NULL);
curl_easy_setopt(result, CURLOPT_PROXYUSERPWD, 
curl_proxyuserpwd);
 #endif
+}
+
+static void init_curl_proxy_auth(CURL *result)
+{
+   if (proxy_auth.username) {
+   if (!proxy_auth.password)
+   credential_fill(_auth);
+   set_proxyauth_name_password(result);
}
 
var_override(_proxy_authmethod, 
getenv("GIT_HTTP_PROXY_AUTHMETHOD"));

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Doug Kelly
On Wed, Nov 4, 2015 at 1:35 PM, Junio C Hamano  wrote:
> Doug Kelly  writes:
>
>> I think the patches I sent (a bit prematurely) address the
>> remaining comments... I did find there was a relevant test in
>> t5304 already, so I added a new test in the same section (and
>> cleaned up some of the garbage it wasn't removing before).  I'm
>> not sure if it's poor form to move tests around like this, but I
>> figured it might be best to keep them logically grouped.
>
> OK, will queue as I didn't spot anything glaringly wrong ;-)
>
> I did wonder if we want to say anything about .bitmap files, though.
> If there is one without matching .idx and .pack, shouldn't we report
> just like we report .idx without .pack (or vice versa)?
>
> Thanks.

I think you're right -- this would be something worth following up on.
At least, t5304 doesn't cover this case explicitly, but when I tried
adding an empty bitmap with a bogus name, I did see a "no
corresponding .idx or .pack" error, similar to the stale .keep file.

I'd trust your (and Jeff's) knowledge on this far more than my own,
but would it be a bad idea to clean up .keep and .bitmap files if the
.idx/.pack pair are missing?  I think we may have had a discussion
previously on how things along these lines might be racey -- but I
don't know what order the .keep file is created in relation to the
.idx/.pack.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 11:35:52AM -0800, Junio C Hamano wrote:

> Doug Kelly  writes:
> 
> > I think the patches I sent (a bit prematurely) address the
> > remaining comments... I did find there was a relevant test in
> > t5304 already, so I added a new test in the same section (and
> > cleaned up some of the garbage it wasn't removing before).  I'm
> > not sure if it's poor form to move tests around like this, but I
> > figured it might be best to keep them logically grouped.
> 
> OK, will queue as I didn't spot anything glaringly wrong ;-)
> 
> I did wonder if we want to say anything about .bitmap files, though.
> If there is one without matching .idx and .pack, shouldn't we report
> just like we report .idx without .pack (or vice versa)?

Yeah, I think so. The logic should really extend to anything without a
matching .pack. And I think the sane rule is probably:

  If we have pack-$sha.$ext, but not pack-$sha.pack, then:

1. if $ext is known to us as a cache that can be regenerated from the
   .pack (i.e., .idx, .bitmap), then delete it

2. if $ext is known to us as precious, do nothing (there is nothing
   in this category right now, though)

3. if $ext is not known to us, warn but do not delete (in case a
   future version adds something precious)

The conservatism in (3) is the right thing to do, I think, but I doubt
it will ever matter, because we probably cannot ever add non-cache
auxiliary files to the pack. Old versions of git would not delete such
precious files, but nor would they carry them forward during a repack.
So short of a repo-version bump, I think we are effectively limited to
adding only caches which can be re-generated from an original .pack.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: O_NONBLOCK under Windows (was: git.git as of tonight)

2015-11-04 Thread Torsten Bögershausen
On 11/04/2015 12:00 AM, Stefan Beller wrote:
> On Tue, Nov 3, 2015 at 1:03 PM, Johannes Sixt  wrote:
>
The #ifdef assumes that Windows never will have O_NONBLOCK/F_GETFL)

Does the following make more sense ?
#if defined (O_NONBLOCK) && defined (F_GETFL)

Or may be:
#ifndef NO_O_NONBLOCK
>> #ifndef GIT_WINDOWS_NATIVE
>> int flags = fcntl(fd, F_GETFL);
>> if (flags < 0)
>> warning("Could not get file status flags, "
>> "output will be degraded");
>> else if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
>> #endif
>> warning("Could not set file status flags, "
>> "output will be degraded");
>> }
>>
>>
>>
>>   
>>

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 01:56:38PM -0600, Doug Kelly wrote:

> > I did wonder if we want to say anything about .bitmap files, though.
> > If there is one without matching .idx and .pack, shouldn't we report
> > just like we report .idx without .pack (or vice versa)?
> 
> I think you're right -- this would be something worth following up on.
> At least, t5304 doesn't cover this case explicitly, but when I tried
> adding an empty bitmap with a bogus name, I did see a "no
> corresponding .idx or .pack" error, similar to the stale .keep file.

Yeah, that should be harmless warning (although note because the bitmap
code only really handles a single bitmap, it can prevent loading of the
"real" bitmap; so you'd want to clean it up, for sure).

> I'd trust your (and Jeff's) knowledge on this far more than my own,
> but would it be a bad idea to clean up .keep and .bitmap files if the
> .idx/.pack pair are missing?  I think we may have had a discussion
> previously on how things along these lines might be racey -- but I
> don't know what order the .keep file is created in relation to the
> .idx/.pack.

Definitely cleaning up the .bitmap is sane and not racy (it's in the
same boat as the .idx, I think).

.keep files are more tricky. I'd have to go over the receive-pack code
to confirm, but I think they _are_ racy. That is, receive-pack will
create them as a lockfile before moving the pack into place. That's OK,
though, if we use mtimes to give ourselves a grace period (I haven't
looked at your series yet).

But moreover, .keep files can be created manually by the user. If the
pack they referenced goes away, they are not really serving any purpose.
But it's possible that the user would want to salvage the content of the
file, or know that it was there.

So I'd argue we should leave them. Or at least leave ones that do not
have the generic "{receive,fetch}-pack $pid on $host comment in them,
which were clearly created as lockfiles.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: O_NONBLOCK under Windows (was: git.git as of tonight)

2015-11-04 Thread Stefan Beller
On Wed, Nov 4, 2015 at 11:59 AM, Torsten Bögershausen  wrote:
> On 11/04/2015 12:00 AM, Stefan Beller wrote:
>> On Tue, Nov 3, 2015 at 1:03 PM, Johannes Sixt  wrote:
>>
> The #ifdef assumes that Windows never will have O_NONBLOCK/F_GETFL)
>
> Does the following make more sense ?
> #if defined (O_NONBLOCK) && defined (F_GETFL)
>
> Or may be:
> #ifndef NO_O_NONBLOCK
>>> #ifndef GIT_WINDOWS_NATIVE
>>> int flags = fcntl(fd, F_GETFL);
>>> if (flags < 0)
>>> warning("Could not get file status flags, "
>>> "output will be degraded");
>>> else if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
>>> #endif
>>> warning("Could not set file status flags, "
>>> "output will be degraded");
>>> }
>>>

Reading Junios answer to the resent patch[1], I am currently debating
if this is the right way to go anyway. As Junio points out, this is
not a warning
but rather a critical issue such that we'd maybe rather die(...) than
just warning(...),
which would make the discussion about the correct condition in the #ifdef moot.

[1] [PATCHv3 02/11] run-command: report failure for degraded output just once
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Doug Kelly
On Wed, Nov 4, 2015 at 2:02 PM, Jeff King  wrote:
> On Wed, Nov 04, 2015 at 01:56:38PM -0600, Doug Kelly wrote:
>
>> > I did wonder if we want to say anything about .bitmap files, though.
>> > If there is one without matching .idx and .pack, shouldn't we report
>> > just like we report .idx without .pack (or vice versa)?
>>
>> I think you're right -- this would be something worth following up on.
>> At least, t5304 doesn't cover this case explicitly, but when I tried
>> adding an empty bitmap with a bogus name, I did see a "no
>> corresponding .idx or .pack" error, similar to the stale .keep file.
>
> Yeah, that should be harmless warning (although note because the bitmap
> code only really handles a single bitmap, it can prevent loading of the
> "real" bitmap; so you'd want to clean it up, for sure).
>
>> I'd trust your (and Jeff's) knowledge on this far more than my own,
>> but would it be a bad idea to clean up .keep and .bitmap files if the
>> .idx/.pack pair are missing?  I think we may have had a discussion
>> previously on how things along these lines might be racey -- but I
>> don't know what order the .keep file is created in relation to the
>> .idx/.pack.
>
> Definitely cleaning up the .bitmap is sane and not racy (it's in the
> same boat as the .idx, I think).
>
> .keep files are more tricky. I'd have to go over the receive-pack code
> to confirm, but I think they _are_ racy. That is, receive-pack will
> create them as a lockfile before moving the pack into place. That's OK,
> though, if we use mtimes to give ourselves a grace period (I haven't
> looked at your series yet).
>
> But moreover, .keep files can be created manually by the user. If the
> pack they referenced goes away, they are not really serving any purpose.
> But it's possible that the user would want to salvage the content of the
> file, or know that it was there.
>
> So I'd argue we should leave them. Or at least leave ones that do not
> have the generic "{receive,fetch}-pack $pid on $host comment in them,
> which were clearly created as lockfiles.
>
> -Peff

Currently there's no mtime-guarding logic (I dug up that conversation
earlier, though, but after I'd done the respin on this series)... OK,
in that case, I'll create a separate patch that tests/cleans up
.bitmap, but doesn't touch .keep.  This might be a small series since
I think the logic for finding pack garbage doesn't know anything about
.bitmap per-se, so it's looking like I'll extend that relevant code,
before adding the handling in gc and appropriate tests.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Stefan Beller
On Wed, Nov 4, 2015 at 10:14 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> The warning message is cluttering the output itself,
>> so just report it once.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>>  run-command.c | 20 ++--
>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/run-command.c b/run-command.c
>> index 7c00c21..3ae563f 100644
>> --- a/run-command.c
>> +++ b/run-command.c
>> @@ -1012,13 +1012,21 @@ static void pp_cleanup(struct parallel_processes *pp)
>>
>>  static void set_nonblocking(int fd)
>>  {
>> + static int reported_degrade = 0;
>>   int flags = fcntl(fd, F_GETFL);
>> - if (flags < 0)
>> - warning("Could not get file status flags, "
>> - "output will be degraded");
>> - else if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
>> - warning("Could not set file status flags, "
>> - "output will be degraded");
>> + if (flags < 0) {
>> + if (!reported_degrade) {
>> + warning("Could not get file status flags, "
>> + "output will be degraded");
>> + reported_degrade = 1;
>> + }
>> + } else if (fcntl(fd, F_SETFL, flags | O_NONBLOCK)) {
>> + if (!reported_degrade) {
>> + warning("Could not set file status flags, "
>> + "output will be degraded");
>> + reported_degrade = 1;
>> + }
>> + }
>>  }
>
> Imagine that we are running two things A and B at the same time.  We
> ask poll(2) and it says both A and B have some data ready to be
> read, and we try to read from A.  strbuf_read_once() would try to
> read up to 8K, relying on the fact that you earlier set the IO to be
> nonblock.  It will get stuck reading from A without allowing output
> from B to drain.  B's write may get stuck because we are not reading
> from it, and would cause B to stop making progress.
>
> What if the other sides of the connection from A and B are talking
> with each other,

I am not sure if we want to allow this ever. How would that work with
jobs==1? How do we guarantee to have A and B running at the same time?
In a later version of the parallel processing we may have some other ramping
up mechanisms, such as: "First run only one process until it outputted at least
250 bytes", which would also produce such a lock. So instead a time based ramp
up may be better. But my general concern is how much guarantees are we selling
here? Maybe the documentation needs to explicitly state that we cannot talk to
each or at least should assume the blocking of stdout/err.

> and B's non-progress caused the processing for A on
> the other side of the connection to block, causing it not to produce
> more output to allow us to make progress reading from A (so that
> eventually we can give B a chance to drain its output)?  Imagine A
> and B are pushes to the same remote, B may be pushing a change to a
> submodule while A may be pushing a matching change to its
> superproject, and the server may be trying to make sure that the
> submodule update completes and updates the ref before making the
> superproject's tree that binds that updated submodule's commit
> availble, for example?  Can we make any progress from that point?
>
> I am not convinced that the failure to set nonblock IO is merely
> "output will be degraded".  It feels more like a fatal error if we
> are driving more than one task at the same time.
>

Another approach would be to test if we can set to non blocking and if
that is not possible, do not buffer it, but redirect the subcommand
directly to stderr of the calling process.

if (set_nonblocking(pp->children[i].process.err) < 0) {
pp->children[i].process.err = 2;
degraded_parallelism = 1;
}

and once we observe the degraded_parallelism flag, we can only
schedule a maximum of one job at a time, having direct output?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 02:08:21PM -0600, Doug Kelly wrote:

> Currently there's no mtime-guarding logic (I dug up that conversation
> earlier, though, but after I'd done the respin on this series)... OK,
> in that case, I'll create a separate patch that tests/cleans up
> .bitmap, but doesn't touch .keep.  This might be a small series since
> I think the logic for finding pack garbage doesn't know anything about
> .bitmap per-se, so it's looking like I'll extend that relevant code,
> before adding the handling in gc and appropriate tests.

I'd hoped you could reuse the list of extensions found in
builtin/repack.c (e.g., see remove_redundant_pack). But I guess that is
not connected with the garbage-reporting code. And anyway, the simple
list probably does not carry sufficient information (it does not know
that ".keep" is potentially more precious than ".idx", for example).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Junio C Hamano
Doug Kelly  writes:

> I think the patches I sent (a bit prematurely) address the
> remaining comments... I did find there was a relevant test in
> t5304 already, so I added a new test in the same section (and
> cleaned up some of the garbage it wasn't removing before).  I'm
> not sure if it's poor form to move tests around like this, but I
> figured it might be best to keep them logically grouped.

OK, will queue as I didn't spot anything glaringly wrong ;-)

I did wonder if we want to say anything about .bitmap files, though.
If there is one without matching .idx and .pack, shouldn't we report
just like we report .idx without .pack (or vice versa)?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] t5509: add basic tests for hideRefs

2015-11-04 Thread Eric Sunshine
On Tuesday, November 3, 2015, Lukas Fleischer  wrote:
> Test whether regular and full hideRefs patterns work as expected when
> namespaces are used.
>
> Helped-by: Eric Sunshine 
> Signed-off-by: Lukas Fleischer 
> ---
> diff --git a/t/t5509-fetch-push-namespaces.sh 
> b/t/t5509-fetch-push-namespaces.sh
> @@ -82,4 +82,45 @@ test_expect_success 'mirroring a repository using a ref 
> namespace' '
> +test_expect_success 'hide namespaced refs with transfer.hideRefs' '
> +   GIT_NAMESPACE=namespace \
> +   git -C pushee -c transfer.hideRefs=refs/tags \
> +   ls-remote "ext::git %s ." >actual &&
> +   printf "$commit1\trefs/heads/master\n" >expected &&
> +   test_cmp expected actual
> +'
> +
> +test_expect_success 'check that transfer.hideRefs does not match unstripped 
> refs' '
> +   GIT_NAMESPACE=namespace \
> +   git -C pushee -c 
> transfer.hideRefs=refs/namespaces/namespace/refs/tags \
> +   ls-remote "ext::git %s ." >actual &&
> +   printf "$commit1\trefs/heads/master\n" >expected &&
> +   printf "$commit0\trefs/tags/0\n" >>expected &&
> +   printf "$commit1\trefs/tags/1\n" >>expected &&
> +   test_cmp expected actual
> +'
> +
> +test_expect_success 'hide full refs with transfer.hideRefs' '
> +   GIT_NAMESPACE=namespace \
> +   git -C pushee -c 
> transfer.hideRefs="^refs/namespaces/namespace/refs/tags" \
> +   ls-remote "ext::git %s ." >actual &&
> +   printf "$commit1\trefs/heads/master\n" >expected &&
> +   test_cmp expected actual
> +'
> +
> +test_expect_success 'try to update a hidden ref' '
> +   test_config -C pushee transfer.hideRefs refs/heads/master &&
> +   test_must_fail git -C original push pushee-namespaced master

In above tests, you use -c to set the configuration temporarily for
the git invocation, but not in this and following tests. Is that
because the -c isn't visible to sub-commands which git-push invokes?
(Genuine question; I want to make sure I understand the reasoning.)

> +'
> +
> +test_expect_success 'try to update a ref that is not hidden' '
> +   test_config -C pushee transfer.hideRefs 
> refs/namespaces/namespace/refs/heads/master &&
> +   git -C original push pushee-namespaced master
> +'
> +
> +test_expect_success 'try to update a hidden full ref' '
> +   test_config -C pushee transfer.hideRefs 
> "^refs/namespaces/namespace/refs/heads/master" &&
> +   test_must_fail git -C original push pushee-namespaced master
> +'
> +
>  test_done
> --
> 2.6.2
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] http: allow selection of proxy authentication method

2015-11-04 Thread Junio C Hamano
Knut Franke  writes:

> diff --git a/http.c b/http.c
> index 7da76ed..a786802 100644
> --- a/http.c
> +++ b/http.c
> @@ -305,6 +326,40 @@ static void init_curl_http_auth(CURL *result)
>  #endif
>  }
>  
> +/* assumes *var is free-able */

This is not just "assumes", but it is wrong for the callers to pass
unfreeable memory, so

/* *var must be free-able */

> +static void var_override(const char **var, char *value)
> +{
> + if (value) {
> + free((void*) *var);

There may be a similar whitespace damage but I happened to notice
this one.

free((void *)*var);

> +static void init_curl_proxy_auth(CURL *result)
> +{
> + var_override(_proxy_authmethod, 
> getenv("GIT_HTTP_PROXY_AUTHMETHOD"));

If your libcurl does not understand CURLOPT_PROXYAUTH, do you need
to do this var_override()?  Shouldn't this be inside the #if..#endif
below?

> +
> +#if LIBCURL_VERSION_NUM >= 0x070a07 /* CURLOPT_PROXYAUTH and CURLAUTH_ANY */
> + if (http_proxy_authmethod) {
> +...
> + else
> + curl_easy_setopt(result, CURLOPT_PROXYAUTH, CURLAUTH_ANY);
> +#endif
> +}

Other than that, looks cleanly done.  Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/4] Add support for matching full refs in hideRefs

2015-11-04 Thread Junio C Hamano
Junio C Hamano  writes:

> The semantics of "hideRefs" used to be that "refs can be hidden from
> ls-remote" (i.e. prevents fetching, avoids contaminating "--mirror")
> and "refs can be hidden from 'push'", but the objects being known
> tips of histories that are complete, they still participate in
> common ancestor discovery.  
>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index 3da97a1..91ed6a5 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -2690,7 +2690,8 @@ the prefix `refs/heads/master` is specified in 
>> `transfer.hideRefs` and the
>>  current namespace is `foo`, then `refs/namespaces/foo/refs/heads/master` is
>>  omitted from the advertisements but `refs/heads/master` and
>>  `refs/namespaces/bar/refs/heads/master` are still advertised as so-called
>> -"have" lines.
>> +"have" lines. In order to match refs before stripping, add a `^` in front of
>> +the ref name. If you combine `!` and `^`, `!` must be specified first.
>
> I think the changes in the above two hunks prevent the hidden refs
> from participating in common ancestor discovery (which is probably a
> good thing), making the above description stale.

Ah, I was confused.  The description only says "refs from outer
namespaces are not advertised, but still participate the common
ancestor discovery in the form of .have lines", and moving the
location where ref-is-hidden is called in this patch does not affect
that fact at all.

So this looks good as-is.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] http: use credential API to handle proxy authentication

2015-11-04 Thread Eric Sunshine
On Wednesday, November 4, 2015, Knut Franke
 wrote:
> Currently, the only way to pass proxy credentials to curl is by including them
> in the proxy URL. Usually, this means they will end up on disk unencrypted, 
> one
> way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
> proxy authentication often uses a domain user, credentials can be security
> sensitive; therefore, a safer way of passing credentials is desirable.
>
> If the configured proxy contains a username but not a password, query the
> credential API for one. Also, make sure we approve/reject proxy credentials
> properly.
>
> For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
> environment variables, which would otherwise be evaluated as a fallback by 
> curl.
> Without this, we would have different semantics for git configuration and
> environment variables.
>
> Signed-off-by: Knut Franke 
> Helped-by: Junio C Hamano 
> Helped-by: Eric Sunshine 
> ---
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> @@ -1593,9 +1593,13 @@ help.htmlPath::
>
>  http.proxy::
> Override the HTTP proxy, normally configured using the 'http_proxy',
> -   'https_proxy', and 'all_proxy' environment variables (see
> -   `curl(1)`).  This can be overridden on a per-remote basis; see
> -   remote..proxy
> +   'https_proxy', and 'all_proxy' environment variables (see `curl(1)`). 
> In
> +   addition to the syntax understood by curl, it is possible to specify a
> +   proxy string with a user name but no password, in which case git will
> +   attempt to acquire one in the same way it does for other credentials. 
> See
> +   linkgit:gitcredentials[7] for more information. The syntax thus is
> +   '[protocol://][user[:password]@]proxyhost[:port]'. This can be 
> overridden
> +   on a per-remote basis; see remote..proxy

s/$/./ maybe?

>  http.proxyAuthMethod::
> Set the method with which to authenticate against the HTTP proxy. This
> diff --git a/http.c b/http.c
> @@ -337,6 +342,24 @@ static void var_override(const char **var, char *value)
>
>  static void init_curl_proxy_auth(CURL *result)
>  {
> +   if (proxy_auth.username) {
> +   struct strbuf s = STRBUF_INIT;
> +   if (!proxy_auth.password)
> +   credential_fill(_auth);
> +#if LIBCURL_VERSION_NUM >= 0x071301
> +   curl_easy_setopt(result, CURLOPT_PROXYUSERNAME,
> +   proxy_auth.username);
> +   curl_easy_setopt(result, CURLOPT_PROXYPASSWORD,
> +   proxy_auth.password);

The strbuf does not get released in this #if branch, but since no
content was added, it doesn't need to be released, thus nothing is
leaked. Good.

Does the compiler warn about unused variable 's' in this #if branch?

> +#else
> +   strbuf_addstr_urlencode(, proxy_auth.username, 1);
> +   strbuf_addch(, ':');
> +   strbuf_addstr_urlencode(, proxy_auth.password, 1);
> +   curl_proxyuserpwd = strbuf_detach(, NULL);
> +   curl_easy_setopt(result, CURLOPT_PROXYUSERPWD, 
> curl_proxyuserpwd);

And this branch detaches content from the strbuf, and that memory is
released later in http_cleanup(), so also nothing is leaked, thus all
is good.

> +#endif
> +   }
> +
> var_override(_proxy_authmethod, 
> getenv("GIT_HTTP_PROXY_AUTHMETHOD"));
>
>  #if LIBCURL_VERSION_NUM >= 0x070a07 /* CURLOPT_PROXYAUTH and CURLAUTH_ANY */
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug: stash save -u removes (some) ignored files

2015-11-04 Thread Felipe Sateler
On 4 November 2015 at 03:23, Atousa Duprat  wrote:
>> felipe@felipe:testgit% git stash save -u
>
> This does the following:
> $ GIT_TRACE=1 git stash save -u
> [...]
> 21:59:10.606094 git.c:348   trace: built-in: git 'clean'
> '--force' '--quiet' '-d'
>
> git-clean -d removes untracked directories in addition to untracked files.
> Should 'git stash save -u' issue a 'git clean -d' or simply a 'git clean'?

It appears to be intentionally done[1]. Maybe the problem is that
git-clean -d should not remove untracked directories that contain
ignored files?


[1] http://article.gmane.org/gmane.comp.version-control.git/180214

-- 

Saludos,
Felipe Sateler
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Limit the size of the data block passed to SHA1_Update()

2015-11-04 Thread Junio C Hamano
Junio C Hamano  writes:

>> ifdef BLK_SHA1
>> SHA1_HEADER = "block-sha1/sha1.h"
>> LIB_OBJS += block-sha1/sha1.o
>> else
>> ifdef PPC_SHA1
>> SHA1_HEADER = "ppc/sha1.h"
>> LIB_OBJS += ppc/sha1.o ppc/sha1ppc.o
>> else
>> ifdef APPLE_COMMON_CRYPTO
>> COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL
>> SHA1_HEADER = 
>> SHA1_MAX_BLOCK_SIZE = 1024L*1024L*1024L
>> else
>> SHA1_HEADER = 
>> EXTLIBS += $(LIB_4_CRYPTO)
>> endif
>>
>> which seems to imply that BLK_SHA1 and APPLE_COMMON_CRYPTO are
>> mutually exclusive?
>
> Yes, you are correct that these two cannot be used at the same time.
> In general (not limited to BLK_SHA1 and APPLE_COMMON_CRYPTO) you can
> pick only _one_ underlying SHA-1 implementation to use with the
> system.

Our "seems to imply" above is a faulty reading.  The four lines I
wrote are not incorrect per-se, but this exchange was misleading.

When BLK_SHA1 is defined, the above fragment from the Makefile is
only saying "CommonCrypto may or may not be used for any other
purposes, but for SHA-1 hashing, I'll use our own block-sha1/
implementation."  It does not say anything about how the system
favours CommonCrypto over OpenSSL with APPLE_COMMON_CRYPTO.

And it is legit to define both APPLE_COMMON_CRYPTO and BLK_SHA1; the
resulting build would still use SSL-related functions what other
people may use from OpenSSL from CommonCrypto.  Filipe Cabecinhas
pointed out that such a configuration works (and solves the issue
that triggered this thread) very early in the thread.

I haven't looked carefully at the latest version of your patch, but
I just wanted to make sure that I didn't mislead you to add an
unnecessary "we check if both APPLE_COMMON_CRYPTO and BLK_SHA1 are
defined and error out because they are incompatible" check.  Sorry
for an earlier message that may have been confusing.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: drop connectivity check for local clones

2015-11-04 Thread Jeff King
[I'm cc-ing the list, since I think this answer is of general interest.]

On Wed, Nov 04, 2015 at 11:00:34AM +0100, Matej Buday wrote:

> I have a question somewhat regarding this old commit of yours:
> https://github.com/git/git/commit/125a05fd0b45416558923b753f6418c24208d443
> 
> Let me preface this by saying that I don't completely understand what the
> connectivity check does...

One of the invariants git tries to remain in the repository is that for
any object reachable from a ref (i.e., a branch or tag), we have all of
the ancestor objects. So if you have commit 125a05, you also have the
parent, and its parent, and so on, down to the root.

When we fetch or clone from a remote repository, it sends us some
objects, and we plan to point one of our refs at it. But rather than
trust that the remote sent us everything we need to maintain that
invariant, we actually walk the graph to make sure that is the case.

This can catch bugs or transfer errors early. So the operation is safer,
at the expense of spending some CPU time.

We skip it for local disk-to-disk clones. We trust the source clone
more, and since the point of a local clone is to be very fast, the
safety/CPU tradeoff doesn't make as much sense.

> Well, the question is -- is this check necessary
> for local clones that use the --reference option?

Sort of. If you say:

  git clone --reference /some/local/repo git://some-remote-repo

Then we do check the incoming objects from some-remote-repo. However,
there is an optimization we don't do: we could assume that everything in
/some/local/repo is fine, and stop traversing there. So if you fetch
only a few objects from the remote, that is all you would check.

The optimization would look something like this:

  https://github.com/peff/git/commit/1254ff54b49eff19ec8a09c36e3edd24d490cae1

I wrote that last year, but haven't actually submitted the patch yet.
There are two reasons:

  1. It needs minor cleanup due to the sha1/oid transition that is
 ongoing (see the "ugh" comment). I think this could be fixed by
 refactoring some of the callback interfaces, but I haven't gotten
 around to it.

  2. Using alternates to optimize can backfire at a certain scale. If
 you have a very large number of refs in the alternate repository,
 just accessing and processing those refs can be more expensive than
 walking the history graph in the first place.

 This is the case for us at GitHub, where our alternates have the
 refs for _all_ of the forks of a given project. So I would want
 some flag to turn this behavior off.

 Of course, we are in an exceptional circumstance at GitHub, and
 that is no reason the topic cannot go upstream (we already carry
 custom patches to disable alternates for things like receive-pack,
 and could do the same here).

 So that is not a good reason not to submit, only an explanation why
 I have not yet bothered to spend the time on it. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 00/11] Expose the submodule parallelism to the user

2015-11-04 Thread Junio C Hamano
Stefan Beller  writes:

> Where does it apply?
> ---
> This series applies on top of d075d2604c0f92045caa8d5bb6ab86cf4921a4ae (Merge
> branch 'rs/daemon-plug-child-leak' into sb/submodule-parallel-update) and 
> replaces
> the previous patches in sb/submodule-parallel-update
>
> What does it do?
> ---
> This series should finish the on going efforts of parallelizing
> submodule network traffic. The patches contain tests for clone,
> fetch and submodule update to use the actual parallelism both via
> command line as well as a configured option. I decided to go with
> "submodule.jobs" for all three for now.

The order of patches and where the series builds makes me suspect
that I have been expecting too much from the "parallel-fetch" topic.

I've been hoping that it would be useful for the project as a whole
to polish the other topic and make it available to wider audience
sooner by itself (both from "end users get improved Git early"
aspect and from "the core machinery to be reused in follow-up
improvements are made closer to perfection sooner" perspective).  So
I've been expecting that "Let's fix it on Windows" change directly
on top of sb/submodule-parallel-fetch to make that topic usable
before everything else.  Other patches in this series may require
the child_process_cleanup() change, so they may be applied on top of
the merge between sb/submodule-parallel-fetch (updated for Windows)
and rs/daemon-plug-child-leak topic.

That does not seem to be what's happening here (note: I am not
complaining; I am just trying to make sure expectation matches
reality).  Am I reading you correctly?

I think sb/submodule-parallel-fetch + sb/submodule-parallel-update
as a single topic would need more time to mature to be in a tagged
release than we have in the remainder of this cycle.  It is likely
that the former topic has a chance to get rebased after 2.7 happens.
And that would allow us to (1) use the child_process_cleanup() from
get-go instead of _deinit and to (2) get the machinery right both
for UNIX and Windows from get-go.  Which would make the result
easier to understand.  As this is one of the more important areas,
it matters to keep the resulting code and the rationale behind it
understandable by reading "log --reverse -p".

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/2]

2015-11-04 Thread Knut Franke
Changes in the fourth iteration:

* update Documentation/technical/api-remote.txt to reflect the addition to
  struct remote

* update documentation of http.proxy in Documentation/config.txt to mention the
  possibility of having git fill in missing proxy password

* fix decl-after-stmt

* generalize env_override helper to var_override

* more coding style corrections
-- 
Vorstandsvorsitzender/Chairman of the board of management:
Gerd-Lothar Leonhart
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Arno Steitz
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] http: allow selection of proxy authentication method

2015-11-04 Thread Knut Franke
CURLAUTH_ANY does not work with proxies which answer unauthenticated requests
with a 307 redirect to an error page instead of a 407 listing supported
authentication methods. Therefore, allow the authentication method to be set
using the environment variable GIT_HTTP_PROXY_AUTHMETHOD or configuration
variables http.proxyAuthmethod and remote..proxyAuthmethod (in analogy
to http.proxy and remote..proxy).

The following values are supported:

* anyauth (default)
* basic
* digest
* negotiate
* ntlm

Signed-off-by: Knut Franke 
Helped-by: Junio C Hamano 
Helped-by: Eric Sunshine 
---
 Documentation/config.txt   | 26 ++
 Documentation/technical/api-remote.txt |  4 +++
 http.c | 65 --
 remote.c   |  3 ++
 remote.h   |  1 +
 5 files changed, 96 insertions(+), 3 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 391a0c3..4ad7f74 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1597,6 +1597,27 @@ http.proxy::
`curl(1)`).  This can be overridden on a per-remote basis; see
remote..proxy
 
+http.proxyAuthMethod::
+   Set the method with which to authenticate against the HTTP proxy. This
+   only takes effect if the configured proxy string contains a user name 
part
+   (i.e. is of the form 'user@host' or 'user@host:port'). This can be
+   overridden on a per-remote basis; see `remote..proxyAuthMethod`.
+   Both can be overridden by the 'GIT_HTTP_PROXY_AUTHMETHOD' environment
+   variable.  Possible values are:
++
+--
+* `anyauth` - Automatically pick a suitable authentication method. It is
+  assumed that the proxy answers an unauthenticated request with a 407
+  status code and one or more Proxy-authenticate headers with supported
+  authentication methods. This is the default.
+* `basic` - HTTP Basic authentication
+* `digest` - HTTP Digest authentication; this prevents the password from being
+  transmitted to the proxy in clear text
+* `negotiate` - GSS-Negotiate authentication (compare the --negotiate option
+  of `curl(1)`)
+* `ntlm` - NTLM authentication (compare the --ntlm option of `curl(1)`)
+--
+
 http.cookieFile::
File containing previously stored cookie lines which should be used
in the Git http session, if they match the server. The file format
@@ -2390,6 +2411,11 @@ remote..proxy::
the proxy to use for that remote.  Set to the empty string to
disable proxying for that remote.
 
+remote..proxyAuthMethod::
+   For remotes that require curl (http, https and ftp), the method to use 
for
+   authenticating against the proxy in use (probably set in
+   `remote..proxy`). See `http.proxyAuthMethod`.
+
 remote..fetch::
The default set of "refspec" for linkgit:git-fetch[1]. See
linkgit:git-fetch[1].
diff --git a/Documentation/technical/api-remote.txt 
b/Documentation/technical/api-remote.txt
index 2cfdd22..f10941b 100644
--- a/Documentation/technical/api-remote.txt
+++ b/Documentation/technical/api-remote.txt
@@ -51,6 +51,10 @@ struct remote
 
The proxy to use for curl (http, https, ftp, etc.) URLs.
 
+`http_proxy_authmethod`::
+
+   The method used for authenticating against `http_proxy`.
+
 struct remotes can be found by name with remote_get(), and iterated
 through with for_each_remote(). remote_get(NULL) will return the
 default remote, given the current branch and configuration.
diff --git a/http.c b/http.c
index 7da76ed..a786802 100644
--- a/http.c
+++ b/http.c
@@ -63,6 +63,24 @@ static long curl_low_speed_limit = -1;
 static long curl_low_speed_time = -1;
 static int curl_ftp_no_epsv;
 static const char *curl_http_proxy;
+static const char *http_proxy_authmethod;
+static struct {
+   const char *name;
+   long curlauth_param;
+} proxy_authmethods[] = {
+   { "basic", CURLAUTH_BASIC },
+   { "digest", CURLAUTH_DIGEST },
+   { "negotiate", CURLAUTH_GSSNEGOTIATE },
+   { "ntlm", CURLAUTH_NTLM },
+#ifdef LIBCURL_CAN_HANDLE_AUTH_ANY
+   { "anyauth", CURLAUTH_ANY },
+#endif
+   /*
+* CURLAUTH_DIGEST_IE has no corresponding command-line option in
+* curl(1) and is not included in CURLAUTH_ANY, so we leave it out
+* here, too
+*/
+};
 static const char *curl_cookie_file;
 static int curl_save_cookies;
 struct credential http_auth = CREDENTIAL_INIT;
@@ -257,6 +275,9 @@ static int http_options(const char *var, const char *value, 
void *cb)
if (!strcmp("http.proxy", var))
return git_config_string(_http_proxy, var, value);
 
+   if (!strcmp("http.proxyauthmethod", var))
+   return git_config_string(_proxy_authmethod, var, value);
+
if (!strcmp("http.cookiefile", var))
return git_config_string(_cookie_file, var, 

[PATCH 2/2] http: use credential API to handle proxy authentication

2015-11-04 Thread Knut Franke
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.

If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.

For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.

Signed-off-by: Knut Franke 
Helped-by: Junio C Hamano 
Helped-by: Eric Sunshine 
---
 Documentation/config.txt | 10 +--
 http.c   | 72 +++-
 http.h   |  1 +
 3 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4ad7f74..320ce9a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1593,9 +1593,13 @@ help.htmlPath::
 
 http.proxy::
Override the HTTP proxy, normally configured using the 'http_proxy',
-   'https_proxy', and 'all_proxy' environment variables (see
-   `curl(1)`).  This can be overridden on a per-remote basis; see
-   remote..proxy
+   'https_proxy', and 'all_proxy' environment variables (see `curl(1)`). In
+   addition to the syntax understood by curl, it is possible to specify a
+   proxy string with a user name but no password, in which case git will
+   attempt to acquire one in the same way it does for other credentials. 
See
+   linkgit:gitcredentials[7] for more information. The syntax thus is
+   '[protocol://][user[:password]@]proxyhost[:port]'. This can be 
overridden
+   on a per-remote basis; see remote..proxy
 
 http.proxyAuthMethod::
Set the method with which to authenticate against the HTTP proxy. This
diff --git a/http.c b/http.c
index a786802..714a59a 100644
--- a/http.c
+++ b/http.c
@@ -81,6 +81,8 @@ static struct {
 * here, too
 */
 };
+static struct credential proxy_auth = CREDENTIAL_INIT;
+static const char *curl_proxyuserpwd;
 static const char *curl_cookie_file;
 static int curl_save_cookies;
 struct credential http_auth = CREDENTIAL_INIT;
@@ -178,6 +180,9 @@ static void finish_active_slot(struct active_request_slot 
*slot)
 #else
slot->results->auth_avail = 0;
 #endif
+
+   curl_easy_getinfo(slot->curl, CURLINFO_HTTP_CONNECTCODE,
+   >results->http_connectcode);
}
 
/* Run callback if appropriate */
@@ -337,6 +342,24 @@ static void var_override(const char **var, char *value)
 
 static void init_curl_proxy_auth(CURL *result)
 {
+   if (proxy_auth.username) {
+   struct strbuf s = STRBUF_INIT;
+   if (!proxy_auth.password)
+   credential_fill(_auth);
+#if LIBCURL_VERSION_NUM >= 0x071301
+   curl_easy_setopt(result, CURLOPT_PROXYUSERNAME,
+   proxy_auth.username);
+   curl_easy_setopt(result, CURLOPT_PROXYPASSWORD,
+   proxy_auth.password);
+#else
+   strbuf_addstr_urlencode(, proxy_auth.username, 1);
+   strbuf_addch(, ':');
+   strbuf_addstr_urlencode(, proxy_auth.password, 1);
+   curl_proxyuserpwd = strbuf_detach(, NULL);
+   curl_easy_setopt(result, CURLOPT_PROXYUSERPWD, 
curl_proxyuserpwd);
+#endif
+   }
+
var_override(_proxy_authmethod, 
getenv("GIT_HTTP_PROXY_AUTHMETHOD"));
 
 #if LIBCURL_VERSION_NUM >= 0x070a07 /* CURLOPT_PROXYAUTH and CURLAUTH_ANY */
@@ -518,8 +541,42 @@ static CURL *get_curl_handle(void)
curl_easy_setopt(result, CURLOPT_USE_SSL, CURLUSESSL_TRY);
 #endif
 
+   /*
+* CURL also examines these variables as a fallback; but we need to 
query
+* them here in order to decide whether to prompt for missing password 
(cf.
+* init_curl_proxy_auth()).
+*
+* Unlike many other common environment variables, these are 
historically
+* lowercase only. It appears that CURL did not know this and 
implemented
+* only uppercase variants, which was later corrected to take both - 
with
+* the exception of http_proxy, which is lowercase only also in CURL. As
+* the lowercase versions are the historical quasi-standard, they take
+* precedence here, as in CURL.
+*/
+   if (!curl_http_proxy) {
+   if (!strcmp(http_auth.protocol, "https")) {
+   var_override(_http_proxy, getenv("HTTPS_PROXY"));
+

Re: [PATCHv3 00/11] Expose the submodule parallelism to the user

2015-11-04 Thread Stefan Beller
On Wed, Nov 4, 2015 at 9:54 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Where does it apply?
>> ---
>> This series applies on top of d075d2604c0f92045caa8d5bb6ab86cf4921a4ae (Merge
>> branch 'rs/daemon-plug-child-leak' into sb/submodule-parallel-update) and 
>> replaces
>> the previous patches in sb/submodule-parallel-update
>>
>> What does it do?
>> ---
>> This series should finish the on going efforts of parallelizing
>> submodule network traffic. The patches contain tests for clone,
>> fetch and submodule update to use the actual parallelism both via
>> command line as well as a configured option. I decided to go with
>> "submodule.jobs" for all three for now.
>
> The order of patches and where the series builds makes me suspect
> that I have been expecting too much from the "parallel-fetch" topic.
>
> I've been hoping that it would be useful for the project as a whole
> to polish the other topic and make it available to wider audience
> sooner by itself (both from "end users get improved Git early"
> aspect and from "the core machinery to be reused in follow-up
> improvements are made closer to perfection sooner" perspective).  So
> I've been expecting that "Let's fix it on Windows" change directly
> on top of sb/submodule-parallel-fetch to make that topic usable
> before everything else.

I can resend the patches on top of sb/submodule-parallel-fetch
(though looking at sb/submodule-parallel-fetch..d075d2604c0f920
[Merge branch 'rs/daemon-plug-child-leak' into sb/submodule-parallel-update]
I don't expect conflicts, so it would be a verbatim resend)


> Other patches in this series may require
> the child_process_cleanup() change, so they may be applied on top of
> the merge between sb/submodule-parallel-fetch (updated for Windows)
> and rs/daemon-plug-child-leak topic.

I assumed the rs/daemon-plug-child-leak topic is no feature, but cleanup.
Which is why I would have expected a sb/submodule-parallel-fetch-for-windows
pointing at maybe the third patch of the series on top of
rs/daemon-plug-child-leak

>
> That does not seem to be what's happening here (note: I am not
> complaining; I am just trying to make sure expectation matches
> reality).  Am I reading you correctly?

I really wanted to send out just one series, my bad.
The ordering made sense to me (first the run-command related fixes
and then the new features in later patches)

>
> I think sb/submodule-parallel-fetch + sb/submodule-parallel-update
> as a single topic would need more time to mature to be in a tagged
> release than we have in the remainder of this cycle.

I agree on that.

>  It is likely
> that the former topic has a chance to get rebased after 2.7 happens.
> And that would allow us to (1) use the child_process_cleanup() from
> get-go instead of _deinit and to (2) get the machinery right both
> for UNIX and Windows from get-go.  Which would make the result
> easier to understand.  As this is one of the more important areas,
> it matters to keep the resulting code and the rationale behind it
> understandable by reading "log --reverse -p".

So you are saying that reading the Windows cleanup patch
before the s/deinit/clear/ Patch by Rene makes it way easier to understand?
Which is why you would prefer another history. (Merging an updated
sb/submodule-parallel-fetch again to  rs/daemon-plug-child-leak or even
sb/submodule-parallel-update)

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 02/11] run-command: report failure for degraded output just once

2015-11-04 Thread Junio C Hamano
Stefan Beller  writes:

> The warning message is cluttering the output itself,
> so just report it once.
>
> Signed-off-by: Stefan Beller 
> ---
>  run-command.c | 20 ++--
>  1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/run-command.c b/run-command.c
> index 7c00c21..3ae563f 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -1012,13 +1012,21 @@ static void pp_cleanup(struct parallel_processes *pp)
>  
>  static void set_nonblocking(int fd)
>  {
> + static int reported_degrade = 0;
>   int flags = fcntl(fd, F_GETFL);
> - if (flags < 0)
> - warning("Could not get file status flags, "
> - "output will be degraded");
> - else if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
> - warning("Could not set file status flags, "
> - "output will be degraded");
> + if (flags < 0) {
> + if (!reported_degrade) {
> + warning("Could not get file status flags, "
> + "output will be degraded");
> + reported_degrade = 1;
> + }
> + } else if (fcntl(fd, F_SETFL, flags | O_NONBLOCK)) {
> + if (!reported_degrade) {
> + warning("Could not set file status flags, "
> + "output will be degraded");
> + reported_degrade = 1;
> + }
> + }
>  }

Imagine that we are running two things A and B at the same time.  We
ask poll(2) and it says both A and B have some data ready to be
read, and we try to read from A.  strbuf_read_once() would try to
read up to 8K, relying on the fact that you earlier set the IO to be
nonblock.  It will get stuck reading from A without allowing output
from B to drain.  B's write may get stuck because we are not reading
from it, and would cause B to stop making progress.

What if the other sides of the connection from A and B are talking
with each other, and B's non-progress caused the processing for A on
the other side of the connection to block, causing it not to produce
more output to allow us to make progress reading from A (so that
eventually we can give B a chance to drain its output)?  Imagine A
and B are pushes to the same remote, B may be pushing a change to a
submodule while A may be pushing a matching change to its
superproject, and the server may be trying to make sure that the
submodule update completes and updates the ref before making the
superproject's tree that binds that updated submodule's commit
availble, for example?  Can we make any progress from that point?

I am not convinced that the failure to set nonblock IO is merely
"output will be degraded".  It feels more like a fatal error if we
are driving more than one task at the same time.





--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 00/11] Expose the submodule parallelism to the user

2015-11-04 Thread Junio C Hamano
Stefan Beller  writes:

> So you are saying that reading the Windows cleanup patch
> before the s/deinit/clear/ Patch by Rene makes it way easier to understand?

No.

The run-parallel API added in parallel-fetch that needs to be fixed
up (because the topic is in 'next', my bad merging prematurely) with
a separate "oops that was not friendly to Windows" is the primary
concern I have for those who later want to learn how it was designed
by going through "log --reverse -p".

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html