Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server

2013-02-17 Thread Jonathan Nieder
Jeff King wrote:
> On Sun, Feb 17, 2013 at 03:05:34AM -0800, Jonathan Nieder wrote:
>> Jeff King wrote:

>>> +   if (verify_ref_advertisement(last->buf, last->len) < 0)
>>> +   die("ref advertisement is invalid at %s", refs_url);
>>
>> Won't this error out with
>>
>>  protocol error: bad line length character: ERR
>>
>> instead of the current more helpful behavior for ERR lines?
>
> I don't think so. Don't ERR lines appear inside their own packets?

Yes, I misread get_remote_heads for some reason.  Thanks for checking.

[...]
> The one thing we do also check, though, is that we end with a flush
> packet. So depending on what servers produce, it may mean we trigger
> this complaint instead of passing the ERR along to fetch-pack.
>
> Rather than doing this fake syntactic verification, I wonder if we
> should simply call get_remote_heads, which does a more thorough check

I'm not sure whether servers are expected to send a flush after an
ERR packet.  The only codepath I know of in git itself that sends
such packets is git-daemon, which does not flush after the error (but
is not used in the stateless-rpc case).  http-backend uses HTTP error
codes for its errors.

If I am reading get_remote_heads correctly, calling it with the
following tweak should work ok.  The extra thread is just to feed a
string into a fd-based interface and could be avoided for "list", too,
if it costs too much.

diff --git i/connect.c w/connect.c
index 49e56ba3..55ee7de7 100644
--- i/connect.c
+++ w/connect.c
@@ -68,7 +68,8 @@ struct ref **get_remote_heads(int in, struct ref **list,
 {
int got_at_least_one_head = 0;
 
-   *list = NULL;
+   if (list)
+   *list = NULL;
for (;;) {
struct ref *ref;
unsigned char old_sha1[20];
@@ -92,6 +93,9 @@ struct ref **get_remote_heads(int in, struct ref **list,
die("protocol error: expected sha/ref, got '%s'", 
buffer);
name = buffer + 41;
 
+   if (!list)
+   continue;
+
name_len = strlen(name);
if (len != name_len + 41) {
free(server_capabilities);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server

2013-02-17 Thread Jeff King
On Sun, Feb 17, 2013 at 03:05:34AM -0800, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > If the smart HTTP response from the server is truncated for
> > any reason, we will get an incomplete ref advertisement. If
> > we then feed this incomplete list to "fetch-pack", one of a
> > few things may happen:
> >
> >   1. If the truncation is in a packet header, fetch-pack
> >  will notice the bogus line and complain.
> >
> >   2. If the truncation is inside a packet, fetch-pack will
> >  keep waiting for us to send the rest of the packet,
> >  which we never will.
> 
> Mostly harmless since the operator could hit ^C, but still unpleasant.

Fetching is not always interactive.  The deadlock I ran into (and again,
I am not sure if this fixes it or not, but it is _a_ deadlock) was on a
server farm doing a large number of "fetch && checkout && deploy"
operations. Only some of them hung, but it took a while to figure out
what was going on.

> [...]
> > This fortunately doesn't happen in the normal fetching
> > workflow, because git-fetch first uses the "list" command,
> > which feeds the refs to get_remote_heads, which does notice
> > the error. However, you can trigger it by sending a direct
> > "fetch" to the remote-curl helper.
> 
> Ah.  Would a test for this make sense?

A test would be great, if you can devise a way to reliably produce
truncated git output (but still valid http output). In the real-world
problem I had, I believe the truncation was caused by an intermediate
reverse proxy that hit a timeout. I simulated truncation by using netcat
to replay munged http headers and git output.

I suspect the simplest portable thing would be a static file of
truncated git output, served by apache, which would need custom
configuration to serve it with the correct content-type header. It
seemed like a lot of test infrastructure to check for a very specific
thing, so I abandoned trying to make a test.

> > +   if (verify_ref_advertisement(last->buf, last->len) < 0)
> > +   die("ref advertisement is invalid at %s", refs_url);
> 
> Won't this error out with
> 
>   protocol error: bad line length character: ERR
> 
> instead of the current more helpful behavior for ERR lines?

I don't think so. Don't ERR lines appear inside their own packets? We
are just verifying that our packets are syntactically correct here, and
my reading of get_remote_heads is that the ERR appears inside the
packetized data.

The one thing we do also check, though, is that we end with a flush
packet. So depending on what servers produce, it may mean we trigger
this complaint instead of passing the ERR along to fetch-pack.

Rather than doing this fake syntactic verification, I wonder if we
should simply call get_remote_heads, which does a more thorough check
(and is what we _would_ call in the list case, and what fetch-pack will
call once we pass data to it). It's slightly less efficient, in that it
starts a new thread and actually builds the linked list of refs. But it
probably isn't that big a deal (and normal operation does a "list" first
which does that _anyway_).

> Same stylistic comment about "what would it mean for the return value
> to be positive?" as in patch 2/3.

Same response. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server

2013-02-17 Thread Jonathan Nieder
Jeff King wrote:

> If the smart HTTP response from the server is truncated for
> any reason, we will get an incomplete ref advertisement. If
> we then feed this incomplete list to "fetch-pack", one of a
> few things may happen:
>
>   1. If the truncation is in a packet header, fetch-pack
>  will notice the bogus line and complain.
>
>   2. If the truncation is inside a packet, fetch-pack will
>  keep waiting for us to send the rest of the packet,
>  which we never will.

Mostly harmless since the operator could hit ^C, but still unpleasant.

[...]
> This fortunately doesn't happen in the normal fetching
> workflow, because git-fetch first uses the "list" command,
> which feeds the refs to get_remote_heads, which does notice
> the error. However, you can trigger it by sending a direct
> "fetch" to the remote-curl helper.

Ah.  Would a test for this make sense?

[...]
> --- a/remote-curl.c
> +++ b/remote-curl.c
[...]
> @@ -174,6 +183,9 @@ static struct discovery* discover_refs(const char 
> *service)
>   die("smart-http metadata lines are invalid at %s",
>   refs_url);
>  
> + if (verify_ref_advertisement(last->buf, last->len) < 0)
> + die("ref advertisement is invalid at %s", refs_url);

Won't this error out with

protocol error: bad line length character: ERR

instead of the current more helpful behavior for ERR lines?

Same stylistic comment about "what would it mean for the return value
to be positive?" as in patch 2/3.

Aside from those two details, the idea looks sane, though.  Good
catch, and thanks for a pleasant read.

Good night,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] remote-curl: sanity check ref advertisement from server

2013-02-15 Thread Jeff King
If the smart HTTP response from the server is truncated for
any reason, we will get an incomplete ref advertisement. If
we then feed this incomplete list to "fetch-pack", one of a
few things may happen:

  1. If the truncation is in a packet header, fetch-pack
 will notice the bogus line and complain.

  2. If the truncation is inside a packet, fetch-pack will
 keep waiting for us to send the rest of the packet,
 which we never will.

  3. If the truncation is at a packet boundary, fetch-pack
 will keep waiting for us to send the next packet, which
 we never will.

As a result, fetch-pack hangs, waiting for input. However,
remote-curl believes it has sent all of the advertisement,
and therefore waits for fetch-pack to speak. The two
processes end up in a deadlock.

This fortunately doesn't happen in the normal fetching
workflow, because git-fetch first uses the "list" command,
which feeds the refs to get_remote_heads, which does notice
the error. However, you can trigger it by sending a direct
"fetch" to the remote-curl helper.

We can make this more robust by verifying that the packet
stream we got from the server does indeed parse correctly
and ends with a flush packet, which means that what
fetch-pack receives will at least be syntactically correct.

The normal non-stateless-rpc case does not have to deal with
this problem; it detects a truncation by getting EOF on the
file descriptor before it has read all data. So it is
tempting to think that we can solve this by closing the
descriptor after relaying the server's advertisement.
Unfortunately, in the stateless rpc case, we need to keep
the descriptor to fetch-pack open in order to pass more data
to it.

We could solve that by using two descriptors, but our
run-command interface does not support that (and modifying
it to create more pipes would make life hard for the Windows
port of git).

Signed-off-by: Jeff King 
---
 remote-curl.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/remote-curl.c b/remote-curl.c
index 73134f5..c7647c7 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -101,6 +101,15 @@ static int read_packets_until_flush(char **buf, size_t 
*len)
}
 }
 
+static int verify_ref_advertisement(char *buf, size_t len)
+{
+   /*
+* Our function parameters are copies, so we do not
+* have to care that read_packets will increment our pointers.
+*/
+   return read_packets_until_flush(&buf, &len);
+}
+
 static struct discovery* discover_refs(const char *service)
 {
struct strbuf exp = STRBUF_INIT;
@@ -174,6 +183,9 @@ static struct discovery* discover_refs(const char *service)
die("smart-http metadata lines are invalid at %s",
refs_url);
 
+   if (verify_ref_advertisement(last->buf, last->len) < 0)
+   die("ref advertisement is invalid at %s", refs_url);
+
last->proto_git = 1;
}
 
-- 
1.8.1.2.11.g1a2f572
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html