Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server
Jeff King wrote: > On Sun, Feb 17, 2013 at 03:05:34AM -0800, Jonathan Nieder wrote: >> Jeff King wrote: >>> + if (verify_ref_advertisement(last->buf, last->len) < 0) >>> + die("ref advertisement is invalid at %s", refs_url); >> >> Won't this error out with >> >> protocol error: bad line length character: ERR >> >> instead of the current more helpful behavior for ERR lines? > > I don't think so. Don't ERR lines appear inside their own packets? Yes, I misread get_remote_heads for some reason. Thanks for checking. [...] > The one thing we do also check, though, is that we end with a flush > packet. So depending on what servers produce, it may mean we trigger > this complaint instead of passing the ERR along to fetch-pack. > > Rather than doing this fake syntactic verification, I wonder if we > should simply call get_remote_heads, which does a more thorough check I'm not sure whether servers are expected to send a flush after an ERR packet. The only codepath I know of in git itself that sends such packets is git-daemon, which does not flush after the error (but is not used in the stateless-rpc case). http-backend uses HTTP error codes for its errors. If I am reading get_remote_heads correctly, calling it with the following tweak should work ok. The extra thread is just to feed a string into a fd-based interface and could be avoided for "list", too, if it costs too much. diff --git i/connect.c w/connect.c index 49e56ba3..55ee7de7 100644 --- i/connect.c +++ w/connect.c @@ -68,7 +68,8 @@ struct ref **get_remote_heads(int in, struct ref **list, { int got_at_least_one_head = 0; - *list = NULL; + if (list) + *list = NULL; for (;;) { struct ref *ref; unsigned char old_sha1[20]; @@ -92,6 +93,9 @@ struct ref **get_remote_heads(int in, struct ref **list, die("protocol error: expected sha/ref, got '%s'", buffer); name = buffer + 41; + if (!list) + continue; + name_len = strlen(name); if (len != name_len + 41) { free(server_capabilities); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server
On Sun, Feb 17, 2013 at 03:05:34AM -0800, Jonathan Nieder wrote: > Jeff King wrote: > > > If the smart HTTP response from the server is truncated for > > any reason, we will get an incomplete ref advertisement. If > > we then feed this incomplete list to "fetch-pack", one of a > > few things may happen: > > > > 1. If the truncation is in a packet header, fetch-pack > > will notice the bogus line and complain. > > > > 2. If the truncation is inside a packet, fetch-pack will > > keep waiting for us to send the rest of the packet, > > which we never will. > > Mostly harmless since the operator could hit ^C, but still unpleasant. Fetching is not always interactive. The deadlock I ran into (and again, I am not sure if this fixes it or not, but it is _a_ deadlock) was on a server farm doing a large number of "fetch && checkout && deploy" operations. Only some of them hung, but it took a while to figure out what was going on. > [...] > > This fortunately doesn't happen in the normal fetching > > workflow, because git-fetch first uses the "list" command, > > which feeds the refs to get_remote_heads, which does notice > > the error. However, you can trigger it by sending a direct > > "fetch" to the remote-curl helper. > > Ah. Would a test for this make sense? A test would be great, if you can devise a way to reliably produce truncated git output (but still valid http output). In the real-world problem I had, I believe the truncation was caused by an intermediate reverse proxy that hit a timeout. I simulated truncation by using netcat to replay munged http headers and git output. I suspect the simplest portable thing would be a static file of truncated git output, served by apache, which would need custom configuration to serve it with the correct content-type header. It seemed like a lot of test infrastructure to check for a very specific thing, so I abandoned trying to make a test. > > + if (verify_ref_advertisement(last->buf, last->len) < 0) > > + die("ref advertisement is invalid at %s", refs_url); > > Won't this error out with > > protocol error: bad line length character: ERR > > instead of the current more helpful behavior for ERR lines? I don't think so. Don't ERR lines appear inside their own packets? We are just verifying that our packets are syntactically correct here, and my reading of get_remote_heads is that the ERR appears inside the packetized data. The one thing we do also check, though, is that we end with a flush packet. So depending on what servers produce, it may mean we trigger this complaint instead of passing the ERR along to fetch-pack. Rather than doing this fake syntactic verification, I wonder if we should simply call get_remote_heads, which does a more thorough check (and is what we _would_ call in the list case, and what fetch-pack will call once we pass data to it). It's slightly less efficient, in that it starts a new thread and actually builds the linked list of refs. But it probably isn't that big a deal (and normal operation does a "list" first which does that _anyway_). > Same stylistic comment about "what would it mean for the return value > to be positive?" as in patch 2/3. Same response. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server
Jeff King wrote: > If the smart HTTP response from the server is truncated for > any reason, we will get an incomplete ref advertisement. If > we then feed this incomplete list to "fetch-pack", one of a > few things may happen: > > 1. If the truncation is in a packet header, fetch-pack > will notice the bogus line and complain. > > 2. If the truncation is inside a packet, fetch-pack will > keep waiting for us to send the rest of the packet, > which we never will. Mostly harmless since the operator could hit ^C, but still unpleasant. [...] > This fortunately doesn't happen in the normal fetching > workflow, because git-fetch first uses the "list" command, > which feeds the refs to get_remote_heads, which does notice > the error. However, you can trigger it by sending a direct > "fetch" to the remote-curl helper. Ah. Would a test for this make sense? [...] > --- a/remote-curl.c > +++ b/remote-curl.c [...] > @@ -174,6 +183,9 @@ static struct discovery* discover_refs(const char > *service) > die("smart-http metadata lines are invalid at %s", > refs_url); > > + if (verify_ref_advertisement(last->buf, last->len) < 0) > + die("ref advertisement is invalid at %s", refs_url); Won't this error out with protocol error: bad line length character: ERR instead of the current more helpful behavior for ERR lines? Same stylistic comment about "what would it mean for the return value to be positive?" as in patch 2/3. Aside from those two details, the idea looks sane, though. Good catch, and thanks for a pleasant read. Good night, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] remote-curl: sanity check ref advertisement from server
If the smart HTTP response from the server is truncated for any reason, we will get an incomplete ref advertisement. If we then feed this incomplete list to "fetch-pack", one of a few things may happen: 1. If the truncation is in a packet header, fetch-pack will notice the bogus line and complain. 2. If the truncation is inside a packet, fetch-pack will keep waiting for us to send the rest of the packet, which we never will. 3. If the truncation is at a packet boundary, fetch-pack will keep waiting for us to send the next packet, which we never will. As a result, fetch-pack hangs, waiting for input. However, remote-curl believes it has sent all of the advertisement, and therefore waits for fetch-pack to speak. The two processes end up in a deadlock. This fortunately doesn't happen in the normal fetching workflow, because git-fetch first uses the "list" command, which feeds the refs to get_remote_heads, which does notice the error. However, you can trigger it by sending a direct "fetch" to the remote-curl helper. We can make this more robust by verifying that the packet stream we got from the server does indeed parse correctly and ends with a flush packet, which means that what fetch-pack receives will at least be syntactically correct. The normal non-stateless-rpc case does not have to deal with this problem; it detects a truncation by getting EOF on the file descriptor before it has read all data. So it is tempting to think that we can solve this by closing the descriptor after relaying the server's advertisement. Unfortunately, in the stateless rpc case, we need to keep the descriptor to fetch-pack open in order to pass more data to it. We could solve that by using two descriptors, but our run-command interface does not support that (and modifying it to create more pipes would make life hard for the Windows port of git). Signed-off-by: Jeff King --- remote-curl.c | 12 1 file changed, 12 insertions(+) diff --git a/remote-curl.c b/remote-curl.c index 73134f5..c7647c7 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -101,6 +101,15 @@ static int read_packets_until_flush(char **buf, size_t *len) } } +static int verify_ref_advertisement(char *buf, size_t len) +{ + /* +* Our function parameters are copies, so we do not +* have to care that read_packets will increment our pointers. +*/ + return read_packets_until_flush(&buf, &len); +} + static struct discovery* discover_refs(const char *service) { struct strbuf exp = STRBUF_INIT; @@ -174,6 +183,9 @@ static struct discovery* discover_refs(const char *service) die("smart-http metadata lines are invalid at %s", refs_url); + if (verify_ref_advertisement(last->buf, last->len) < 0) + die("ref advertisement is invalid at %s", refs_url); + last->proto_git = 1; } -- 1.8.1.2.11.g1a2f572 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html