Re: [PATCH] packfile: Correct zlib buffer handling
Hi, Sorry about the delay here (bit of a mix-up and didn't reply to the list). (see inline ) On Sun, May 27, 2018 at 9:41 PM, Junio C Hamano wrote: > Junio C Hamano writes: > >> Duy Nguyen writes: >> >>> On Sun, May 27, 2018 at 1:57 AM, Junio C Hamano wrote: (trimming) > > Specifically, I was worried about this assertion: > > Lets rely on the fact that the source buffer will only be fully > consumed when the when the destination buffer is inflated to the > correct size. > > which I think is the exact bad thinking that caused troubles for us > in the past; isn't the explanation in 456cdf6e ("Fix loose object > uncompression check.", 2007-03-19) relevant here? > > - stream.avail_out = size + 1; > + stream.avail_out = size; > ... > stream.next_in = in; > st = git_inflate(, Z_FINISH); > if (!stream.avail_out) > - break; /* the payload is larger than it should be */ > + break; /* done, st indicates if source fully consumed > */ > curpos += stream.next_in - in; > } while (st == Z_OK || st == Z_BUF_ERROR); > git_inflate_end(); > if ((st != Z_STREAM_END) || stream.total_out != size) { > free(buffer); > return NULL; > } > > With minimum stream.avail_out without slack, when !avail_out, i.e. > when we fully filled the output buffer, it could be that we had > correct input that deflates to the correct size, in which case we > are happy---st would say Z_STREAM_END, we would leave the loop > because it is neither OK nor BUF_ERROR, and total_out would report > the size we expected. Or the input zlib stream may have ended with > bytes that express "this concludes the stream", and the input bytes > before that was sufficient to construct the original payload fully, > and we may have just fed the bytes before that "this concludes the > stream" to git_inflate(). > > In such a case, we haven't consumed all the avail_in. We may > already have all the correct output, i.e. !avail_out, but because we > haven't consumed the "this concludes the stream", st is not > STREAM_END in such a case. If I understand correctly your concerned the avail_in is longer than what is required to fill the output buffer.. I'm fairly sure that won't result in a Z_STREAM_END, as you rightfully point out, but the loop _will_ terminate due to the output buffer being full and then since its not Z_STREAM_END the unpack_compressed_entry fails, as it should. > > Our existing while() loop, with one-byte slack in avail_out, would > have let us continue and the next iteration of the loop would have > consumed the input without producing any more output (i.e. avail_out > would have been left to 1 in both of these final two rounds) and we > would have exited the loop. After calling inflate_end(), we would > have noticed STREAM_END and correct size and we would have been > happy. Your assuming that zlib will terminate with an error, but a fully decompressed buffer, because it hasn't consumed the entire input buffer. I don't think that is how it works (its not how the documentation is written, nor the bits of code i've looked at seem to work, which granted i'm not a zlib maintainer). > > The updated code would handle this latter case rather badly, no? We > leave the loop early, notice st is not STREAM_END, and be very > unhappy, because this patch did not give us to consume the very end > of the input stream and left the loop early. Your correct if the above case is a valid zlib behavior then there would be a problem. But, I don't think the termination is dicated by insufficient output space until there is an attempt to utilize that space. > >>> This yields two problems, first a single byte overrun won't be detected >>> properly because the Z_STREAM_END will then be set, but the null >>> terminator will have been overwritten. > > Because we compare total_out and size at the end, we would detect it > as an error in this function, no? Then zlib overwriting NUL would > not be a problem, as we would free the buffer and return NULL, no? > >>> The other problem is that >>> more recent zlib patches have been poisoning the unconsumed portions >>> of the buffers which also overwrites the null, while correctly >>> returning length and status. > > Isn't that a bug in zlib, though? Or do they do that deliberately? > > I think a workaround with lower impact would be to manually restore > NUL at the end of the buffer. I agree, just resetting the NULL its likely safer, and I will repost a patch soon which if nothing else makes git more robust to errant zlib behavior.
Re: [PATCH] packfile: Correct zlib buffer handling
Junio C Hamanowrites: > Duy Nguyen writes: > >> On Sun, May 27, 2018 at 1:57 AM, Junio C Hamano wrote: >>> Duy Nguyen writes: >>> On Sat, May 26, 2018 at 12:56 AM, Jeremy Linton wrote: > @@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct > packed_git *p, > return NULL; > memset(, 0, sizeof(stream)); > stream.next_out = buffer; > - stream.avail_out = size + 1; > + stream.avail_out = size; You may want to include in your commit message a reference to 39eea7bdd9 (Fix incorrect error check while reading deflated pack data - 2009-10-21) which adds this plus one with a fascinating story behind. >>> >>> A bit puzzled---are you saying that this recent patch breaks the old >>> fix and must be done in some other way? >> >> No. I actually wanted to answer that question when I tried to track >> down the commit that adds " + 1" but I did not spend enough time to >> understand the old problem. I guess your puzzle means you didn't think >> it would break anything, which is good. > > No it merely means I am puzzled how the posted patch that goes > directly opposite to what an earlier "fix" did is a correct solution > to anything X-<. Specifically, I was worried about this assertion: Lets rely on the fact that the source buffer will only be fully consumed when the when the destination buffer is inflated to the correct size. which I think is the exact bad thinking that caused troubles for us in the past; isn't the explanation in 456cdf6e ("Fix loose object uncompression check.", 2007-03-19) relevant here? - stream.avail_out = size + 1; + stream.avail_out = size; ... stream.next_in = in; st = git_inflate(, Z_FINISH); if (!stream.avail_out) - break; /* the payload is larger than it should be */ + break; /* done, st indicates if source fully consumed */ curpos += stream.next_in - in; } while (st == Z_OK || st == Z_BUF_ERROR); git_inflate_end(); if ((st != Z_STREAM_END) || stream.total_out != size) { free(buffer); return NULL; } With minimum stream.avail_out without slack, when !avail_out, i.e. when we fully filled the output buffer, it could be that we had correct input that deflates to the correct size, in which case we are happy---st would say Z_STREAM_END, we would leave the loop because it is neither OK nor BUF_ERROR, and total_out would report the size we expected. Or the input zlib stream may have ended with bytes that express "this concludes the stream", and the input bytes before that was sufficient to construct the original payload fully, and we may have just fed the bytes before that "this concludes the stream" to git_inflate(). In such a case, we haven't consumed all the avail_in. We may already have all the correct output, i.e. !avail_out, but because we haven't consumed the "this concludes the stream", st is not STREAM_END in such a case. Our existing while() loop, with one-byte slack in avail_out, would have let us continue and the next iteration of the loop would have consumed the input without producing any more output (i.e. avail_out would have been left to 1 in both of these final two rounds) and we would have exited the loop. After calling inflate_end(), we would have noticed STREAM_END and correct size and we would have been happy. The updated code would handle this latter case rather badly, no? We leave the loop early, notice st is not STREAM_END, and be very unhappy, because this patch did not give us to consume the very end of the input stream and left the loop early. >> This yields two problems, first a single byte overrun won't be detected >> properly because the Z_STREAM_END will then be set, but the null >> terminator will have been overwritten. Because we compare total_out and size at the end, we would detect it as an error in this function, no? Then zlib overwriting NUL would not be a problem, as we would free the buffer and return NULL, no? >> The other problem is that >> more recent zlib patches have been poisoning the unconsumed portions >> of the buffers which also overwrites the null, while correctly >> returning length and status. Isn't that a bug in zlib, though? Or do they do that deliberately? I think a workaround with lower impact would be to manually restore NUL at the end of the buffer.
Re: [PATCH] packfile: Correct zlib buffer handling
Duy Nguyenwrites: > On Sun, May 27, 2018 at 1:57 AM, Junio C Hamano wrote: >> Duy Nguyen writes: >> >>> On Sat, May 26, 2018 at 12:56 AM, Jeremy Linton >>> wrote: @@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct packed_git *p, return NULL; memset(, 0, sizeof(stream)); stream.next_out = buffer; - stream.avail_out = size + 1; + stream.avail_out = size; >>> >>> You may want to include in your commit message a reference to >>> 39eea7bdd9 (Fix incorrect error check while reading deflated pack data >>> - 2009-10-21) which adds this plus one with a fascinating story >>> behind. >> >> A bit puzzled---are you saying that this recent patch breaks the old >> fix and must be done in some other way? > > No. I actually wanted to answer that question when I tried to track > down the commit that adds " + 1" but I did not spend enough time to > understand the old problem. I guess your puzzle means you didn't think > it would break anything, which is good. No it merely means I am puzzled how the posted patch that goes directly opposite to what an earlier "fix" did is a correct solution to anything X-<.
Re: [PATCH] packfile: Correct zlib buffer handling
On Sun, May 27, 2018 at 1:57 AM, Junio C Hamanowrote: > Duy Nguyen writes: > >> On Sat, May 26, 2018 at 12:56 AM, Jeremy Linton >> wrote: >>> @@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct >>> packed_git *p, >>> return NULL; >>> memset(, 0, sizeof(stream)); >>> stream.next_out = buffer; >>> - stream.avail_out = size + 1; >>> + stream.avail_out = size; >> >> You may want to include in your commit message a reference to >> 39eea7bdd9 (Fix incorrect error check while reading deflated pack data >> - 2009-10-21) which adds this plus one with a fascinating story >> behind. > > A bit puzzled---are you saying that this recent patch breaks the old > fix and must be done in some other way? No. I actually wanted to answer that question when I tried to track down the commit that adds " + 1" but I did not spend enough time to understand the old problem. I guess your puzzle means you didn't think it would break anything, which is good. -- Duy
Re: [PATCH] packfile: Correct zlib buffer handling
Duy Nguyenwrites: > On Sat, May 26, 2018 at 12:56 AM, Jeremy Linton > wrote: >> @@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct packed_git >> *p, >> return NULL; >> memset(, 0, sizeof(stream)); >> stream.next_out = buffer; >> - stream.avail_out = size + 1; >> + stream.avail_out = size; > > You may want to include in your commit message a reference to > 39eea7bdd9 (Fix incorrect error check while reading deflated pack data > - 2009-10-21) which adds this plus one with a fascinating story > behind. A bit puzzled---are you saying that this recent patch breaks the old fix and must be done in some other way?
Re: [PATCH] packfile: Correct zlib buffer handling
On Sat, May 26, 2018 at 12:56 AM, Jeremy Lintonwrote: > @@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct packed_git > *p, > return NULL; > memset(, 0, sizeof(stream)); > stream.next_out = buffer; > - stream.avail_out = size + 1; > + stream.avail_out = size; You may want to include in your commit message a reference to 39eea7bdd9 (Fix incorrect error check while reading deflated pack data - 2009-10-21) which adds this plus one with a fascinating story behind. -- Duy
Re: [PATCH] packfile: Correct zlib buffer handling
Jeremy Linton wrote: > The buffer being passed to zlib includes a null terminator that > git needs to keep in place. unpack_compressed_entry() attempts to > detect the case that the source buffer hasn't been fully consumed > by checking to see if the destination buffer has been over consumed. > > This yields two problems, first a single byte overrun won't be detected > properly because the Z_STREAM_END will then be set, but the null > terminator will have been overwritten. The other problem is that > more recent zlib patches have been poisoning the unconsumed portions > of the buffers which also overwrites the null, while correctly > returning length and status. > > Lets rely on the fact that the source buffer will only be fully > consumed when the when the destination buffer is inflated to the > correct size. We can do this by passing zlib the correct buffer size > and properly checking the return status. The latter check actually > already exists if the buffer size is correct. > > Signed-off-by: Jeremy Linton> --- As a little background, earlier today Pavel Cahyna filed a ticket about a regression in a recent zlib update on aarch64 in Fedora[1]. While he was doing that I was just beginning to look at why the git test suite failed fairly badly a build last night[2]. The aarch64 build on Fedora 28 failed, while it succeeded on all other architectures (armv7hl, i686, ppc64, ppc64le, s390x, x86_64). It also passed on newer and older Fedora releases. The difference was that the Fedora 28 zlib build has some aarch64 optimization patches applied[3]. (Those patches are in rawhide/f29 as well, but due to an unrelated issue have not yet made it into the buildroot used for the git build.) I'm woefully unqualified to comment on the patch, but if there are any questions about how this was found, I hope the above background will be helpful. A big thanks to Jeremy for dropping whatever he had planned to do today and dig into this issue. I can only hope it was either more fun or less work than what he hoped to do with his Friday. :) Thanks also to Pavel for finding the source of the failures and filing a detailed bug report to get things moving. [1] https://bugzilla.redhat.com/1582555 [2] https://fedorapeople.org/~tmz/git-aarch64-make-test [3] https://src.fedoraproject.org/rpms/zlib/c/25e9802 > packfile.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/packfile.c b/packfile.c > index 7c1a2519f..245eb3204 100644 > --- a/packfile.c > +++ b/packfile.c > @@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct packed_git > *p, > return NULL; > memset(, 0, sizeof(stream)); > stream.next_out = buffer; > - stream.avail_out = size + 1; > + stream.avail_out = size; > > git_inflate_init(); > do { > @@ -1424,7 +1424,7 @@ static void *unpack_compressed_entry(struct packed_git > *p, > stream.next_in = in; > st = git_inflate(, Z_FINISH); > if (!stream.avail_out) > - break; /* the payload is larger than it should be */ > + break; /* done, st indicates if source fully consumed */ > curpos += stream.next_in - in; > } while (st == Z_OK || st == Z_BUF_ERROR); > git_inflate_end(); -- Todd ~~ An election is coming. Universal peace is declared and the foxes have a sincere interest in prolonging the lives of the poultry. -- T.S. Eliot
Re: [PATCH] packfile: Correct zlib buffer handling
On Fri, May 25, 2018 at 7:17 PM, Jeremy Lintonwrote: > The buffer being passed to zlib includes a null terminator that > git needs to keep in place. unpack_compressed_entry() attempts to > detect the case that the source buffer hasn't been fully consumed > by checking to see if the destination buffer has been over consumed. > > This yields two problems, first a single byte overrun won't be detected > properly because the Z_STREAM_END will then be set, but the null > terminator will have been overwritten. The other problem is that > more recent zlib patches have been poisoning the unconsumed portions > of the buffers which also overwrites the null, while correctly > returning length and status. > > Lets rely on the fact that the source buffer will only be fully s/Lets/Let's/ > consumed when the when the destination buffer is inflated to the s/when the when the/when the/ > correct size. We can do this by passing zlib the correct buffer size > and properly checking the return status. The latter check actually > already exists if the buffer size is correct. > > Signed-off-by: Jeremy Linton
[PATCH] packfile: Correct zlib buffer handling
The buffer being passed to zlib includes a null terminator that git needs to keep in place. unpack_compressed_entry() attempts to detect the case that the source buffer hasn't been fully consumed by checking to see if the destination buffer has been over consumed. This yields two problems, first a single byte overrun won't be detected properly because the Z_STREAM_END will then be set, but the null terminator will have been overwritten. The other problem is that more recent zlib patches have been poisoning the unconsumed portions of the buffers which also overwrites the null, while correctly returning length and status. Lets rely on the fact that the source buffer will only be fully consumed when the when the destination buffer is inflated to the correct size. We can do this by passing zlib the correct buffer size and properly checking the return status. The latter check actually already exists if the buffer size is correct. Signed-off-by: Jeremy Linton--- packfile.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/packfile.c b/packfile.c index 7c1a2519f..245eb3204 100644 --- a/packfile.c +++ b/packfile.c @@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct packed_git *p, return NULL; memset(, 0, sizeof(stream)); stream.next_out = buffer; - stream.avail_out = size + 1; + stream.avail_out = size; git_inflate_init(); do { @@ -1424,7 +1424,7 @@ static void *unpack_compressed_entry(struct packed_git *p, stream.next_in = in; st = git_inflate(, Z_FINISH); if (!stream.avail_out) - break; /* the payload is larger than it should be */ + break; /* done, st indicates if source fully consumed */ curpos += stream.next_in - in; } while (st == Z_OK || st == Z_BUF_ERROR); git_inflate_end(); -- 2.13.6
[PATCH] packfile: Correct zlib buffer handling
The buffer being passed to zlib includes a null terminator that git needs to keep in place. unpack_compressed_entry() attempts to detect the case that the source buffer hasn't been fully consumed by checking to see if the destination buffer has been over consumed. This yields two problems, first a single byte overrun won't be detected properly because the Z_STREAM_END will then be set, but the null terminator will have been overwritten. The other problem is that more recent zlib patches have been poisoning the unconsumed portions of the buffers which also overwrites the null, while correctly returning length and status. Lets rely on the fact that the source buffer will only be fully consumed when the when the destination buffer is inflated to the correct size. We can do this by passing zlib the correct buffer size and properly checking the return status. The latter check actually already exists if the buffer size is correct. Signed-off-by: Jeremy Linton--- packfile.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/packfile.c b/packfile.c index 7c1a2519f..245eb3204 100644 --- a/packfile.c +++ b/packfile.c @@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct packed_git *p, return NULL; memset(, 0, sizeof(stream)); stream.next_out = buffer; - stream.avail_out = size + 1; + stream.avail_out = size; git_inflate_init(); do { @@ -1424,7 +1424,7 @@ static void *unpack_compressed_entry(struct packed_git *p, stream.next_in = in; st = git_inflate(, Z_FINISH); if (!stream.avail_out) - break; /* the payload is larger than it should be */ + break; /* done, st indicates if source fully consumed */ curpos += stream.next_in - in; } while (st == Z_OK || st == Z_BUF_ERROR); git_inflate_end(); -- 2.13.6