I wrote:
> 1.  I do not think it's possible to get Z_STREAM_END from inflate()
> in astreamer_gzip_decompressor_content, because we don't tell it
> that we've reached end-of-stream.

Nah, I'm mistaken about that.  I forgot that a zlib-compressed stream
has its own internal end-of-stream marker, and that's what triggers
the Z_STREAM_END report.

> 2. What Claude noticed I think, but failed to correct accurately, is
> that astreamer_gzip_decompressor_finalize needs to invoke inflate()
> with Z_FINISH, and pump it until it returns Z_STREAM_END.

I spent some time testing this and could not devise a scenario where
there is leftover data still to be decompressed once
astreamer_gzip_decompressor_content finishes.  Closer reading of
zlib.h says that inflate() won't report all the input data as having
been consumed if it runs out of output buffer space, so in practice
the logic in astreamer_gzip_decompressor_content is fine, except that
we're ignoring some error report codes that we shouldn't.  So I think
we should take the proposed patch hunk

-               if (res == Z_STREAM_ERROR)
-                       pg_fatal("could not decompress data: %s", zs->msg);
+               if (res != Z_OK && res != Z_STREAM_END && res != Z_BUF_ERROR)
+                       pg_fatal("could not decompress data: %s",
+                                        zs->msg ? zs->msg : "unknown error");

but I don't like the other bit

+               /* If we've hit the end of the compressed stream, stop. */
+               if (res == Z_STREAM_END)
+                       break;

Paying attention to the amount of data consumed seems just as
good and takes less code.

> 3. It sure looks to me like astreamer_lz4_decompressor_finalize and
> astreamer_zstd_decompressor_finalize have related bugs.  There's no
> provision in them for flushing any buffered data out of those
> libraries, either.

I'm not quite convinced either way about those two.  Perhaps they
are just as aggressive as zlib about dumping decompressed data,
but they might not be.  I did try instrumenting tar_parser_finalize
and failed to observe any misbehavior in the regression tests, but
that's not a very stressful test.

                        regards, tom lane


Reply via email to