This is an automated email from the ASF dual-hosted git repository. maxyang pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/cloudberry.git
commit 94dddae995a956ef2debd517b89feefabd83c269 Author: hyongtao-db <[email protected]> AuthorDate: Mon Jul 4 11:26:51 2022 +0800 master: fix gpfdist crash (#13750) Long log: When we write compressed data into a full disk, gpfdist will return 500 error. After 5 mins, the session will be removed. Then, if we write data into the full disk again, gpfdist will fail, like: ``` gz_file_write_one_chunk: Assertion `ret1 != (-2)' failed. ``` This is a phenomenon that users do not expect. In fact, gpfdist can still work for other request. We think we shouldn't carry on assert(ret1 != Z_STREAM_ERROR); If the error happens, we need to judge and return directly, avoiding gpfdist fails. The log is shown as below: ``` 2022-06-27 18:10:55 6837 INFO active segids in session: 0 1 2 2022-06-27 18:10:55 6837 WARN [1:6:1:11] handle_post_request, write error: cannot write into file 2022-06-27 18:10:55 6837 WARN [1:6:1:11] HTTP ERROR: 127.0.0.1 - 500 cannot write into file 2022-06-27 18:10:55 6837 WARN [1:6:1:11] gpfdist read unexpected data after shutdown // ..................... 2022-06-27 18:16:49 6837 INFO remove sessions 2022-06-27 18:16:49 6837 INFO remove out-dated session 812-0000000037.25.0.0:/home/gpadmin/core_analysis/load_files/test_file/demo_str.txt.gz 2022-06-27 18:16:49 6837 INFO free session 812-0000000037.25.0.0:/home/gpadmin/core_analysis/load_files/test_file/demo_str.txt.gz gpfdist: gfile.c:376: gz_file_write_one_chunk: Assertion `ret1 != (-2)' failed. ``` Signed-off-by: Yongtao Huang<[email protected]> --- src/backend/utils/misc/fstream/gfile.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/backend/utils/misc/fstream/gfile.c b/src/backend/utils/misc/fstream/gfile.c index 2183d7407e..9c63eed6e5 100644 --- a/src/backend/utils/misc/fstream/gfile.c +++ b/src/backend/utils/misc/fstream/gfile.c @@ -382,7 +382,11 @@ gz_file_write_one_chunk(gfile_t *fd, int do_flush) z->s.avail_out = COMPRESSION_BUFFER_SIZE; z->s.next_out = z->out; ret1 = deflate(&(z->s), do_flush); /* no bad return value */ - assert(ret1 != Z_STREAM_ERROR); /* state not clobbered */ + if (ret1 == Z_STREAM_ERROR) + { + gfile_printf_then_putc_newline("the gz file is unrepaired, stop writing"); + return -1; + } have = COMPRESSION_BUFFER_SIZE - z->s.avail_out; if ( write_and_retry(fd, z->out, have) != have ) @@ -391,6 +395,7 @@ gz_file_write_one_chunk(gfile_t *fd, int do_flush) * presently gfile_close calls gz_file_close only for the on_write case so we don't need * to handle inflateEnd here */ + gfile_printf_then_putc_newline("failed to write, the stream ends"); (void)deflateEnd(&(z->s)); ret = -1; break; --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
