pg_dump: Fix compression API errorhandling Compression in pg_dump is abstracted using an API with multiple implementations which can be selected at runtime by the user. The API and its implementations have evolved over time, notable commits include bf9aa490db, e9960732a9, 84adc8e20, and 0da243fed. The errorhandling defined by the API was however problematic and the implementations had a few bugs and/or were not following the API specification. This commit modifies the API to ensure that callers can perform errorhandling efficiently and fixes all the implementations such that they all implement the API in the same way. A full list of the changes can be seen below.
* write_func: - Make write_func throw an error on all error conditions. All callers of write_func were already checking for success and calling pg_fatal on all errors, so we might as well make the API support that case directly with simpler errorhandling as a result. * open_func: - zstd: move stream initialization from the open function to the read and write functions as they can have fatal errors. Also ensure to dup the file descriptor like none and gzip. - lz4: Ensure to dup the file descriptor like none and gzip. * close_func: - zstd: Ensure to close the file descriptor even if closing down the compressor fails, and clean up state allocation on fclose failures. Make sure to capture errors set by fclose. - lz4: Ensure to close the file descriptor even if closing down the compressor fails, and instead of calling pg_fatal log the failures using pg_log_error. Make sure to capture errors set by fclose. - none: Make sure to catch errors set by fclose. * read_func / gets_func: - Make read_func unconditionally return the number of read bytes instead of making it optional per implementation. - lz4: Make sure to call throw an error and not return -1 - gzip: gzread returning zero cannot be assumed to indicate EOF as it is documented to return zero for some types of errors. - lz4, zstd: Convert the _read_internal helper functions to not call pg_fatal on errors to be able to handle gets_func returning NULL on error. * getc_func: - zstd: Use an unsigned char rather than an int to read char into. * LZ4Stream_init: - Make sure to not switch to inited state until we know that initialization succeeded and reset errno just in case. On top of these changes there are minor comment cleanups and improvements as well as an attempt to consistently reset errno in codepaths where it is inspected. This work was initiated by a report of API misuse, which turned into a larger body of work. As this is an internal API these changes can be backpatched into all affected branches. Author: Tom Lane <t...@sss.pgh.pa.us> Author: Daniel Gustafsson <dan...@yesql.se> Reported-by: Evgeniy Gorbanev <gorbanyo...@basealt.ru> Discussion: https://postgr.es/m/517794.1750082...@sss.pgh.pa.us Backpatch-through: 16 Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/e686010c5b47f2e7f7e4e8d31ef69efadbbc4d72 Modified Files -------------- src/bin/pg_dump/compress_gzip.c | 35 ++++++--- src/bin/pg_dump/compress_io.c | 2 + src/bin/pg_dump/compress_io.h | 15 ++-- src/bin/pg_dump/compress_lz4.c | 92 +++++++++++++--------- src/bin/pg_dump/compress_none.c | 29 +++---- src/bin/pg_dump/compress_zstd.c | 144 +++++++++++++++++++++------------- src/bin/pg_dump/pg_backup_archiver.c | 4 +- src/bin/pg_dump/pg_backup_directory.c | 52 +++--------- 8 files changed, 208 insertions(+), 165 deletions(-)