Hi all, Thank you for the updated patch.
On Fri, May 22, 2026 at 1:03 PM Nitin Motiani <[email protected]> wrote: > > Changed how pipe commands are quoted in the Windows test. The latest > versions are attached. I worked on reproducing the current limitation around parallel dumps and then tested the latest v16 patch adding --pipe support for pg_dump. To begin with, I verified the existing behavior. For example: pg_dump postgres | gzip > dump.sql.gz works, but does not support parallelism, whereas: pg_dump -Fd -j 4 -f dumpdir postgres du -sh dumpdir 21M dumpdir requires intermediate disk storage. This demonstrates the current limitation where users must choose between parallelism and streaming pipelines. I then tested the patch introducing --pipe support. The feature is quite useful for modern workflows where users want to stream dump output directly to compression or upload pipelines without relying on intermediate storage. Basic functionality worked as expected. For example: pg_dump -p 55432 -Fd -j 4 --pipe="cat > dump.out" postgres, produced a ~38MB output file, and: pg_dump -p 55432 -Fd -j 4 --pipe="gzip > dump.gz" postgres produced, a compressed file (~11MB). The initial contents appeared valid: gunzip -c dump.gz | head 1 2 3 ... Also, no intermediate directory was created, confirming that the patch enables streaming without filesystem-backed staging. Error handling also behaved correctly. For example: --pipe="invalid_cmd" resulted in: pg_dump: error: pipe command failed: command not found and: --pipe="gzip | false" resulted in: pg_dump: error: pipe command failed: child process exited with exit code 1 However, I observed an important issue when using the feature with multiple parallel workers. Since the pipe command is executed per output file, using: --pipe="gzip > dump.gz", it results in multiple workers invoking independent gzip processes that all write to the same output file. This leads to corrupted or truncated output. In my testing: gunzip -c dump.gz > dump.sql failed with: gzip: dump.gz: unexpected end of file This suggests that concurrent writes to a shared output target are not coordinated and can result in invalid dumps. It would be helpful to clarify expected usage patterns here. For example: whether users are expected to generate distinct outputs per worker, or whether safeguards should be implemented to prevent multiple workers from writing to the same destination. Additionally, during failure scenarios I observed backend logs such as: FATAL: connection to client lost Broken pipe While this is expected when the pipe terminates prematurely, it may be worth considering whether error messaging or cleanup behavior can be made clearer from the user perspective. Overall, the feature is valuable and aligns well with modern backup workflows. However, behavior in multi-worker scenarios with shared pipe targets may need further clarification or safeguards to avoid data corruption. Looking forward to more feedback. Regards. Solai
