MisterRaindrop commented on PR #1571: URL: https://github.com/apache/cloudberry/pull/1571#issuecomment-3897151327
> Overall, FDW parallel scan is a direction worth exploring, but this approach is too rough. The core problems are: > > 1. locus transition semantics for Gather in an MPP context haven't been thought through, and the changes are too broad. > 2. FDW is a black box from the database's perspective. > For heap tables we have parallel scan (divide work by pages), for AO/AOCS we have parallel scan (divide work by files) — the work partitioning is well-defined. > But for FDWs, the parallel behavior depends entirely on the FDW's own implementation. If an FDW (say file_fdw) sets parallel_safe = true following planner's parallel logic but doesn't actually implement the DSM parallel callbacks (EstimateDSMForeignScan, InitializeDSMForeignScan, InitializeWorkerForeignScan), then multiple workers will each scan the full dataset, producing duplicate rows. I'm not very familiar with Cloudberry. Still learning. FDW itself is a black box. Its specific implementation largely depends on how the user implements it. My understanding is that users need to take responsibility for their own implementations. Additionally, I should only enable gather for FDW. In other cases, it should remain false, this will parallel processing advantages of PostgreSQL? Additionally, I've looked into other aspects of FDW parallelism. Currently, it seems there is no optimal solution. So, should we aim to implement parallelism that is transparent to users? Or are there better approaches? Could you share some idea? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
