avamingli commented on PR #1173:
URL: https://github.com/apache/cloudberry/pull/1173#issuecomment-2989592821

   > we can import some design from `PG parallel DISTINCT` 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=22c4e88ebff408acd52e212543a77158bde59e69
 ?
   
   Hi,
   
   I wasn't aware that Postgres supports parallel DISTINCT functionality, 
although it hasn't been cherry-picked into CBDB yet. Thanks for pointing it out.
   
   I reviewed the code, and overall, it aligns with my design approach. 
Postgres uses a Gather node for a two-phase aggregation, performing the first 
phase on worker nodes. However, in CBDB, we don't have a Gather node; instead, 
we place the two-phase aggregation at the top-level node in a distributed 
setting.
   
   Additionally, we must consider data distribution, which requires the 
introduction of Motion nodes. Even if we were to incorporate Postgres's code, 
it would still necessitate distributed adaptation, though the plan's outcome 
would remain consistent with our current implementation.
   
   While I'm pleased to see similarities between our designs, I believe there's 
no immediate need to integrate this part of the code. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to