2010YOUY01 commented on issue #17789:
URL: https://github.com/apache/datafusion/issues/17789#issuecomment-3355931347

   > Well done with that PR!! At this point I'm not sure the GroupsAccumulator 
is the right thing to do, what do you think? Once your PR is merged, getting 
rid of the usage of ArrayAggAccumulator in the general case for StringAgg might 
be the best thing to do?
   
   Maybe `GroupsAccumulator` still has some room to further improve the 
performance, but it's way more complicated to implement, this looks like a 
low-ROI effort to me.
   
   I think getting rid of `ArrayAggAccumulator` is similar -- native 
implementation should get better performance, however it requires huge effort 
to implement it on all similar aggregate functions, and `ArrayAggAccumulator` 
makes implementing aggregation with `ORDER/DISTINCT` really simple.
   
   I think the most important follow-up task is making `ArrayAggAccumulator` 
faster, the previous 1000x slowdown and memory bloat suggest there might be a 
major inefficiency inside.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to