rok commented on pull request #9683:
URL: https://github.com/apache/arrow/pull/9683#issuecomment-797629920


   > I'm not familiar with this C++ code so I'll let others comment (cc @pitrou 
@bkietz @michalursa). It looks like the issue is only with ChunkedArrays where 
the chunks have different dictionaries? My instinct is that, rather than 
unifying first and then determining unique values/counting/hashing, what if we 
could do the aggregation on each chunk first and then unify the results? That 
would be a smaller amount of data to manipulate.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to