[GitHub] [arrow] rok commented on pull request #9683: ARROW-10403: [C++] Implement unique kernel for dictionary type

GitBox Fri, 12 Mar 2021 09:11:23 -0800


rok commented on pull request #9683:
URL: https://github.com/apache/arrow/pull/9683#issuecomment-797629920



   > I'm not familiar with this C++ code so I'll let others comment (cc @pitrou 
@bkietz @michalursa). It looks like the issue is only with ChunkedArrays where 
the chunks have different dictionaries? My instinct is that, rather than 
unifying first and then determining unique values/counting/hashing, what if we 
could do the aggregation on each chunk first and then unify the results? That 
would be a smaller amount of data to manipulate.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] rok commented on pull request #9683: ARROW-10403: [C++] Implement unique kernel for dictionary type

Reply via email to