[ 
https://issues.apache.org/jira/browse/ARROW-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rok Mihevc updated ARROW-12301:
-------------------------------
    Description: 
When calculating unique for chunked DictionaryArrays we currently run through 
all chunks and unify their dictionaries and then collect chunk indices. We 
could avoid the dictionary unification by using a generic hash.

[See discussion here|https://github.com/apache/arrow/pull/9683] and 
[here|https://issues.apache.org/jira/browse/ARROW-10403]

  was:
When calculating unique for chunked DictionaryArrays we currently run through 
all chunks and unify their dictionaries and then collect chunk indices. We 
could avoid the dictionary unification by using a generic hash.

[See discussion here|https://github.com/apache/arrow/pull/9683] and 
[here|#ARROW-10403]


> [C++][Compute] Use generic hash-aggregate for DictionaryArrays
> --------------------------------------------------------------
>
>                 Key: ARROW-12301
>                 URL: https://issues.apache.org/jira/browse/ARROW-12301
>             Project: Apache Arrow
>          Issue Type: Improvement
>            Reporter: Rok Mihevc
>            Priority: Major
>
> When calculating unique for chunked DictionaryArrays we currently run through 
> all chunks and unify their dictionaries and then collect chunk indices. We 
> could avoid the dictionary unification by using a generic hash.
> [See discussion here|https://github.com/apache/arrow/pull/9683] and 
> [here|https://issues.apache.org/jira/browse/ARROW-10403]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to