[ https://issues.apache.org/jira/browse/ARROW-9132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney reassigned ARROW-9132: ----------------------------------- Assignee: Wes McKinney > [C++] Support unique kernel for dictionary type > ----------------------------------------------- > > Key: ARROW-9132 > URL: https://issues.apache.org/jira/browse/ARROW-9132 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python > Affects Versions: 0.17.1 > Reporter: Dave Hirschfeld > Assignee: Wes McKinney > Priority: Major > Fix For: 1.0.0 > > > Enabling > [`strings_as_dictionary`](https://turbodbc.readthedocs.io/en/latest/pages/advanced_usage.html?highlight=strings_as_dictionary#obtaining-apache-arrow-result-sets) > in `turbodbc` returns a `ChunkedArray` of `dictionary` type (IIUC). > I'd like to enable this for better performance however it seems not all > functionality is implemented for `dictionary` types? In particular, `unique` > seems not to be implemented: > {code} > In [40]: nmi.__class__.mro() > Out[40]: [pyarrow.lib.ChunkedArray, pyarrow.lib._PandasConvertible, object] > In [41]: nmi.type > Out[41]: DictionaryType(dictionary<values=string, indices=int32, ordered=0>) > In [42]: nmi.unique() > Traceback (most recent call last): > File "<ipython-input-42-0fcb7893d5c4>", line 1, in <module> > nmi.unique() > File "pyarrow\table.pxi", line 307, in pyarrow.lib.ChunkedArray.unique > File "pyarrow\error.pxi", line 106, in pyarrow.lib.check_status > ArrowNotImplementedError: unique not implemented for > dictionary<values=string, indices=int32, ordered=0> > {code} > It would be very useful if the `dictionary` type supported all the usual > operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)