[ https://issues.apache.org/jira/browse/ARROW-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rok Mihevc updated ARROW-5340: ------------------------------ External issue URL: https://github.com/apache/arrow/issues/21799 > [C++] See if possible to deduplicate dictionaries in IPC streams in some way > ---------------------------------------------------------------------------- > > Key: ARROW-5340 > URL: https://issues.apache.org/jira/browse/ARROW-5340 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Reporter: Wes McKinney > Priority: Major > > As follow-on work to ARROW-3144, there are cases where a dictionary may be > shared by multiple fields in a RecordBatch. > The presumption of {{arrow::ipc::DictionaryMemo}} is that there is a 1-to-1 > mapping between fields and dictionaries, and dictionary id assignment occurs > prior to observing the dictionaries (to know whether or not they are used > multiple times), so it may not be feasible, or at least not easy. -- This message was sent by Atlassian Jira (v8.20.10#820010)