Wes McKinney created ARROW-81:
---------------------------------
Summary: C++: Add a Category nested type
Key: ARROW-81
URL: https://issues.apache.org/jira/browse/ARROW-81
Project: Apache Arrow
Issue Type: New Feature
Components: C++
Reporter: Wes McKinney
Assignee: Wes McKinney
A Category (or "factor") is a dictionary-encoded array whose dictionary has
semantic meaning. The data consists of
- An array of integer "codes"
- A child array of some other type, known as the "categories" or "levels" of
the array
It is a basic requirement for Python and R, at least, as Arrow C++ consumers,
to have this type. Separately, we should consider what is necessary to be able
to transmit category data in IPCs -- possible an expansion of the format.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)