Wes McKinney created ARROW-81:
---------------------------------

             Summary: C++: Add a Category nested type
                 Key: ARROW-81
                 URL: https://issues.apache.org/jira/browse/ARROW-81
             Project: Apache Arrow
          Issue Type: New Feature
          Components: C++
            Reporter: Wes McKinney
            Assignee: Wes McKinney


A Category (or "factor") is a dictionary-encoded array whose dictionary has 
semantic meaning. The data consists of

- An array of integer "codes"
- A child array of some other type, known as the "categories" or "levels" of 
the array

It is a basic requirement for Python and R, at least, as Arrow C++ consumers, 
to have this type. Separately, we should consider what is necessary to be able 
to transmit category data in IPCs -- possible an expansion of the format. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to