Quanlong Huang created ORC-1143:
-----------------------------------

             Summary: [C++] Support reading the PRESENT stream without reading 
the column data
                 Key: ORC-1143
                 URL: https://issues.apache.org/jira/browse/ORC-1143
             Project: ORC
          Issue Type: New Feature
          Components: C++
            Reporter: Quanlong Huang


Queries like "select count(a) from tbl" just requires checking whether the 
column value is not NULL. ORC files already have the PRESENT stream for each 
column (though it's optional). We can serve the request by just reading the 
PRESENT stream.

Currently, ReadIntent has two items:
{code:java}
enum ReadIntent {
  ReadIntent_ALL = 0,

  // Only read the offsets of selected type. Do not read the children types.
  ReadIntent_OFFSETS = 1
};{code}
We can extend it to add an item like ReadIntent_PRESENT. The corresponding 
ColumnVectorBatch will only have valid notNull results.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to