Quanlong Huang created ORC-1143: ----------------------------------- Summary: [C++] Support reading the PRESENT stream without reading the column data Key: ORC-1143 URL: https://issues.apache.org/jira/browse/ORC-1143 Project: ORC Issue Type: New Feature Components: C++ Reporter: Quanlong Huang
Queries like "select count(a) from tbl" just requires checking whether the column value is not NULL. ORC files already have the PRESENT stream for each column (though it's optional). We can serve the request by just reading the PRESENT stream. Currently, ReadIntent has two items: {code:java} enum ReadIntent { ReadIntent_ALL = 0, // Only read the offsets of selected type. Do not read the children types. ReadIntent_OFFSETS = 1 };{code} We can extend it to add an item like ReadIntent_PRESENT. The corresponding ColumnVectorBatch will only have valid notNull results. -- This message was sent by Atlassian Jira (v8.20.1#820001)