Wes McKinney created PARQUET-473: ------------------------------------ Summary: Develop external predicate pushdown API for column readers Key: PARQUET-473 URL: https://issues.apache.org/jira/browse/PARQUET-473 Project: Parquet Issue Type: New Feature Components: parquet-cpp Reporter: Wes McKinney
This will happen significantly downstream of where we are at right now, but we should be planning ahead to facilitate scanning Parquet files with externally-defined predicates as a primary use case. I suggest that the most general (and high performance) predicate will be batch-oriented; i.e. the predicate will be passed a batch of materialized values from one or more columns, and it returns an array of booleans indicating whether or not the predicate is true. We can also develop a row-by-row "scalar" predicate API if users need that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)