William Butler created PARQUET-2109:
---------------------------------------
Summary: Parquet Cpp Reader Can Loop Forever If Page Values
Overstated
Key: PARQUET-2109
URL: https://issues.apache.org/jira/browse/PARQUET-2109
Project: Parquet
Issue Type: Bug
Components: parquet-cpp
Reporter: William Butler
Assignee: William Butler
If the page header states that there are more values than are actually present
in the page, the Parquet CPP can loop forever. This is because HasNext() will
return true but the actual ReadBatch() will have nothing to read and will not
change reader state, causing an infinite loop. We first noticed the bug via
ScanFileContents(), but this impacts any code that does not check to see if
ReadBatch() consumed anything.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)