shuai-xu opened a new pull request, #2055:
URL: https://github.com/apache/orc/pull/2055
### What changes were proposed in this pull request?
This pr fix the bug that if the column statistics in a orc file is not fully
written, and lack of hasnull field, user may get a wrong result using c++ to
read it.
For example, a file struct<string col1, string col2>, has 10 lines, col1 all
has value, col2 all is null. the column 1's stat written by trino may be
numberOfValues: 10
stringStatistics {
minimum: "10"
maximum: "100"
sum: 565
}. col2's stat is numberOfValues: 0. They all have no hasnull field. When
we want to get where col2 is null, we will get nothing.
### Why are the changes needed?
User may get a wrong result with this bug.
### How was this patch tested?
Add unit tests.
### Was this patch authored or co-authored using generative AI tooling?
No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]