pitrou opened a new issue, #47053: URL: https://github.com/apache/arrow/issues/47053
### Describe the enhancement requested The `ValidateRunEndEncodedChildren` function uses ordered comparisons in two places to validate the run-end encoded data: 1. To check there are at least as many values as run ends: https://github.com/apache/arrow/blob/8b2336058c1dd5eba3293ab736cfbe8e0c38dc2b/cpp/src/arrow/util/ree_util.cc#L199-L202 2. To check that the last run-end does not overpass the logical offset and length: https://github.com/apache/arrow/blob/8b2336058c1dd5eba3293ab736cfbe8e0c38dc2b/cpp/src/arrow/util/ree_util.cc#L218-L224 It seems that the current checks can let through some programming errors. An example is https://github.com/apache/arrow/issues/47029 where the JSON C++ reader would read the integration data as having logical length 7 even though the generated run-ends were much larger. Is there a reason for not doing equality testing for these checks? ### Component(s) C++, Integration -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
