raulcd commented on code in PR #46992: URL: https://github.com/apache/arrow/pull/46992#discussion_r2300319186
########## cpp/src/parquet/statistics.h: ########## @@ -215,12 +220,15 @@ class PARQUET_EXPORT Statistics { /// \param[in] has_min_max whether the min/max statistics are set /// \param[in] has_null_count whether the null_count statistics are set /// \param[in] has_distinct_count whether the distinct_count statistics are set + /// \param[in] is_min_value_exact whether the min value is exact + /// \param[in] is_max_value_exact whether the max value is exact /// \param[in] pool a memory pool to use for any memory allocations, optional static std::shared_ptr<Statistics> Make( const ColumnDescriptor* descr, const std::string& encoded_min, const std::string& encoded_max, int64_t num_values, int64_t null_count, int64_t distinct_count, bool has_min_max, bool has_null_count, - bool has_distinct_count, + bool has_distinct_count, std::optional<bool> is_min_value_exact, + std::optional<bool> is_max_value_exact, Review Comment: Would it be ok to move the API deprecation (both the old one and the one introduced here) to create a new API that gets rid of the `has_*`, moves the previous statistics values to `std::optional`, removes the `MemoryPool`, potentially getting rid of the encoder to a different issue/PR? All that are nice features but seems slightly out of scope for exposing `is_{min/max}_value_exact` and can be tackled independently (it could even be done before this one). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org