wgtmac commented on code in PR #46992: URL: https://github.com/apache/arrow/pull/46992#discussion_r2220862555
########## cpp/src/parquet/printer.cc: ########## @@ -166,10 +166,20 @@ void ParquetFilePrinter::DebugPrint(std::ostream& stream, std::list<int> selecte stream << " Values: " << column_chunk->num_values(); if (column_chunk->is_stats_set()) { std::string min = stats->min(), max = stats->max(); + std::string max_exact = + stats->is_max_value_exact.has_value() Review Comment: Is it an overkill to print `unknown`? Could we simplify it as below: ``` std::string max_exact = stats->is_max_value_exact.value_or(false) ? "true" : "false"; ``` ########## cpp/src/parquet/printer.cc: ########## @@ -337,6 +347,22 @@ void ParquetFilePrinter::JSONPrint(std::ostream& stream, std::list<int> selected << R"("Max": ")" << FormatStatValue(descr->physical_type(), max) << "\", " << R"("Min": ")" << FormatStatValue(descr->physical_type(), min) << "\""; + if (stats->is_max_value_exact().has_value()) { + stream << ", " + << R"("IsMaxValueExact": ")" + << (stats->is_max_value_exact().value() ? "true" : "false") << "\""; + } else { + stream << ", " Review Comment: Same question here. ########## cpp/src/parquet/statistics.cc: ########## @@ -613,6 +615,8 @@ class TypedStatisticsImpl : public TypedStatistics<DType> { if (has_min_max) { PlainDecode(encoded_min, &min_); PlainDecode(encoded_max, &max_); + statistics_.is_min_value_exact = true; Review Comment: From the comment above (`Create stats from a thrift Statistics object`), shouldn't we expand the function parameters for these two fields? ########## cpp/src/parquet/statistics.h: ########## @@ -259,6 +264,14 @@ class PARQUET_EXPORT Statistics { /// \brief Plain-encoded maximum value virtual std::string EncodeMax() const = 0; + /// \brief Return the minimum value exact flag if set. + /// It will be true if there was no truncation. + virtual std::optional<bool> is_min_value_exact() const = 0; Review Comment: Should we consider simplifying it as `virtual bool is_min_value_exact() const = 0;` which returns false if the value is missing internally? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org