xiangfu0 commented on code in PR #18185:
URL: https://github.com/apache/pinot/pull/18185#discussion_r3294043410
##########
pinot-controller/src/main/java/org/apache/pinot/controller/util/ServerSegmentMetadataReader.java:
##########
@@ -144,6 +159,50 @@ public TableMetadataInfo
getAggregatedTableMetadataFromServer(String tableNameWi
}
return l;
}));
+ // Aggregate per-column compression stats from server responses
+ List<ColumnCompressionStatsInfo> serverColStats =
tableMetadataInfo.getColumnCompressionStats();
+ if (serverColStats != null) {
+ for (ColumnCompressionStatsInfo info : serverColStats) {
+ // Skip columns with no meaningful compression data (old raw
segments without persisted codec)
+ if (info.getCodec() == null && !info.hasDictionary()) {
+ continue;
+ }
+ String col = info.getColumn();
+ long[] accum = columnCompressionAccum.computeIfAbsent(col, k ->
new long[2]);
+ // Only accumulate uncompressed size when it is a real value (not
the -1 sentinel from dict columns)
+ if (info.getUncompressedSizeInBytes() >= 0) {
+ accum[0] += info.getUncompressedSizeInBytes();
+ }
+ accum[1] += info.getCompressedSizeInBytes();
+ if (info.getCodec() != null) {
+ columnCodecMap.merge(col, info.getCodec(),
+ (existing, incoming) -> existing.equals(incoming) ? existing
: "MIXED");
+ }
+ columnHasDictMap.put(col, info.hasDictionary());
Review Comment:
Will all the info has same value of `hasDictionary()`?
If there is mixed segments, then columnHasDictMap may give you different
results.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]