Copilot commented on code in PR #61845:
URL: https://github.com/apache/doris/pull/61845#discussion_r3004432292
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/TabletStatMgr.java:
##########
@@ -162,7 +164,22 @@ protected void runAfterCatalogReady() {
}
try {
List<Partition> allPartitions =
olapTable.getAllPartitions();
- partitionCount += allPartitions.size();
+ int tablePartitionNum = allPartitions.size();
+ partitionCount += tablePartitionNum;
+ // Check if this table's partition count is near the limit
(>80%)
+ if
(olapTable.getPartitionInfo().enableAutomaticPartition()) {
+ int limit = Config.max_auto_partition_num;
+ if (tablePartitionNum > limit * 8L / 10) {
Review Comment:
`OlapTable.getAllPartitions()` includes temp partitions, but partition limit
enforcement/warnings use `getPartitionNum()` (non-temp). Using
`allPartitions.size()` here can inflate the near-limit gauges (and
partitionCount) due to temp partitions and make the metric inconsistent with
the actual limit checks. Consider computing near-limit using
`olapTable.getPartitionNum()` / `getPartitions()` (or otherwise excluding temp
partitions) while still iterating `getAllPartitions()` for size/stat
aggregation if needed.
##########
fe/fe-core/src/main/java/org/apache/doris/metric/MetricRepo.java:
##########
@@ -1044,15 +1044,16 @@ public Integer getValue() {
GAUGE_AVG_TABLET_SIZE_BYTES = new
GaugeMetricImpl<>("avg_tablet_size_bytes", MetricUnit.BYTES, "", 0L);
DORIS_METRIC_REGISTER.addMetrics(GAUGE_AVG_TABLET_SIZE_BYTES);
- // Partition near-limit warning counters
- COUNTER_AUTO_PARTITION_NEAR_LIMIT = new
LongCounterMetric("auto_partition_near_limit_count",
+ // Partition near-limit warning gauges (updated by TabletStatMgr
periodic scan)
+ GAUGE_AUTO_PARTITION_NEAR_LIMIT = new
GaugeMetricImpl<>("auto_partition_near_limit_count",
MetricUnit.NOUNIT,
- "number of times auto partition count exceeded 80% of
max_auto_partition_num");
- DORIS_METRIC_REGISTER.addMetrics(COUNTER_AUTO_PARTITION_NEAR_LIMIT);
- COUNTER_DYNAMIC_PARTITION_NEAR_LIMIT = new
LongCounterMetric("dynamic_partition_near_limit_count",
+ "number of auto partition tables where partition count
exceeded 80% of max_auto_partition_num", 0L);
+ DORIS_METRIC_REGISTER.addMetrics(GAUGE_AUTO_PARTITION_NEAR_LIMIT);
+ GAUGE_DYNAMIC_PARTITION_NEAR_LIMIT = new
GaugeMetricImpl<>("dynamic_partition_near_limit_count",
MetricUnit.NOUNIT,
- "number of times dynamic partition count exceeded 80% of
max_dynamic_partition_num");
- DORIS_METRIC_REGISTER.addMetrics(COUNTER_DYNAMIC_PARTITION_NEAR_LIMIT);
+ "number of dynamic partition tables where partition count
exceeded 80% of max_dynamic_partition_num",
+ 0L);
+ DORIS_METRIC_REGISTER.addMetrics(GAUGE_DYNAMIC_PARTITION_NEAR_LIMIT);
Review Comment:
These metrics keep the same names but change from counter to gauge. For
Prometheus-based monitoring this is an incompatible type change (existing
`rate()/increase()` queries and some scrape pipelines will report type
conflicts and may drop samples). If backward compatibility is required,
consider emitting a new gauge metric name (or temporarily exporting both the
old counter and new gauge under different names) and deprecating the old one
with a release note.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]