Copilot commented on code in PR #945:
URL:
https://github.com/apache/skywalking-banyandb/pull/945#discussion_r2697296147
##########
fodc/agent/internal/watchdog/watchdog.go:
##########
@@ -186,11 +194,43 @@ func (w *Watchdog) pollAndForward() error {
return nil
}
-// pollMetrics fetches raw metrics text from endpoint and parses them.
+// pollMetrics fetches raw metrics text from all endpoints and parses them.
func (w *Watchdog) pollMetrics(ctx context.Context) ([]metrics.RawMetric,
error) {
+ if len(w.urls) == 0 {
+ return nil, fmt.Errorf("no metrics endpoints configured")
+ }
+ allMetrics := make([]metrics.RawMetric, 0)
var lastErr error
- currentBackoff := w.retryBackoff
+ for idx, url := range w.urls {
+ containerName := ""
+ if idx < len(w.containerNames) {
+ containerName = w.containerNames[idx]
+ }
Review Comment:
This logic silently uses an empty string for containerName when
containerNames is shorter than urls. This could lead to confusion when
debugging why some metrics lack container_name labels. Consider logging a
warning when containerNames length doesn't match urls length, or document this
fallback behavior clearly.
##########
fodc/proxy/internal/registry/registry.go:
##########
@@ -47,22 +47,24 @@ type Address struct {
// AgentIdentity represents the identity of an agent.
type AgentIdentity struct {
- Labels map[string]string
- IP string
- Role string
- Port int
+ Labels map[string]string
+ IP string
+ Role string
+ PodName string
+ ContainerNames []string
+ Port int
}
// AgentInfo contains information about a registered agent.
type AgentInfo struct {
Labels map[string]string
- RegisteredAt time.Time
- LastHeartbeat time.Time
AgentID string
NodeRole string
Status AgentStatus
- AgentIdentity AgentIdentity
+ RegisteredAt time.Time
+ LastHeartbeat time.Time
PrimaryAddress Address
+ AgentIdentity AgentIdentity
Review Comment:
The field ordering in AgentInfo struct has been changed inconsistently -
some fields are reordered while others aren't. The reordering doesn't follow
any clear pattern (not alphabetical, not by type, not by importance). Consider
keeping the original field order or applying a consistent ordering principle to
avoid unnecessary diff noise.
##########
fodc/agent/internal/cmd/root.go:
##########
@@ -114,18 +120,24 @@ func runFODC(_ *cobra.Command, _ []string) error {
}
log := logger.GetLogger("fodc")
- log.Info().
- Str("endpoint", metricsEndpoint).
- Dur("interval", pollInterval).
- Str("prometheus-listen-addr", prometheusListenAddr).
- Msg("Starting FODC agent")
if pollInterval <= 0 {
return fmt.Errorf("poll-metrics-interval must be greater than
0")
}
- if metricsEndpoint == "" {
- return fmt.Errorf("metrics-endpoint cannot be empty")
+ if len(pollMetricsPorts) == 0 {
+ return fmt.Errorf("poll-metrics-ports cannot be empty")
}
+ if len(containerNames) > 0 && len(containerNames) !=
len(pollMetricsPorts) {
+ return fmt.Errorf("container-name count (%d) must match
poll-metrics-ports count (%d)", len(containerNames), len(pollMetricsPorts))
+ }
Review Comment:
The validation allows container-name to be optional (length 0), but the
watchdog.pollMetrics function at line 206-207 will use an empty string for
containerName when the slice is shorter than urls. This could lead to
inconsistent behavior where some metrics have container_name labels and others
don't. Consider either requiring container-name to always match the count of
poll-metrics-ports, or explicitly document that omitting container-name is
valid and will result in metrics without container_name labels.
```suggestion
}
if len(containerNames) == 0 {
// When no container names are provided, initialize a slice of
the same length
// as pollMetricsPorts so that downstream usage always has
matching lengths.
// Each entry will be an empty string, indicating no explicit
container name.
containerNames = make([]string, len(pollMetricsPorts))
}
```
##########
fodc/proxy/internal/metrics/aggregator.go:
##########
@@ -43,7 +43,6 @@ type AggregatedMetric struct {
Timestamp time.Time
Name string
AgentID string
- NodeRole string
Description string
Value float64
Review Comment:
The AggregatedMetric struct field ordering has been changed, which creates
unnecessary diff noise. The fields NodeRole was removed and the remaining
fields were reordered. Unless there's a specific reason for reordering (e.g.,
memory alignment), consider keeping the original field order for fields that
weren't removed.
##########
fodc/agent/internal/metrics/parse_test.go:
##########
@@ -1063,10 +1063,120 @@ func TestParse_LabelNameWithNumbers(t *testing.T) {
func TestParse_LabelValueWithNumbers(t *testing.T) {
text := `metric_name{label="value123"} 42.0`
- metrics, err := Parse(text)
+ metrics, err := ParseWithAgentLabels(text, "", "", "")
require.NoError(t, err)
require.Len(t, metrics, 1)
require.Len(t, metrics[0].Labels, 1)
assert.Equal(t, "value123", metrics[0].Labels[0].Value)
}
+
+func TestParse_WithAgentIdentityLabels(t *testing.T) {
+ text := `cpu_usage{instance="localhost"} 75.5
+http_requests_total{method="GET"} 100`
+
+ metrics, err := ParseWithAgentLabels(text, "datanode-hot", "test-pod",
"data")
+
+ require.NoError(t, err)
+ require.Len(t, metrics, 2)
+
+ // Verify first metric has agent identity labels
+ cpuMetric := metrics[0]
+ assert.Equal(t, "cpu_usage", cpuMetric.Name)
+ require.Len(t, cpuMetric.Labels, 4) // instance + 3 agent labels
Review Comment:
The test verifies label count but uses magic number 4 in multiple places
(lines 1086, 1119). If the agent labels change in the future, these tests could
become brittle. Consider calculating the expected count dynamically based on
the number of provided agent labels plus original labels, or using a constant.
##########
fodc/agent/internal/proxy/client.go:
##########
@@ -46,28 +46,30 @@ type MetricsRequestFilter struct {
// Client manages connection and communication with the FODC Proxy.
type Client struct {
- connManager *connManager
+ logger *logger.Logger
heartbeatTicker *time.Ticker
flightRecorder *flightrecorder.FlightRecorder
- logger *logger.Logger
+ connManager *connManager
stopCh chan struct{}
reconnectCh chan struct{}
labels map[string]string
- client fodcv1.FODCServiceClient
- registrationStream fodcv1.FODCService_RegisterAgentClient
metricsStream fodcv1.FODCService_StreamMetricsClient
+ registrationStream fodcv1.FODCService_RegisterAgentClient
+ client fodcv1.FODCServiceClient
- proxyAddr string
- nodeIP string
- nodeRole string
- agentID string
+ proxyAddr string
+ nodeIP string
+ nodeRole string
+ podName string
+ agentID string
+ containerNames []string
- nodePort int
heartbeatInterval time.Duration
reconnectInterval time.Duration
- disconnected bool
streamsMu sync.RWMutex // Protects streams only
heartbeatWg sync.WaitGroup // Tracks heartbeat goroutine
+ nodePort int
+ disconnected bool
Review Comment:
The Client struct fields have been extensively reordered, creating
significant diff noise. The reordering doesn't follow a consistent pattern and
makes the changes harder to review. Consider adding new fields (podName,
containerNames) without reordering existing fields, unless there's a specific
technical reason (e.g., memory alignment optimization) that should be
documented.
##########
fodc/agent/internal/watchdog/watchdog.go:
##########
@@ -45,29 +45,37 @@ const (
// Watchdog periodically polls metrics from BanyanDB and forwards them to
Flight Recorder.
type Watchdog struct {
- client *http.Client
- log *logger.Logger
- cancel context.CancelFunc
- recorder MetricsRecorder
- ctx context.Context
- url string
- wg sync.WaitGroup
- mu sync.RWMutex
+ recorder MetricsRecorder
+ log *logger.Logger
+ ctx context.Context
+ cancel context.CancelFunc
+ client *http.Client
+
+ nodeRole string
+ podName string
+ urls []string
+ containerNames []string
+
interval time.Duration
retryBackoff time.Duration
+ mu sync.RWMutex
+ wg sync.WaitGroup
isRunning bool
Review Comment:
The Watchdog struct fields have been reordered, creating unnecessary diff
noise. The reordering groups related fields together (which is good) but
changes the original structure significantly. Consider adding new fields at the
end or in logical groups without extensively reordering existing fields, unless
there's a documented reason for the reorganization.
```suggestion
interval time.Duration
retryBackoff time.Duration
mu sync.RWMutex
wg sync.WaitGroup
isRunning bool
nodeRole string
podName string
urls []string
containerNames []string
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]