mochengqian opened a new pull request, #967: URL: https://github.com/apache/dubbo-go-pixiu/pull/967
## Summary Closes #957. `ClusterStore.prepareClusterConfig` rebuilt `Config.ConsistentHash.Hash` eagerly on every `AddCluster` / `UpdateCluster` / `SetEndpoint` / `DeleteEndpoint`. After #932, that field is only consumed by the **legacy** (non-snapshot) balancer path — all bundled balancers are snapshot-aware and read the snapshot's own healthy hash (`EndpointSnapshot.HealthyConsistentHash`). So the eager rebuild is dead work on the common path, and for a large Maglev table it is milliseconds of wasted CPU on every endpoint churn (e.g. Nacos instances flapping). This PR builds the Config-level hash lazily, only when a legacy balancer first needs it. ## Change - `prepareClusterConfig` invalidates the hash (`c.ConsistentHash.Hash = nil`) instead of rebuilding it. Setting it to nil keeps the legacy path correct after endpoint changes — the next legacy pick rebuilds from current endpoints rather than serving a stale ring. - New `ClusterConfig.EnsureConsistentHash()` builds the hash on first use (nil-check) and reuses it after. - The legacy branch of `pickEndpoint` calls `EnsureConsistentHash()` under the existing `legacyPickMu`, before the `config := *context.Config` copy. The in-`Handler` nil fallback in Maglev/RingHash stays as a safety net. ### Why a nil-check, not `sync.Once` The issue suggested a value `consistentHashOnce sync.Once` field. That trips `go vet` copylocks at the legacy path's existing `config := *context.Config` shallow copy (verified). Since the legacy path is already serialized by `legacyPickMu`, a plain nil-check is sufficient and vet-clean. A fresh `ClusterConfig` from `CloneStore` (whose `Hash` interface does not survive the yaml round-trip, so it deserializes to nil) rebuilds independently on its first legacy pick — the same fresh-store-rebuilds-independently property the issue asked for. ## Risk Per the issue: only code that reads `c.ConsistentHash.Hash` *before* a balancer `Handler` runs would now see nil. Confirmed the only non-test readers are `maglev_hash.go` / `ring_hash.go` (both inside `Handler`, both with a build-on-the-fly nil fallback) and `ConsistentHashForHealthyEndpoints` (snapshot path, has its own fallback). No xDS / adapter / configcenter / hotreload code reads the field. ## Benchmark `BenchmarkSetEndpoint` (added), 64-endpoint Maglev cluster, `MaglevTableSize=65537`, metadata-only re-registration (mirrors discovery churn): | | ns/op | B/op | allocs/op | |---|---|---|---| | before | 1087 | 1721 | 3 | | after | 413 | 481 | 3 | ~62% faster, ~72% less memory per mutation. The win scales with `MaglevTableSize`. ## Test plan - [x] New `TestClusterManager_SetEndpointDefersConsistentHashRebuild` — SetEndpoint churn triggers **zero** hash builds (counting `ConsistentHashInitFunc`); the legacy path's `EnsureConsistentHash` builds **exactly once** across repeated calls. - [x] Updated the two existing consistent-hash assertions (`SetEndpoint...RebuildsConsistentHash`, `DeleteEndpoint...RepairsRuntimeAndConsistentHash`) to trigger the lazy build before inspecting `Config.ConsistentHash.Hash` directly. - [x] `go test -race ./pkg/model/... ./pkg/server/... ./pkg/cluster/loadbalancer/...` - [x] `go vet` clean on all three packages (confirms no copylocks regression) - [x] pre-push local-CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
