Re: [PR] fix(pd): add timeout and null-safety to getLeaderGrpcAddress() [hugegraph]

via GitHub Tue, 10 Mar 2026 00:09:50 -0700


imbajin commented on code in PR #2961:
URL: https://github.com/apache/hugegraph/pull/2961#discussion_r2909775766



##########
hugegraph-pd/hg-pd-core/src/main/java/org/apache/hugegraph/pd/raft/RaftEngine.java:
##########
@@ -239,8 +241,20 @@ public String getLeaderGrpcAddress() throws 
ExecutionException, InterruptedExcep
             waitingForLeader(10000);
         }
 
-        return 
raftRpcClient.getGrpcAddress(raftNode.getLeaderId().getEndpoint().toString()).get()
-                            .getGrpcAddress();
+        try {
+            RaftRpcProcessor.GetMemberResponse response = raftRpcClient
+                    
.getGrpcAddress(raftNode.getLeaderId().getEndpoint().toString())
+                    .get(config.getRpcTimeout(), TimeUnit.MILLISECONDS);

Review Comment:
   ⚠️ `config.getRpcTimeout()` only bounds the callers 
`CompletableFuture.get(...)` here; it does **not** bound the underlying Bolt 
RPC. `RaftRpcClient` is still initialized with a fresh `new RpcOptions()` and 
`invokeAsync(...)` uses `rpcOptions.getRpcDefaultTimeout()`, so in the 
bridge-network failure case well return from this method quickly but still 
leave the actual RPC running until the client default timeout fires. On a 
follower that keeps redirecting traffic, those in-flight requests can 
accumulate.
   
   Could we wire the configured timeout into `RaftRpcClient` itself (or cancel 
the future on timeout) so the transport timeout and the caller timeout stay 
consistent?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix(pd): add timeout and null-safety to getLeaderGrpcAddress() [hugegraph]

Reply via email to