wu-sheng commented on code in PR #777:
URL: https://github.com/apache/skywalking-java/pull/777#discussion_r2517028548


##########
apm-sniffer/apm-agent-core/src/main/java/org/apache/skywalking/apm/agent/core/remote/GRPCChannelManager.java:
##########
@@ -184,21 +184,65 @@ public Channel getChannel() {
      */
     public void reportError(Throwable throwable) {
         if (isNetworkError(throwable)) {
-            reconnect = true;
-            notify(GRPCChannelStatus.DISCONNECT);
+            triggerReconnect();
         }
     }
 
     private void notify(GRPCChannelStatus status) {
-        for (GRPCChannelListener listener : listeners) {
-            try {
-                listener.statusChanged(status);
-            } catch (Throwable t) {
-                LOGGER.error(t, "Fail to notify {} about channel connected.", 
listener.getClass().getName());
+        synchronized (listeners) {
+            for (GRPCChannelListener listener : listeners) {
+                try {
+                    listener.statusChanged(status);
+                } catch (Throwable t) {
+                    LOGGER.error(t, "Fail to notify {} about channel 
connected.", listener.getClass().getName());
+                }
             }
         }
     }
 
+    /**
+     * Create a new gRPC channel to the specified server and reset connection 
state.
+     */
+    private void createNewChannel(String host, int port) throws Exception {
+        if (managedChannel != null) {
+            managedChannel.shutdownNow();
+        }
+
+        managedChannel = GRPCChannel.newBuilder(host, port)
+                                    .addManagedChannelBuilder(new 
StandardChannelBuilder())
+                                    .addManagedChannelBuilder(new 
TLSChannelBuilder())
+                                    .addChannelDecorator(new 
AgentIDDecorator())
+                                    .addChannelDecorator(new 
AuthenticationDecorator())
+                                    .build();
+
+        // Reset reconnectCount after actually rebuilding the channel
+        reconnectCount = 0;
+        notifyConnected();
+    }
+
+    /**
+     * Trigger reconnection by setting reconnect flag and notifying listeners.
+     */
+    private void triggerReconnect() {
+        synchronized (statusLock) {
+            reconnect = true;
+            notify(GRPCChannelStatus.DISCONNECT);
+        }
+    }
+
+    /**
+     * Notify listeners that connection is established without resetting 
reconnectCount.
+     * This is used when the channel appears connected but we want to keep 
monitoring
+     * reconnect attempts in case it's a false positive (half-open connection).
+     */
+    private void notifyConnected() {
+        synchronized (statusLock) {
+            // Don't reset reconnectCount - connection might still be half-open

Review Comment:
   Can't you check connection status as `TRANSIENT_FAILURE`? `isConnected` is 
checking status==`READY`. When it is half cloed, it should not be in READY 
status. 
   
   Could you rechecking the logic? This change and context seem to be not 
consistent with gRPC concept.



##########
apm-sniffer/apm-agent-core/src/main/java/org/apache/skywalking/apm/agent/core/remote/GRPCChannelManager.java:
##########
@@ -184,21 +184,65 @@ public Channel getChannel() {
      */
     public void reportError(Throwable throwable) {
         if (isNetworkError(throwable)) {
-            reconnect = true;
-            notify(GRPCChannelStatus.DISCONNECT);
+            triggerReconnect();
         }
     }
 
     private void notify(GRPCChannelStatus status) {
-        for (GRPCChannelListener listener : listeners) {
-            try {
-                listener.statusChanged(status);
-            } catch (Throwable t) {
-                LOGGER.error(t, "Fail to notify {} about channel connected.", 
listener.getClass().getName());
+        synchronized (listeners) {
+            for (GRPCChannelListener listener : listeners) {
+                try {
+                    listener.statusChanged(status);
+                } catch (Throwable t) {
+                    LOGGER.error(t, "Fail to notify {} about channel 
connected.", listener.getClass().getName());
+                }
             }
         }
     }
 
+    /**
+     * Create a new gRPC channel to the specified server and reset connection 
state.
+     */
+    private void createNewChannel(String host, int port) throws Exception {
+        if (managedChannel != null) {
+            managedChannel.shutdownNow();
+        }
+
+        managedChannel = GRPCChannel.newBuilder(host, port)
+                                    .addManagedChannelBuilder(new 
StandardChannelBuilder())
+                                    .addManagedChannelBuilder(new 
TLSChannelBuilder())
+                                    .addChannelDecorator(new 
AgentIDDecorator())
+                                    .addChannelDecorator(new 
AuthenticationDecorator())
+                                    .build();
+
+        // Reset reconnectCount after actually rebuilding the channel
+        reconnectCount = 0;
+        notifyConnected();
+    }
+
+    /**
+     * Trigger reconnection by setting reconnect flag and notifying listeners.
+     */
+    private void triggerReconnect() {
+        synchronized (statusLock) {
+            reconnect = true;
+            notify(GRPCChannelStatus.DISCONNECT);
+        }
+    }
+
+    /**
+     * Notify listeners that connection is established without resetting 
reconnectCount.
+     * This is used when the channel appears connected but we want to keep 
monitoring
+     * reconnect attempts in case it's a false positive (half-open connection).
+     */
+    private void notifyConnected() {
+        synchronized (statusLock) {
+            // Don't reset reconnectCount - connection might still be half-open

Review Comment:
   Why can't you just add the TRANSIENT_FAILURE status into isNetworkError 
method? Then it could be triggered reconnection when the status changed. 
   If no error received, then, I don't think it is in TRANSIENT_FAILURE(or half 
closed).
   
   Please check the runtime more.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to