zrlw edited a comment on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939205361


   zookeeper not 
connected连接失败的SingleRegistryCenterDubboProtocolIntegrationTest日志有前一个测试类SingleRegistryCenterInjvmIntegrationTest的zk
 client session timeout告警, 
   (构建日志: 
https://github.com/apache/dubbo/runs/3835835450?check_suite_focus=true)
   摘了主要的内容如下
   ```
   [INFO] Running 
org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest
  
   <== 前一个测试类SingleRegistryCenterInjvmIntegrationTest  
   [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0  INFO 
curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper client instance 
initiated successfully, session id is 100001a38dd0000, dubbo version: 
3.0.4-SNAPSHOT, current host: 172.19.112.1  
   <== 前一个测试类的zk client session ( id: 100001a38dd0000 )
   [08/10/21 07:35:39:624 UTC] main  INFO support.RegistryManager:  [DUBBO] 
Close all registries [], dubbo version: 3.0.4-SNAPSHOT, current host: 
172.19.112.1
   <== 前一个测试类关闭所有注册
   [08/10/21 07:35:39:624 UTC] main  INFO deploy.DefaultApplicationDeployer:  
[DUBBO] Dubbo Application[243.1] has stopped., dubbo version: 3.0.4-SNAPSHOT, 
current host: 172.19.112.1
   <== 前一个测试类dubbo应用已停止
   [08/10/21 07:35:39:632 UTC] main  INFO 
registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter 
close successfully., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭zk注册中心
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.493 
s - in 
org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest
   <== 前一个测试类结束
   [INFO] Running 
org.apache.dubbo.integration.single.SingleRegistryCenterDubboProtocolIntegrationTest
   
   <== 开始出zk连接失败的测试类SingleRegistryCenterDubboProtocolIntegrationTest
   [08/10/21 07:35:39:634 UTC] main  INFO 
registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is 
starting..., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 启动zk注册中心
   [08/10/21 07:35:39:727 UTC] Curator-ConnectionStateManager-0  WARN 
curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper connection of 
session 100001a38dd0000 timed out. connection timeout value is 3000, session 
expire timeout value is 60000, dubbo version: 3.0.4-SNAPSHOT, current host: 
172.19.112.1
   <== 提示前一个测试类的session超时 (id: 100001a38dd0000 ) 
   [08/10/21 07:35:40:655 UTC] main  INFO 
registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is 
started successfully, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   ```
   问题:
   
前一个测试类SingleRegistryCenterDubboProtocolIntegrationTest的tearDown做了DubboBootstrap.reset(),zk客户端应该全都被关闭了,但是debug跟踪发现SingleRegistryCenterDubboProtocolIntegrationTest里的DubboBootstrap.reset()并没有调用CuratorZookeeperClient的doClose方法。
   
   看了一篇curator连接异常问题定位的帖子,里面说curator的event 
loop是个死循环处理,依次调用各个watcher,如果有一个watcher挂住hang掉了,后面的事件都不会被处理,换句话如果zk连接的事件排在了hang掉的事件处理后面,那么curator就没有机会处理connected事件将currentConnectionState改为已连接,应用的连接就会超时失败。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to