[I] [E1008]Reached timeout=10000ms (brpc)

2024-07-08 Thread via GitHub


clee01 opened a new issue, #2686:
URL: https://github.com/apache/brpc/issues/2686

   **Describe the bug (描述bug)**
   brpc客户端在连接空闲了一段时间后,大约1.5h,再次请求发现会会报这样的错误 `[E1008]Reached 
timeout=1ms`,大约再过了15分钟后,问题不再出现。
   
   **To Reproduce (复现方法)**
   brpc客户端在连接空闲了一段时间后,大约1.5h,再次请求发现会会报这样的错误 `[E1008]Reached timeout=1ms`
   
   **Expected behavior (期望行为)**
   正常请求,不报E1008错误
   
   **Versions (各种版本)**
   OS: centos7.9
   Compiler: gcc 9.3.1
   brpc: -
   protobuf: -
   
   **Additional context/screenshots (更多上下文/截图)**
   ``` cpp
   Client::Client() {
 auto config = ConfigManager::GetConfigManager();
   
 brpc::ChannelOptions options;
   
 options.protocol = config->GetConfigParam(
 "risk_control", "protocol", "baidu_std");
 options.connection_type = config->GetConfigParam(
 "risk_control", "connection_type", "");
 options.timeout_ms =
 config->GetConfigParam("risk_control", "timeout_ms", "100");
 options.max_retry =
 config->GetConfigParam("risk_control", "max_retry", "3");
   
 const std::string server = config->GetConfigParam(
 "risk_control", "server", "0.0.0.0:8000");
 const std::string load_balancer =
 config->GetConfigParam("risk_control", "load_balancer", 
"");
   
 if (channel_.Init(server.c_str(), load_balancer.c_str(), &options) != 0) {
   SLOG_ERROR("", "RiskControl", "Client Init failed", 0, 0);
 }
 stub_.reset(new rc::AsyncRcService_Stub(&channel_));
   }
   ```
   初步分析了下,应该是tcp连接假死导致应用层没有感知到,触发了TCP的超时重传。
   
   但是有一个疑问就是tcp连接为啥会假死呢,通过`lsof`命令查看客户端和服务端的TCP连接都是`established`的状态
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@brpc.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@brpc.apache.org
For additional commands, e-mail: dev-h...@brpc.apache.org



Re: [I] [E1008]Reached timeout=10000ms (brpc)

2024-09-07 Thread via GitHub


cfycyf commented on issue #2686:
URL: https://github.com/apache/brpc/issues/2686#issuecomment-2335107933

   问题有解决吗?我这也出现了这个报错。
   场景是:
   1. brpc server在节点A
   2. brpc client 在节点B,定时向节点A发送请求,接收回复
   操作:
   1. 节点B ifconfig ethX down(ethX是brpc连接的网络的网卡)
   2. 等待30s左右,ifconfig ethX up,
   
   期望结果:
   网卡up后,1s左右能够正常brpc通信
   实际结果:
   有一半的概率 网卡up后的8-15s会出现“[E1008]Reached timeout=”,
   实际brpc server是正常的
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@brpc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@brpc.apache.org
For additional commands, e-mail: dev-h...@brpc.apache.org