xishpo opened a new issue, #3098:
URL: https://github.com/apache/brpc/issues/3098

   **Describe the bug**
   openssl短连接方式访问服务,偶发E1008超时错误
   
   **To Reproduce**
   请求回复数据在64K-200K相对容易触发,更大或者更小不容易出现
   分析日志发现,客户端出现E1008错误时,IOBuf::cut_multiple_into_SSL_channel,
   nw返回都是完整的消息长度,同时BIO_flush(wbio) <=0,BIO_should_write(wbio) 
>0,BIO_fd_non_fatal_error(errno)>0
   说明ssl缓冲区中还有数据。
   iobuf.cpp  
   IOBuf::cut_multiple_into_SSL_channel
   1072 #ifndef USE_MESALINK
   1073     // Flush remaining data inside the BIO buffer layer
   1074     BIO* wbio = SSL_get_wbio(ssl);
   1075     if (BIO_wpending(wbio) > 0) {
   1076         int rc = BIO_flush(wbio);
   1077         if (rc <= 0 && BIO_fd_non_fatal_error(errno) == 0) {
   1078             // Fatal error during BIO_flush
   1079             *ssl_error = SSL_ERROR_SYSCALL;
   1080             return rc;
   1081         }
                         //---<<<<<执行这条路径
   1082     }
   1083 #else
   1084     int rc = SSL_flush(ssl);
   1085     if (rc <= 0) {
   1086         *ssl_error = SSL_ERROR_SYSCALL;
   1087         return rc;
   1088     }
   1089 #endif
   1090 
   1091     return nw;
   1092 }
        由于所有数据写入了ssl session,在KeepWrite中IsWriteComplete 返回了true,
   无法进入doWrite和cut_multiple_into_SSL_channel,没能再次bio_flush 发送ssl中的数据,
   造成最有一段数据没能发送到客户端,进而触发E1008超时错误.
        在KeepWrite函数中检查IsWriteComplete后,直接调用bio_flush,能够解决此问题,
   但不确定是否有其他问题
   
   socket.cpp  
   void* Socket::KeepWrite(void* void_arg) 
   1910         if (NULL == cur_tail) {
   1911             for (cur_tail = req; cur_tail->next != NULL;
   1912                  cur_tail = cur_tail->next);
   1913         }
   1914         // Return when there's no more WriteRequests and req is 
completely
   1915         // written.
   1916         if (s->IsWriteComplete(cur_tail, (req == cur_tail), &cur_tail)) 
{
   1917             CHECK_EQ(cur_tail, req);
   1918             s->ReturnSuccessfulWriteRequest(req);
   1919             return NULL;
   1920         }
   1921     } while (1);
   1922 
   1923     // Error occurred, release all requests until no new requests.
   1924     s->ReleaseAllFailedWriteRequests(req);
   1925     return NULL;
   1926 }
   
   
   **Expected behavior**
   
   
   
   **Versions**
   OS:rhel8/rhel9
   Compiler:
   brpc:1.9-1.14
   protobuf:
   
   **Additional context/screenshots**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to