grpc github issue: https://github.com/grpc/grpc/issues/18554 StackOverflow post: https://stackoverflow.com/questions/55460086/client-channel-unusable-after-a-network-reset
Summary: If a client channel is in a READY state and the network is disconnected, the channel becomes unusable and the client will not attempt to reconnect to the server once the network connection is re-established. *The channel does not transition from a READY state to a TRANSIENT_FAILURE on a DEADLINE_EXCEEDED error (deadline set by my client application).* What version of gRPC and what language are you using? 1.17.2 Same issue experience in version 1.11.x C++ What operating system (Linux, Windows,...) and version? Client running on Ubuntu 16.04. Server running Windows Enterprise. What did you do? Server and client are both started on a connected network. I can successfully make calls and receive responses from the server. When the network is turned off, the server receives a "Disconnected client - Endpoint read failed" error. Some other relevant fields in this debug message - "grpc_status":14 (UNAVAILABLE), "occured_during_write":0, "description":"An established connection was aborted by the software in your host machine". At the time of network disconnect, the client does not print out any logs at all (using GRPC_TRACE=connectivity_state,call_error,op_failure,server_channel,client_channel,channel GRPC_VERBOSITY=DEBUG). Once the network is turned on again there are no logs experienced on neither the server nor the client. Attempting to make a call using the client (send a launch request) results in a repeated DEADLINE_EXCEEDED error. Turning off the network connection at this time does not result in a server side "Disconnected client" error. The client context is set to use a deadline (tested with 2 and 10 seconds). Synchronous calls are used in this case. *Code sniplets:* */rpc_service.proto* syntax = "proto3"; import "google/rpc/status.proto"; message RpcRequest { } message RpcResponse { } service RpcService{ rpc Call(RpcRequest) returns (RpcResponse); } */client.cc* Initialization: std::unique_ptrRpcService::Stub stub_ = RpcService::NewStub(::grpc:: CreateChannel( server_endpoint, ::grpc::InsecureChannelCredentials())); *Sending a rpc request:* ::grpc::ClientContext context; context.set_deadline( gpr_time_from_micros(call_timeout_.InMicroseconds(), GPR_TIMESPAN)); RpcRequest request; RpcResponse response; ::grpc::Status grpc_status = stub_->Call(&context, request, &response); */server.cc* grpc::ServerBuilder builder; builder.AddListeningPort(endpoint, ::grpc::InsecureServerCredentials()); builder.RegisterService(&rpc_service); std::unique_ptrgrpc::Server grpc_server_ = builder.BuildAndStart(); What did you expect to see? Client should make a successful call after a network reset. What did you see instead? Client fails to receive a response from the server. Anything else we should know about your project / environment? When the network connection is re-established and the client fails to receive a response from the server, tcpdump captures the client sending out some packets. Starting up both client and server with network ON, and then unplugging the network does not result in any error messages until a call is attempted. This is the same result as when starting both client and server with the network disconnected. Once a call is attempted the client will transition from IDLE to CONNECTING and then begin to bounce back and forth between CONNECTING and TRANSIENT_FAILURE states (attempting to reconnect with exponential back-off) until the connection is re-established. If the client is started with the network connected, but doesn't send a request and the network is disconnected the server doesn't get a disconnected client error. Until a call is made, the client stays in a "IDLE". If a client is initialized and a call is made on a disconnected network, then the client will enter a CONNECTING state (with exponential backoff up to a max of 2 min where the client will be in a TRANSIENT_FAILURE state). Once the network is connected, the connection will be re-established the next time the channel will enter a CONNECTING state and the client will enter the READY state. After this, each call will succeed until the network is reset. Disconnecting the network after the client is in a READY state will not transition the client out of a READY state. In summary: Until a call is made, the client will stay in an "IDLE" state no matter the network status. Once a call is made, the client will attempt to make a connection by entering the CONNECTING state. If no connection is found, it will transition bounce in-between CONNECTING and TRANSIENT_FAILURE states. Once a connection is found, the client will go into a READY state. From here, if a connection is lost, the client will not attempt to enter a CONNECTING state again. *Similar issue (closed) to the one I’m having:* https://github.com/grpc/grpc/issues/16974 Known fix Create a new channel on each call. Failed fix attempts Set GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA = 0 Questions Should the client be able to use the already created channel after a network reset? Does the channel have to be restarted when a network is reset? -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscr...@googlegroups.com. To post to this group, send email to grpc-io@googlegroups.com. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/bf306179-6e7b-4edb-a205-6df4ad7a0125%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.