grpc github issue: https://github.com/grpc/grpc/issues/18554
StackOverflow post: 
https://stackoverflow.com/questions/55460086/client-channel-unusable-after-a-network-reset

Summary: If a client channel is in a READY state and the network is 
disconnected, the channel becomes unusable and the client will not attempt 
to reconnect to the server once the network connection is re-established. *The 
channel does not transition from a READY state to a TRANSIENT_FAILURE on a 
DEADLINE_EXCEEDED error (deadline set by my client application).*

What version of gRPC and what language are you using?

1.17.2
Same issue experience in version 1.11.x
C++


What operating system (Linux, Windows,...) and version?

Client running on Ubuntu 16.04.
Server running Windows Enterprise.


What did you do?

Server and client are both started on a connected network. I can 
successfully make calls and receive responses from the server. When the 
network is turned off, the server receives a "Disconnected client - 
Endpoint read failed" error. Some other relevant fields in this debug 
message - "grpc_status":14 (UNAVAILABLE), "occured_during_write":0, 
"description":"An established connection was aborted by the software in 
your host machine".

At the time of network disconnect, the client does not print out any logs 
at all (using 
GRPC_TRACE=connectivity_state,call_error,op_failure,server_channel,client_channel,channel
 
GRPC_VERBOSITY=DEBUG).

Once the network is turned on again there are no logs experienced on 
neither the server nor the client. Attempting to make a call using the 
client (send a launch request) results in a repeated DEADLINE_EXCEEDED 
error. Turning off the network connection at this time does not result in a 
server side "Disconnected client" error.

The client context is set to use a deadline (tested with 2 and 10 seconds). 
Synchronous calls are used in this case.


*Code sniplets:*
*/rpc_service.proto*
syntax = "proto3";

import "google/rpc/status.proto";

message RpcRequest {
}

message RpcResponse {
}

service RpcService{
rpc Call(RpcRequest) returns (RpcResponse);
}


*/client.cc*
Initialization:

std::unique_ptrRpcService::Stub stub_ = RpcService::NewStub(::grpc::
CreateChannel(
server_endpoint, ::grpc::InsecureChannelCredentials()));

*Sending a rpc request:*

::grpc::ClientContext context;
context.set_deadline(
gpr_time_from_micros(call_timeout_.InMicroseconds(), GPR_TIMESPAN));
RpcRequest request;
RpcResponse response;
::grpc::Status grpc_status = stub_->Call(&context, request, &response);

*/server.cc*
grpc::ServerBuilder builder;
builder.AddListeningPort(endpoint, ::grpc::InsecureServerCredentials());
builder.RegisterService(&rpc_service);
std::unique_ptrgrpc::Server grpc_server_ = builder.BuildAndStart();

What did you expect to see?

Client should make a successful call after a network reset.


What did you see instead?

Client fails to receive a response from the server.


Anything else we should know about your project / environment?

When the network connection is re-established and the client fails to 
receive a response from the server, tcpdump captures the client sending out 
some packets.
Starting up both client and server with network ON, and then unplugging the 
network does not result in any error messages until a call is attempted. 
This is the same result as when starting both client and server with the 
network disconnected. Once a call is attempted the client will transition 
from IDLE to CONNECTING and then begin to bounce back and forth between 
CONNECTING and TRANSIENT_FAILURE states (attempting to reconnect with 
exponential back-off) until the connection is re-established.
If the client is started with the network connected, but doesn't send a 
request and the network is disconnected the server doesn't get a 
disconnected client error. Until a call is made, the client stays in a 
"IDLE".
If a client is initialized and a call is made on a disconnected network, 
then the client will enter a CONNECTING state (with exponential backoff up 
to a max of 2 min where the client will be in a TRANSIENT_FAILURE state). 
Once the network is connected, the connection will be re-established the 
next time the channel will enter a CONNECTING state and the client will 
enter the READY state. After this, each call will succeed until the network 
is reset.
Disconnecting the network after the client is in a READY state will not 
transition the client out of a READY state.
In summary: Until a call is made, the client will stay in an "IDLE" state 
no matter the network status. Once a call is made, the client will attempt 
to make a connection by entering the CONNECTING state. If no connection is 
found, it will transition bounce in-between CONNECTING and 
TRANSIENT_FAILURE states. Once a connection is found, the client will go 
into a READY state. From here, if a connection is lost, the client will not 
attempt to enter a CONNECTING state again.


*Similar issue (closed) to the one I’m having:*
https://github.com/grpc/grpc/issues/16974


Known fix

Create a new channel on each call.


Failed fix attempts

Set GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA = 0


Questions

Should the client be able to use the already created channel after a 
network reset?
Does the channel have to be restarted when a network is reset?


-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/bf306179-6e7b-4edb-a205-6df4ad7a0125%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to