I am using golang GRPC v1.40.0 on MacOS platforms to communicate with my
GRPC server. I am noticing a problem where if grpc.Dial is called once when
the machine's network state is unstable (let's say something happens that
prevent DNS resolution for my target server's host name) GRPC isn't able to
recover and return a valid grpc.clientconn object for a very long time,
even after the network is stabilised (DNS resolution is working again).
This happens despite several reconnection (grpc.clientconn.close and
subsequent grpc.dial) attempts.
I don't see any attempt to resolve DNS or any other TCP traffic for my
server in Wireshark trace during my attempts to re-dial the connection.
Though I see lot noise from DNS queries to *grpc_config* which gets
unanswered as my server don't support them. Dial() doesn't return any error
but connection state is always 'idle' as opposed to 'ready' (when it works).
However, I also noticed that when I pass in a custom dialer function (that
is using net.dialer with system's default resolver) as part of grpc
dialOptions during call to grpc.dial(...) in that case the reattempt
succeeds and I get back a valid connection at ready state. But if don't
supply a dialer function and let grpc manage the name resolution and
dialing of the connection then it doesn't recover easily from the bad
state, takes a very long time, 10-20 mins.
Note: we use DNS scheme for go resolver: resolver.SetDefaultScheme("dns")
Can anyone shed some light into go-grpc's connection establishment logic
that explains this?
--
You received this message because you are subscribed to the Google Groups
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/grpc-io/7e8bba08-25b7-419f-a67a-59992794bdafn%40googlegroups.com.