[grpc-io] Establishing multiple grpc subchannels for a single resolved host

alysha.gardner via grpc.io Fri, 17 Aug 2018 08:46:56 -0700

Hi grpc people!

We have a setup where we're running a grpc service (written in Go) on GKE, 
and we're accepting traffic from outside the cluster through nginx 
ingresses. Our clients are all using Core GRPC libraries (mostly Ruby) to 
make calls to the nginx ingress, which load-balances per-call to our 
backend pods.

The problem we have with this setup is that whenever the nginx ingresses
reload they drop all client connections, which results in spikes of
Unavailable errors from our grpc clients. There are many nginx ingresses
but they all have a single IP, the incoming TCP connections are routed
through a google cloud L4 load balancer. Whenever an nginx . client closes
a TCP connection the GRPC subchannel treats the backend as unavailable,
even though there are many more nginx pods that may be available
immediately to serve traffic, and it goes into backoff logic. My
understanding is that with multiple subchannels even if one nginx ingress
is restarted the others can continue to serve requests and we shouldn't see
Unavailable errors.

My question is: what is the best way to make GRPC Core establish multiple
connections to a single IP, so we can have long-lived connections to
multiple nginx ingresses?

Possibilities we've considered:

- DNS round-robin with multiple public IPs on a single A record - we've
tested this and it works, but it requires us to manually administer the DNS
records and run multiple L4 LBs

- DNS SRV records - it seems like we could have multiple SRV records with
the same hostname, but in my testing this requires us to add a look-aside
load-balancer as well, and enable ares DNS which doesn't seem to be
production-ready

- Host a look-aside load-balancer - we could host our own LB service, but
it's not clear to me how we would overcome this issue for the LB service?
The LB would be behind the same nginx ingresses. I haven't found great
documentation on how to set this up either.

- Connection pooling in the client - wrapping the Ruby GRPC channels in a
library that explicitly establishes multiple channels, each with one
sub-channel. I've tried to write this but it's tricky to implement at a
high level. I couldn't get it to perform as well during failures as the DNS
round-robin approach.

Are there options I missed? Is there any supported pattern for this? Has
anyone deployed a similar architecture (many clients connecting through
nginx on a single public IP)?

Thanks,
Alysha

--
You received this message because you are subscribed to the Google Groups
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit
https://groups.google.com/d/msgid/grpc-io/0571d7d1-d91c-417a-b1ee-5c7f2296bc38%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[grpc-io] Establishing multiple grpc subchannels for a single resolved host

Reply via email to