Thanks Srini, I haven't tested option 2 yet, I would expect though that since client is unaware of what is happening we should see some request failures/latency spikes until new connection is established. That's why I would consider it mostly for disaster prevention rather than for general connection balancing. I'm actually now more interested in exploring option 4 as it looks like we can achieve safe setup if we keep proxy in front of servers and expose a separate proxy port for each server. Can someone recommend a good opensource grpclb implementation? I've found bsm/grpclb <https://github.com/bsm/grpclb> which looks reasonable but wasn't sure if there is anything else available.
On Friday, February 19, 2021 at 12:50:17 PM UTC-8 Srini Polavarapu wrote: > Hi, > > Option 3 is ideal but since you don't have that as well as option 4 > available, option 2 is worth exploring. Are the concerns with options 2 > based on some experiments you have done or is it just a hunch? This > comment <https://github.com/grpc/grpc/issues/12295#issuecomment-650364080> > has > some relevant info that you could use. > > On Thursday, February 18, 2021 at 7:06:37 PM UTC-8 vitaly....@gmail.com > wrote: > >> >> Hey folks, >> >> I'm trying to solve a problem of even load (or at least connection) >> distribution between grpc clients and our backend servers. >> >> First of all let me describe our setup: >> We are using network load balancing (L4) in front of our grpc servers. >> Clients will see one endpoint (LB) and connect to it. This means that >> standard client-side load balancing features like round robing wouldn't >> work as there will only be one sub-channel for client-server communication. >> >> One issue with this approach can be demonstrated by the following example: >> Let's say we have 2 servers running and 20 clients connect to them. At >> the beginning, since we go through the network load balancer, connections >> will be distributed evenly (or close to that), so we'll roughly have 50% of >> connections to each server. Now let's assume these servers reboot one after >> another, like in a deployment. What would happen is that server that comes >> up first would get all 20 worker connections and server that comes up later >> would have zero. This situation won't change unless client or server would >> drop a connection periodically or more clients request connections. >> >> I've considered a few options for solving this: >> 1. Connection management on the client side - do something to reset the >> channel (like [enterIdle]( >> https://grpc.github.io/grpc-java/javadoc/io/grpc/ManagedChannel.html#enterIdle) >> >> in grpc-java). Downside - it seems that this feature has been developed for >> android and I can't find similar functionality in grpc-go. >> 2. Connection management on the server side - drop connections >> periodically on the server. Downside - this approach looks less graceful >> than the client side one and may impact request latency and result in >> request failures on the client side. >> 3. Use request based grpc-aware L7 LB, this way client would connect to >> the LB, which would fan out requests to the servers. Downside - I've been >> told by our infra guys that it is hard to implement in our setup due to the >> way we use TLS and manage certificates. >> 4. Expose our servers outside and use grpc-lb or client side load >> balancing. Downside - it seems less secure and would make it harder to >> protect against DDoS attacks if we go this route. I think this downside >> makes this approach unviable. >> >> My bias is towards going with option 3 and doing request based load >> balancing because it allows much more fine grained control based on load, >> but since our infra can not support it at the moment, I might be forced to >> use option 1 or 2 in the short to mid term. Option 2 I like the least, as >> it might result in latency spikes and errors on the client side. >> >> My questions are: >> 1. Which approach is generally preferable? >> 2. Are there other options to consider? >> 3. Is it possible to influence grpc channel state in grpc-go, which would >> trigger resolver and balancer to establish a new connection similar to what >> enterIdle does in java? From what I see in the [clientconn.go]( >> https://github.com/grpc/grpc-go/blob/master/clientconn.go) there is no >> option to change the channel state to idle or trigger a reconnect in some >> other way. >> 4. Is there a way to implement server side connection management cleanly >> without impacting client-side severely? >> >> Here are links that I find useful for some context: >> grpc/load-balancing.md at master · grpc/grpc >> <https://www.google.com/url?q=https://github.com/grpc/grpc/blob/master/doc/load-balancing.md&sa=D&source=calendar&ust=1614134668829000&usg=AOvVaw21tfy7_lvaEmuQ_VRla1tY> >> proposal/A9-server-side-conn-mgt.md at master · grpc/proposal >> <https://www.google.com/url?q=https://github.com/grpc/proposal/blob/master/A9-server-side-conn-mgt.md&sa=D&source=calendar&ust=1614134668829000&usg=AOvVaw3CEasUxdbyoBhDZoX9oYB3> >> >> proposal/A8-client-side-keepalive.md at master · grpc/proposal >> <https://www.google.com/url?q=https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md&sa=D&source=calendar&ust=1614134668829000&usg=AOvVaw2EuL2EScC-WhnwJStxikI4> >> >> grpc/keepalive.md at master · grpc/grpc >> <https://www.google.com/url?q=https://github.com/grpc/grpc/blob/master/doc/keepalive.md&sa=D&source=calendar&ust=1614134668829000&usg=AOvVaw1T5WVe-QM5uc6UzblVzhKp> >> >> >> Sorry for the long read, >> Vitaly >> > -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/c848cfe4-a9f9-4297-97d2-2c2e73ec3634n%40googlegroups.com.