In general, the connection times are fairly randomly distributed, but not 
always. One case I have seen is a update of the server. We spin up a new 
replica of the server, and it takes all connections. As a result, another 
replica spins up shortly after due to autoscaling. This new replica takes 
no connections for 30m.

As far as tuning the autoscaling, that is a good point, I will look into 
that.

On Wednesday, October 16, 2019 at 10:55:20 AM UTC-7, Mark D. Roth wrote:
>
> Do your clients all start up at the same time?  If not, it's not clear to 
> me why your setup wouldn't work.  If the clients' start times are randomly 
> distributed, then if the server closes each connection after 30m, the 
> connection close times should be just as randomly distributed as the client 
> start times, which means that as soon as the new server comes up, clients 
> should start trickling into it.  It may take 30m for the load to fully 
> balance, but the new server should start getting new load immediately, and 
> the load should increase slowly over that 30m period.
>
> I don't know anything about the Kubernetes autoscaling side of things, but 
> maybe there are parameters you can tune there to give the new server more 
> time to accumulate load before Kubernetes kills it?
>
> In general, it's not clear to me that there's a better approach than the 
> one you're already taking.  There's always a bit of tension between load 
> balancing and streaming RPCs, because the whole point of a streaming RPC is 
> that it doesn't go through load balancing for each individual message, 
> which means that all of the messages go to the same backend.
>
> I hope this information is helpful.
>
> On Mon, Oct 7, 2019 at 3:54 PM howardjohn via grpc.io <
> grp...@googlegroups.com <javascript:>> wrote:
>
>> We have a case where we have many clients and few servers, typically 
>> 1000:1 ratio. The traffic is a single bidirectional stream per client.
>>
>> The problem we are seeing is that when a new server comes up, it will 
>> have no clients connected, as they maintain their connection to the other 
>> servers. 
>>
>> This is made worse by Kubernetes autoscaling, as this new server will 
>> have 0 load it will scale down and we flip flop between n and n+1 replicas. 
>> This graph shows this behavior pretty well: 
>> https://snapshot.raintank.io/dashboard/snapshot/SceOCrNpdOr4qmTUk1UHF20xMiNqGk6K?panelId=4&fullscreen&orgId=2
>>
>> As a mitigation against this, we have the server close the connections 
>> every 30m. This is not great, because it takes a least 30 min to balance, 
>> and due to the above issue this generally doesn't ever work.
>>
>>
>> I am wondering if there are any best practices for handling this type of 
>> problem?
>>
>> One possible idea we have is the server sharing load information and 
>> shedding load if they have more than their "fair share" of connections, but 
>> this is pretty complex
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "grpc.io" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to grp...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/grpc-io/3423f295-a2fe-4096-b003-fc09c605987e%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/grpc-io/3423f295-a2fe-4096-b003-fc09c605987e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
> Mark D. Roth <ro...@google.com <javascript:>>
> Software Engineer
> Google, Inc.
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/1fa9e2fc-3dd3-45e5-b656-13337e170cb1%40googlegroups.com.

Reply via email to