We are building a grpc service in python which has a bidirectional streaming endpoint and also a unary endpoint.
we want the stream to live forever so we have no timeouts and streams are working as expected, we are using kubernates and docker for deployment. But we are facing issues with scaling the service, how to scale an infinite streaming grpc server, we can't scale based on a number of requests but there is only one request made and data is sent as frames. how can we scale this service? right now in worker thread pool has max threads as 100. one quick solution is to give max worker threads a higher number and scale based on CPU load and memory usage. is there any better way to do it. -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscr...@googlegroups.com. To post to this group, send email to grpc-io@googlegroups.com. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/27d25146-16f1-4997-856a-4dcf57f91803%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.