Re: [grpc-io] Re: grpc c++ performance - help required

2021-09-01 Thread 'Mark D. Roth' via grpc.io
I'm so sorry for not responding sooner!  For some reason, gmail tagged your
messages as spam, so I didn't see them. :(

On Fri, Aug 27, 2021 at 10:55 PM Sureshbabu Seshadri <
sureshbabu8...@gmail.com> wrote:

> Dear GRPC team,
> Can any one help on this?
>
> On Friday, August 13, 2021 at 12:53:21 PM UTC+5:30 Sureshbabu Seshadri
> wrote:
>
>> Mark,
>> Please find the grpc ttrace logs in the following link
>> https://drive.google.com/file/d/15y7KzyCtIeAoYSUzyPHpY4gcr7uUnIP0/view?usp=sharing
>>
>> I am not able to upload files directly here. Please note that the
>> profiling is done for same API called in loop for 1000 times and let me
>> know.
>>
>> On Thursday, August 12, 2021 at 11:27:16 AM UTC+5:30 Sureshbabu Seshadri
>> wrote:
>>
>>> Thanks Mark, my current profile does not include channel creation time.
>>> Profiling is only applicable for RPC calls.
>>
>>
Note that when you first create a gRPC channel, it does not actually do any
name resolution or connect to any servers until you either explicitly tell
it to do so (such as by calling channel->WaitForConnected(gpr_
inf_future(GPR_CLOCK_MONOTONIC))) or send the first RPC on it.  So if you
don't proactively tell the channel to connect but start counting the
elapsed time right before you send the first RPC, then you are actually
including the channel connection time in your benchmark.

>From the trace log above, though, it seems clear that the problem you're
seeing here is not actually channel startup time.  The channel starts to
connect on this line:

I0812 21:30:17.76000  5748 resolving_lb_policy.cc:161]
resolving_lb=01EBA08F00D0: starting name resolution


And it finishes connecting here:

I0812 21:30:17.90300 44244 client_channel.cc:1362]
chand=01EBA08F35B0: update: state=READY picker=01EBA0900B70


So it took the channel only 0.143 seconds to get connected, which means
that's probably not the problem you're seeing here.

Once it did get connected, it looks like it took about 8 seconds to process
1000 RPCs, which does seem quite slow.

Can you share the code you're using for the client and server?



> We have an existing code base for IPC which uses CORBA architecture and we
>>> are trying to replace it with GRPC, similar sample in CORBA completes
>>> quickly that is 1000 RPCs are completed within 2 seconds in same network.
>>> Hence this is kind of roadblock for our migration.
>>>
>>> I will execute test with traces enabled and share the logs ASAP
>>>
>>> On Wednesday, August 11, 2021 at 10:38:48 PM UTC+5:30 Mark D. Roth wrote:
>>>
 You can check to see whether the problem is a channel startup problem
 or a latency problem by calling
 channel->WaitForConnected(gpr_inf_future(GPR_CLOCK_MONOTONIC)) before
 you start sending RPCs on the channel.  That call won't return until the
 channel has completed the DNS lookup and established a connection to the
 server, so if you start timing after that, your timing will exclude the
 channel startup time.

 If you see that the channel startup time is high but the RPCs flow
 quickly once the startup time is passed, then the problem is either the DNS
 lookup or establishing a connection to the server.  In that case, please
 try running with the environment variables GRPC_VERBOSITY=DEBUG
 GRPC_TRACE=client_channel_routing,pick_first and share the log, so
 that we can help you figure out which one is the problem.

 If you see that the channel startup time is not that high but that the
 RPCs are actually flowing more slowly over the network, then the problem
 might be network congestion of some sort.

 Also, if the problem does turn out to be channel startup time, note
 that it probably won't matter much in practice, as long as your application
 creates the channel once and reuses it for all of its RPCs.  We do not
 recommend a pattern where you create a channel, send a bunch of RPCs, then
 destroy the channel, and then do that whole thing again later when you need
 to send more RPCs.

 I hope this information is helpful.
 On Sunday, August 8, 2021 at 9:34:43 AM UTC-7 sureshb...@gmail.com
 wrote:

> *Environment*
>
>1. Both client and server are C++
>2. Server might be running either locally or in different system
>3. In case of remote server, it is in same network.
>4. Using SYNC C++ server
>5. Unary RPC
>
> Our performance numbers are very low for running 1000 RPC calls
> (continuous calls through loop for testing) it takes about 10 seconds when
> server running in different PC.
>
> The client creates channel using *hostname:portnumber *method and
> using this approach the local server were also taking similar 10 seconds
> for 1000 calls. Later we modified channel creation for local server by
> using *localhost:port *then it was much improved performance, all the
> 1000 calls completed withi

[grpc-io] Re: Using 0.0.0.0 as server listen address

2021-09-01 Thread 'Yuri Golobokov' via grpc.io
Hi,

Which language are you using for gRPC server?

On Friday, August 27, 2021 at 12:44:50 PM UTC-7 sumuk...@gmail.com wrote:

> After creating a server and adding a listen port with "0.0.0.0:", is 
> it possible to figure out which interface grpc picked to bind and query 
> that ?
>
> This is similar to using port 0 and then querying the port number that was 
> dynamically assigned, except for the IP address.
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/1365c958-5671-41ca-ae06-f280c92db5dfn%40googlegroups.com.


Re: [grpc-io] Re: Load balancer and resolver with Ruby

2021-09-01 Thread Emmanuel DELMAS
Hi Chen

> Given that client is doing client-side LB with round_robin, is setting
max_connection_age on the server-side the right way to solve this problem?
Will clients be able to refresh and reconnect automatically, or do we need
to recreate the client (the underlying channel) periodically?
I set max_connection_age on the server-side and it works well. Nothing else
to do on the client side. When max_connection_age is reached, a GOAWAY
signal is sent to the client. Each time a client receives a GOAWAY signal,
it automatically refreshes its DNS and creates connections for new services
and the one that has been closed.

> Also, the GOAWAY signal is random. Do client implementation need to
handle this in particular?
What do you mean exactly? I'm not sure to be able to answer this point.

Regards

*Emmanuel Delmas*
Backend Developer
CSE Member

*LinkedIn .*


*19 rue Blanche, 75009 Paris, France*


Le mer. 1 sept. 2021 à 01:43, Chen Song  a écrit :

> I want to follow up on this thread, as we have similar requirements (force
> clients to refresh server addresses from dns resolver as new pods will be
> launched on K8s) but client is in Python.
>
> Given that client is doing client-side LB with round_robin, is setting
> max_connection_age on the server-side the right way to solve this problem?
> Will clients be able to refresh and reconnect automatically, or do we need
> to recreate the client (the underlying channel) periodically?
> Also, the GOAWAY signal is random. Do client implementation need to handle
> this in particular?
>
> Chen
> On Wednesday, December 23, 2020 at 4:50:31 AM UTC-5 Emmanuel Delmas wrote:
>
>> > Just curious, how has this been determined that the GOAWAY frame wasn't
>> received? Also what are your values of MAX_CONNECTION_AGE and
>> MAX_CONNECTION_AGE_GRACE ?
>>
>> MAX_CONNECTION_AGE and MAX_CONNECTION_AGE_GRACE was infinite but I
>> changed this week MAX_CONNECTION_AGE to 5 minutes.
>>
>> I followed this documentation to display gRPC logs and the see GOAWAY
>> signal.
>> https://github.com/grpc/grpc/blob/v1.25.x/TROUBLESHOOTING.md
>> https://github.com/grpc/grpc/blob/master/doc/environment_variables.md
>> To reproduce the error, I setup a channel without round robin load
>> balancing (only one subchannel).
>> ExampleService::Stub.new("headless-test-grpc-master.test-grpc.svc.cluster.local:50051",
>> :this_channel_is_insecure, timeout: 5)
>> Then I recursively kill the server pod connected to my client. When I see
>> in the logs that GOAWAY signal is received, a reconnection occurs without
>> any error in my requests. But when the reception of the GOAWAY signal is
>> not logged, no reconnection occurs and I receive a bunch of
>> DeadlineExceeded errors during several minutes.
>> The error still occur even if I create a new channel. However, if a
>> recreate the channel adding "dns:" at the beginning of the host, it works.
>> ExampleService::Stub.new("dns:headless-test-grpc-master.test-grpc.svc.cluster.local:50051",
>> :this_channel_is_insecure, timeout: 5)
>> The opposite if true. If I create the channel with "dns:" at the
>> beginning of the host, it can lead to the same failure and I will be able
>> to create a working channel removing the "dns:" at the beginning of the
>> host.
>>
>>
>> *Did you already heard this kind of issue? Is there some cache in the dns
>> resolver?*
>>
>> > A guess: one possible thing to look for is if IP packets to/from the
>> pod's address stopped forwarding, rendering the TCP connection to it a
>> "black hole". In that case, a grpc client will, by default, realize that a
>> connection is bad only after the TCP connection times out (typically ~15
>> minutes). You may set keepalive parameters to notice the brokenness of such
>> connections faster -- see references to keepalive in
>> https://github.com/grpc/proposal/blob/master/A9-server-side-conn-mgt.md
>> for more details.
>>
>> Yes. It is like requests go to a black hole. And has you said, it is
>> naturally fixed by itself after around 15 minutes. I will add a client side
>> keep alive to make it shorter. But even with 1 minute instead of 15, I need
>> to find another workaround in order to avoid degraded services for my
>> customer.
>>
>> Thank you.
>>
>> Le mardi 22 décembre 2020 à 21:34:32 UTC+1, apo...@google.com a écrit :
>>
>>> > It happens that sometimes, the GOAWAY signal isn't received by the
>>> client.
>>>
>>> Just curious, how has this been determined that the GOAWAY frame wasn't
>>> received? Also what are your values of MAX_CONNECTION_AGE and
>>> MAX_CONNECTION_AGE_GRACE ?
>>>
>>> A guess: one possible thing to look for is if IP packets to/from the
>>> pod's address stopped forwarding, rendering the TCP connection to it a
>>> "black hole". In that case, a grpc client will, by default, realize that a
>>> connection is bad only after the TCP connection times out (typically ~15
>>> minutes). You may set keepalive parameters to notice the broken