Thanks Jeff for suggestions. disabling nagle did not help, BTW i guess this algo is independent of application and hence when our old SW performs well and not GRPC might require some tuning in GRPC side. correct me if i am wrong
On Sunday, September 5, 2021 at 8:11:23 PM UTC+5:30 Jeff wrote: > Also, I would be careful posting a wireshark pcap to a public forum. The > pcap contains lots of information a hacker could potentially use to attack > your servers. > > On Sun, Sep 5, 2021 at 12:32 AM Jeff Steger <be2...@gmail.com> wrote: > >> you either have network congestion or nagle algo is interfering. try >> disabling nagle and/or running traceroute. >> >> On Fri, Sep 3, 2021 at 1:47 PM Sureshbabu Seshadri <sureshb...@gmail.com> >> wrote: >> >>> thanks Mark, parallelizing is not an option for my project. As I >>> initially mentioned this project is conversion of CORBA IPC into GRPC >>> whereas CORBA able to complete 1000 RPCs in 1 second vs 10 seconds in GRPC. >>> >>> I will send GRPC logs with TCP enabled soon; If you prefer wireshark, i >>> need to learn the tool usage and send it later. >>> >>> On Friday, September 3, 2021 at 9:58:02 PM UTC+5:30 Mark D. Roth wrote: >>> >>>> I don't see anything obviously wrong with your code. >>>> >>>> Since this test is sending RPCs serially instead of in parallel, it's >>>> possible that there are too many network round-trips happening here, each >>>> one of which would increase latency because the next operation is blocking >>>> on the previous one. Can you try running with the environment variables >>>> GRPC_VERBOSITY=DEBUG >>>> GRPC_TRACE=tcp? Or, alternatively, getting a wireshark capture of the >>>> network communication? That might help us see how many round-trips are >>>> happening here. >>>> >>>> You might also consider whether sending a bunch of RPCs serially is >>>> actually a realistic benchmark for your production workload. You might >>>> get >>>> better performance by parallelizing the requests from the client. >>>> >>>> On Fri, Sep 3, 2021 at 2:29 AM Sureshbabu Seshadri < >>>> sureshb...@gmail.com> wrote: >>>> >>>>> Thanks Mark. The below link has source code of my sample, please let >>>>> me know if you need any other information to analyze >>>>> >>>>> >>>>> https://drive.google.com/file/d/12PH65OYwflaPBpa2a-yMcBqSE3xYX9S-/view?usp=sharing >>>>> >>>>> >>>>> On Thursday, September 2, 2021 at 3:45:04 AM UTC+5:30 Mark D. Roth >>>>> wrote: >>>>> >>>>>> I'm so sorry for not responding sooner! For some reason, gmail >>>>>> tagged your messages as spam, so I didn't see them. :( >>>>>> >>>>>> On Fri, Aug 27, 2021 at 10:55 PM Sureshbabu Seshadri < >>>>>> sureshb...@gmail.com> wrote: >>>>>> >>>>>>> Dear GRPC team, >>>>>>> Can any one help on this? >>>>>>> >>>>>>> On Friday, August 13, 2021 at 12:53:21 PM UTC+5:30 Sureshbabu >>>>>>> Seshadri wrote: >>>>>>> >>>>>>>> Mark, >>>>>>>> Please find the grpc ttrace logs in the following link >>>>>>>> https://drive.google.com/file/d/15y7KzyCtIeAoYSUzyPHpY4gcr7uUnIP0/view?usp=sharing >>>>>>>> >>>>>>>> I am not able to upload files directly here. Please note that the >>>>>>>> profiling is done for same API called in loop for 1000 times and let >>>>>>>> me >>>>>>>> know. >>>>>>>> >>>>>>>> On Thursday, August 12, 2021 at 11:27:16 AM UTC+5:30 Sureshbabu >>>>>>>> Seshadri wrote: >>>>>>>> >>>>>>>>> Thanks Mark, my current profile does not include channel creation >>>>>>>>> time. Profiling is only applicable for RPC calls. >>>>>>>> >>>>>>>> >>>>>> Note that when you first create a gRPC channel, it does not actually >>>>>> do any name resolution or connect to any servers until you either >>>>>> explicitly tell it to do so (such as by calling >>>>>> channel->WaitForConnected(gpr_inf_future(GPR_CLOCK_MONOTONIC))) or >>>>>> send the first RPC on it. So if you don't proactively tell the channel >>>>>> to >>>>>> connect but start counting the elapsed time right before you send the >>>>>> first >>>>>> RPC, then you are actually including the channel connection time in your >>>>>> benchmark. >>>>>> >>>>>> From the trace log above, though, it seems clear that the problem >>>>>> you're seeing here is not actually channel startup time. The channel >>>>>> starts to connect on this line: >>>>>> >>>>>> I0812 21:30:17.760000000 5748 resolving_lb_policy.cc:161] >>>>>> resolving_lb=000001EBA08F00D0: starting name resolution >>>>>> >>>>>> >>>>>> And it finishes connecting here: >>>>>> >>>>>> I0812 21:30:17.903000000 44244 client_channel.cc:1362] >>>>>> chand=000001EBA08F35B0: update: state=READY picker=000001EBA0900B70 >>>>>> >>>>>> >>>>>> So it took the channel only 0.143 seconds to get connected, which >>>>>> means that's probably not the problem you're seeing here. >>>>>> >>>>>> Once it did get connected, it looks like it took about 8 seconds to >>>>>> process 1000 RPCs, which does seem quite slow. >>>>>> >>>>>> Can you share the code you're using for the client and server? >>>>>> >>>>>> >>>>>> >>>>>>> We have an existing code base for IPC which uses CORBA architecture >>>>>>>>> and we are trying to replace it with GRPC, similar sample in CORBA >>>>>>>>> completes quickly that is 1000 RPCs are completed within 2 seconds in >>>>>>>>> same >>>>>>>>> network. Hence this is kind of roadblock for our migration. >>>>>>>>> >>>>>>>>> I will execute test with traces enabled and share the logs ASAP >>>>>>>>> >>>>>>>>> On Wednesday, August 11, 2021 at 10:38:48 PM UTC+5:30 Mark D. Roth >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> You can check to see whether the problem is a channel startup >>>>>>>>>> problem or a latency problem by calling >>>>>>>>>> channel->WaitForConnected(gpr_inf_future(GPR_CLOCK_MONOTONIC)) >>>>>>>>>> before you start sending RPCs on the channel. That call won't >>>>>>>>>> return until >>>>>>>>>> the channel has completed the DNS lookup and established a >>>>>>>>>> connection to >>>>>>>>>> the server, so if you start timing after that, your timing will >>>>>>>>>> exclude the >>>>>>>>>> channel startup time. >>>>>>>>>> >>>>>>>>>> If you see that the channel startup time is high but the RPCs >>>>>>>>>> flow quickly once the startup time is passed, then the problem is >>>>>>>>>> either >>>>>>>>>> the DNS lookup or establishing a connection to the server. In that >>>>>>>>>> case, >>>>>>>>>> please try running with the environment variables >>>>>>>>>> GRPC_VERBOSITY=DEBUG >>>>>>>>>> GRPC_TRACE=client_channel_routing,pick_first and share the log, >>>>>>>>>> so that we can help you figure out which one is the problem. >>>>>>>>>> >>>>>>>>>> If you see that the channel startup time is not that high but >>>>>>>>>> that the RPCs are actually flowing more slowly over the network, >>>>>>>>>> then the >>>>>>>>>> problem might be network congestion of some sort. >>>>>>>>>> >>>>>>>>>> Also, if the problem does turn out to be channel startup time, >>>>>>>>>> note that it probably won't matter much in practice, as long as your >>>>>>>>>> application creates the channel once and reuses it for all of its >>>>>>>>>> RPCs. We >>>>>>>>>> do not recommend a pattern where you create a channel, send a bunch >>>>>>>>>> of >>>>>>>>>> RPCs, then destroy the channel, and then do that whole thing again >>>>>>>>>> later >>>>>>>>>> when you need to send more RPCs. >>>>>>>>>> >>>>>>>>>> I hope this information is helpful. >>>>>>>>>> On Sunday, August 8, 2021 at 9:34:43 AM UTC-7 >>>>>>>>>> sureshb...@gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> *Environment* >>>>>>>>>>> >>>>>>>>>>> 1. Both client and server are C++ >>>>>>>>>>> 2. Server might be running either locally or in different >>>>>>>>>>> system >>>>>>>>>>> 3. In case of remote server, it is in same network. >>>>>>>>>>> 4. Using SYNC C++ server >>>>>>>>>>> 5. Unary RPC >>>>>>>>>>> >>>>>>>>>>> Our performance numbers are very low for running 1000 RPC calls >>>>>>>>>>> (continuous calls through loop for testing) it takes about 10 >>>>>>>>>>> seconds when >>>>>>>>>>> server running in different PC. >>>>>>>>>>> >>>>>>>>>>> The client creates channel using *hostname:portnumber *method >>>>>>>>>>> and using this approach the local server were also taking similar >>>>>>>>>>> 10 >>>>>>>>>>> seconds for 1000 calls. Later we modified channel creation for >>>>>>>>>>> local server >>>>>>>>>>> by using *localhost:port *then it was much improved >>>>>>>>>>> performance, all the 1000 calls completed within 300 ms. >>>>>>>>>>> >>>>>>>>>>> Based on the above test, we strongly believe DNS resolution >>>>>>>>>>> seems to cause slow performance as change hostname to localhost >>>>>>>>>>> results in >>>>>>>>>>> huge performance gain, however that is not possible for servers >>>>>>>>>>> running on >>>>>>>>>>> different PC. >>>>>>>>>>> >>>>>>>>>>> Can someone help with this? Is DNS the real culprit or what else >>>>>>>>>>> can be changed to get good performance throughput in this case. >>>>>>>>>>> >>>>>>>>>>> Please let me know if there any other input required for this. >>>>>>>>>>> >>>>>>>>>> -- >>>>>>> You received this message because you are subscribed to a topic in >>>>>>> the Google Groups "grpc.io" group. >>>>>>> To unsubscribe from this topic, visit >>>>>>> https://groups.google.com/d/topic/grpc-io/pD3HiDvxymY/unsubscribe. >>>>>>> To unsubscribe from this group and all its topics, send an email to >>>>>>> grpc-io+u...@googlegroups.com. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/grpc-io/66ff211a-e86f-47ff-9583-b423877b8f02n%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/grpc-io/66ff211a-e86f-47ff-9583-b423877b8f02n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Mark D. Roth <ro...@google.com> >>>>>> Software Engineer >>>>>> Google, Inc. >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to a topic in the >>>>> Google Groups "grpc.io" group. >>>>> To unsubscribe from this topic, visit >>>>> https://groups.google.com/d/topic/grpc-io/pD3HiDvxymY/unsubscribe. >>>>> To unsubscribe from this group and all its topics, send an email to >>>>> grpc-io+u...@googlegroups.com. >>>>> >>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/grpc-io/f010b3e1-8554-4c39-a086-a0741e4f7d12n%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/grpc-io/f010b3e1-8554-4c39-a086-a0741e4f7d12n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> >>>> >>>> -- >>>> Mark D. Roth <ro...@google.com> >>>> Software Engineer >>>> Google, Inc. >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "grpc.io" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to grpc-io+u...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/grpc-io/50c98f17-fdee-4e33-bbc1-dbd4d77897fdn%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/grpc-io/50c98f17-fdee-4e33-bbc1-dbd4d77897fdn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/2ecb012f-dfe3-412b-acd9-7442abe072a0n%40googlegroups.com.