[ https://issues.apache.org/jira/browse/IMPALA-10139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189718#comment-17189718 ]
Sahil Takiar commented on IMPALA-10139: --------------------------------------- The "network" time (calculated as {{int64_t network_time_ns = total_time_ns - resp_.receiver_latency_ns()}}) might be a more useful threshold value to use. > Slow RPC logs can be misleading > ------------------------------- > > Key: IMPALA-10139 > URL: https://issues.apache.org/jira/browse/IMPALA-10139 > Project: IMPALA > Issue Type: Improvement > Reporter: Sahil Takiar > Priority: Major > > The slow RPC logs added in IMPALA-9128 are based on the total time taken to > successfully complete a RPC. The issue is that there are many reasons why an > RPC might take a long time to complete. An RPC is considered complete only > when the receiver has processed that RPC. > The problem is that due to client-driven back-pressure mechanism, it is > entirely possible that the receiver RPC does not process a receiver RPC > because {{KrpcDataStreamRecvr::SenderQueue::GetBatch}} just hasn't been > called yet (indirectly called by {{ExchangeNode::GetNext}}). > This can lead to flood of slow RPC logs, even though the RPCs might not > actually be slow themselves. What is worse is that the because of the > back-pressure mechanism, slowness from the client (e.g. Hue users) will > propagate across all nodes involved in the query. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org