[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HADOOP-12325: --- Resolution: Fixed Fix Version/s: 2.7.4 Status: Resolved (was: Patch Available) I verified test failures and pushed to branch-2.7. > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325-branch-2.7.00.patch, HADOOP-12325.001.patch, > HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, > HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, HADOOP-12325.006.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HADOOP-12325: --- Status: Patch Available (was: Reopened) > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325-branch-2.7.00.patch, HADOOP-12325.001.patch, > HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, > HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, HADOOP-12325.006.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HADOOP-12325: --- Attachment: HADOOP-12325-branch-2.7.00.patch > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325-branch-2.7.00.patch, HADOOP-12325.001.patch, > HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, > HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, HADOOP-12325.006.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HADOOP-12325: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Thanks [~anu] for the contribution and [~ajisakaa] for the review. I've commit the change to trunk and branch-2. > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: 2.8.0 > > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, > HADOOP-12325.004.patch, HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, > HADOOP-12325.006.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.006.patch [~ajisakaa] Thanks for your review and changes to the test file. Please see my comments below bq. 1. Would you add a whitespace before "took " in the log message? fixed. bq. 2. After running the regression test locally, I can't see any logs about sleep RPC. On my machine if I open the file org.apache.hadoop.ipc.TestProtoBufRpc-output.txt in the sure-fire reports directory, I am able to see the following line. {code} 2015-08-24 10:52:16,713 WARN ipc.Server (Server.java:logSlowRpcCalls(438)) - Slow RPC : sleep took 3004 milliseconds to process from client 10.0.1.35:57223 {code} bq. Attaching a patch to verify that the slow call is logged. Now the test fails. With the new call {code} long after = getLongCounter("RpcSlowCalls", rpcMetrics); {code} somehow the mocking layer is still returning the old snap-shotted value. I have modified the tests to call server layer directly and tests are now behaving as expected. > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, > HADOOP-12325.004.patch, HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, > HADOOP-12325.006.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HADOOP-12325: --- Attachment: HADOOP-12325.005.test.patch Attaching a patch to verify that the slow call is logged. Now the test fails. > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, > HADOOP-12325.004.patch, HADOOP-12325.005.patch, HADOOP-12325.005.test.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HADOOP-12325: -- Status: Patch Available (was: Open) > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, > HADOOP-12325.004.patch, HADOOP-12325.005.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HADOOP-12325: -- Status: Open (was: Patch Available) > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, > HADOOP-12325.004.patch, HADOOP-12325.005.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.005.patch Support Slow RPC logging for WriteableRpcEngine also > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, > HADOOP-12325.004.patch, HADOOP-12325.005.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.004.patch fix java doc > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, > HADOOP-12325.004.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HADOOP-12325: Attachment: Callers of WritableRpcEngine.call.png > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.003.patch Fixed CheckStyle issues , two issues still remain, please ignore them. * {{Server.java}} - File is too long * Variable 'logSlowRPC' must be private - it is a hadoop metric and follows the general pattern in the file. > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HADOOP-12325.001.patch, HADOOP-12325.002.patch, > HADOOP-12325.003.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.002.patch Thanks for detailed review [~xyao] . I have attached a revised patch. Please see below for my detailed comments. bq. 1. Do you miss updating all the caller of ProtobufRpcEngine.call() to pass receiveTime using Time.monotonicNow() instead of the Time.now()? fixed, I have reverted to using Time.now() for in this patch. bq. 2. Do we need update WritableRpcEngine.java class with logSlowRpcCalls()? I could not find any place where we were using WritableRpcEngine for real, hence I did not make that change. bq. 3. NIT: Can you put the magic number 1024 as final variable like fixed bq. 4. Can you change the following from fixed bq. 5. NIT: Rpc -> RPC to be consistent fixed bq. Can you make the SleepRequestProto accepting a duration parameter instead of the fixed SLEEP_DURATION (1000ms)? done bq. 7. Is it possible to test with 1K fast calls instead of 10K calls to save test resources without affecting the results? I had benchmarked these calls and even with 10K it is in milliseconds. The reason I was making 10 K calls is to make sure that the test exercises the computation and the statistical significance properly. > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HADOOP-12325.001.patch, HADOOP-12325.002.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Status: Patch Available (was: Open) > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HADOOP-12325.001.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.001.patch This patch adds : * An metric called RpcSlowCalls * Ability to log Slow calls if ipc.server.log.slow.rpc is set to true > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Attachments: HADOOP-12325.001.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)