[
https://issues.apache.org/jira/browse/HADOOP-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656334#action_12656334
]
rangadi edited comment on HADOOP-4797 at 12/13/08 1:54 PM:
----------------------------------------------------------------
(Edit : minor correction in shell command for running the test)
About CPU: Looks like there is orders of magnitude improvement in CPU taken by
write operation for 10MB RPC response with the patch. Atached micro benchmark
TestRpcCpu.patch runs a simple server that has only one call '{{byte[]
getBuffer()}}' that just returns a static byte array.
Measured cpu reported by /proc/pid/stat for the server for 10 iterations client
fetching 10MB buffer :
* With out the patch : 38500 (40 seconds per iteration)
* With the patch : 12500 (20 seconds per iteration, more on this later)
* RPC server takes 3 times less CPU.
* Important thing to note is that both of these have a large fixed CPU cost
outside of final write() operation.. So real cpu benefit of just around write
is much much higher.
* Given the work done on RPC server and client is same (10 seconds each),
** With out the patch : write() adds 20-30 seconds
** with the patch : negligible.
It is bit shocking to see that it takes 20 seconds to fetch 10MB even with the
patch. I don't think it can be expained by extra copies (as mentioned in
HADOOP-4813). It is mostly a problem in object writable. I will try out a fix
for that. This patch will show much better CPU improvement once we improve
Object writable for arrays.
How to run the test :{noformat}
$ ant package
# for server
$ bin/hadoop jar build/hadoop-0.20.0-dev-test.jar testrpccpu server
#client : if you run client on a diff machine set
"rpc.test.cpu.server.hostname" in
#conf/hadodop-site.xml
$ bin/hadoop jar build/hadoop-0.20.0-dev-test.jar testrpccpu
{noformat}
was (Author: rangadi):
About CPU: Looks like there is orders of magnitude improvement in CPU
taken by write operation for 10MB RPC response with the patch. Atached micro
benchmark TestRpcCpu.patch runs a simple server that has only one call
'{{byte[] getBuffer()}}' that just returns a static byte array.
Measured cpu reported by /proc/pid/stat for the server for 10 iterations client
fetching 10MB buffer :
* With out the patch : 38500 (40 seconds per iteration)
* With the patch : 12500 (20 seconds per iteration, more on time later)
* RPC server takes 3 times less CPU.
* Important thing to note is that both of these have a large fixed CPU cost
outside of final write() operation.. So real cpu benefit of just around write
is much much higher.
* Given the work done on RPC server and client is same (10 seconds each),
* With out the patch : write() adds 20-30 seconds
* with the patch : negligible.
It is bit shocking to see that it takes 20 seconds to fetch 10MB even with the
patch. I don't think it can be expained by extra copies (as mentioned in
HADOOP-4813). It is mostly a problem in object writable. I will try out a fix
for that. This patch will show much better CPU improvement once we improve
Object writable for arrays.
How to run the test :{noformat}
$ ant package
# for server
$ bin/hadoop jar build/hadoop-0.20.0-dev-test.jar testrpccpu server
#client : if you run client on a diff machine set
"rpc.test.cpu.server.hostname" in
#conf/hadodop-site.xml
$ bin/hadoop jar build/hadoop-0.20.0-dev-test.jar testrpccpu server
{noformat}
> RPC Server can leave a lot of direct buffers
> ---------------------------------------------
>
> Key: HADOOP-4797
> URL: https://issues.apache.org/jira/browse/HADOOP-4797
> Project: Hadoop Core
> Issue Type: Bug
> Components: ipc
> Affects Versions: 0.17.0
> Reporter: Raghu Angadi
> Assignee: Raghu Angadi
> Priority: Blocker
> Fix For: 0.18.3, 0.19.1, 0.20.0
>
> Attachments: HADOOP-4797-branch-18.patch,
> HADOOP-4797-branch-18.patch, HADOOP-4797-branch-18.patch, HADOOP-4797.patch,
> HADOOP-4797.patch, TestRpcCpu.patch
>
>
> RPC server unwittingly can soft-leak direct buffers. One observed case is
> that one of the namenodes at Yahoo took 40GB of virtual memory though it was
> configured for 24GB memory. Most of the memory outside Java heap expected to
> be direct buffers. This shown to be because of how RPC server reads and
> writes serialized data. The cause and proposed fix are in following comment.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.