[ https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Charles Lamb updated HDFS-7847: ------------------------------- Attachment: HDFS-7847.001.patch @cmccabe, @stack, thanks for the review! bq. DFSClient.java: this change adds three new fields to DFSClient. But they only seem to be used by unit tests. It seems like we should just put these inside the unit test(s) that are using these-- if necessary, by adding a helper method. There's no reason to add more fields to DFSClient. Also remember that when using FileContext, we create new DFSClients all the time. Good point. I've left the existing {code}ClientProtocol namenode{code} field alone. The other 3 proxies are created on-demand by their getters. That means no change in DFSClient instance size. bq. It seems kind of odd to have NameNodeProxies#createProxy create a proxy to the datanode. It's actually a proxy to the NN for the DatanodeProtocol. That's the same protocol that the DN uses to speak with the NN when it's sending (among other things) block reports. bq. In general, when you see "NameNodeProxies" I think "proxies used by the NameNode" and this doesn't fit with that. These are actually proxies used to talk to the NN, not proxies used by the NN. I didn't make the name. bq. Can you give a little more context about why this is a good idea (as opposed to just having some custom code in the unit test or in a unit test util class that creates a proxy) While the name DatanodeProtocol makes us think of an RPC protocol to the datanode, it is in fact yet another one of the many protocols to the namenode which is embodied in the NamenodeProtocols (plural) omnibus interface. The problem this is addressing is that when we are talking to an in-process NN in the NNThroughputBenchmark, then it's easy to get our hands on a NamenodeProtocols instance -- you simply call NameNode.getRpcServer(). However, the idea of this patch is to let you run the benchmark against a non-in-process NN, so there's no NameNode instance to use. That means we have to create RPC proxy objects for each of the NN protocols that we need to use. It would be nice if we could create a single proxy for the omnibus NamenodeProtocols interface, but we can't. Instead, we have to pick and choose the different namenode protocols that we want to use -- ClientProtocol, NamenodeProtocol, RefreshUserMappingProtocol, and DatanodeProtocol -- and create proxies for them. Code to create proxies for the first three of these already existed in NameNodeProxies.java, but we have to add a few new lines to create the DatanodeProtocol proxy. @stack I looked into your (offline) suggestion to try calling through the TinyDatanode, but it's just doing the same thing that my patch does -- it uses the same ClientProtocol instance that the rest of the test uses. TinyDataNode is really just a skeleton and doesn't really borrow much code from the real DN. bq. Of course the NameNode may or may not be remote here. It seems like --nnuri or just --namenode or something like that would be more descriptive. Yeah, I agree. I changed it to -namenode. bq. Instead of this boilerplate, just use StringUtils#popOptionWithArgument. Changed. I was just trying to match the existing code, but the using StringUtils is for the better. {code} - replication, BLOCK_SIZE, null); + replication, BLOCK_SIZE, CryptoProtocolVersion.supported()); {code} bq. This fix is a little bit separate, right? I suppose we can do it in this JIRA, though. Without this, the relevant PBHelper.convert code throws NPE on the supportVersions arg. > Modify NNThroughputBenchmark to be able to operate on a remote NameNode > ----------------------------------------------------------------------- > > Key: HDFS-7847 > URL: https://issues.apache.org/jira/browse/HDFS-7847 > Project: Hadoop HDFS > Issue Type: Sub-task > Affects Versions: HDFS-7836 > Reporter: Colin Patrick McCabe > Assignee: Charles Lamb > Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, > make_blocks.tar.gz > > > Modify NNThroughputBenchmark to be able to operate on a NN that is not in > process. A followon Jira will modify it some more to allow quantifying native > and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)