[ 
https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7847:
-------------------------------
    Attachment: HDFS-7847.001.patch

@cmccabe, @stack, thanks for the review!

bq. DFSClient.java: this change adds three new fields to DFSClient. But they 
only seem to be used by unit tests. It seems like we should just put these 
inside the unit test(s) that are using these-- if necessary, by adding a helper 
method. There's no reason to add more fields to DFSClient. Also remember that 
when using FileContext, we create new DFSClients all the time.

Good point. I've left the existing {code}ClientProtocol namenode{code} field 
alone. The other 3 proxies are created on-demand by their getters. That means 
no change in DFSClient instance size.

bq. It seems kind of odd to have NameNodeProxies#createProxy create a proxy to 
the datanode.

It's actually a proxy to the NN for the DatanodeProtocol. That's the same 
protocol that the DN uses to speak with the NN when it's sending (among other 
things) block reports.

bq. In general, when you see "NameNodeProxies" I think "proxies used by the 
NameNode" and this doesn't fit with that.

These are actually proxies used to talk to the NN, not proxies used by the NN. 
I didn't make the name.

bq. Can you give a little more context about why this is a good idea (as 
opposed to just having some custom code in the unit test or in a unit test util 
class that creates a proxy)

While the name DatanodeProtocol makes us think of an RPC protocol to the 
datanode, it is in fact yet another one of the many protocols to the namenode 
which is embodied in the NamenodeProtocols (plural) omnibus interface. The 
problem this is addressing is that when we are talking to an in-process NN in 
the NNThroughputBenchmark, then it's easy to get our hands on a 
NamenodeProtocols instance -- you simply call NameNode.getRpcServer(). However, 
the idea of this patch is to let you run the benchmark against a non-in-process 
NN, so there's no NameNode instance to use. That means we have to create RPC 
proxy objects for each of the NN protocols that we need to use.

It would be nice if we could create a single proxy for the omnibus 
NamenodeProtocols interface, but we can't. Instead, we have to pick and choose 
the different namenode protocols that we want to use -- ClientProtocol, 
NamenodeProtocol, RefreshUserMappingProtocol, and DatanodeProtocol -- and 
create proxies for them. Code to create proxies for the first three of these 
already existed in NameNodeProxies.java, but we have to add a few new lines to 
create the DatanodeProtocol proxy.

@stack I looked into your (offline) suggestion to try calling through the 
TinyDatanode, but it's just doing the same thing that my patch does -- it uses 
the same ClientProtocol instance that the rest of the test uses. TinyDataNode 
is really just a skeleton and doesn't really borrow much code from the real DN.

bq. Of course the NameNode may or may not be remote here. It seems like --nnuri 
or just --namenode or something like that would be more descriptive.

Yeah, I agree. I changed it to -namenode.

bq. Instead of this boilerplate, just use StringUtils#popOptionWithArgument.

Changed. I was just trying to match the existing code, but the using 
StringUtils is for the better. 

{code}
-          replication, BLOCK_SIZE, null);
+          replication, BLOCK_SIZE, CryptoProtocolVersion.supported());
{code}

bq. This fix is a little bit separate, right? I suppose we can do it in this 
JIRA, though.

Without this, the relevant PBHelper.convert code throws NPE on the 
supportVersions arg.


> Modify NNThroughputBenchmark to be able to operate on a remote NameNode
> -----------------------------------------------------------------------
>
>                 Key: HDFS-7847
>                 URL: https://issues.apache.org/jira/browse/HDFS-7847
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7836
>            Reporter: Colin Patrick McCabe
>            Assignee: Charles Lamb
>         Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, 
> make_blocks.tar.gz
>
>
> Modify NNThroughputBenchmark to be able to operate on a NN that is not in 
> process. A followon Jira will modify it some more to allow quantifying native 
> and java heap sizes, and some latency numbers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to