[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16884283#comment-16884283
 ] 

Ayush Saxena edited comment on HDFS-13248 at 7/13/19 5:32 AM:
--------------------------------------------------------------

Hey Brahma!!!
Literally, I don't have those stats with an MR Job. Actually I guess you are 
pointing towards as if client resides on the same site as that of Router.
In that case, Yes, Correct. technically this won't be a problem as the router 
and the client node share the same addr. 
Here is to tackle when this isn't true and the *client resides on a node having 
a DN but not a Router* In that scenario the BPP shall be satisfied wrt to 
Router not the client, and hence these locality problems shall occur and the 
DN's shall be sorted wrt to Router not client. And by far I guess the number of 
clients sites shall be more than Routers. Let me know if you require any stats 
for analysis, we shall try grab them up.
Well we had couple solutions, All Stuck as of now :
*  Add proxy address in IPC connection (HADOOP-16254) --> This had some 
security concerns.
* The RouterRPCServer should transfer CallerContext and client ip to 
NamenodeRpcServer (HDFS-13293) --> This tend to little opaque and couple of 
more problems stated above.
* Favored Nodes --> I guess the last patch here. Pass the local node as favored 
node. But this isn't a complete solution. This doesn't take into account the 
fallback in case of non availability of local nodes and couple of more.

Do give a check, if you can help, or give some pointers to any of the 
solutions, Or a new solutions. this had been still since quite a long.


was (Author: ayushtkn):
Hey Brahma!!!
Literally, I don't have those stats with an MR Job. Actually I guess you are 
pointing towards as if client resides on the same site as that of Router.
In that case, Yes, Correct. technically this won't be a problem as the router 
and the client node share the same addr. 
Here is to tackle when this isn't true and the *client resides on a node having 
a DN but not a Router* In that scenario the BPP shall be satisfied wrt to 
Router not the client, and hence these locality problems shall occur and the 
DN's shall be sorted wrt to Router not client. And by far I guess the number of 
clients sides shall be more than Routers. Let me know if you require any stats 
for analysis, we shall try grab them up.
Well he had couple solutions, All Stuck as of now :
*  Add proxy address in IPC connection (HADOOP-16254) --> This had some 
security concerns.
* The RouterRPCServer should transfer CallerContext and client ip to 
NamenodeRpcServer (HDFS-13293) --> This tend to little opaque and couple of 
more problems stated above.
* Favored Nodes --> I guess the last patch here. Pass the local node as favored 
node. But this isn't a complete solution. This doesn't take into account the 
fallback in case of non availability of local nodes and couple of more.

Do give a check, if you can help, or give some pointers to any of the 
solutions, Or a new solutions. this had been still since quite a long.

> RBF: Namenode need to choose block location for the client
> ----------------------------------------------------------
>
>                 Key: HDFS-13248
>                 URL: https://issues.apache.org/jira/browse/HDFS-13248
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Wu Weiwei
>            Assignee: Íñigo Goiri
>            Priority: Major
>         Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality 
> Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to