[ https://issues.apache.org/jira/browse/HDFS-13248?focusedWorklogId=743450&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-743450 ]
ASF GitHub Bot logged work on HDFS-13248: ----------------------------------------- Author: ASF GitHub Bot Created on: 17/Mar/22 22:04 Start Date: 17/Mar/22 22:04 Worklog Time Spent: 10m Work Description: omalley opened a new pull request #4081: URL: https://github.com/apache/hadoop/pull/4081 The NN makes decisions based on the client machine that control the locality of data access. Currently that is done by finding the ip address using the rpc connection, however in the RBF configuration, that will always be one of the router's ip address. We'd added the client's ip to the caller context in the router, so now the NN has the information. This patch makes the NN use the caller context information. From a security point of view, this patch adds a new configuration knob (dfs.namenode.ip-proxy-users) on the NN that defines the list of users that can set their client ip address. Sites should add "hdfs" (or the account that runs the routers) to "dfs.namenode.ip-proxy-users" on the NN to enable this feature. Note that the audit log does NOT currently use this information, so the client ip in the audit log will be the RBF proxy. Sites should turn on caller context logging so that the client ip addresses are captured. <!-- Thanks for sending a pull request! 1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute 2. Make sure your PR title starts with JIRA issue id, e.g., 'HADOOP-17799. Your PR title ...'. --> ### Description of PR ### How was this patch tested? ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 743450) Remaining Estimate: 0h Time Spent: 10m > RBF: Namenode need to choose block location for the client > ---------------------------------------------------------- > > Key: HDFS-13248 > URL: https://issues.apache.org/jira/browse/HDFS-13248 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Wu Weiwei > Assignee: Íñigo Goiri > Priority: Major > Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, > HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, > HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality > Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg > > Time Spent: 10m > Remaining Estimate: 0h > > When execute a put operation via router, the NameNode will choose block > location for the router, not for the real client. This will affect the file's > locality. > I think on both NameNode and Router, we should add a new addBlock method, or > add a parameter for the current addBlock method, to pass the real client > information. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org