[jira] Updated: (HDFS-894) DatanodeID.ipcPort is not updated when existing node re-registers
[ https://issues.apache.org/jira/browse/HDFS-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HDFS-894: --- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I've just committed this. Thanks Todd! DatanodeID.ipcPort is not updated when existing node re-registers - Key: HDFS-894 URL: https://issues.apache.org/jira/browse/HDFS-894 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Attachments: hdfs-894.txt In FSNamesystem.registerDatanode, it checks if a registering node is a reregistration of an old one based on storage ID. If so, it simply updates the old one with the new registration info. However, the new ipcPort is lost when this happens. I produced manually this by setting up a DN with IPC port set to 0 (so it picks an ephemeral port) and then restarting the DN. At this point, the NN's view of the ipcPort is stale, and clients will not be able to achieve pipeline recovery. This should be easy to fix and unit test, but not sure when I'll get to it, so anyone else should feel free to grab it if they get to it first. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-894) DatanodeID.ipcPort is not updated when existing node re-registers
[ https://issues.apache.org/jira/browse/HDFS-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HDFS-894: --- Fix Version/s: 0.22.0 DatanodeID.ipcPort is not updated when existing node re-registers - Key: HDFS-894 URL: https://issues.apache.org/jira/browse/HDFS-894 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-894.txt In FSNamesystem.registerDatanode, it checks if a registering node is a reregistration of an old one based on storage ID. If so, it simply updates the old one with the new registration info. However, the new ipcPort is lost when this happens. I produced manually this by setting up a DN with IPC port set to 0 (so it picks an ephemeral port) and then restarting the DN. At this point, the NN's view of the ipcPort is stale, and clients will not be able to achieve pipeline recovery. This should be easy to fix and unit test, but not sure when I'll get to it, so anyone else should feel free to grab it if they get to it first. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-894) DatanodeID.ipcPort is not updated when existing node re-registers
[ https://issues.apache.org/jira/browse/HDFS-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-894: - Attachment: hdfs-894.txt DatanodeID.ipcPort is not updated when existing node re-registers - Key: HDFS-894 URL: https://issues.apache.org/jira/browse/HDFS-894 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker Attachments: hdfs-894.txt In FSNamesystem.registerDatanode, it checks if a registering node is a reregistration of an old one based on storage ID. If so, it simply updates the old one with the new registration info. However, the new ipcPort is lost when this happens. I produced manually this by setting up a DN with IPC port set to 0 (so it picks an ephemeral port) and then restarting the DN. At this point, the NN's view of the ipcPort is stale, and clients will not be able to achieve pipeline recovery. This should be easy to fix and unit test, but not sure when I'll get to it, so anyone else should feel free to grab it if they get to it first. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-894) DatanodeID.ipcPort is not updated when existing node re-registers
[ https://issues.apache.org/jira/browse/HDFS-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-894: - Status: Patch Available (was: Open) DatanodeID.ipcPort is not updated when existing node re-registers - Key: HDFS-894 URL: https://issues.apache.org/jira/browse/HDFS-894 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker Attachments: hdfs-894.txt In FSNamesystem.registerDatanode, it checks if a registering node is a reregistration of an old one based on storage ID. If so, it simply updates the old one with the new registration info. However, the new ipcPort is lost when this happens. I produced manually this by setting up a DN with IPC port set to 0 (so it picks an ephemeral port) and then restarting the DN. At this point, the NN's view of the ipcPort is stale, and clients will not be able to achieve pipeline recovery. This should be easy to fix and unit test, but not sure when I'll get to it, so anyone else should feel free to grab it if they get to it first. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-894) DatanodeID.ipcPort is not updated when existing node re-registers
[ https://issues.apache.org/jira/browse/HDFS-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-894: - Component/s: name-node This should be fixed in all three current branches. As mentioned in the description, it can prevent the write pipeline from recovering since ClientDatanodeProtocol and InterDatanodeProtocol won't be able to connect. DatanodeID.ipcPort is not updated when existing node re-registers - Key: HDFS-894 URL: https://issues.apache.org/jira/browse/HDFS-894 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker Attachments: hdfs-894.txt In FSNamesystem.registerDatanode, it checks if a registering node is a reregistration of an old one based on storage ID. If so, it simply updates the old one with the new registration info. However, the new ipcPort is lost when this happens. I produced manually this by setting up a DN with IPC port set to 0 (so it picks an ephemeral port) and then restarting the DN. At this point, the NN's view of the ipcPort is stale, and clients will not be able to achieve pipeline recovery. This should be easy to fix and unit test, but not sure when I'll get to it, so anyone else should feel free to grab it if they get to it first. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.