[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180422#comment-13180422 ] Hudson commented on HDFS-2720: -- Integrated in Hadoop-Hdfs-HAbranch-build #38 (See [https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/38/]) HDFS-2720. Fix MiniDFSCluster HA support to work properly on Windows. Contributed by Uma Maheswara Rao G. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1227284 Files : * /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt * /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs Key: HDFS-2720 URL: https://issues.apache.org/jira/browse/HDFS-2720 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Fix For: HA branch (HDFS-1623) Attachments: HDFS-2720.patch, HDFS-2720.patch To maintain the clusterID same , we are copying the namespaceDirs from 1st NN to other NNs. While copying this files, in_use.lock file may not allow to copy in all the OSs since it has aquired the lock on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178029#comment-13178029 ] Uma Maheswara Rao G commented on HDFS-2720: --- Todd, Thanks a lot for the Review! I am presently on leave, will update patch soon. Thanks Uma HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs Key: HDFS-2720 URL: https://issues.apache.org/jira/browse/HDFS-2720 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-2720.patch To maintain the clusterID same , we are copying the namespaceDirs from 1st NN to other NNs. While copying this files, in_use.lock file may not allow to copy in all the OSs since it has aquired the lock on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177371#comment-13177371 ] Todd Lipcon commented on HDFS-2720: --- Small nits: {code} + // Now format 1st NN and copy the storage dirs to remaining all. {code} to remaining all seems like a typo. copy the storage directory from that node to the others. would be better. Also I think it's easier to read first than 1st {code} + //Start all Namenodes {code} add space after {{//}} - The change to remove setRpcEngine looks unrelated - that should get cleaned up in trunk so it doesn't present a merge issue in the branch. HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs Key: HDFS-2720 URL: https://issues.apache.org/jira/browse/HDFS-2720 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-2720.patch To maintain the clusterID same , we are copying the namespaceDirs from 1st NN to other NNs. While copying this files, in_use.lock file may not allow to copy in all the OSs since it has aquired the lock on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177389#comment-13177389 ] Eli Collins commented on HDFS-2720: --- ATM and I were discussing how to initialize the SBN state yesterday. What we currently do is format the primary then copy the name dirs to the SBN. How about making the SBN do this automatically on startup? Specifically, on NN startup, if HA and a shared edits dir are configured, if there is no local image but the shared-dir is configured then the SBN downloads the image from the primary (if the other NN is still standby then it fails to start as it does currently). HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs Key: HDFS-2720 URL: https://issues.apache.org/jira/browse/HDFS-2720 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-2720.patch To maintain the clusterID same , we are copying the namespaceDirs from 1st NN to other NNs. While copying this files, in_use.lock file may not allow to copy in all the OSs since it has aquired the lock on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177434#comment-13177434 ] Todd Lipcon commented on HDFS-2720: --- That would be a nice improvement... but I think it makes sense to do this small fix that Uma proposed so the tests run on Windows, and then do the standby initialize from remote active feature separately? HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs Key: HDFS-2720 URL: https://issues.apache.org/jira/browse/HDFS-2720 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-2720.patch To maintain the clusterID same , we are copying the namespaceDirs from 1st NN to other NNs. While copying this files, in_use.lock file may not allow to copy in all the OSs since it has aquired the lock on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177468#comment-13177468 ] Eli Collins commented on HDFS-2720: --- Yup, I'll file a separate jira. Agree wrt the fix for Windows. HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs Key: HDFS-2720 URL: https://issues.apache.org/jira/browse/HDFS-2720 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-2720.patch To maintain the clusterID same , we are copying the namespaceDirs from 1st NN to other NNs. While copying this files, in_use.lock file may not allow to copy in all the OSs since it has aquired the lock on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175532#comment-13175532 ] Todd Lipcon commented on HDFS-2720: --- For testing on actual clusters, I've done this by shutting down the active NN, then just rsyncing the storage dir to the standby, then starting the standby. Your idea of skipping in_use.lock is one solution for MiniDFSCluster. The other solution would be to copy the storage dir to all the standbys before starting the first NN. But that might break addNameNode support in HA - maybe not a big deal since I don't think we use that in HA cluster tests at the moment. HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs Key: HDFS-2720 URL: https://issues.apache.org/jira/browse/HDFS-2720 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G To maintain the clusterID same , we are copying the namespaceDirs from 1st NN to other NNs. While copying this files, in_use.lock file may not allow to copy in all the OSs since it has aquired the lock on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175542#comment-13175542 ] Uma Maheswara Rao G commented on HDFS-2720: --- Thanks a lot,Todd for the suggestions. {quote} For testing on actual clusters, I've done this by shutting down the active NN, then just rsyncing the storage dir to the standby, then starting the standby. {quote} I feel we should automate this once we built the automatic HA right. We were depending on zookeeper to store namespaceID(in our internal HA). But i am not sure how we can handle it in linux HA case. Shall i file one JIRA for it? Thanks Uma HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs Key: HDFS-2720 URL: https://issues.apache.org/jira/browse/HDFS-2720 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G To maintain the clusterID same , we are copying the namespaceDirs from 1st NN to other NNs. While copying this files, in_use.lock file may not allow to copy in all the OSs since it has aquired the lock on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175686#comment-13175686 ] Uma Maheswara Rao G commented on HDFS-2720: --- Updated a patch, First it will format one NN and copy the dirs to remaining other nodes. After this step it will start all NNs. HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs Key: HDFS-2720 URL: https://issues.apache.org/jira/browse/HDFS-2720 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, test Affects Versions: HA branch (HDFS-1623) Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-2720.patch To maintain the clusterID same , we are copying the namespaceDirs from 1st NN to other NNs. While copying this files, in_use.lock file may not allow to copy in all the OSs since it has aquired the lock on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira