[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5944: - Status: Patch Available (was: Open) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint - Key: HDFS-5944 URL: https://issues.apache.org/jira/browse/HDFS-5944 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0, 1.2.0 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, HDFS-5944.test.txt, HDFS-5944.trunk.patch In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5944: - Attachment: HDFS-5944.trunk.patch Upload the same trunk patch to trigger the build. LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint - Key: HDFS-5944 URL: https://issues.apache.org/jira/browse/HDFS-5944 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.2.0, 2.2.0 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, HDFS-5944.test.txt, HDFS-5944.trunk.patch In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated HDFS-5944: --- Attachment: HDFS-5944-branch-1.2.patch HDFS-5944.patch Update patches with unit test. LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint - Key: HDFS-5944 URL: https://issues.apache.org/jira/browse/HDFS-5944 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.2.0, 2.2.0 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944-branch-1.2.patch, HDFS-5944.patch, HDFS-5944.patch, HDFS-5944.test.txt In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated HDFS-5944: --- Attachment: (was: HDFS-5944-branch-1.2.patch) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint - Key: HDFS-5944 URL: https://issues.apache.org/jira/browse/HDFS-5944 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.2.0, 2.2.0 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, HDFS-5944.test.txt In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated HDFS-5944: --- Attachment: (was: HDFS-5944.patch) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint - Key: HDFS-5944 URL: https://issues.apache.org/jira/browse/HDFS-5944 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.2.0, 2.2.0 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, HDFS-5944.test.txt In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-5944: - Attachment: HDFS-5944.test.txt Upload a unit test for branch-1, which can reproduce the bug. Since the error will crash 2NN JVM, you need to run the test like the following to print the 2NN error to console to validate: $ant test -Dtest.output=yes -Dtestcase=testname LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint - Key: HDFS-5944 URL: https://issues.apache.org/jira/browse/HDFS-5944 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.2.0, 2.2.0 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch, HDFS-5944.test.txt In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated HDFS-5944: --- Description: In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. was: In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is this a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint - Key: HDFS-5944 URL: https://issues.apache.org/jira/browse/HDFS-5944 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.2.0, 2.2.0 Reporter: zhaoyunjiong Assignee: zhaoyunjiong In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5944) LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated HDFS-5944: --- Attachment: HDFS-5944.patch HDFS-5944-branch-1.2.patch This patch is very simple, if prefix ended with '/', just minus 1 from srclen, so p.charAt(srclen) could handle path correctly. LeaseManager:findLeaseWithPrefixPath didn't handle path like /a/b/ right cause SecondaryNameNode failed do checkpoint - Key: HDFS-5944 URL: https://issues.apache.org/jira/browse/HDFS-5944 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.2.0, 2.2.0 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Attachments: HDFS-5944-branch-1.2.patch, HDFS-5944.patch In our cluster, we encountered error like this: java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log but is not under construction. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217) at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949) What happened: Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write. And Client A continue refresh it's lease. Client B deleted /XXX/20140206/04_30/ Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log Then secondaryNameNode try to do checkpoint and failed due to failed to delete lease hold by Client A when Client B deleted /XXX/20140206/04_30/. The reason is a bug in findLeaseWithPrefixPath: int srclen = prefix.length(); if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) { entries.put(entry.getKey(), entry.getValue()); } Here when prefix is /XXX/20140206/04_30/, and p is /XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'. The fix is simple, I'll upload patch later. -- This message was sent by Atlassian JIRA (v6.1.5#6160)