[jira] [Commented] (HDFS-7443) Datanode upgrade to BLOCKID_BASED_LAYOUT sometimes fails

2014-12-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251106#comment-14251106
 ] 

Colin Patrick McCabe commented on HDFS-7443:


It appears that the old software could sometimes create a duplicate copy of the 
same block in two different {{subdir}} folders on the same volume.  In all the 
cases in which we've seen this, the block files were identical.  Two files, 
both for the same block id, in separate directories.  This appears to be a bug, 
since obviously we don't want to store the same block twice on the same volume. 
 This causes the {{EEXIST}} problem on upgrade, since the new block layout only 
has one place where each block ID can go.  Unfortunately, the hardlink code 
doesn't print the name of the file which caused the problem, making diagnosis 
more difficult than it should be.

One easy way around this is to check for duplicate block IDs on each volume 
before upgrading, and manually remove the duplicates.

We should also consider logging an error message and continuing the upgrade 
process when we encounter this.

[~kihwal], I'm not sure why, in your case, the DataNode retried the hard link 
process multiple times.  I'm also not sure why you ended up with a jumbled 
{{previous.tmp}} directory.  When we reproduced this on CDH5.2, we did not have 
that problem, for whatever reason.

 Datanode upgrade to BLOCKID_BASED_LAYOUT sometimes fails
 

 Key: HDFS-7443
 URL: https://issues.apache.org/jira/browse/HDFS-7443
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Kihwal Lee
Priority: Blocker

 When we did an upgrade from 2.5 to 2.6 in a medium size cluster, about 4% of 
 datanodes were not coming up.  They treid data file layout upgrade for 
 BLOCKID_BASED_LAYOUT introduced in HDFS-6482, but failed.
 All failures were caused by {{NativeIO.link()}} throwing IOException saying 
 {{EEXIST}}.  The data nodes didn't die right away, but the upgrade was soon 
 retried when the block pool initialization was retried whenever 
 {{BPServiceActor}} was registering with the namenode.  After many retries, 
 datenodes terminated.  This would leave {{previous.tmp}} and {{current}} with 
 no {{VERSION}} file in the block pool slice storage directory.  
 Although {{previous.tmp}} contained the old {{VERSION}} file, the content was 
 in the new layout and the subdirs were all newly created ones.  This 
 shouldn't have happened because the upgrade-recovery logic in {{Storage}} 
 removes {{current}} and renames {{previous.tmp}} to {{current}} before 
 retrying.  All successfully upgraded volumes had old state preserved in their 
 {{previous}} directory.
 In summary there were two observed issues.
 - Upgrade failure with {{link()}} failing with {{EEXIST}}
 - {{previous.tmp}} contained not the content of original {{current}}, but 
 half-upgraded one.
 We did not see this in smaller scale test clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7443) Datanode upgrade to BLOCKID_BASED_LAYOUT sometimes fails

2014-12-01 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230223#comment-14230223
 ] 

Colin Patrick McCabe commented on HDFS-7443:


The {{EEXIST}} error and the modified {{previous.tmp}} seem related.  If we 
somehow tried to upgrade a directory that was already half-upgraded, {{EEXIST}} 
is exactly what we'd expect to see.

 Datanode upgrade to BLOCKID_BASED_LAYOUT sometimes fails
 

 Key: HDFS-7443
 URL: https://issues.apache.org/jira/browse/HDFS-7443
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Kihwal Lee
Priority: Blocker

 When we did an upgrade from 2.5 to 2.6 in a medium size cluster, about 4% of 
 datanodes were not coming up.  They treid data file layout upgrade for 
 BLOCKID_BASED_LAYOUT introduced in HDFS-6482, but failed.
 All failures were caused by {{NativeIO.link()}} throwing IOException saying 
 {{EEXIST}}.  The data nodes didn't die right away, but the upgrade was soon 
 retried when the block pool initialization was retried whenever 
 {{BPServiceActor}} was registering with the namenode.  After many retries, 
 datenodes terminated.  This would leave {{previous.tmp}} and {{current}} with 
 no {{VERSION}} file in the block pool slice storage directory.  
 Although {{previous.tmp}} contained the old {{VERSION}} file, the content was 
 in the new layout and the subdirs were all newly created ones.  This 
 shouldn't have happened because the upgrade-recovery logic in {{Storage}} 
 removes {{current}} and renames {{previous.tmp}} to {{current}} before 
 retrying.  All successfully upgraded volumes had old state preserved in their 
 {{previous}} directory.
 In summary there were two observed issues.
 - Upgrade failure with {{link()}} failing with {{EEXIST}}
 - {{previous.tmp}} contained not the content of original {{current}}, but 
 half-upgraded one.
 We did not see this in smaller scale test clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7443) Datanode upgrade to BLOCKID_BASED_LAYOUT sometimes fails

2014-11-25 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224617#comment-14224617
 ] 

Kihwal Lee commented on HDFS-7443:
--

This is the first error seen.

{noformat}
ERROR datanode.DataNode: Initialization failed for Block pool registering 
(Datanode Uuid unassigned) service to some.host:8020 EEXIST: File exists
{noformat}

This was after successful upgrade of several volumes. Since the hard link 
summary was not printed and it was multiple seconds after starting upgrade of 
this volume (did not fail right away), the error must have come from 
{{DataStorage.linkBlocks()}} when it was checking the result with 
{{Futures.get()}}.

Then it was retried and failed the same way.

{noformat}
INFO common.Storage: Analyzing storage directories for bpid BP-
INFO common.Storage: Recovering storage directory 
/a/b/hadoop/var/hdfs/data/current/BP- from previous upgrade
INFO common.Storage: Upgrading block pool storage directory 
/a/b/hadoop/var/hdfs/data/current/BP-
   old LV = -55; old CTime = 12345678.
   new LV = -56; new CTime = 45678989
ERROR datanode.DataNode: Initialization failed for Block pool registering 
(Datanode Uuid unassigned) service to some.host:8020 EEXIST: File exists
{noformat}

This indicates {{Storage.analyzeStorage()}} correctly returning 
{{RECOVER_UPGRADE}} and the partial upgrade is undone before retrying. This 
repeated hundreds of times before termination of datanode, which logged the 
stack trace.

{noformat}
FATAL datanode.DataNode: Initialization failed for Block pool registering 
(Datanode Uuid unassigned) service to some.host:8020. Exiting. 
java.io.IOException: EEXIST: File exists
at sun.reflect.GeneratedConstructorAccessor18.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at 
com.google.common.util.concurrent.Futures.newFromConstructor(Futures.java:1258)
at 
com.google.common.util.concurrent.Futures.newWithCause(Futures.java:1218)
at 
com.google.common.util.concurrent.Futures.wrapAndThrowExceptionOrError(Futures.java:1131)
at com.google.common.util.concurrent.Futures.get(Futures.java:1048)
at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocks(DataStorage.java:999)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.linkAllBlocks(BlockPoolSliceStorage.java:594)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doUpgrade(BlockPoolSliceStorage.java:403)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:337)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:197)
at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:438)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1312)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1277)
at 
org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:221)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
at java.lang.Thread.run(Thread.java:722)
Caused by: EEXIST: File exists
at org.apache.hadoop.io.nativeio.NativeIO.link0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO.link(NativeIO.java:836)
at 
org.apache.hadoop.hdfs.server.datanode.DataStorage$2.call(DataStorage.java:991)
at 
org.apache.hadoop.hdfs.server.datanode.DataStorage$2.call(DataStorage.java:984)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
... 1 more
{noformat}

At this point, {{previous.tmp}} contained the new directory structure with 
blocks and meta files placed in the ID-based directory. Some orphaned meta and 
block files were observed.  Restarting datanode does not reproduce the issue, 
but I suspect data loss based on the missing files and the number of missing 
blocks.

 Datanode upgrade to BLOCKID_BASED_LAYOUT sometimes fails
 

 Key: HDFS-7443
 URL: https://issues.apache.org/jira/browse/HDFS-7443
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0