J.Andreina created HDFS-7820:
--------------------------------
Summary: Client Write fails after rolling upgrade operation with
"<block_id> already exist in finalized state"
Key: HDFS-7820
URL: https://issues.apache.org/jira/browse/HDFS-7820
Project: Hadoop HDFS
Issue Type: Bug
Reporter: J.Andreina
Assignee: J.Andreina
Steps to Reproduce:
===================
Step 1: Prepare rolling upgrade using "hdfs dfsadmin -rollingUpgrade prepare"
Step 2: Shutdown SNN and NN
Step 3: Start NN with the "hdfs namenode -rollingUpgrade started" option.
Step 4: Executed "hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT>
upgrade" and restarted Datanode
Step 5: Write 3 files to hdfs ( block id assigned are : blk_1073741831_1007,
blk_1073741832_1008,blk_1073741833_1009 )
Step 6: Shutdown both NN and DN
Step 7: Start NNs with the "hdfs namenode -rollingUpgrade rollback" option.
Start DNs with the "-rollback" option.
Step 8: Write 2 files to hdfs.
Issue:
=======
Client write failed with below exception
{noformat}
2015-02-23 16:00:12,896 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Receiving BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 src:
/XXXXXXXXXXX:48545 dest: /XXXXXXXXXXX:50010
2015-02-23 16:00:12,897 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
opWriteBlock BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008
received exception
org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 already exists in
state FINALIZED and thus cannot be created.
{noformat}
Observations:
=============
1. At Namenode side block invalidate is been sent only to 2 blocks.
{noformat}
15/02/23 14:59:56 INFO BlockStateChange: BLOCK* InvalidateBlocks: add
blk_1073741833_1009 to XXXXXXXXXXX:50010
15/02/23 14:59:56 INFO BlockStateChange: BLOCK* InvalidateBlocks: add
blk_1073741831_1007 to XXXXXXXXXXX:50010
{noformat}
2. fsck report does not show information on blk_1073741832_1008
{noformat}
FSCK started by Rex (auth:SIMPLE) from /XXXXXXXXXXX for path / at Mon Feb 23
16:17:57 CST 2015
/File1: Under replicated
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741825_1001. Target Replicas is
3 but found 1 replica(s).
/File11: Under replicated
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741827_1003. Target Replicas is
3 but found 1 replica(s).
/File2: Under replicated
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741826_1002. Target Replicas is
3 but found 1 replica(s).
/AfterRollback_2: Under replicated
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741831_1007. Target Replicas is
3 but found 1 replica(s).
/Test1: Under replicated
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741828_1004. Target Replicas is
3 but found 1 replica(s).
Status: HEALTHY
Total size: 31620 B
Total dirs: 7
Total files: 6
Total symlinks: 0
Total blocks (validated): 5 (avg. block size 6324 B)
Minimally replicated blocks: 5 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 5 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 10 (66.666664 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Mon Feb 23 16:17:57 CST 2015 in 3 milliseconds
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)