[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242382#comment-13242382 ]
Uma Maheswara Rao G commented on HDFS-200: ------------------------------------------ Hi Dhruba, Looks following code creating the problem in one special condition. {code} BlockInfo storedBlock = blocksMap.getStoredBlock(block); + if (storedBlock == null) { + // if the block with a WILDCARD generation stamp matches and the + // corresponding file is under construction, then accept this block. + // This block has a diferent generation stamp on the datanode + // because of a lease-recovery-attempt. + Block nblk = new Block(block.getBlockId()); + storedBlock = blocksMap.getStoredBlock(nblk); + if (storedBlock != null && storedBlock.getINode() != null && + (storedBlock.getGenerationStamp() <= block.getGenerationStamp() || + storedBlock.getINode().isUnderConstruction())) { + NameNode.stateChangeLog.info("BLOCK* NameSystem.addStoredBlock: " + + "addStoredBlock request received for " + block + " on " + + node.getName() + " size " + block.getNumBytes() + + " and it belongs to a file under construction. "); + } else { + storedBlock = null; + } {code} Events are as follows. 1) DN1->DN2->DN3 are in pipeline with genstamp 1 2) Client completed writing and closed the file. 3) Now DN3 killed. 4) Now reopened the file in append. 5) Now Pipeline contains DN1->DN2 with genstamp 2 6) Client continues writing some more data. 7) Now DN3 started. Replica presents in current directory as this was already finalized before. 8) DN3 triggered blockReport. 9) Since this block with genstamp 1 is not there in BlocksMap, it is trying to get the block with WILDCARD. And able to get the block. Will contains newer genstamp(2). 10) Since the file is UnderConstruction, it will just accept the block and updates in BlocksMap. Problem is, if the client gets DN3 for read, they will fail because, NN may give the block ID with latest genstamp (2), and DN3 does not contain the block with genstamp 2. Of cource data also inconsistent. Thanks Uma > In HDFS, sync() not yet guarantees data available to the new readers > -------------------------------------------------------------------- > > Key: HDFS-200 > URL: https://issues.apache.org/jira/browse/HDFS-200 > Project: Hadoop HDFS > Issue Type: New Feature > Affects Versions: 0.20-append > Reporter: Tsz Wo (Nicholas), SZE > Assignee: dhruba borthakur > Priority: Blocker > Fix For: 0.20-append, 0.20.205.0 > > Attachments: 4379_20081010TC3.java, HDFS-200.20-security.1.patch, > Reader.java, Reader.java, ReopenProblem.java, Writer.java, Writer.java, > checkLeases-fix-1.txt, checkLeases-fix-unit-test-1.txt, > fsyncConcurrentReaders.txt, fsyncConcurrentReaders11_20.txt, > fsyncConcurrentReaders12_20.txt, fsyncConcurrentReaders13_20.txt, > fsyncConcurrentReaders14_20.txt, fsyncConcurrentReaders15_20.txt, > fsyncConcurrentReaders16_20.txt, fsyncConcurrentReaders3.patch, > fsyncConcurrentReaders4.patch, fsyncConcurrentReaders5.txt, > fsyncConcurrentReaders6.patch, fsyncConcurrentReaders9.patch, > hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, > hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, > namenode.log, namenode.log, reopen_test.sh > > > In the append design doc > (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it > says > * A reader is guaranteed to be able to read data that was 'flushed' before > the reader opened the file > However, this feature is not yet implemented. Note that the operation > 'flushed' is now called "sync". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira