[ https://issues.apache.org/jira/browse/HDFS-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043184#comment-13043184 ]
John George commented on HDFS-1907: ----------------------------------- {noformat} assert curOff >= blk.getStartOffset() : "Block not found"; {noformat} The only way that (I can think of) where this assert is hit is if bytesRead becomes negative and/or a wrong block is returned. {noformat} TODO? Assert if the read request exceeds the length of the unfinalized + finalized blocks. {noformat} Sorry but I do not get the point of this TODO? I do not see how it brings consistency. Can you elaborate a little more? And I am little lost as to why both the asserts are related. > BlockMissingException upon concurrent read and write: reader was doing file > position read while writer is doing write without hflush > ------------------------------------------------------------------------------------------------------------------------------------ > > Key: HDFS-1907 > URL: https://issues.apache.org/jira/browse/HDFS-1907 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client > Affects Versions: 0.23.0 > Environment: Run on a real cluster. Using the latest 0.23 build. > Reporter: CW Chung > Assignee: John George > Attachments: HDFS-1907-2.patch, HDFS-1907-3.patch, HDFS-1907-4.patch, > HDFS-1907-5.patch, HDFS-1907-5.patch, HDFS-1907.patch > > > BlockMissingException is thrown under this test scenario: > Two different processes doing concurrent file r/w: one read and the other > write on the same file > - writer keep doing file write > - reader doing position file read from beginning of the file to the visible > end of file, repeatedly > The reader is basically doing: > byteRead = in.read(currentPosition, buffer, 0, byteToReadThisRound); > where CurrentPostion=0, buffer is a byte array buffer, byteToReadThisRound = > 1024*10000; > Usually it does not fail right away. I have to read, close file, re-open the > same file a few times to create the problem. I'll pose a test program to > repro this problem after I've cleaned up a bit my current test program. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira