[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875801#action_12875801 ]
sam rash commented on HDFS-1057: -------------------------------- Thanks for the quick review. I understand most of the comments, but have a couple of questions: 1. replicaVisibleLength was here before I made any changes. Why is it not valid? I understood it to be an upper bound on the bytes that could be read from this block. Is it the case that start + length <= replicaVisibleLength and we want to optimize? (the for loop to wait for bytes on disk >= visible length was here before, I just moved it earlier in the constructor) 2. not sure I understand endOffset. This was again a variable that already existed. What I thought you were getting at was the condition to decide if we should use the in-memory checksum or not (which is what you describe). 3. If we don't put the sync set/get method in ReplicaInPipelineInterface, we will have to use an if/else construct on instanceof in BlockReceiver and call one or the other. I can see the argument for keeping the method out of the interface since it is RBW-specific, but on the other hand, it's effectively a no-op for other implementers of the interface and leads to cleaner code (better natural polymorphism then if-else constructs to force it). either way, just wanted to throw that out there as a question of style > Concurrent readers hit ChecksumExceptions if following a writer to very end > of file > ----------------------------------------------------------------------------------- > > Key: HDFS-1057 > URL: https://issues.apache.org/jira/browse/HDFS-1057 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node > Affects Versions: 0.21.0, 0.22.0, 0.20-append > Reporter: Todd Lipcon > Assignee: sam rash > Priority: Blocker > Attachments: conurrent-reader-patch-1.txt, > conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, > hdfs-1057-trunk-1.txt > > > In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before > calling flush(). Therefore, if there is a concurrent reader, it's possible to > race here - the reader will see the new length while those bytes are still in > the buffers of BlockReceiver. Thus the client will potentially see checksum > errors or EOFs. Additionally, the last checksum chunk of the file is made > accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.