[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

sam rash (JIRA) Wed, 18 May 2011 17:22:32 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035884#comment-13035884
 ]


sam rash commented on HDFS-1057:
--------------------------------

i assume a similar problem as before.  The problem was that code that opened 
RPC proxies to DNs did not get closed in a finally block.  The test failure 
output indicates a socket/fd leak ("Too many open files").

https://issues.apache.org/jira/browse/HDFS-1310

the test was succeeding 8 months ago, 2010-09-10, so I'd look at commits that 
came after that.  



> Concurrent readers hit ChecksumExceptions if following a writer to very end 
> of file
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-1057
>                 URL: https://issues.apache.org/jira/browse/HDFS-1057
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node
>    Affects Versions: 0.20-append, 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: sam rash
>            Priority: Blocker
>             Fix For: 0.20-append, 0.21.0, 0.22.0
>
>         Attachments: HDFS-1057-0.20-append.patch, 
> conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
> conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
> hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
> hdfs-1057-trunk-6.txt
>
>
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
> calling flush(). Therefore, if there is a concurrent reader, it's possible to 
> race here - the reader will see the new length while those bytes are still in 
> the buffers of BlockReceiver. Thus the client will potentially see checksum 
> errors or EOFs. Additionally, the last checksum chunk of the file is made 
> accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

Reply via email to