[ 
https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748236#action_12748236
 ] 

ryan rawson commented on HDFS-200:
----------------------------------

I've been testing this on my 20 node cluster here. Lease recovery can take a 
long time which is a bit of an issue.  The sync seems to be pretty good 
overall, we are recovering most of the edits up until the last flush, and it's 
pretty responsive.

However, I have discovered a new bug, the scenario is like so:
- we roll the logs every < 1MB (block size).
- we now have 18 logs to recover. The first 17 were closed properly, only the 
last one was in mid-write.
- during log recovery, the hbase master calls fs.append(f); out.close();
- But the master gets stuck at the out.close(); It can't seem to progress.  
Investigating the logs, it looks like the namenode 'forgets' about the other 2 
replicas for the block (file is 1 block), and thus we are stuck
until another replica comes back. 

I've attached logs, hadoop fsck, stack traces from hbase. 

> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
>                 Key: HDFS-200
>                 URL: https://issues.apache.org/jira/browse/HDFS-200
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>            Priority: Blocker
>         Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, 
> fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, 
> fsyncConcurrentReaders13_20.txt, fsyncConcurrentReaders14_20.txt, 
> fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, 
> fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, 
> fsyncConcurrentReaders9.patch, 
> hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, 
> hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, 
> namenode.log, namenode.log, Reader.java, Reader.java, reopen_test.sh, 
> ReopenProblem.java, Writer.java, Writer.java
>
>
> In the append design doc 
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it 
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before 
> the reader opened the file
> However, this feature is not yet implemented.  Note that the operation 
> 'flushed' is now called "sync".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to