[ 
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667462#action_12667462
 ] 

Doug Judd commented on HADOOP-4379:
-----------------------------------

I tried the test again and still no luck.  To recap, here's how the log file is 
created:

out_stream.write(header, 0, 7);
out_stream.sync()
out_stream.write(data, 0, amount);
out_stream.sync()
[...]

After the test finished, I shut down the Hypertable servers.  This time the 
listing shows the files to be 0 bytes in length (as opposed to 7 bytes with the 
previous patch):

[d...@motherlode000 aol-basic]$ hadoop fs -ls 
/hypertable/servers/10.0.30.1*_38060/log/range_txn
-rw-r--r--   3 doug supergroup          0 2009-01-26 11:40 
/hypertable/servers/10.0.30.102_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          0 2009-01-26 11:40 
/hypertable/servers/10.0.30.104_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          0 2009-01-26 11:40 
/hypertable/servers/10.0.30.106_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          0 2009-01-26 11:40 
/hypertable/servers/10.0.30.108_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          0 2009-01-26 11:40 
/hypertable/servers/10.0.30.110_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          0 2009-01-26 11:40 
/hypertable/servers/10.0.30.112_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          0 2009-01-26 11:40 
/hypertable/servers/10.0.30.114_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          0 2009-01-26 11:40 
/hypertable/servers/10.0.30.116_38060/log/range_txn/0.log

When the RangeServer starts up again, it discovers that the log file 
(range_txn/0.log) does exist, so it starts the recovery process.  However, it 
only sees the 7 byte header.  All of the subsequent log appends do not appear 
in the log file.  So the system starts up without recovering any of the data.

BTW, in this particular circumstance, there no other writer writing to the file 
when the range server comes up and reads it.  Here's the high-level of what's 
going on:

RangeServer opens an FSDataOutputStream to the log and starts appending to it
RangeServer is killed with 'kill -9"
RangeServer comes up again and reads the log

In your above note you said, "A reader checks to see if the file is being 
written to by another writer. if so, it fetches the size of the last block from 
the primary datanode."  This is not the case with our test, there is no writer 
writing to the log when we try to read it.

- Doug


> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4379
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4379
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>             Fix For: 0.19.1
>
>         Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, 
> fsyncConcurrentReaders3.patch, Reader.java, Writer.java
>
>
> In the append design doc 
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it 
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before 
> the reader opened the file
> However, this feature is not yet implemented.  Note that the operation 
> 'flushed' is now called "sync".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to