[jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers

Doug Judd (JIRA) Sun, 18 Jan 2009 09:33:21 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664986#action_12664986
 ]


Doug Judd commented on HADOOP-4379:
-----------------------------------

Hi Dhruba,

I've been working with Luke a little on this.  Here are more details.  The log 
that gets written in the test is very small.  The first thing the software does 
when it creates the log is it writes a 7-byte header.  Then later as the test 
proceeds, the system will append a small entry and the do a sync.  We use the 
FSDataOutputStream class.  The sequence of operations looks something like this:

out_stream.write(data, 0, amount);
out_stream.sync()
[...]

When the test completes, all of the logs are exactly 7 bytes long.  It remains 
this way even if I wait 10 minutes or kill the Hypertable java process and wait 
several minutes as well.  Here is the listing:

[d...@motherlode000 aol-basic]$ hadoop fs -ls 
/hypertable/servers/10.0.30.1*_38060/log/range_txn
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 
/hypertable/servers/10.0.30.102_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 
/hypertable/servers/10.0.30.104_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 
/hypertable/servers/10.0.30.106_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 
/hypertable/servers/10.0.30.108_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 
/hypertable/servers/10.0.30.110_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 
/hypertable/servers/10.0.30.112_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 
/hypertable/servers/10.0.30.114_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup          7 2009-01-17 19:52 
/hypertable/servers/10.0.30.116_38060/log/range_txn/0.log

After shutting down HDFS and restarting it again, the listing looks like this:

[d...@motherlode000 aol-basic]$ hadoop fs -ls 
/hypertable/servers/10.0.30.1*_38060/log/range_txn
-rw-r--r--   3 doug supergroup        564 2009-01-17 19:52 
/hypertable/servers/10.0.30.102_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup         84 2009-01-17 19:52 
/hypertable/servers/10.0.30.104_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup       1063 2009-01-17 19:52 
/hypertable/servers/10.0.30.106_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup        634 2009-01-17 19:52 
/hypertable/servers/10.0.30.108_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup        217 2009-01-17 19:52 
/hypertable/servers/10.0.30.110_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup       1943 2009-01-17 19:52 
/hypertable/servers/10.0.30.112_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup       1072 2009-01-17 19:52 
/hypertable/servers/10.0.30.114_38060/log/range_txn/0.log
-rw-r--r--   3 doug supergroup        525 2009-01-17 19:52 
/hypertable/servers/10.0.30.116_38060/log/range_txn/0.log

The last time I ran this test I encountered a problem where it appeared that 
some of our commits were lost.  Here's what I did:

1. ran tests (which create a table with 75274825 cells)
2. kill Hypertable
3. shutdown HDFS
4. restart HDFS
5. restart Hypertable (which re-plays the commit logs)
6. dumped the table

The table dump in #6 came up short (e.g. 72M entries).  It appears that some of 
the commit logs (different log than the range_txn log) came back up incomplete.

Let us know if you want us to run an instrumented version or anything.  We can 
send you the Hadoop log files if that helps.  Thanks!

- Doug


> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4379
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4379
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>             Fix For: 0.19.1
>
>         Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt
>
>
> In the append design doc 
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it 
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before 
> the reader opened the file
> However, this feature is not yet implemented.  Note that the operation 
> 'flushed' is now called "sync".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers

Reply via email to