Don't fill preallocated portion of edits log with 0x00
------------------------------------------------------

                 Key: HDFS-1846
                 URL: https://issues.apache.org/jira/browse/HDFS-1846
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: name-node
    Affects Versions: 0.23.0
            Reporter: Aaron T. Myers
            Assignee: Aaron T. Myers


HADOOP-2330 added a feature to preallocate space in the local file system for 
the NN transaction log. That change seeks past the current end of the file and 
writes out some data, which on most systems results in the intervening data in 
the file being filled with zeros. Most underlying file systems have special 
handling for sparse files, and don't actually allocate blocks on disk for 
blocks of a file which consist completely of 0x00.

I've seen cases in the wild where the volume an edits dir is on fills up, 
resulting in a partial final transaction being written out to disk. If you 
examine the bytes of this (now corrupt) edits file, you'll see the partial 
final transaction followed by a lot of zeros, suggesting that the preallocation 
previously succeeded before the volume ran out of space. If we fill the 
preallocated space with something other than zeros, we'd likely see the failure 
at preallocation time, rather than transaction-writing time, and so cause the 
NN to crash earlier, without a partial transaction being written out.

I also hypothesize that filling the preallocated space in the edits log with 
something other than 0x00 will result in a performance improvement in NN 
throughput. I haven't tested this yet, but I intend to as part of this JIRA.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to