[ https://issues.apache.org/jira/browse/HDFS-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin Patrick McCabe updated HDFS-3510: --------------------------------------- Attachment: HDFS-3510-b1.002.patch * fill with all 0xff, so that we don't have to handle OP_INVALID specially. Also, technically it is undefined what byte pattern ByteBuffer.allocateDirect fills the buffer with, although it is 0 in practice in Oracle implementations. In JDK7 they specified that it zero-fills. > FSEditLog pre-allocation does not work in branch-1 > -------------------------------------------------- > > Key: HDFS-3510 > URL: https://issues.apache.org/jira/browse/HDFS-3510 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 1.0.0 > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > Fix For: 1.0.0 > > Attachments: HDFS-3510-b1.001.patch, HDFS-3510-b1.002.patch > > > In the FSEditLog, we want to avoid running out of space in the middle of > writing an edit log operation to the disk. We do this by a process called > "preallocation"-- reserving space on the disk for the upcoming edit log > entries before beginning to write them. > branch-1 has some major problems with the way it does preallocation. These > problems can lead to corrupt edit logs when the disk runs out of space during > an edit log sync operation. > The problems are: > * We use FileChannel#write without checking for short writes, but > WritableByteChannel explicitly documents that they are possible, and the > FileChannel subclass is silent on the issue. > * We only try to do preallocation when the current position is less than 4096 > bytes from the end of the file. However, bufReady starts out at 512kb, and > only gets bigger from there. There is no way that 4kb is enough space to > reserve. > * The current code seems to be based on a misunderstanding of how space is > allocated in files in Linux. In FileChannel#write(ByteBuffer, long), the > second argument is the offset to start writing at. Since we set this to > fc.position() + 1024*1024, this means that we *start* writing a megabyte > after the end of the file. This is guaranteed to create a sparse file on > Linux, defeating the point of pre-allocation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira