On Thu, Feb 15, 2018 at 9:49 PM, Bobby Evans <ev...@oath.com.invalid> wrote: > I don't have the read only mode on disk full feature yet. I will look at > pulling it back to our fork, but I will also look at fixing the journaling > in general. Having spoken with the HDFS team here, they have seen a lot of > scary things that appear similar to this situation when a disk starts to go > bad. It would probably be in our best interest to guard against some of > those things on the bookies too. What scary things are the HDFS team doing? One thing we are doing in the journal, is that we preallocate the disk before we write to it. I remember, back in the day, this was mostly to get smoother latency, as the filesystem would get less involved, but this should also avoid the situation you described in your original email (unless the filesystem is overcommitting, or theirs some strange CoW stuff going on). Also, I recall some changes that came in from twitter that would pad each write to the journal out to the expected block size (i don't think we queried the actual size), which would ensure that you didn't try to rewrite a block, which could corrupt data if you failed in the middle of a rewrite. Of course, there's no guarantee that these things are bug free, but they should have handled the situation you described.
-Ivan