> One thing that should circumvent this is that the bookie should go > into readonly mode when it hits 95% full disk.
I think this is only applying to the ledgers disk, but not for the journal. And, to answer to Bobby, the switch to read-only mode feature was already present in 4.3 (again, just for storage device). Matteo On Thu, Feb 15, 2018 at 2:56 PM Ivan Kelly <iv...@apache.org> wrote: > On Thu, Feb 15, 2018 at 9:49 PM, Bobby Evans <ev...@oath.com.invalid> > wrote: > > I don't have the read only mode on disk full feature yet. I will look at > > pulling it back to our fork, but I will also look at fixing the > journaling > > in general. Having spoken with the HDFS team here, they have seen a lot > of > > scary things that appear similar to this situation when a disk starts to > go > > bad. It would probably be in our best interest to guard against some of > > those things on the bookies too. > What scary things are the HDFS team doing? One thing we are doing in > the journal, is that we preallocate the disk before we write to it. I > remember, back in the day, this was mostly to get smoother latency, as > the filesystem would get less involved, but this should also avoid the > situation you described in your original email (unless the filesystem > is overcommitting, or theirs some strange CoW stuff going on). Also, I > recall some changes that came in from twitter that would pad each > write to the journal out to the expected block size (i don't think we > queried the actual size), which would ensure that you didn't try to > rewrite a block, which could corrupt data if you failed in the middle > of a rewrite. Of course, there's no guarantee that these things are > bug free, but they should have handled the situation you described. > > -Ivan > -- Matteo Merli <mme...@apache.org>