Bob, On Fri, Nov 12, 2021 at 8:53 PM Bob Peterson <[email protected]> wrote: > When journals are replayed they start from the last stable point in > the journal. But an rgrp's glock lvb can be updated before the rgrp is > stable in the journal, so another node can see newer lvb values that > reflect changes made after the stable point in the journal. > > This patch changes function thaw_glock, which is called after journal > recovery is complete, on every node, regardless of whether or not the > node replayed the journal (and therefore the rgrps). There's no good > way for any given node to determine if its rgrp glocks had been replayed > by a different node from a another node's journal, so it has no way to know > if its lvbs are still valid. So as soon as it knows recovery is complete > and the journals have been properly replayed, it zeroes out the lvbs > for all rgrp glocks. This forces it to re-read the lvb the next time > the glock is held.
this doesn't make sense to me. When looking at the journal, what matters is where the journal ends (in other words, the point of the last journal flush), not where it starts. We really must make sure that the journal has been flushed to include all the resource group changes before giving up the resource group glock. That's when the local LVB changes become visible to the other nodes. We should never have to go back in history, which is what this patch essentially allows. If we always go forward, the resource group LVBs will never "go invalid" in the first place. Thanks, Andreas > Signed-off-by: Bob Peterson <[email protected]> > --- > fs/gfs2/glock.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c > index 8dbd6fe66420..24c101287b70 100644 > --- a/fs/gfs2/glock.c > +++ b/fs/gfs2/glock.c > @@ -2161,6 +2161,8 @@ void gfs2_flush_delete_work(struct gfs2_sbd *sdp) > > static void thaw_glock(struct gfs2_glock *gl) > { > + if (gl->gl_name.ln_type == LM_TYPE_RGRP) > + memset(gl->gl_lksb.sb_lvbptr, 0, sizeof(struct > gfs2_rgrp_lvb)); > if (!test_and_clear_bit(GLF_FROZEN, &gl->gl_flags)) > return; > if (!lockref_get_not_dead(&gl->gl_lockref)) > -- > 2.33.1 >
