Hi, Before this patch, function gfs2_rgrp_bh_get would only update the rgrp info from the buffer if the GFS2_RDF_UPTODATE flag was clear. This went back to the days when gfs2 kept rgrp version numbers, and re-read the buffer_heads constantly, not just when needed.
The problem is, RDF_UPTODATE is a local flag, but lvbs are changed dynamically throughout the cluster. This is a serious problem when using the rgrplvb mount option because of scenarios like this: 1. Node A mounts the file system, sets RDF_UPTODATE for rgrp X. 2. Node B mounts the file system, sets RDF_UPTODATE for rgrp X. 3. Node A deletes a large file, freeing up lots of blocks, so the lvb gets updated. 4. Node B now re-reads the rgrp, but because it's marked UPTODATE, it decides not to update its in-core copy of rgrp X. At this point, Node B will have the wrong value for rgd->rd_free, the amount of free space in the rgrp. But there's no good reason not to grab the most recent values from the buffer: it only costs us a few cpu cycles to read them. This patch removes the UPTODATE check in favor of just always reading the rgrp values in from the buffer we just read. Signed-off-by: Bob Peterson <rpete...@redhat.com> --- fs/gfs2/rgrp.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c index 1ad3256b9cbc..6eec634eae2d 100644 --- a/fs/gfs2/rgrp.c +++ b/fs/gfs2/rgrp.c @@ -1177,15 +1177,14 @@ static int gfs2_rgrp_bh_get(struct gfs2_rgrpd *rgd) } } - if (!(rgd->rd_flags & GFS2_RDF_UPTODATE)) { - for (x = 0; x < length; x++) - clear_bit(GBF_FULL, &rgd->rd_bits[x].bi_flags); - gfs2_rgrp_in(rgd, (rgd->rd_bits[0].bi_bh)->b_data); - rgd->rd_flags |= (GFS2_RDF_UPTODATE | GFS2_RDF_CHECK); - rgd->rd_free_clone = rgd->rd_free; - /* max out the rgrp allocation failure point */ - rgd->rd_extfail_pt = rgd->rd_free; - } + for (x = 0; x < length; x++) + clear_bit(GBF_FULL, &rgd->rd_bits[x].bi_flags); + gfs2_rgrp_in(rgd, (rgd->rd_bits[0].bi_bh)->b_data); + rgd->rd_flags |= (GFS2_RDF_UPTODATE | GFS2_RDF_CHECK); + rgd->rd_free_clone = rgd->rd_free; + /* max out the rgrp allocation failure point */ + rgd->rd_extfail_pt = rgd->rd_free; + if (cpu_to_be32(GFS2_MAGIC) != rgd->rd_rgl->rl_magic) { rgd->rd_rgl->rl_unlinked = cpu_to_be32(count_unlinked(rgd)); gfs2_rgrp_ondisk2lvb(rgd->rd_rgl,