Re: [Cluster-devel] [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"
On Thu, Jan 26, 2017 at 08:44:55AM +0100, Michal Hocko wrote: > > > I'm convinced the current series is OK, only real life will tell us > > > whether > > > we missed something or not ;) > > > > I would like to extend the changelog of "jbd2: mark the transaction > > context with the scope GFP_NOFS context". > > > > " > > Please note that setups without journal do not suffer from potential > > recursion problems and so they do not need the scope protection because > > neither ->releasepage nor ->evict_inode (which are the only fs entry > > points from the direct reclaim) can reenter a locked context which is > > doing the allocation currently. > > " > > Could you comment on this Ted, please? I guess so there still is one way this could screw us, and it's this reason for GFP_NOFS: - to prevent from stack overflows during the reclaim because the allocation is performed from a deep context already The writepages call stack can be pretty deep. (Especially if we're using ext4 in no journal mode over, say, iSCSI.) How much stack space can get consumed by a reclaim? - Ted
Re: [Cluster-devel] [PATCH 0/5] mkfs.gfs2 device topology-related fixes
- Original Message - | This patch set improves handling of various combinations of device topology | values/io limits. It addresses two main problems: misaligned devices and | devices which report strange optimal/minumum values which can not be used for | resource group alignment. Some tests which failed before this patch set are | added in patch 5. | | Andrew Price (5): | mkfs.gfs2: Add an extended option for device topology testing | mkfs.gfs2: Warn when device is misaligned | mkfs.gfs2: Disable rgrp alignment when dev topology is unsuitable | mkfs.gfs2: Disregard device topology if zero-sized sectors reported | gfs2-utils tests: Add tests for device topology handling in mkfs.gfs2 | | gfs2/mkfs/main_mkfs.c | 65 | ++- | tests/mkfs.at | 8 +++ | 2 files changed, 62 insertions(+), 11 deletions(-) | | -- | 2.9.3 | Hi, Series ACK Regards, Bob Peterson Red Hat File Systems
[Cluster-devel] [PATCH 5/5] gfs2-utils tests: Add tests for device topology handling in mkfs.gfs2
Add some tests that exercise mkfs.gfs2 against various device i/o limits based on odd values seen in the wild. Signed-off-by: Andrew Price --- tests/mkfs.at | 8 1 file changed, 8 insertions(+) diff --git a/tests/mkfs.at b/tests/mkfs.at index d3521c8..c026a76 100644 --- a/tests/mkfs.at +++ b/tests/mkfs.at @@ -101,3 +101,11 @@ AT_CHECK([$GFS_MKFS -p lock_dlm -t "quite_long_cluster_name_test_here:intec34p" AT_CHECK([$GFS_MKFS -p lock_dlm -t "financial_cluster:this_time_we_test_fs_naming_len" $GFS_TGT], 255, [ignore], [ignore]) GFS_FSCK_CHECK([$GFS_MKFS -p lock_dlm -t "a_really_long_named_cluster_here:concurrently_lets_check_fs_len" $GFS_TGT]) AT_CLEANUP + +AT_SETUP([Device i/o limits handling]) +AT_KEYWORDS(mkfs.gfs2 mkfs) +AT_CHECK([$GFS_MKFS -p lock_nolock -o test_topology=0:0:0:0:0 $GFS_TGT], 0, [ignore], [ignore]) +AT_CHECK([$GFS_MKFS -p lock_nolock -o test_topology=7168:512:0:33553920:512 $GFS_TGT], 0, [ignore], [ignore]) +AT_CHECK([$GFS_MKFS -p lock_nolock -o test_topology=7168:512:8192:33553920:512 $GFS_TGT], 0, [ignore], [Warning: device is not properly aligned. This may harm performance. +]) +AT_CLEANUP -- 2.9.3
[Cluster-devel] [PATCH 3/5] mkfs.gfs2: Disable rgrp alignment when dev topology is unsuitable
If optimal_io_size is not a multiple of minimum_io_size then the values are not reliable swidth and sunit values, so disable rgrp stripe alignment in that case. Signed-off-by: Andrew Price --- gfs2/mkfs/main_mkfs.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c index db83d1c..e2c4a22 100644 --- a/gfs2/mkfs/main_mkfs.c +++ b/gfs2/mkfs/main_mkfs.c @@ -629,8 +629,14 @@ static lgfs2_rgrps_t rgs_init(struct mkfs_opts *opts, struct gfs2_sbd *sdp) al_off = opts->sunit / sdp->bsize; } } else if (opts->align) { - if ((opts->dev.minimum_io_size > opts->dev.physical_sector_size) && - (opts->dev.optimal_io_size > opts->dev.physical_sector_size)) { + if (opts->dev.optimal_io_size <= opts->dev.physical_sector_size || + opts->dev.minimum_io_size <= opts->dev.physical_sector_size || + (opts->dev.optimal_io_size % opts->dev.minimum_io_size) != 0) { + /* If optimal_io_size is not a multiple of minimum_io_size then + the values are not reliable swidth and sunit values, so disable + rgrp alignment */ + opts->align = 0; + } else { al_base = opts->dev.optimal_io_size / sdp->bsize; al_off = opts->dev.minimum_io_size / sdp->bsize; } -- 2.9.3
[Cluster-devel] [PATCH 0/5] mkfs.gfs2 device topology-related fixes
This patch set improves handling of various combinations of device topology values/io limits. It addresses two main problems: misaligned devices and devices which report strange optimal/minumum values which can not be used for resource group alignment. Some tests which failed before this patch set are added in patch 5. Andrew Price (5): mkfs.gfs2: Add an extended option for device topology testing mkfs.gfs2: Warn when device is misaligned mkfs.gfs2: Disable rgrp alignment when dev topology is unsuitable mkfs.gfs2: Disregard device topology if zero-sized sectors reported gfs2-utils tests: Add tests for device topology handling in mkfs.gfs2 gfs2/mkfs/main_mkfs.c | 65 ++- tests/mkfs.at | 8 +++ 2 files changed, 62 insertions(+), 11 deletions(-) -- 2.9.3
[Cluster-devel] [PATCH 2/5] mkfs.gfs2: Warn when device is misaligned
Normally one layer of the I/O stack will adjust for a non-zero alignment_offset value and expose it to upper layers as zero. If a misalignment is not accounted for, mkfs.gfs2 will see a non-zero alignment_offset and should report the misalignment to the user as it could harm performance. Signed-off-by: Andrew Price --- gfs2/mkfs/main_mkfs.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c index a2d2ba9..db83d1c 100644 --- a/gfs2/mkfs/main_mkfs.c +++ b/gfs2/mkfs/main_mkfs.c @@ -495,7 +495,11 @@ static unsigned choose_blocksize(struct mkfs_opts *opts) printf("optimal_io_size: %lu\n", dev->optimal_io_size); printf("physical_sector_size: %lu\n", dev->physical_sector_size); } - + if (dev->got_topol && dev->alignment_offset != 0) { + fprintf(stderr, + _("Warning: device is not properly aligned. This may harm performance.\n")); + dev->physical_sector_size = dev->logical_sector_size; + } if (!opts->got_bsize && dev->got_topol) { if (dev->optimal_io_size <= getpagesize() && dev->optimal_io_size >= dev->minimum_io_size) -- 2.9.3
[Cluster-devel] [PATCH 4/5] mkfs.gfs2: Disregard device topology if zero-sized sectors reported
Guard against physical_sector_size or logical_sector_size being zero. Normally this isn't a problem as the kernel will report these values as the device sector size but the checks are useful for when zero values are passed to mkfs. Signed-off-by: Andrew Price --- gfs2/mkfs/main_mkfs.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c index e2c4a22..0801a4b 100644 --- a/gfs2/mkfs/main_mkfs.c +++ b/gfs2/mkfs/main_mkfs.c @@ -136,6 +136,7 @@ struct mkfs_opts { unsigned got_lockproto:1; unsigned got_locktable:1; unsigned got_device:1; + unsigned got_topol:1; unsigned override:1; unsigned quiet:1; @@ -304,7 +305,8 @@ static void opt_parse_extended(char *str, struct mkfs_opts *opts) } else if (strcmp("test_topology", key) == 0) { if (parse_topology(opts, val) != 0) exit(-1); - opts->dev.got_topol = 1; + opts->got_topol = (opts->dev.logical_sector_size != 0 && + opts->dev.physical_sector_size != 0); } else if (strcmp("help", key) == 0) { print_ext_opts(); exit(0); @@ -487,20 +489,21 @@ static unsigned choose_blocksize(struct mkfs_opts *opts) unsigned int x; unsigned int bsize = opts->bsize; struct mkfs_dev *dev = &opts->dev; + int got_topol = (dev->got_topol || opts->got_topol); - if (dev->got_topol && opts->debug) { + if (got_topol && opts->debug) { printf("alignment_offset: %lu\n", dev->alignment_offset); printf("logical_sector_size: %lu\n", dev->logical_sector_size); printf("minimum_io_size: %lu\n", dev->minimum_io_size); printf("optimal_io_size: %lu\n", dev->optimal_io_size); printf("physical_sector_size: %lu\n", dev->physical_sector_size); } - if (dev->got_topol && dev->alignment_offset != 0) { + if (got_topol && dev->alignment_offset != 0) { fprintf(stderr, _("Warning: device is not properly aligned. This may harm performance.\n")); dev->physical_sector_size = dev->logical_sector_size; } - if (!opts->got_bsize && dev->got_topol) { + if (!opts->got_bsize && got_topol) { if (dev->optimal_io_size <= getpagesize() && dev->optimal_io_size >= dev->minimum_io_size) bsize = dev->optimal_io_size; @@ -887,7 +890,8 @@ static int probe_contents(struct mkfs_dev *dev) dev->minimum_io_size = blkid_topology_get_minimum_io_size(tp); dev->optimal_io_size = blkid_topology_get_optimal_io_size(tp); dev->physical_sector_size = blkid_topology_get_physical_sector_size(tp); - dev->got_topol = 1; + dev->got_topol = (dev->logical_sector_size != 0 && + dev->physical_sector_size != 0); } } @@ -944,7 +948,7 @@ int main(int argc, char *argv[]) opts_get(argc, argv, &opts); opts_check(&opts); - open_dev(&opts.dev, !opts.dev.got_topol); + open_dev(&opts.dev, !opts.got_topol); bsize = choose_blocksize(&opts); if (S_ISREG(opts.dev.stat.st_mode)) { -- 2.9.3
[Cluster-devel] [PATCH 1/5] mkfs.gfs2: Add an extended option for device topology testing
This adds an -o test_topology=v1:v2:v3:v4:v5 extended option which can be used to test how mkfs.gfs2 behaves with different device characteristics. The values it accepts correspond to: 1. alignment_offset 2. logical_sector_size 3. minimum_io_size 4. optimal_io_size 5. physical_sector_size Signed-off-by: Andrew Price --- gfs2/mkfs/main_mkfs.c | 39 ++- 1 file changed, 34 insertions(+), 5 deletions(-) diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c index 363878c..a2d2ba9 100644 --- a/gfs2/mkfs/main_mkfs.c +++ b/gfs2/mkfs/main_mkfs.c @@ -256,6 +256,33 @@ static unsigned parse_bool(struct mkfs_opts *opts, const char *key, const char * exit(-1); } +static int parse_topology(struct mkfs_opts *opts, char *str) +{ + char *opt; + unsigned i = 0; + unsigned long *topol[5] = { + &opts->dev.alignment_offset, + &opts->dev.logical_sector_size, + &opts->dev.minimum_io_size, + &opts->dev.optimal_io_size, + &opts->dev.physical_sector_size + }; + + while ((opt = strsep(&str, ":")) != NULL) { + if (i > 4) { + fprintf(stderr, "Too many topology values.\n"); + return 1; + } + *topol[i] = parse_ulong(opts, "test_topology", opt); + i++; + } + if (i < 5) { + fprintf(stderr, "Too few topology values.\n"); + return 1; + } + return 0; +} + static void opt_parse_extended(char *str, struct mkfs_opts *opts) { char *opt; @@ -274,6 +301,10 @@ static void opt_parse_extended(char *str, struct mkfs_opts *opts) opts->got_swidth = 1; } else if (strcmp("align", key) == 0) { opts->align = parse_bool(opts, "align", val); + } else if (strcmp("test_topology", key) == 0) { + if (parse_topology(opts, val) != 0) + exit(-1); + opts->dev.got_topol = 1; } else if (strcmp("help", key) == 0) { print_ext_opts(); exit(0); @@ -854,7 +885,7 @@ static int probe_contents(struct mkfs_dev *dev) return 0; } -static void open_dev(struct mkfs_dev *dev) +static void open_dev(struct mkfs_dev *dev, int withprobe) { int error; @@ -882,9 +913,7 @@ static void open_dev(struct mkfs_dev *dev) fprintf(stderr, _("'%s' is not a block device or regular file\n"), dev->path); exit(1); } - - error = probe_contents(dev); - if (error) + if (withprobe && (probe_contents(dev) != 0)) exit(1); } @@ -905,7 +934,7 @@ int main(int argc, char *argv[]) opts_get(argc, argv, &opts); opts_check(&opts); - open_dev(&opts.dev); + open_dev(&opts.dev, !opts.dev.got_topol); bsize = choose_blocksize(&opts); if (S_ISREG(opts.dev.stat.st_mode)) { -- 2.9.3
Re: [Cluster-devel] [GFS2 PATCH 3/3] GFS2: Reduce contention on gfs2_log_lock
Hi, On 25/01/17 19:58, Bob Peterson wrote: This patch reworks some of the log locks to provide less contention. This is a very complicated patch, with not a lot of explanation (I know there is more detail on the cover letter, but it should be on the patch itself, since the cover letter will not go into the git tree). I'm not sure I understand the reason for switching the locking order here. The original intent was simply to avoid taking the log lock in cases where the bh was already in the log. I would expect gfs2_trans_add_meta to land up looking something like this: lock_buffer(bh); if (buffer_pinned(bh)) { /* If the buffer is pinned, then we know it must have a bd attached and be in the log already */ set_bit(TR_TOUCHED, &tr->tr_flags); goto out; } gfs2_log_lock(sdp); rest of function unchanged here gfs2_log_unlock(sdp); out: unlock_buffer(bh); } That is unless the test in meta_lo_add() "if (!list_empty(&bd->bd_list))" is not equivalent to testing buffer_pinned() for some reason, but even if that is the case, all we need to do is find a better test than "if (buffer_pinned(bh))" I think, which at worst case might mean adding an additional flag to the bh which would follow the on/off the bd->bd_list state, Steve. Signed-off-by: Bob Peterson --- fs/gfs2/aops.c| 23 ++- fs/gfs2/glops.c | 4 ++-- fs/gfs2/log.c | 6 ++--- fs/gfs2/lops.c| 3 ++- fs/gfs2/meta_io.c | 11 + fs/gfs2/trans.c | 67 +-- 6 files changed, 51 insertions(+), 63 deletions(-) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index 5a6f52e..d3cd30d 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -986,23 +986,13 @@ static sector_t gfs2_bmap(struct address_space *mapping, sector_t lblock) static void gfs2_discard(struct gfs2_sbd *sdp, struct buffer_head *bh) { - struct gfs2_bufdata *bd; - lock_buffer(bh); - gfs2_log_lock(sdp); clear_buffer_dirty(bh); - bd = bh->b_private; - if (bd) { - if (!list_empty(&bd->bd_list) && !buffer_pinned(bh)) - list_del_init(&bd->bd_list); - else - gfs2_remove_from_journal(bh, REMOVE_JDATA); - } + gfs2_remove_from_journal(bh, REMOVE_JDATA); bh->b_bdev = NULL; clear_buffer_mapped(bh); clear_buffer_req(bh); clear_buffer_new(bh); - gfs2_log_unlock(sdp); unlock_buffer(bh); } @@ -1157,7 +1147,6 @@ int gfs2_releasepage(struct page *page, gfp_t gfp_mask) * on dirty buffers like we used to here again. */ - gfs2_log_lock(sdp); spin_lock(&sdp->sd_ail_lock); head = bh = page_buffers(page); do { @@ -1174,25 +1163,27 @@ int gfs2_releasepage(struct page *page, gfp_t gfp_mask) head = bh = page_buffers(page); do { + lock_buffer(bh); bd = bh->b_private; if (bd) { gfs2_assert_warn(sdp, bd->bd_bh == bh); + gfs2_log_lock(sdp); if (!list_empty(&bd->bd_list)) list_del_init(&bd->bd_list); + gfs2_log_unlock(sdp); bd->bd_bh = NULL; bh->b_private = NULL; - kmem_cache_free(gfs2_bufdata_cachep, bd); } - + unlock_buffer(bh); + if (bd) + kmem_cache_free(gfs2_bufdata_cachep, bd); bh = bh->b_this_page; } while (bh != head); - gfs2_log_unlock(sdp); return try_to_free_buffers(page); cannot_release: spin_unlock(&sdp->sd_ail_lock); - gfs2_log_unlock(sdp); return 0; } diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c index 5db59d4..fced99b 100644 --- a/fs/gfs2/glops.c +++ b/fs/gfs2/glops.c @@ -61,7 +61,6 @@ static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync, struct buffer_head *bh; const unsigned long b_state = (1UL << BH_Dirty)|(1UL << BH_Pinned)|(1UL << BH_Lock); - gfs2_log_lock(sdp); spin_lock(&sdp->sd_ail_lock); list_for_each_entry_safe_reverse(bd, tmp, head, bd_ail_gl_list) { if (nr_revokes == 0) @@ -72,12 +71,13 @@ static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync, continue; gfs2_ail_error(gl, bh); } + gfs2_log_lock(sdp); gfs2_trans_add_revoke(sdp, bd); + gfs2_log_unlock(sdp); nr_revokes--; } GLOCK_BUG_ON(gl, !fsync && atomic_read(&gl->gl_ail_count)); spin_unlock(&sdp->sd_ail_lock); - gfs2_log_unlock(sdp); } diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 4fb76c0..4d5f3a1 100644 --- a/fs/gfs2/log.c +++ b/fs
Re: [Cluster-devel] [GFS2 PATCH 2/3] GFS2: Inline function meta_lo_add
Hi, Acked-by: Steven Whitehouse Steve. On 25/01/17 19:58, Bob Peterson wrote: This patch simply combines function meta_lo_add with its only caller, trans_add_meta. This makes the code easier to read and will make it easier to reduce contention on gfs2_log_lock. Signed-off-by: Bob Peterson --- fs/gfs2/trans.c | 49 ++--- 1 file changed, 22 insertions(+), 27 deletions(-) diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c index 5d7f5b0..483c5a7 100644 --- a/fs/gfs2/trans.c +++ b/fs/gfs2/trans.c @@ -195,16 +195,35 @@ void gfs2_trans_add_data(struct gfs2_glock *gl, struct buffer_head *bh) unlock_buffer(bh); } -static void meta_lo_add(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd) +void gfs2_trans_add_meta(struct gfs2_glock *gl, struct buffer_head *bh) { + + struct gfs2_sbd *sdp = gl->gl_name.ln_sbd; + struct gfs2_bufdata *bd; struct gfs2_meta_header *mh; struct gfs2_trans *tr; enum gfs2_freeze_state state = atomic_read(&sdp->sd_freeze_state); + lock_buffer(bh); + gfs2_log_lock(sdp); + bd = bh->b_private; + if (bd == NULL) { + gfs2_log_unlock(sdp); + unlock_buffer(bh); + lock_page(bh->b_page); + if (bh->b_private == NULL) + bd = gfs2_alloc_bufdata(gl, bh, &gfs2_buf_lops); + else + bd = bh->b_private; + unlock_page(bh->b_page); + lock_buffer(bh); + gfs2_log_lock(sdp); + } + gfs2_assert(sdp, bd->bd_gl == gl); tr = current->journal_info; set_bit(TR_TOUCHED, &tr->tr_flags); if (!list_empty(&bd->bd_list)) - return; + goto out_unlock; set_bit(GLF_LFLUSH, &bd->bd_gl->gl_flags); set_bit(GLF_DIRTY, &bd->bd_gl->gl_flags); mh = (struct gfs2_meta_header *)bd->bd_bh->b_data; @@ -222,31 +241,7 @@ static void meta_lo_add(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd) mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid); list_add(&bd->bd_list, &tr->tr_buf); tr->tr_num_buf_new++; -} - -void gfs2_trans_add_meta(struct gfs2_glock *gl, struct buffer_head *bh) -{ - - struct gfs2_sbd *sdp = gl->gl_name.ln_sbd; - struct gfs2_bufdata *bd; - - lock_buffer(bh); - gfs2_log_lock(sdp); - bd = bh->b_private; - if (bd == NULL) { - gfs2_log_unlock(sdp); - unlock_buffer(bh); - lock_page(bh->b_page); - if (bh->b_private == NULL) - bd = gfs2_alloc_bufdata(gl, bh, &gfs2_buf_lops); - else - bd = bh->b_private; - unlock_page(bh->b_page); - lock_buffer(bh); - gfs2_log_lock(sdp); - } - gfs2_assert(sdp, bd->bd_gl == gl); - meta_lo_add(sdp, bd); +out_unlock: gfs2_log_unlock(sdp); unlock_buffer(bh); }
Re: [Cluster-devel] [GFS2 PATCH 1/3] GFS2: Switch tr_touched to flag in transaction
Hi, This looks like a good change to do anyway, irrespective of the other patches. Acked-by: Steven Whitehouse Steve. On 25/01/17 19:58, Bob Peterson wrote: This patch eliminates the int variable tr_touched in favor of a new flag in the transaction. This is a step toward reducing contention on the gfs2_log_lock spin_lock. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 10 +++--- fs/gfs2/log.c | 6 +++--- fs/gfs2/meta_io.c | 6 +++--- fs/gfs2/trans.c | 19 ++- 4 files changed, 23 insertions(+), 18 deletions(-) diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h index 00d8dc2..c45084a 100644 --- a/fs/gfs2/incore.h +++ b/fs/gfs2/incore.h @@ -470,15 +470,19 @@ struct gfs2_quota_data { struct rcu_head qd_rcu; }; +enum { + TR_TOUCHED = 1, + TR_ATTACHED = 2, + TR_ALLOCED = 3, +}; + struct gfs2_trans { unsigned long tr_ip; unsigned int tr_blocks; unsigned int tr_revokes; unsigned int tr_reserved; - unsigned int tr_touched:1; - unsigned int tr_attached:1; - unsigned int tr_alloced:1; + unsigned long tr_flags; unsigned int tr_num_buf_new; unsigned int tr_num_databuf_new; diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 5028a9d..4fb76c0 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -799,7 +799,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, static void gfs2_merge_trans(struct gfs2_trans *old, struct gfs2_trans *new) { - WARN_ON_ONCE(old->tr_attached != 1); + WARN_ON_ONCE(!test_bit(TR_ATTACHED, &old->tr_flags)); old->tr_num_buf_new += new->tr_num_buf_new; old->tr_num_databuf_new += new->tr_num_databuf_new; @@ -823,9 +823,9 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr) if (sdp->sd_log_tr) { gfs2_merge_trans(sdp->sd_log_tr, tr); } else if (tr->tr_num_buf_new || tr->tr_num_databuf_new) { - gfs2_assert_withdraw(sdp, tr->tr_alloced); + gfs2_assert_withdraw(sdp, test_bit(TR_ALLOCED, &tr->tr_flags)); sdp->sd_log_tr = tr; - tr->tr_attached = 1; + set_bit(TR_ATTACHED, &tr->tr_flags); } sdp->sd_log_commited_revoke += tr->tr_num_revoke - tr->tr_num_revoke_rm; diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c index 373639a5..a88a347 100644 --- a/fs/gfs2/meta_io.c +++ b/fs/gfs2/meta_io.c @@ -293,7 +293,7 @@ int gfs2_meta_read(struct gfs2_glock *gl, u64 blkno, int flags, wait_on_buffer(bh); if (unlikely(!buffer_uptodate(bh))) { struct gfs2_trans *tr = current->journal_info; - if (tr && tr->tr_touched) + if (tr && test_bit(TR_TOUCHED, &tr->tr_flags)) gfs2_io_error_bh(sdp, bh); brelse(bh); *bhp = NULL; @@ -320,7 +320,7 @@ int gfs2_meta_wait(struct gfs2_sbd *sdp, struct buffer_head *bh) if (!buffer_uptodate(bh)) { struct gfs2_trans *tr = current->journal_info; - if (tr && tr->tr_touched) + if (tr && test_bit(TR_TOUCHED, &tr->tr_flags)) gfs2_io_error_bh(sdp, bh); return -EIO; } @@ -346,7 +346,7 @@ void gfs2_remove_from_journal(struct buffer_head *bh, int meta) tr->tr_num_buf_rm++; else tr->tr_num_databuf_rm++; - tr->tr_touched = 1; + set_bit(TR_TOUCHED, &tr->tr_flags); was_pinned = 1; brelse(bh); } diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c index 0c1bde3..5d7f5b0 100644 --- a/fs/gfs2/trans.c +++ b/fs/gfs2/trans.c @@ -48,7 +48,7 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks, tr->tr_blocks = blocks; tr->tr_revokes = revokes; tr->tr_reserved = 1; - tr->tr_alloced = 1; + set_bit(TR_ALLOCED, &tr->tr_flags); if (blocks) tr->tr_reserved += 6 + blocks; if (revokes) @@ -78,7 +78,8 @@ static void gfs2_print_trans(const struct gfs2_trans *tr) { pr_warn("Transaction created at: %pSR\n", (void *)tr->tr_ip); pr_warn("blocks=%u revokes=%u reserved=%u touched=%u\n", - tr->tr_blocks, tr->tr_revokes, tr->tr_reserved, tr->tr_touched); + tr->tr_blocks, tr->tr_revokes, tr->tr_reserved, + test_bit(TR_TOUCHED, &tr->tr_flags)); pr_warn("Buf %u/%u Databuf %u/%u Revoke %u/%u\n", tr->tr_num_buf_new, tr->tr_num_buf_rm, tr->tr_num_databuf_new, tr->tr_num_databuf_rm, @@ -89,12 +90,12 @@ void gfs2_trans_end(struct gfs2_sbd *sdp) { struct gfs2_trans *tr = current->journal_info; s64 nbuf; - int alloced = tr->tr_alloced; + int alloced = test_bit(TR_ALLOCED, &tr->tr_flags); BUG_ON(!tr); current->journal_info = NULL; - if (!tr->tr_t
Re: [Cluster-devel] [PATCH] fs/dlm: Fix kernel memory disclosure
Hi, On 26/01/17 08:54, Vlad Tsyrklevich wrote: Hello, I wanted to ping the list and see if this could get a review. On Mon, Jan 9, 2017 at 8:27 PM, Vlad Tsyrklevich wrote: Clear the 'unused' field to avoid leaking memory to userland in copy_result_to_user(). Signed-off-by: Vlad Tsyrklevich --- fs/dlm/user.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/dlm/user.c b/fs/dlm/user.c index 1ce908c..0570711 100644 --- a/fs/dlm/user.c +++ b/fs/dlm/user.c @@ -138,6 +138,8 @@ static void compat_output(struct dlm_lock_result *res, res32->lksb.sb_flags = res->lksb.sb_flags; res32->lksb.sb_lkid = res->lksb.sb_lkid; res32->lksb.sb_lvbptr = (__u32)(long)res->lksb.sb_lvbptr; + + memset(&res32->unused, 0, sizeof(res32->unused)); } #endif -- 2.7.0 It looks like struct dlm_lksb32 has a hole in it, so it would be safer just to zero the whole of the dlm_lock_result32 before it is written to, rather than trying to find all the holes individually, even if slightly slower (I'm not sure it would be noticeable in reality though) Steve.
Re: [Cluster-devel] [PATCH] fs/dlm: Fix kernel memory disclosure
Hello, I wanted to ping the list and see if this could get a review. On Mon, Jan 9, 2017 at 8:27 PM, Vlad Tsyrklevich wrote: > Clear the 'unused' field to avoid leaking memory to userland in > copy_result_to_user(). > > Signed-off-by: Vlad Tsyrklevich > --- > fs/dlm/user.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/dlm/user.c b/fs/dlm/user.c > index 1ce908c..0570711 100644 > --- a/fs/dlm/user.c > +++ b/fs/dlm/user.c > @@ -138,6 +138,8 @@ static void compat_output(struct dlm_lock_result *res, > res32->lksb.sb_flags = res->lksb.sb_flags; > res32->lksb.sb_lkid = res->lksb.sb_lkid; > res32->lksb.sb_lvbptr = (__u32)(long)res->lksb.sb_lvbptr; > + > + memset(&res32->unused, 0, sizeof(res32->unused)); > } > #endif > > -- > 2.7.0 >