date:20170126

Re: [Cluster-devel] [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"

2017-01-26 Thread Theodore Ts'o

On Thu, Jan 26, 2017 at 08:44:55AM +0100, Michal Hocko wrote:
> > > I'm convinced the current series is OK, only real life will tell us 
> > > whether
> > > we missed something or not ;)
> > 
> > I would like to extend the changelog of "jbd2: mark the transaction
> > context with the scope GFP_NOFS context".
> > 
> > "
> > Please note that setups without journal do not suffer from potential
> > recursion problems and so they do not need the scope protection because
> > neither ->releasepage nor ->evict_inode (which are the only fs entry
> > points from the direct reclaim) can reenter a locked context which is
> > doing the allocation currently.
> > "
> 
> Could you comment on this Ted, please?

I guess   so there still is one way this could screw us, and it's this 
reason for GFP_NOFS:

- to prevent from stack overflows during the reclaim because
  the allocation is performed from a deep context already

The writepages call stack can be pretty deep.  (Especially if we're
using ext4 in no journal mode over, say, iSCSI.)

How much stack space can get consumed by a reclaim?

- Ted

Re: [Cluster-devel] [PATCH 0/5] mkfs.gfs2 device topology-related fixes

2017-01-26 Thread Bob Peterson

- Original Message -
| This patch set improves handling of various combinations of device topology
| values/io limits. It addresses two main problems: misaligned devices and
| devices which report strange optimal/minumum values which can not be used for
| resource group alignment. Some tests which failed before this patch set are
| added in patch 5.
| 
| Andrew Price (5):
|   mkfs.gfs2: Add an extended option for device topology testing
|   mkfs.gfs2: Warn when device is misaligned
|   mkfs.gfs2: Disable rgrp alignment when dev topology is unsuitable
|   mkfs.gfs2: Disregard device topology if zero-sized sectors reported
|   gfs2-utils tests: Add tests for device topology handling in mkfs.gfs2
| 
|  gfs2/mkfs/main_mkfs.c | 65
|  ++-
|  tests/mkfs.at |  8 +++
|  2 files changed, 62 insertions(+), 11 deletions(-)
| 
| --
| 2.9.3
| 
Hi,

Series ACK

Regards,

Bob Peterson
Red Hat File Systems

[Cluster-devel] [PATCH 5/5] gfs2-utils tests: Add tests for device topology handling in mkfs.gfs2

2017-01-26 Thread Andrew Price

Add some tests that exercise mkfs.gfs2 against various device i/o limits
based on odd values seen in the wild.

Signed-off-by: Andrew Price 
---
 tests/mkfs.at | 8 
 1 file changed, 8 insertions(+)

diff --git a/tests/mkfs.at b/tests/mkfs.at
index d3521c8..c026a76 100644
--- a/tests/mkfs.at
+++ b/tests/mkfs.at
@@ -101,3 +101,11 @@ AT_CHECK([$GFS_MKFS -p lock_dlm -t 
"quite_long_cluster_name_test_here:intec34p"
 AT_CHECK([$GFS_MKFS -p lock_dlm -t 
"financial_cluster:this_time_we_test_fs_naming_len" $GFS_TGT], 255, [ignore], 
[ignore])
 GFS_FSCK_CHECK([$GFS_MKFS -p lock_dlm -t 
"a_really_long_named_cluster_here:concurrently_lets_check_fs_len" $GFS_TGT])
 AT_CLEANUP
+
+AT_SETUP([Device i/o limits handling])
+AT_KEYWORDS(mkfs.gfs2 mkfs)
+AT_CHECK([$GFS_MKFS -p lock_nolock -o test_topology=0:0:0:0:0 $GFS_TGT], 0, 
[ignore], [ignore])
+AT_CHECK([$GFS_MKFS -p lock_nolock -o test_topology=7168:512:0:33553920:512 
$GFS_TGT], 0, [ignore], [ignore])
+AT_CHECK([$GFS_MKFS -p lock_nolock -o test_topology=7168:512:8192:33553920:512 
$GFS_TGT], 0, [ignore], [Warning: device is not properly aligned. This may harm 
performance.
+])
+AT_CLEANUP
-- 
2.9.3

[Cluster-devel] [PATCH 3/5] mkfs.gfs2: Disable rgrp alignment when dev topology is unsuitable

2017-01-26 Thread Andrew Price

If optimal_io_size is not a multiple of minimum_io_size then the values
are not reliable swidth and sunit values, so disable rgrp stripe
alignment in that case.

Signed-off-by: Andrew Price 
---
 gfs2/mkfs/main_mkfs.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c
index db83d1c..e2c4a22 100644
--- a/gfs2/mkfs/main_mkfs.c
+++ b/gfs2/mkfs/main_mkfs.c
@@ -629,8 +629,14 @@ static lgfs2_rgrps_t rgs_init(struct mkfs_opts *opts, 
struct gfs2_sbd *sdp)
al_off = opts->sunit / sdp->bsize;
}
} else if (opts->align) {
-   if ((opts->dev.minimum_io_size > 
opts->dev.physical_sector_size) &&
-   (opts->dev.optimal_io_size > 
opts->dev.physical_sector_size)) {
+   if (opts->dev.optimal_io_size <= opts->dev.physical_sector_size 
||
+   opts->dev.minimum_io_size <= opts->dev.physical_sector_size 
||
+   (opts->dev.optimal_io_size % opts->dev.minimum_io_size) != 
0) {
+   /* If optimal_io_size is not a multiple of 
minimum_io_size then
+  the values are not reliable swidth and sunit values, 
so disable
+  rgrp alignment */
+   opts->align = 0;
+   } else {
al_base = opts->dev.optimal_io_size / sdp->bsize;
al_off = opts->dev.minimum_io_size / sdp->bsize;
}
-- 
2.9.3

[Cluster-devel] [PATCH 0/5] mkfs.gfs2 device topology-related fixes

2017-01-26 Thread Andrew Price

This patch set improves handling of various combinations of device topology
values/io limits. It addresses two main problems: misaligned devices and
devices which report strange optimal/minumum values which can not be used for
resource group alignment. Some tests which failed before this patch set are
added in patch 5.

Andrew Price (5):
  mkfs.gfs2: Add an extended option for device topology testing
  mkfs.gfs2: Warn when device is misaligned
  mkfs.gfs2: Disable rgrp alignment when dev topology is unsuitable
  mkfs.gfs2: Disregard device topology if zero-sized sectors reported
  gfs2-utils tests: Add tests for device topology handling in mkfs.gfs2

 gfs2/mkfs/main_mkfs.c | 65 ++-
 tests/mkfs.at |  8 +++
 2 files changed, 62 insertions(+), 11 deletions(-)

-- 
2.9.3

[Cluster-devel] [PATCH 2/5] mkfs.gfs2: Warn when device is misaligned

2017-01-26 Thread Andrew Price

Normally one layer of the I/O stack will adjust for a non-zero
alignment_offset value and expose it to upper layers as zero. If a
misalignment is not accounted for, mkfs.gfs2 will see a non-zero
alignment_offset and should report the misalignment to the user as it
could harm performance.

Signed-off-by: Andrew Price 
---
 gfs2/mkfs/main_mkfs.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c
index a2d2ba9..db83d1c 100644
--- a/gfs2/mkfs/main_mkfs.c
+++ b/gfs2/mkfs/main_mkfs.c
@@ -495,7 +495,11 @@ static unsigned choose_blocksize(struct mkfs_opts *opts)
printf("optimal_io_size: %lu\n", dev->optimal_io_size);
printf("physical_sector_size: %lu\n", 
dev->physical_sector_size);
}
-
+   if (dev->got_topol && dev->alignment_offset != 0) {
+   fprintf(stderr,
+ _("Warning: device is not properly aligned. This may harm 
performance.\n"));
+   dev->physical_sector_size = dev->logical_sector_size;
+   }
if (!opts->got_bsize && dev->got_topol) {
if (dev->optimal_io_size <= getpagesize() &&
dev->optimal_io_size >= dev->minimum_io_size)
-- 
2.9.3

[Cluster-devel] [PATCH 4/5] mkfs.gfs2: Disregard device topology if zero-sized sectors reported

2017-01-26 Thread Andrew Price

Guard against physical_sector_size or logical_sector_size being zero.
Normally this isn't a problem as the kernel will report these values as
the device sector size but the checks are useful for when zero values
are passed to mkfs.

Signed-off-by: Andrew Price 
---
 gfs2/mkfs/main_mkfs.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c
index e2c4a22..0801a4b 100644
--- a/gfs2/mkfs/main_mkfs.c
+++ b/gfs2/mkfs/main_mkfs.c
@@ -136,6 +136,7 @@ struct mkfs_opts {
unsigned got_lockproto:1;
unsigned got_locktable:1;
unsigned got_device:1;
+   unsigned got_topol:1;
 
unsigned override:1;
unsigned quiet:1;
@@ -304,7 +305,8 @@ static void opt_parse_extended(char *str, struct mkfs_opts 
*opts)
} else if (strcmp("test_topology", key) == 0) {
if (parse_topology(opts, val) != 0)
exit(-1);
-   opts->dev.got_topol = 1;
+   opts->got_topol = (opts->dev.logical_sector_size != 0 &&
+  opts->dev.physical_sector_size != 0);
} else if (strcmp("help", key) == 0) {
print_ext_opts();
exit(0);
@@ -487,20 +489,21 @@ static unsigned choose_blocksize(struct mkfs_opts *opts)
unsigned int x;
unsigned int bsize = opts->bsize;
struct mkfs_dev *dev = &opts->dev;
+   int got_topol = (dev->got_topol || opts->got_topol);
 
-   if (dev->got_topol && opts->debug) {
+   if (got_topol && opts->debug) {
printf("alignment_offset: %lu\n", dev->alignment_offset);
printf("logical_sector_size: %lu\n", dev->logical_sector_size);
printf("minimum_io_size: %lu\n", dev->minimum_io_size);
printf("optimal_io_size: %lu\n", dev->optimal_io_size);
printf("physical_sector_size: %lu\n", 
dev->physical_sector_size);
}
-   if (dev->got_topol && dev->alignment_offset != 0) {
+   if (got_topol && dev->alignment_offset != 0) {
fprintf(stderr,
  _("Warning: device is not properly aligned. This may harm 
performance.\n"));
dev->physical_sector_size = dev->logical_sector_size;
}
-   if (!opts->got_bsize && dev->got_topol) {
+   if (!opts->got_bsize && got_topol) {
if (dev->optimal_io_size <= getpagesize() &&
dev->optimal_io_size >= dev->minimum_io_size)
bsize = dev->optimal_io_size;
@@ -887,7 +890,8 @@ static int probe_contents(struct mkfs_dev *dev)
dev->minimum_io_size = 
blkid_topology_get_minimum_io_size(tp);
dev->optimal_io_size = 
blkid_topology_get_optimal_io_size(tp);
dev->physical_sector_size = 
blkid_topology_get_physical_sector_size(tp);
-   dev->got_topol = 1;
+   dev->got_topol = (dev->logical_sector_size != 0 &&
+ dev->physical_sector_size != 0);
}
}
 
@@ -944,7 +948,7 @@ int main(int argc, char *argv[])
opts_get(argc, argv, &opts);
opts_check(&opts);
 
-   open_dev(&opts.dev, !opts.dev.got_topol);
+   open_dev(&opts.dev, !opts.got_topol);
bsize = choose_blocksize(&opts);
 
if (S_ISREG(opts.dev.stat.st_mode)) {
-- 
2.9.3

[Cluster-devel] [PATCH 1/5] mkfs.gfs2: Add an extended option for device topology testing

2017-01-26 Thread Andrew Price

This adds an -o test_topology=v1:v2:v3:v4:v5 extended option which can
be used to test how mkfs.gfs2 behaves with different device
characteristics. The values it accepts correspond to:

1. alignment_offset
2. logical_sector_size
3. minimum_io_size
4. optimal_io_size
5. physical_sector_size

Signed-off-by: Andrew Price 
---
 gfs2/mkfs/main_mkfs.c | 39 ++-
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/gfs2/mkfs/main_mkfs.c b/gfs2/mkfs/main_mkfs.c
index 363878c..a2d2ba9 100644
--- a/gfs2/mkfs/main_mkfs.c
+++ b/gfs2/mkfs/main_mkfs.c
@@ -256,6 +256,33 @@ static unsigned parse_bool(struct mkfs_opts *opts, const 
char *key, const char *
exit(-1);
 }
 
+static int parse_topology(struct mkfs_opts *opts, char *str)
+{
+   char *opt;
+   unsigned i = 0;
+   unsigned long *topol[5] = {
+   &opts->dev.alignment_offset,
+   &opts->dev.logical_sector_size,
+   &opts->dev.minimum_io_size,
+   &opts->dev.optimal_io_size,
+   &opts->dev.physical_sector_size
+   };
+
+   while ((opt = strsep(&str, ":")) != NULL) {
+   if (i > 4) {
+   fprintf(stderr, "Too many topology values.\n");
+   return 1;
+   }
+   *topol[i] = parse_ulong(opts, "test_topology", opt);
+   i++;
+   }
+   if (i < 5) {
+   fprintf(stderr, "Too few topology values.\n");
+   return 1;
+   }
+   return 0;
+}
+
 static void opt_parse_extended(char *str, struct mkfs_opts *opts)
 {
char *opt;
@@ -274,6 +301,10 @@ static void opt_parse_extended(char *str, struct mkfs_opts 
*opts)
opts->got_swidth = 1;
} else if (strcmp("align", key) == 0) {
opts->align = parse_bool(opts, "align", val);
+   } else if (strcmp("test_topology", key) == 0) {
+   if (parse_topology(opts, val) != 0)
+   exit(-1);
+   opts->dev.got_topol = 1;
} else if (strcmp("help", key) == 0) {
print_ext_opts();
exit(0);
@@ -854,7 +885,7 @@ static int probe_contents(struct mkfs_dev *dev)
return 0;
 }
 
-static void open_dev(struct mkfs_dev *dev)
+static void open_dev(struct mkfs_dev *dev, int withprobe)
 {
int error;
 
@@ -882,9 +913,7 @@ static void open_dev(struct mkfs_dev *dev)
fprintf(stderr, _("'%s' is not a block device or regular 
file\n"), dev->path);
exit(1);
}
-
-   error = probe_contents(dev);
-   if (error)
+   if (withprobe && (probe_contents(dev) != 0))
exit(1);
 }
 
@@ -905,7 +934,7 @@ int main(int argc, char *argv[])
opts_get(argc, argv, &opts);
opts_check(&opts);
 
-   open_dev(&opts.dev);
+   open_dev(&opts.dev, !opts.dev.got_topol);
bsize = choose_blocksize(&opts);
 
if (S_ISREG(opts.dev.stat.st_mode)) {
-- 
2.9.3

Re: [Cluster-devel] [GFS2 PATCH 3/3] GFS2: Reduce contention on gfs2_log_lock

2017-01-26 Thread Steven Whitehouse


Hi,


On 25/01/17 19:58, Bob Peterson wrote:

This patch reworks some of the log locks to provide less contention.
This is a very complicated patch, with not a lot of explanation (I know 
there is more detail on the cover letter, but it should be on the patch 
itself, since the cover letter will not go into the git tree). I'm not 
sure I understand the reason for switching the locking order here. The 
original intent was simply to avoid taking the log lock in cases where 
the bh was already in the log.


I would expect gfs2_trans_add_meta to land up looking something like this:

lock_buffer(bh);
if (buffer_pinned(bh)) {
/* If the buffer is pinned, then we know it must have a bd 
attached and be in the log already */

set_bit(TR_TOUCHED, &tr->tr_flags);
goto out;
}
gfs2_log_lock(sdp);

 rest of function unchanged here

gfs2_log_unlock(sdp);
out:
unlock_buffer(bh);
}

That is unless the test in meta_lo_add() "if 
(!list_empty(&bd->bd_list))" is not equivalent to testing 
buffer_pinned() for some reason, but even if that is the case, all we 
need to do is find a better test than "if (buffer_pinned(bh))" I think, 
which at worst case might mean adding an additional flag to the bh which 
would follow the on/off the bd->bd_list state,


Steve.


Signed-off-by: Bob Peterson 
---
  fs/gfs2/aops.c| 23 ++-
  fs/gfs2/glops.c   |  4 ++--
  fs/gfs2/log.c |  6 ++---
  fs/gfs2/lops.c|  3 ++-
  fs/gfs2/meta_io.c | 11 +
  fs/gfs2/trans.c   | 67 +--
  6 files changed, 51 insertions(+), 63 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 5a6f52e..d3cd30d 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -986,23 +986,13 @@ static sector_t gfs2_bmap(struct address_space *mapping, 
sector_t lblock)
  
  static void gfs2_discard(struct gfs2_sbd *sdp, struct buffer_head *bh)

  {
-   struct gfs2_bufdata *bd;
-
lock_buffer(bh);
-   gfs2_log_lock(sdp);
clear_buffer_dirty(bh);
-   bd = bh->b_private;
-   if (bd) {
-   if (!list_empty(&bd->bd_list) && !buffer_pinned(bh))
-   list_del_init(&bd->bd_list);
-   else
-   gfs2_remove_from_journal(bh, REMOVE_JDATA);
-   }
+   gfs2_remove_from_journal(bh, REMOVE_JDATA);
bh->b_bdev = NULL;
clear_buffer_mapped(bh);
clear_buffer_req(bh);
clear_buffer_new(bh);
-   gfs2_log_unlock(sdp);
unlock_buffer(bh);
  }
  
@@ -1157,7 +1147,6 @@ int gfs2_releasepage(struct page *page, gfp_t gfp_mask)

 * on dirty buffers like we used to here again.
 */
  
-	gfs2_log_lock(sdp);

spin_lock(&sdp->sd_ail_lock);
head = bh = page_buffers(page);
do {
@@ -1174,25 +1163,27 @@ int gfs2_releasepage(struct page *page, gfp_t gfp_mask)
  
  	head = bh = page_buffers(page);

do {
+   lock_buffer(bh);
bd = bh->b_private;
if (bd) {
gfs2_assert_warn(sdp, bd->bd_bh == bh);
+   gfs2_log_lock(sdp);
if (!list_empty(&bd->bd_list))
list_del_init(&bd->bd_list);
+   gfs2_log_unlock(sdp);
bd->bd_bh = NULL;
bh->b_private = NULL;
-   kmem_cache_free(gfs2_bufdata_cachep, bd);
}
-
+   unlock_buffer(bh);
+   if (bd)
+   kmem_cache_free(gfs2_bufdata_cachep, bd);
bh = bh->b_this_page;
} while (bh != head);
-   gfs2_log_unlock(sdp);
  
  	return try_to_free_buffers(page);
  
  cannot_release:

spin_unlock(&sdp->sd_ail_lock);
-   gfs2_log_unlock(sdp);
return 0;
  }
  
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c

index 5db59d4..fced99b 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -61,7 +61,6 @@ static void __gfs2_ail_flush(struct gfs2_glock *gl, bool 
fsync,
struct buffer_head *bh;
const unsigned long b_state = (1UL << BH_Dirty)|(1UL << BH_Pinned)|(1UL 
<< BH_Lock);
  
-	gfs2_log_lock(sdp);

spin_lock(&sdp->sd_ail_lock);
list_for_each_entry_safe_reverse(bd, tmp, head, bd_ail_gl_list) {
if (nr_revokes == 0)
@@ -72,12 +71,13 @@ static void __gfs2_ail_flush(struct gfs2_glock *gl, bool 
fsync,
continue;
gfs2_ail_error(gl, bh);
}
+   gfs2_log_lock(sdp);
gfs2_trans_add_revoke(sdp, bd);
+   gfs2_log_unlock(sdp);
nr_revokes--;
}
GLOCK_BUG_ON(gl, !fsync && atomic_read(&gl->gl_ail_count));
spin_unlock(&sdp->sd_ail_lock);
-   gfs2_log_unlock(sdp);
  }
  
  
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c

index 4fb76c0..4d5f3a1 100644
--- a/fs/gfs2/log.c
+++ b/fs

Re: [Cluster-devel] [GFS2 PATCH 2/3] GFS2: Inline function meta_lo_add

2017-01-26 Thread Steven Whitehouse


Hi,

Acked-by: Steven Whitehouse 

Steve.


On 25/01/17 19:58, Bob Peterson wrote:

This patch simply combines function meta_lo_add with its only
caller, trans_add_meta. This makes the code easier to read and
will make it easier to reduce contention on gfs2_log_lock.

Signed-off-by: Bob Peterson 
---
  fs/gfs2/trans.c | 49 ++---
  1 file changed, 22 insertions(+), 27 deletions(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 5d7f5b0..483c5a7 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -195,16 +195,35 @@ void gfs2_trans_add_data(struct gfs2_glock *gl, struct 
buffer_head *bh)
unlock_buffer(bh);
  }
  
-static void meta_lo_add(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd)

+void gfs2_trans_add_meta(struct gfs2_glock *gl, struct buffer_head *bh)
  {
+
+   struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
+   struct gfs2_bufdata *bd;
struct gfs2_meta_header *mh;
struct gfs2_trans *tr;
enum gfs2_freeze_state state = atomic_read(&sdp->sd_freeze_state);
  
+	lock_buffer(bh);

+   gfs2_log_lock(sdp);
+   bd = bh->b_private;
+   if (bd == NULL) {
+   gfs2_log_unlock(sdp);
+   unlock_buffer(bh);
+   lock_page(bh->b_page);
+   if (bh->b_private == NULL)
+   bd = gfs2_alloc_bufdata(gl, bh, &gfs2_buf_lops);
+   else
+   bd = bh->b_private;
+   unlock_page(bh->b_page);
+   lock_buffer(bh);
+   gfs2_log_lock(sdp);
+   }
+   gfs2_assert(sdp, bd->bd_gl == gl);
tr = current->journal_info;
set_bit(TR_TOUCHED, &tr->tr_flags);
if (!list_empty(&bd->bd_list))
-   return;
+   goto out_unlock;
set_bit(GLF_LFLUSH, &bd->bd_gl->gl_flags);
set_bit(GLF_DIRTY, &bd->bd_gl->gl_flags);
mh = (struct gfs2_meta_header *)bd->bd_bh->b_data;
@@ -222,31 +241,7 @@ static void meta_lo_add(struct gfs2_sbd *sdp, struct 
gfs2_bufdata *bd)
mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid);
list_add(&bd->bd_list, &tr->tr_buf);
tr->tr_num_buf_new++;
-}
-
-void gfs2_trans_add_meta(struct gfs2_glock *gl, struct buffer_head *bh)
-{
-
-   struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
-   struct gfs2_bufdata *bd;
-
-   lock_buffer(bh);
-   gfs2_log_lock(sdp);
-   bd = bh->b_private;
-   if (bd == NULL) {
-   gfs2_log_unlock(sdp);
-   unlock_buffer(bh);
-   lock_page(bh->b_page);
-   if (bh->b_private == NULL)
-   bd = gfs2_alloc_bufdata(gl, bh, &gfs2_buf_lops);
-   else
-   bd = bh->b_private;
-   unlock_page(bh->b_page);
-   lock_buffer(bh);
-   gfs2_log_lock(sdp);
-   }
-   gfs2_assert(sdp, bd->bd_gl == gl);
-   meta_lo_add(sdp, bd);
+out_unlock:
gfs2_log_unlock(sdp);
unlock_buffer(bh);
  }

Re: [Cluster-devel] [GFS2 PATCH 1/3] GFS2: Switch tr_touched to flag in transaction

2017-01-26 Thread Steven Whitehouse


Hi,

This looks like a good change to do anyway, irrespective of the other 
patches.


Acked-by: Steven Whitehouse 

Steve.


On 25/01/17 19:58, Bob Peterson wrote:

This patch eliminates the int variable tr_touched in favor of a
new flag in the transaction. This is a step toward reducing contention
on the gfs2_log_lock spin_lock.

Signed-off-by: Bob Peterson 
---
  fs/gfs2/incore.h  | 10 +++---
  fs/gfs2/log.c |  6 +++---
  fs/gfs2/meta_io.c |  6 +++---
  fs/gfs2/trans.c   | 19 ++-
  4 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 00d8dc2..c45084a 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -470,15 +470,19 @@ struct gfs2_quota_data {
struct rcu_head qd_rcu;
  };
  
+enum {

+   TR_TOUCHED = 1,
+   TR_ATTACHED = 2,
+   TR_ALLOCED = 3,
+};
+
  struct gfs2_trans {
unsigned long tr_ip;
  
  	unsigned int tr_blocks;

unsigned int tr_revokes;
unsigned int tr_reserved;
-   unsigned int tr_touched:1;
-   unsigned int tr_attached:1;
-   unsigned int tr_alloced:1;
+   unsigned long tr_flags;
  
  	unsigned int tr_num_buf_new;

unsigned int tr_num_databuf_new;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 5028a9d..4fb76c0 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -799,7 +799,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock 
*gl,
  
  static void gfs2_merge_trans(struct gfs2_trans *old, struct gfs2_trans *new)

  {
-   WARN_ON_ONCE(old->tr_attached != 1);
+   WARN_ON_ONCE(!test_bit(TR_ATTACHED, &old->tr_flags));
  
  	old->tr_num_buf_new	+= new->tr_num_buf_new;

old->tr_num_databuf_new  += new->tr_num_databuf_new;
@@ -823,9 +823,9 @@ static void log_refund(struct gfs2_sbd *sdp, struct 
gfs2_trans *tr)
if (sdp->sd_log_tr) {
gfs2_merge_trans(sdp->sd_log_tr, tr);
} else if (tr->tr_num_buf_new || tr->tr_num_databuf_new) {
-   gfs2_assert_withdraw(sdp, tr->tr_alloced);
+   gfs2_assert_withdraw(sdp, test_bit(TR_ALLOCED, &tr->tr_flags));
sdp->sd_log_tr = tr;
-   tr->tr_attached = 1;
+   set_bit(TR_ATTACHED, &tr->tr_flags);
}
  
  	sdp->sd_log_commited_revoke += tr->tr_num_revoke - tr->tr_num_revoke_rm;

diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index 373639a5..a88a347 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -293,7 +293,7 @@ int gfs2_meta_read(struct gfs2_glock *gl, u64 blkno, int 
flags,
wait_on_buffer(bh);
if (unlikely(!buffer_uptodate(bh))) {
struct gfs2_trans *tr = current->journal_info;
-   if (tr && tr->tr_touched)
+   if (tr && test_bit(TR_TOUCHED, &tr->tr_flags))
gfs2_io_error_bh(sdp, bh);
brelse(bh);
*bhp = NULL;
@@ -320,7 +320,7 @@ int gfs2_meta_wait(struct gfs2_sbd *sdp, struct buffer_head 
*bh)
  
  	if (!buffer_uptodate(bh)) {

struct gfs2_trans *tr = current->journal_info;
-   if (tr && tr->tr_touched)
+   if (tr && test_bit(TR_TOUCHED, &tr->tr_flags))
gfs2_io_error_bh(sdp, bh);
return -EIO;
}
@@ -346,7 +346,7 @@ void gfs2_remove_from_journal(struct buffer_head *bh, int 
meta)
tr->tr_num_buf_rm++;
else
tr->tr_num_databuf_rm++;
-   tr->tr_touched = 1;
+   set_bit(TR_TOUCHED, &tr->tr_flags);
was_pinned = 1;
brelse(bh);
}
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 0c1bde3..5d7f5b0 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -48,7 +48,7 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int 
blocks,
tr->tr_blocks = blocks;
tr->tr_revokes = revokes;
tr->tr_reserved = 1;
-   tr->tr_alloced = 1;
+   set_bit(TR_ALLOCED, &tr->tr_flags);
if (blocks)
tr->tr_reserved += 6 + blocks;
if (revokes)
@@ -78,7 +78,8 @@ static void gfs2_print_trans(const struct gfs2_trans *tr)
  {
pr_warn("Transaction created at: %pSR\n", (void *)tr->tr_ip);
pr_warn("blocks=%u revokes=%u reserved=%u touched=%u\n",
-   tr->tr_blocks, tr->tr_revokes, tr->tr_reserved, tr->tr_touched);
+   tr->tr_blocks, tr->tr_revokes, tr->tr_reserved,
+   test_bit(TR_TOUCHED, &tr->tr_flags));
pr_warn("Buf %u/%u Databuf %u/%u Revoke %u/%u\n",
tr->tr_num_buf_new, tr->tr_num_buf_rm,
tr->tr_num_databuf_new, tr->tr_num_databuf_rm,
@@ -89,12 +90,12 @@ void gfs2_trans_end(struct gfs2_sbd *sdp)
  {
struct gfs2_trans *tr = current->journal_info;
s64 nbuf;
-   int alloced = tr->tr_alloced;
+   int alloced = test_bit(TR_ALLOCED, &tr->tr_flags);
  
  	BUG_ON(!tr);

current->journal_info = NULL;
  
-	if (!tr->tr_t

Re: [Cluster-devel] [PATCH] fs/dlm: Fix kernel memory disclosure

2017-01-26 Thread Steven Whitehouse


Hi,


On 26/01/17 08:54, Vlad Tsyrklevich wrote:

Hello, I wanted to ping the list and see if this could get a review.

On Mon, Jan 9, 2017 at 8:27 PM, Vlad Tsyrklevich  wrote:

Clear the 'unused' field to avoid leaking memory to userland in
copy_result_to_user().

Signed-off-by: Vlad Tsyrklevich 
---
  fs/dlm/user.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/fs/dlm/user.c b/fs/dlm/user.c
index 1ce908c..0570711 100644
--- a/fs/dlm/user.c
+++ b/fs/dlm/user.c
@@ -138,6 +138,8 @@ static void compat_output(struct dlm_lock_result *res,
 res32->lksb.sb_flags = res->lksb.sb_flags;
 res32->lksb.sb_lkid = res->lksb.sb_lkid;
 res32->lksb.sb_lvbptr = (__u32)(long)res->lksb.sb_lvbptr;
+
+   memset(&res32->unused, 0, sizeof(res32->unused));
  }
  #endif

--
2.7.0

It looks like struct dlm_lksb32 has a hole in it, so it would be safer 
just to zero the whole of the dlm_lock_result32 before it is written to, 
rather than trying to find all the holes individually, even if slightly 
slower (I'm not sure it would be noticeable in reality though)


Steve.

Re: [Cluster-devel] [PATCH] fs/dlm: Fix kernel memory disclosure

2017-01-26 Thread Vlad Tsyrklevich

Hello, I wanted to ping the list and see if this could get a review.

On Mon, Jan 9, 2017 at 8:27 PM, Vlad Tsyrklevich  wrote:
> Clear the 'unused' field to avoid leaking memory to userland in
> copy_result_to_user().
>
> Signed-off-by: Vlad Tsyrklevich 
> ---
>  fs/dlm/user.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/fs/dlm/user.c b/fs/dlm/user.c
> index 1ce908c..0570711 100644
> --- a/fs/dlm/user.c
> +++ b/fs/dlm/user.c
> @@ -138,6 +138,8 @@ static void compat_output(struct dlm_lock_result *res,
> res32->lksb.sb_flags = res->lksb.sb_flags;
> res32->lksb.sb_lkid = res->lksb.sb_lkid;
> res32->lksb.sb_lvbptr = (__u32)(long)res->lksb.sb_lvbptr;
> +
> +   memset(&res32->unused, 0, sizeof(res32->unused));
>  }
>  #endif
>
> --
> 2.7.0
>

Re: [Cluster-devel] [PATCH 8/8] Revert "ext4: fix wrong gfp type under transaction"

Re: [Cluster-devel] [PATCH 0/5] mkfs.gfs2 device topology-related fixes

[Cluster-devel] [PATCH 5/5] gfs2-utils tests: Add tests for device topology handling in mkfs.gfs2

[Cluster-devel] [PATCH 3/5] mkfs.gfs2: Disable rgrp alignment when dev topology is unsuitable

[Cluster-devel] [PATCH 0/5] mkfs.gfs2 device topology-related fixes

[Cluster-devel] [PATCH 2/5] mkfs.gfs2: Warn when device is misaligned

[Cluster-devel] [PATCH 4/5] mkfs.gfs2: Disregard device topology if zero-sized sectors reported

[Cluster-devel] [PATCH 1/5] mkfs.gfs2: Add an extended option for device topology testing

Re: [Cluster-devel] [GFS2 PATCH 3/3] GFS2: Reduce contention on gfs2_log_lock

Re: [Cluster-devel] [GFS2 PATCH 2/3] GFS2: Inline function meta_lo_add

Re: [Cluster-devel] [GFS2 PATCH 1/3] GFS2: Switch tr_touched to flag in transaction

Re: [Cluster-devel] [PATCH] fs/dlm: Fix kernel memory disclosure

Re: [Cluster-devel] [PATCH] fs/dlm: Fix kernel memory disclosure

13 matches

Site Navigation

Mail list logo

Footer information