Re: [PATCH] Btrfs: skip commit transaction if we don't have enough pinned bytes

2017-06-03 Thread Holger Hoffstätte
On 06/02/17 20:14, Omar Sandoval wrote:
> On Fri, May 19, 2017 at 11:39:15AM -0600, Liu Bo wrote:
>> We commit transaction in order to reclaim space from pinned bytes because
>> it could process delayed refs, and in may_commit_transaction(), we check
>> first if pinned bytes are enough for the required space, we then check if
>> that plus bytes reserved for delayed insert are enough for the required
>> space.
>>
>> This changes the code to the above logic.
>>
>> Signed-off-by: Liu Bo 
>> ---
>>  fs/btrfs/extent-tree.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index e390451c72e6..bded1ddd1bb6 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -4837,7 +4837,7 @@ static int may_commit_transaction(struct btrfs_fs_info 
>> *fs_info,
>>  
>>  spin_lock(_rsv->lock);
>>  if (percpu_counter_compare(_info->total_bytes_pinned,
>> -   bytes - delayed_rsv->size) >= 0) {
>> +   bytes - delayed_rsv->size) < 0) {
>>  spin_unlock(_rsv->lock);
>>  return -ENOSPC;
>>  }
> 
> I found this bug in my latest enospc investigation, too. However, the
> total_bytes_pinned counter is pretty broken. E.g., on my laptop:
> 
> $ sync; grep -H '' /sys/fs/btrfs/*/allocation/*/total_bytes_pinned
> /sys/fs/btrfs/f31cc926-37d3-442d-b50a-10c62d47badc/allocation/data/total_bytes_pinned:48693501952
> /sys/fs/btrfs/f31cc926-37d3-442d-b50a-10c62d47badc/allocation/metadata/total_bytes_pinned:-258146304
> /sys/fs/btrfs/f31cc926-37d3-442d-b50a-10c62d47badc/allocation/system/total_bytes_pinned:0
> 
> I have a patch to fix it that I haven't cleaned up yet, below. Without
> it, Bo's fix will probably cause early ENOSPCs. Dave, should we pull
> Bo's patch out of for-next? In any case, I'll get my fix sent out.
[..patch snipped..]

This made me curious since I also found the underflowed metadata counter
on my system. I tried to reproduce it after un/remount (to reset the counters)
and noticed that I could reliably cause the metadata underflow by defragging
a few large subvolumes, which apparently creates enough extent tree movement
that the counter quickly goes bananas. It took some backporting, but with your
patch applied I can defrag away and have so far not seen a single counter
underflow; all of data/metadata/system are always positive after writes and
then reliably drop to 0 after sync/commit. Nice!
Just thought you'd want to know.

Holger
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4] btrfs-progs: btrfs-convert: Add larger device support

2017-06-03 Thread Lakshmipathi.G
With larger file system (in this case its 22TB), ext2fs_open() returns
EXT2_ET_CANT_USE_LEGACY_BITMAPS error message with ext2fs_read_block_bitmap().

To overcome this issue, (a) we need pass EXT2_FLAG_64BITS flag with ext2fs_open.
(b) use 64-bit functions like ext2fs_get_block_bitmap_range2,
ext2fs_inode_data_blocks2,ext2fs_read_ext_attr2. (c) use 64bit types with
btrfs_convert_context fields.

bug: https://bugzilla.kernel.org/show_bug.cgi?id=194795
Signed-off-by: Lakshmipathi.G 
---
 convert/common.h  |  8 
 convert/source-ext2.c | 11 ++-
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/convert/common.h b/convert/common.h
index 0d3adea..2f4ea48 100644
--- a/convert/common.h
+++ b/convert/common.h
@@ -30,10 +30,10 @@ struct btrfs_mkfs_config;
 
 struct btrfs_convert_context {
u32 blocksize;
-   u32 first_data_block;
-   u32 block_count;
-   u32 inodes_count;
-   u32 free_inodes_count;
+   u64 first_data_block;
+   u64 block_count;
+   u64 inodes_count;
+   u64 free_inodes_count;
u64 total_bytes;
char *volume_name;
const struct btrfs_convert_operations *convert_ops;
diff --git a/convert/source-ext2.c b/convert/source-ext2.c
index 1b0576b..275cb89 100644
--- a/convert/source-ext2.c
+++ b/convert/source-ext2.c
@@ -34,8 +34,9 @@ static int ext2_open_fs(struct btrfs_convert_context *cctx, 
const char *name)
ext2_filsys ext2_fs;
ext2_ino_t ino;
u32 ro_feature;
+   int open_flag = EXT2_FLAG_SOFTSUPP_FEATURES | EXT2_FLAG_64BITS;
 
-   ret = ext2fs_open(name, 0, 0, 0, unix_io_manager, _fs);
+   ret = ext2fs_open(name, open_flag, 0, 0, unix_io_manager, _fs);
if (ret) {
fprintf(stderr, "ext2fs_open: %s\n", error_message(ret));
return -1;
@@ -148,7 +149,7 @@ static int ext2_read_used_space(struct 
btrfs_convert_context *cctx)
return -ENOMEM;
 
for (i = 0; i < fs->group_desc_count; i++) {
-   ret = ext2fs_get_block_bitmap_range(fs->block_map, blk_itr,
+   ret = ext2fs_get_block_bitmap_range2(fs->block_map, blk_itr,
block_nbytes * 8, block_bitmap);
if (ret) {
error("fail to get bitmap from ext2, %s",
@@ -353,7 +354,7 @@ static int ext2_create_symlink(struct btrfs_trans_handle 
*trans,
int ret;
char *pathname;
u64 inode_size = btrfs_stack_inode_size(btrfs_inode);
-   if (ext2fs_inode_data_blocks(ext2_fs, ext2_inode)) {
+   if (ext2fs_inode_data_blocks2(ext2_fs, ext2_inode)) {
btrfs_set_stack_inode_size(btrfs_inode, inode_size + 1);
ret = ext2_create_file_extents(trans, root, objectid,
btrfs_inode, ext2_fs, ext2_ino,
@@ -627,9 +628,9 @@ static int ext2_copy_extended_attrs(struct 
btrfs_trans_handle *trans,
ret = -ENOMEM;
goto out;
}
-   err = ext2fs_read_ext_attr(ext2_fs, ext2_inode->i_file_acl, buffer);
+   err = ext2fs_read_ext_attr2(ext2_fs, ext2_inode->i_file_acl, buffer);
if (err) {
-   fprintf(stderr, "ext2fs_read_ext_attr: %s\n",
+   fprintf(stderr, "ext2fs_read_ext_attr2: %s\n",
error_message(err));
ret = -1;
goto out;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/13] fs: simplify dio_bio_complete

2017-06-03 Thread Christoph Hellwig
Only read bio->bi_error once in the common path.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Bart Van Assche 
---
 fs/direct-io.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index 04247a6c3f73..bb711e4b86c2 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -477,13 +477,12 @@ static int dio_bio_complete(struct dio *dio, struct bio 
*bio)
 {
struct bio_vec *bvec;
unsigned i;
-   int err;
+   int err = bio->bi_error;
 
-   if (bio->bi_error)
+   if (err)
dio->io_error = -EIO;
 
if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) {
-   err = bio->bi_error;
bio_check_pages_dirty(bio); /* transfers ownership */
} else {
bio_for_each_segment_all(bvec, bio, i) {
@@ -494,7 +493,6 @@ static int dio_bio_complete(struct dio *dio, struct bio 
*bio)
set_page_dirty_lock(page);
put_page(page);
}
-   err = bio->bi_error;
bio_put(bio);
}
return err;
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/13] dm: change ->end_io calling convention

2017-06-03 Thread Christoph Hellwig
Turn the error paramter into a pointer so that target drivers can change
the value, and make sure only DM_ENDIO_* values are returned from the
methods.

Signed-off-by: Christoph Hellwig 
---
 drivers/md/dm-cache-target.c  |  4 ++--
 drivers/md/dm-flakey.c|  8 
 drivers/md/dm-log-writes.c|  4 ++--
 drivers/md/dm-mpath.c | 11 ++-
 drivers/md/dm-raid1.c | 14 +++---
 drivers/md/dm-snap.c  |  4 ++--
 drivers/md/dm-stripe.c| 14 +++---
 drivers/md/dm-thin.c  |  4 ++--
 drivers/md/dm.c   | 36 ++--
 include/linux/device-mapper.h |  2 +-
 10 files changed, 51 insertions(+), 50 deletions(-)

diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c
index d682a0511381..c48612e6d525 100644
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -2820,7 +2820,7 @@ static int cache_map(struct dm_target *ti, struct bio 
*bio)
return r;
 }
 
-static int cache_end_io(struct dm_target *ti, struct bio *bio, int error)
+static int cache_end_io(struct dm_target *ti, struct bio *bio, int *error)
 {
struct cache *cache = ti->private;
unsigned long flags;
@@ -2838,7 +2838,7 @@ static int cache_end_io(struct dm_target *ti, struct bio 
*bio, int error)
bio_drop_shared_lock(cache, bio);
accounted_complete(cache, bio);
 
-   return 0;
+   return DM_ENDIO_DONE;
 }
 
 static int write_dirty_bitset(struct cache *cache)
diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index e8f093b323ce..c9539917a59b 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -358,12 +358,12 @@ static int flakey_map(struct dm_target *ti, struct bio 
*bio)
return DM_MAPIO_REMAPPED;
 }
 
-static int flakey_end_io(struct dm_target *ti, struct bio *bio, int error)
+static int flakey_end_io(struct dm_target *ti, struct bio *bio, int *error)
 {
struct flakey_c *fc = ti->private;
struct per_bio_data *pb = dm_per_bio_data(bio, sizeof(struct 
per_bio_data));
 
-   if (!error && pb->bio_submitted && (bio_data_dir(bio) == READ)) {
+   if (!*error && pb->bio_submitted && (bio_data_dir(bio) == READ)) {
if (fc->corrupt_bio_byte && (fc->corrupt_bio_rw == READ) &&
all_corrupt_bio_flags_match(bio, fc)) {
/*
@@ -377,11 +377,11 @@ static int flakey_end_io(struct dm_target *ti, struct bio 
*bio, int error)
 * Error read during the down_interval if drop_writes
 * and error_writes were not configured.
 */
-   return -EIO;
+   *error = -EIO;
}
}
 
-   return error;
+   return DM_ENDIO_DONE;
 }
 
 static void flakey_status(struct dm_target *ti, status_type_t type,
diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c
index e42264706c59..cc57c7fa1268 100644
--- a/drivers/md/dm-log-writes.c
+++ b/drivers/md/dm-log-writes.c
@@ -664,7 +664,7 @@ static int log_writes_map(struct dm_target *ti, struct bio 
*bio)
return DM_MAPIO_REMAPPED;
 }
 
-static int normal_end_io(struct dm_target *ti, struct bio *bio, int error)
+static int normal_end_io(struct dm_target *ti, struct bio *bio, int *error)
 {
struct log_writes_c *lc = ti->private;
struct per_bio_data *pb = dm_per_bio_data(bio, sizeof(struct 
per_bio_data));
@@ -686,7 +686,7 @@ static int normal_end_io(struct dm_target *ti, struct bio 
*bio, int error)
spin_unlock_irqrestore(>blocks_lock, flags);
}
 
-   return error;
+   return DM_ENDIO_DONE;
 }
 
 /*
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index bf6e49c780d5..ceeeb495d01c 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1517,14 +1517,15 @@ static int multipath_end_io(struct dm_target *ti, 
struct request *clone,
return r;
 }
 
-static int multipath_end_io_bio(struct dm_target *ti, struct bio *clone, int 
error)
+static int multipath_end_io_bio(struct dm_target *ti, struct bio *clone, int 
*error)
 {
struct multipath *m = ti->private;
struct dm_mpath_io *mpio = get_mpio_from_bio(clone);
struct pgpath *pgpath = mpio->pgpath;
unsigned long flags;
+   int r = DM_ENDIO_DONE;
 
-   if (!error || noretry_error(error))
+   if (!*error || noretry_error(*error))
goto done;
 
if (pgpath)
@@ -1533,7 +1534,7 @@ static int multipath_end_io_bio(struct dm_target *ti, 
struct bio *clone, int err
if (atomic_read(>nr_valid_paths) == 0 &&
!test_bit(MPATHF_QUEUE_IF_NO_PATH, >flags)) {
dm_report_EIO(m);
-   error = -EIO;
+   *error = -EIO;
goto done;
}
 
@@ -1546,7 +1547,7 @@ static int multipath_end_io_bio(struct dm_target *ti, 
struct bio *clone, int err
if 

[PATCH 12/13] blk-mq: switch ->queue_rq return value to blk_status_t

2017-06-03 Thread Christoph Hellwig
Use the same values for use for request completion errors as the return
value from ->queue_rq.  BLK_STS_RESOURCE is special cased to cause
a requeue, and all the others are completed as-is.

Signed-off-by: Christoph Hellwig 
---
 block/blk-mq.c| 37 --
 drivers/block/loop.c  |  6 +++---
 drivers/block/mtip32xx/mtip32xx.c | 17 
 drivers/block/nbd.c   | 12 ---
 drivers/block/null_blk.c  |  4 ++--
 drivers/block/rbd.c   |  4 ++--
 drivers/block/virtio_blk.c| 10 +-
 drivers/block/xen-blkfront.c  |  8 
 drivers/md/dm-rq.c|  8 
 drivers/mtd/ubi/block.c   |  6 +++---
 drivers/nvme/host/core.c  | 14 ++---
 drivers/nvme/host/fc.c| 23 +++--
 drivers/nvme/host/nvme.h  |  2 +-
 drivers/nvme/host/pci.c   | 42 +++
 drivers/nvme/host/rdma.c  | 26 +---
 drivers/nvme/target/loop.c| 17 
 drivers/scsi/scsi_lib.c   | 30 ++--
 include/linux/blk-mq.h|  7 ++-
 18 files changed, 131 insertions(+), 142 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index adcc1c0dce6e..7af78b1e9db9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -924,7 +924,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, 
struct list_head *list)
 {
struct blk_mq_hw_ctx *hctx;
struct request *rq;
-   int errors, queued, ret = BLK_MQ_RQ_QUEUE_OK;
+   int errors, queued;
 
if (list_empty(list))
return false;
@@ -935,6 +935,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, 
struct list_head *list)
errors = queued = 0;
do {
struct blk_mq_queue_data bd;
+   blk_status_t ret;
 
rq = list_first_entry(list, struct request, queuelist);
if (!blk_mq_get_driver_tag(rq, , false)) {
@@ -975,25 +976,20 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, 
struct list_head *list)
}
 
ret = q->mq_ops->queue_rq(hctx, );
-   switch (ret) {
-   case BLK_MQ_RQ_QUEUE_OK:
-   queued++;
-   break;
-   case BLK_MQ_RQ_QUEUE_BUSY:
+   if (ret == BLK_STS_RESOURCE) {
blk_mq_put_driver_tag_hctx(hctx, rq);
list_add(>queuelist, list);
__blk_mq_requeue_request(rq);
break;
-   default:
-   pr_err("blk-mq: bad return on queue: %d\n", ret);
-   case BLK_MQ_RQ_QUEUE_ERROR:
+   }
+
+   if (unlikely(ret != BLK_STS_OK)) {
errors++;
blk_mq_end_request(rq, BLK_STS_IOERR);
-   break;
+   continue;
}
 
-   if (ret == BLK_MQ_RQ_QUEUE_BUSY)
-   break;
+   queued++;
} while (!list_empty(list));
 
hctx->dispatched[queued_to_index(queued)]++;
@@ -1031,7 +1027,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, 
struct list_head *list)
 * - blk_mq_run_hw_queue() checks whether or not a queue has
 *   been stopped before rerunning a queue.
 * - Some but not all block drivers stop a queue before
-*   returning BLK_MQ_RQ_QUEUE_BUSY. Two exceptions are scsi-mq
+*   returning BLK_STS_RESOURCE. Two exceptions are scsi-mq
 *   and dm-rq.
 */
if (!blk_mq_sched_needs_restart(hctx) &&
@@ -1410,7 +1406,7 @@ static void __blk_mq_try_issue_directly(struct request 
*rq, blk_qc_t *cookie,
};
struct blk_mq_hw_ctx *hctx;
blk_qc_t new_cookie;
-   int ret;
+   blk_status_t ret;
 
if (q->elevator)
goto insert;
@@ -1426,18 +1422,19 @@ static void __blk_mq_try_issue_directly(struct request 
*rq, blk_qc_t *cookie,
 * would have done
 */
ret = q->mq_ops->queue_rq(hctx, );
-   if (ret == BLK_MQ_RQ_QUEUE_OK) {
+   switch (ret) {
+   case BLK_STS_OK:
*cookie = new_cookie;
return;
-   }
-
-   if (ret == BLK_MQ_RQ_QUEUE_ERROR) {
+   case BLK_STS_RESOURCE:
+   __blk_mq_requeue_request(rq);
+   goto insert;
+   default:
*cookie = BLK_QC_T_NONE;
-   blk_mq_end_request(rq, BLK_STS_IOERR);
+   blk_mq_end_request(rq, ret);
return;
}
 
-   __blk_mq_requeue_request(rq);
 insert:
blk_mq_sched_insert_request(rq, false, true, false, may_sleep);
 }
diff --git a/drivers/block/loop.c b/drivers/block/loop.c

[PATCH 08/13] dm mpath: merge do_end_io_bio into multipath_end_io_bio

2017-06-03 Thread Christoph Hellwig
This simplifies the code and especially the error passing a bit and
will help with the next patch.

Signed-off-by: Christoph Hellwig 
---
 drivers/md/dm-mpath.c | 42 +++---
 1 file changed, 15 insertions(+), 27 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 3df056b73b66..6d5ebb76149d 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1510,24 +1510,24 @@ static int multipath_end_io(struct dm_target *ti, 
struct request *clone,
return r;
 }
 
-static int do_end_io_bio(struct multipath *m, struct bio *clone,
-int error, struct dm_mpath_io *mpio)
+static int multipath_end_io_bio(struct dm_target *ti, struct bio *clone, int 
error)
 {
+   struct multipath *m = ti->private;
+   struct dm_mpath_io *mpio = get_mpio_from_bio(clone);
+   struct pgpath *pgpath = mpio->pgpath;
unsigned long flags;
 
-   if (!error)
-   return 0;   /* I/O complete */
-
-   if (noretry_error(error))
-   return error;
+   if (!error || noretry_error(error))
+   goto done;
 
-   if (mpio->pgpath)
-   fail_path(mpio->pgpath);
+   if (pgpath)
+   fail_path(pgpath);
 
if (atomic_read(>nr_valid_paths) == 0 &&
!test_bit(MPATHF_QUEUE_IF_NO_PATH, >flags)) {
dm_report_EIO(m);
-   return -EIO;
+   error = -EIO;
+   goto done;
}
 
/* Queue for the daemon to resubmit */
@@ -1539,28 +1539,16 @@ static int do_end_io_bio(struct multipath *m, struct 
bio *clone,
if (!test_bit(MPATHF_QUEUE_IO, >flags))
queue_work(kmultipathd, >process_queued_bios);
 
-   return DM_ENDIO_INCOMPLETE;
-}
-
-static int multipath_end_io_bio(struct dm_target *ti, struct bio *clone, int 
error)
-{
-   struct multipath *m = ti->private;
-   struct dm_mpath_io *mpio = get_mpio_from_bio(clone);
-   struct pgpath *pgpath;
-   struct path_selector *ps;
-   int r;
-
-   BUG_ON(!mpio);
-
-   r = do_end_io_bio(m, clone, error, mpio);
-   pgpath = mpio->pgpath;
+   error = DM_ENDIO_INCOMPLETE;
+done:
if (pgpath) {
-   ps = >pg->ps;
+   struct path_selector *ps = >pg->ps;
+
if (ps->type->end_io)
ps->type->end_io(ps, >path, mpio->nr_bytes);
}
 
-   return r;
+   return error;
 }
 
 /*
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/13] nvme-lightnvm: use blk_execute_rq in nvme_nvm_submit_user_cmd

2017-06-03 Thread Christoph Hellwig
Instead of reinventing it poorly.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Bart Van Assche 
Reviewed-by: Javier González 
---
 drivers/nvme/host/lightnvm.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/nvme/host/lightnvm.c b/drivers/nvme/host/lightnvm.c
index f5df78ed1e10..f3885b5e56bd 100644
--- a/drivers/nvme/host/lightnvm.c
+++ b/drivers/nvme/host/lightnvm.c
@@ -571,13 +571,6 @@ static struct nvm_dev_ops nvme_nvm_dev_ops = {
.max_phys_sect  = 64,
 };
 
-static void nvme_nvm_end_user_vio(struct request *rq, int error)
-{
-   struct completion *waiting = rq->end_io_data;
-
-   complete(waiting);
-}
-
 static int nvme_nvm_submit_user_cmd(struct request_queue *q,
struct nvme_ns *ns,
struct nvme_nvm_command *vcmd,
@@ -608,7 +601,6 @@ static int nvme_nvm_submit_user_cmd(struct request_queue *q,
rq->timeout = timeout ? timeout : ADMIN_TIMEOUT;
 
rq->cmd_flags &= ~REQ_FAILFAST_DRIVER;
-   rq->end_io_data = 
 
if (ppa_buf && ppa_len) {
ppa_list = dma_pool_alloc(dev->dma_pool, GFP_KERNEL, _dma);
@@ -662,9 +654,7 @@ static int nvme_nvm_submit_user_cmd(struct request_queue *q,
}
 
 submit:
-   blk_execute_rq_nowait(q, NULL, rq, 0, nvme_nvm_end_user_vio);
-
-   wait_for_completion_io();
+   blk_execute_rq(q, NULL, rq, 0);
 
if (nvme_req(rq)->flags & NVME_REQ_CANCELLED)
ret = -EINTR;
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/13] block_dev: propagate bio_iov_iter_get_pages error in __blkdev_direct_IO

2017-06-03 Thread Christoph Hellwig
Once we move the block layer to its own status code we'll still want to
propagate the bio_iov_iter_get_pages, so restructure __blkdev_direct_IO
to take ret into account when returning the errno.

Signed-off-by: Christoph Hellwig 
---
 fs/block_dev.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 51959936..c1dc393ad6b9 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -334,7 +334,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
bool is_read = (iov_iter_rw(iter) == READ), is_sync;
loff_t pos = iocb->ki_pos;
blk_qc_t qc = BLK_QC_T_NONE;
-   int ret;
+   int ret = 0;
 
if ((pos | iov_iter_alignment(iter)) &
(bdev_logical_block_size(bdev) - 1))
@@ -363,7 +363,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
 
ret = bio_iov_iter_get_pages(bio, iter);
if (unlikely(ret)) {
-   bio->bi_error = ret;
+   bio->bi_error = -EIO;
bio_endio(bio);
break;
}
@@ -412,7 +412,8 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
}
__set_current_state(TASK_RUNNING);
 
-   ret = dio->bio.bi_error;
+   if (!ret)
+   ret = dio->bio.bi_error;
if (likely(!ret))
ret = dio->size;
 
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/13] dm: don't return errnos from ->map

2017-06-03 Thread Christoph Hellwig
Instead use the special DM_MAPIO_KILL return value to return -EIO just
like we do for the request based path.  Note that dm-log-writes returned
-ENOMEM in a few places, which now becomes -EIO instead.  No consumer
treats -ENOMEM special so this shouldn't be an issue (and it should
use a mempool to start with to make guaranteed progress).

Signed-off-by: Christoph Hellwig 
---
 drivers/md/dm-crypt.c |  4 ++--
 drivers/md/dm-flakey.c|  4 ++--
 drivers/md/dm-integrity.c | 12 ++--
 drivers/md/dm-log-writes.c|  4 ++--
 drivers/md/dm-mpath.c | 13 ++---
 drivers/md/dm-raid1.c |  6 +++---
 drivers/md/dm-snap.c  |  8 
 drivers/md/dm-target.c|  2 +-
 drivers/md/dm-verity-target.c |  6 +++---
 drivers/md/dm-zero.c  |  4 ++--
 drivers/md/dm.c   | 16 +++-
 11 files changed, 46 insertions(+), 33 deletions(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index ebf9e72d479b..f4b51809db21 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -2795,10 +2795,10 @@ static int crypt_map(struct dm_target *ti, struct bio 
*bio)
 * and is aligned to this size as defined in IO hints.
 */
if (unlikely((bio->bi_iter.bi_sector & ((cc->sector_size >> 
SECTOR_SHIFT) - 1)) != 0))
-   return -EIO;
+   return DM_MAPIO_KILL;
 
if (unlikely(bio->bi_iter.bi_size & (cc->sector_size - 1)))
-   return -EIO;
+   return DM_MAPIO_KILL;
 
io = dm_per_bio_data(bio, cc->per_bio_data_size);
crypt_io_init(io, cc, bio, dm_target_offset(ti, 
bio->bi_iter.bi_sector));
diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index 13305a182611..e8f093b323ce 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -321,7 +321,7 @@ static int flakey_map(struct dm_target *ti, struct bio *bio)
if (bio_data_dir(bio) == READ) {
if (!fc->corrupt_bio_byte && !test_bit(DROP_WRITES, 
>flags) &&
!test_bit(ERROR_WRITES, >flags))
-   return -EIO;
+   return DM_MAPIO_KILL;
goto map_bio;
}
 
@@ -349,7 +349,7 @@ static int flakey_map(struct dm_target *ti, struct bio *bio)
/*
 * By default, error all I/O.
 */
-   return -EIO;
+   return DM_MAPIO_KILL;
}
 
 map_bio:
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index c7f7c8d76576..ee78fb471229 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -1352,13 +1352,13 @@ static int dm_integrity_map(struct dm_target *ti, 
struct bio *bio)
DMERR("Too big sector number: 0x%llx + 0x%x > 0x%llx",
  (unsigned long long)dio->range.logical_sector, 
bio_sectors(bio),
  (unsigned long long)ic->provided_data_sectors);
-   return -EIO;
+   return DM_MAPIO_KILL;
}
if (unlikely((dio->range.logical_sector | bio_sectors(bio)) & 
(unsigned)(ic->sectors_per_block - 1))) {
DMERR("Bio not aligned on %u sectors: 0x%llx, 0x%x",
  ic->sectors_per_block,
  (unsigned long long)dio->range.logical_sector, 
bio_sectors(bio));
-   return -EIO;
+   return DM_MAPIO_KILL;
}
 
if (ic->sectors_per_block > 1) {
@@ -1368,7 +1368,7 @@ static int dm_integrity_map(struct dm_target *ti, struct 
bio *bio)
if (unlikely((bv.bv_offset | bv.bv_len) & 
((ic->sectors_per_block << SECTOR_SHIFT) - 1))) {
DMERR("Bio vector (%u,%u) is not aligned on 
%u-sector boundary",
bv.bv_offset, bv.bv_len, 
ic->sectors_per_block);
-   return -EIO;
+   return DM_MAPIO_KILL;
}
}
}
@@ -1383,18 +1383,18 @@ static int dm_integrity_map(struct dm_target *ti, 
struct bio *bio)
wanted_tag_size *= ic->tag_size;
if (unlikely(wanted_tag_size != bip->bip_iter.bi_size)) 
{
DMERR("Invalid integrity data size %u, expected 
%u", bip->bip_iter.bi_size, wanted_tag_size);
-   return -EIO;
+   return DM_MAPIO_KILL;
}
}
} else {
if (unlikely(bip != NULL)) {
DMERR("Unexpected integrity data when using internal 
hash");
-   return -EIO;
+   return DM_MAPIO_KILL;
}
}
 
if (unlikely(ic->mode == 'R') && unlikely(dio->write))
-   return -EIO;
+   return DM_MAPIO_KILL;
 
   

[PATCH 05/13] fs: remove the unused error argument to dio_end_io()

2017-06-03 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
Reviewed-by: Bart Van Assche 
---
 fs/btrfs/inode.c   | 6 +++---
 fs/direct-io.c | 3 +--
 include/linux/fs.h | 2 +-
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 17cbe9306faf..758b2666885e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8244,7 +8244,7 @@ static void btrfs_endio_direct_read(struct bio *bio)
kfree(dip);
 
dio_bio->bi_error = bio->bi_error;
-   dio_end_io(dio_bio, bio->bi_error);
+   dio_end_io(dio_bio);
 
if (io_bio->end_io)
io_bio->end_io(io_bio, err);
@@ -8304,7 +8304,7 @@ static void btrfs_endio_direct_write(struct bio *bio)
kfree(dip);
 
dio_bio->bi_error = bio->bi_error;
-   dio_end_io(dio_bio, bio->bi_error);
+   dio_end_io(dio_bio);
bio_put(bio);
 }
 
@@ -8673,7 +8673,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, 
struct inode *inode,
 * Releases and cleans up our dio_bio, no need to bio_put()
 * nor bio_endio()/bio_io_error() against dio_bio.
 */
-   dio_end_io(dio_bio, ret);
+   dio_end_io(dio_bio);
}
if (io_bio)
bio_put(io_bio);
diff --git a/fs/direct-io.c b/fs/direct-io.c
index a04ebea77de8..04247a6c3f73 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -348,13 +348,12 @@ static void dio_bio_end_io(struct bio *bio)
 /**
  * dio_end_io - handle the end io action for the given bio
  * @bio: The direct io bio thats being completed
- * @error: Error if there was one
  *
  * This is meant to be called by any filesystem that uses their own 
dio_submit_t
  * so that the DIO specific endio actions are dealt with after the filesystem
  * has done it's completion work.
  */
-void dio_end_io(struct bio *bio, int error)
+void dio_end_io(struct bio *bio)
 {
struct dio *dio = bio->bi_private;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 803e5a9b2654..4388ab58843d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2843,7 +2843,7 @@ enum {
DIO_SKIP_DIO_COUNT = 0x08,
 };
 
-void dio_end_io(struct bio *bio, int error);
+void dio_end_io(struct bio *bio);
 
 ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
 struct block_device *bdev, struct iov_iter *iter,
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/13] scsi/osd: don't save block errors into req_results

2017-06-03 Thread Christoph Hellwig
We will only have sense data if the command executed and got a SCSI
result, so this is pointless.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Martin K. Petersen 
---
 drivers/scsi/osd/osd_initiator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c
index 8a1b94816419..14785177ce7b 100644
--- a/drivers/scsi/osd/osd_initiator.c
+++ b/drivers/scsi/osd/osd_initiator.c
@@ -477,7 +477,7 @@ static void _set_error_resid(struct osd_request *or, struct 
request *req,
 int error)
 {
or->async_error = error;
-   or->req_errors = scsi_req(req)->result ? : error;
+   or->req_errors = scsi_req(req)->result;
or->sense_len = scsi_req(req)->sense_len;
if (or->sense_len)
memcpy(or->sense, scsi_req(req)->sense, or->sense_len);
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/13] dm: fix REQ_RAHEAD handling

2017-06-03 Thread Christoph Hellwig
A few (but not all) dm targets use a special EWOULDBLOCK error code for
failing REQ_RAHEAD requests that fail due to a lack of available resources.
But no one else knows about this magic code, and lower level drivers also
don't generate it when failing read-ahead requests for similar reasons.

So remove this special casing and ignore all additional error handling for
REQ_RAHEAD - if this was a real underlying error we'd get a normal read
once the real read comes in.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Bart Van Assche 
---
 drivers/md/dm-raid1.c  | 4 ++--
 drivers/md/dm-stripe.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index a95cbb80fb34..5e30b08b91d9 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -1214,7 +1214,7 @@ static int mirror_map(struct dm_target *ti, struct bio 
*bio)
 */
if (!r || (r == -EWOULDBLOCK)) {
if (bio->bi_opf & REQ_RAHEAD)
-   return -EWOULDBLOCK;
+   return -EIO;
 
queue_bio(ms, bio, rw);
return DM_MAPIO_SUBMITTED;
@@ -1258,7 +1258,7 @@ static int mirror_end_io(struct dm_target *ti, struct bio 
*bio, int error)
if (error == -EOPNOTSUPP)
return error;
 
-   if ((error == -EWOULDBLOCK) && (bio->bi_opf & REQ_RAHEAD))
+   if (bio->bi_opf & REQ_RAHEAD)
return error;
 
if (unlikely(error)) {
diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index 75152482f3ad..780e95889a7c 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -384,7 +384,7 @@ static int stripe_end_io(struct dm_target *ti, struct bio 
*bio, int error)
if (!error)
return 0; /* I/O complete */
 
-   if ((error == -EWOULDBLOCK) && (bio->bi_opf & REQ_RAHEAD))
+   if (bio->bi_opf & REQ_RAHEAD)
return error;
 
if (error == -EOPNOTSUPP)
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/13] gfs2: remove the unused sd_log_error field

2017-06-03 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
Reviewed-by: Bart Van Assche 
---
 fs/gfs2/incore.h | 1 -
 fs/gfs2/lops.c   | 4 +---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index b7cf65d13561..aa3d44527fa2 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -815,7 +815,6 @@ struct gfs2_sbd {
atomic_t sd_log_in_flight;
struct bio *sd_log_bio;
wait_queue_head_t sd_log_flush_wait;
-   int sd_log_error;
 
atomic_t sd_reserving_log;
wait_queue_head_t sd_reserving_log_wait;
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index b1f9144b42c7..13ebf15a4db0 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -209,10 +209,8 @@ static void gfs2_end_log_write(struct bio *bio)
struct page *page;
int i;
 
-   if (bio->bi_error) {
-   sdp->sd_log_error = bio->bi_error;
+   if (bio->bi_error)
fs_err(sdp, "Error %d writing to log\n", bio->bi_error);
-   }
 
bio_for_each_segment_all(bvec, bio, i) {
page = bvec->bv_page;
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


dedicated error codes for the block layer V3

2017-06-03 Thread Christoph Hellwig
This series introduces a new blk_status_t error code type for the block
layer so that we can have tigher control and explicit semantics for
block layer errors.

All but the last three patches are cleanups that lead to the new type.

The series it mostly limited to the block layer and drivers, and touching
file systems a little bit.  The only major exception is btrfs, which
does funny things with bios and thus sees a larger amount of propagation
of the new blk_status_t.

A git tree is also available at:

git://git.infradead.org/users/hch/block.git block-errors

gitweb:


http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/block-errors

Note the the two biggest patches didn't make it to linux-block and
linux-btrfs last time.  If you didn't get them they are available in
the git tree above.  Unfortunately there is no easy way to split them
up.

Changes since V2:
 - minor tweaks from reviews

Changes since V1: 
 - keep blk_types.h for now
 - removed a BUG_ON in dm-mpath
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] generic: test Btrfs delalloc accounting overflow

2017-06-03 Thread Christoph Hellwig
This looks like a btrfs-specific test, and not like a generic one
to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] generic: test Btrfs delalloc accounting overflow

2017-06-03 Thread Eryu Guan
On Fri, Jun 02, 2017 at 02:46:52PM +0200, David Sterba wrote:
> On Fri, Jun 02, 2017 at 12:07:37PM +0300, Nikolay Borisov wrote:
> > > +# Make sure that we didn't leak any metadata space.
> > > +if [[ $FSTYP = btrfs ]]; then
> > > + uuid="$(findmnt -n -o UUID "$TEST_DIR")"
> > 
> > if we are on btrfs and we don't have findmnt this test will likely fail.
> > Perhaps include a _require_command findmnt
> 
> I think utilities like findmnt should be checked at the beginning of the
> whole testuiste, not in each test that uses them. As findmnt is part of

Agreed. I think we can define a FINDMNT_PROG in common/config and refuse
to run any test if it's mising, as what we did to $MOUNT_PROG and other
must-have commands. (There's already a bare call to findmnt in
common/rc, change it to call $FINDMNT_PROG too).

We can do this in a separate patch.

Thanks,
Eryu

> util-linux, missing it would also mean that eg 'mount' is missing.
> Highly unlikely.
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html