On Thu, 2018-07-12 at 10:42 -0600, Jens Axboe wrote: > > Hence the patch I sent is wrong, the code actually looks fine. Which > means we're back to trying to figure out what is going on here. It'd > be great with a test case...
We don't have an easy test case yet. But the customer has confirmed that the problem occurs with upstream 4.17.5, too. We also confirmed again that the problem occurs when the kernel uses the kmalloc() code path in __blkdev_direct_IO_simple(). My personal suggestion would be to ditch __blkdev_direct_IO_simple() altogether. After all, it's not _that_ much simpler thatn __blkdev_direct_IO(), and it seems to be broken in a subtle way. However, so far I've only identified a minor problem, see below - it doesn't explain the data corruption we're seeing. Martin -- Dr. Martin Wilck <mwi...@suse.com>, Tel. +49 (0)911 74053 2107 SUSELinux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) >From d0c74ef1fc73c03983950c496f7490da8aa56671 Mon Sep 17 00:00:00 2001 From: Martin Wilck <mwi...@suse.com> Date: Fri, 13 Jul 2018 18:38:44 +0200 Subject: [PATCH] fs: fix error exit in __blkdev_direct_IO_simple Cleanup code was missing in the error return path. --- fs/block_dev.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 7ec920e..b82b516 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -218,8 +218,12 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, bio.bi_end_io = blkdev_bio_end_io_simple; ret = bio_iov_iter_get_pages(&bio, iter); - if (unlikely(ret)) + if (unlikely(ret)) { + if (vecs != inline_vecs) + kfree(vecs); + bio_uninit(&bio); return ret; + } ret = bio.bi_iter.bi_size; if (iov_iter_rw(iter) == READ) { -- 2.17.1