Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Sat, Dec 20, 2014 at 06:51:33AM +, Al Viro wrote: > > The problem is that the use of ->direct_IO by the swap code is a gross > > layering violation. ->direct_IO is a callback for the filesystem, and > > the swap code need to call ->read_iter instead of ->readpage and > > ->write_tier instead of ->direct_IO, and leave the locking to the > > filesystem. > > The thing is, ->read_iter() and ->write_iter() might decide to fall back to > buffered IO path. XFS is unusual in that respect - there O_DIRECT ends up > with short write in such case. Other filesystems, OTOH... We'll just need a ->swap_activate method that makes sure we really do direct I/O. For all filesystems currently suporting swap just checking that all blocks are allocated (as the ->bmap path already does) should be enough. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Sat, Dec 20, 2014 at 06:51:33AM +, Al Viro wrote: The problem is that the use of -direct_IO by the swap code is a gross layering violation. -direct_IO is a callback for the filesystem, and the swap code need to call -read_iter instead of -readpage and -write_tier instead of -direct_IO, and leave the locking to the filesystem. The thing is, -read_iter() and -write_iter() might decide to fall back to buffered IO path. XFS is unusual in that respect - there O_DIRECT ends up with short write in such case. Other filesystems, OTOH... We'll just need a -swap_activate method that makes sure we really do direct I/O. For all filesystems currently suporting swap just checking that all blocks are allocated (as the -bmap path already does) should be enough. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Sat, Dec 20, 2014 at 06:51:33AM +, Al Viro wrote: > On Mon, Dec 15, 2014 at 08:56:15AM -0800, Christoph Hellwig wrote: > > On Mon, Dec 15, 2014 at 05:27:05PM +0100, Jan Kara wrote: > > > On Sun 14-12-14 21:26:56, Omar Sandoval wrote: > > > > The generic write code locks i_mutex for a direct_IO. Swap-over-NFS > > > > doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to > > > > be held, but most direct_IO implementations do. > > > I think you are speaking about direct IO writes only, aren't you? For > > > DIO > > > reads we don't hold i_mutex AFAICS. And also for DIO writes we don't > > > necessarily hold i_mutex - see for example XFS which doesn't take i_mutex > > > for direct IO writes. It uses it's internal rwlock for this (see > > > xfs_file_dio_aio_write()). So I think this is just wrong. > > > > The problem is that the use of ->direct_IO by the swap code is a gross > > layering violation. ->direct_IO is a callback for the filesystem, and > > the swap code need to call ->read_iter instead of ->readpage and > > ->write_tier instead of ->direct_IO, and leave the locking to the > > filesystem. > > The thing is, ->read_iter() and ->write_iter() might decide to fall back to > buffered IO path. XFS is unusual in that respect - there O_DIRECT ends up > with short write in such case. Other filesystems, OTOH... Alright, now what? Using ->direct_IO directly is pretty much a no go because of the different locking conventions as was pointed out. Maybe some "no, really, just direct I/O" iocb flag? -- Omar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Sat, Dec 20, 2014 at 06:51:33AM +, Al Viro wrote: On Mon, Dec 15, 2014 at 08:56:15AM -0800, Christoph Hellwig wrote: On Mon, Dec 15, 2014 at 05:27:05PM +0100, Jan Kara wrote: On Sun 14-12-14 21:26:56, Omar Sandoval wrote: The generic write code locks i_mutex for a direct_IO. Swap-over-NFS doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to be held, but most direct_IO implementations do. I think you are speaking about direct IO writes only, aren't you? For DIO reads we don't hold i_mutex AFAICS. And also for DIO writes we don't necessarily hold i_mutex - see for example XFS which doesn't take i_mutex for direct IO writes. It uses it's internal rwlock for this (see xfs_file_dio_aio_write()). So I think this is just wrong. The problem is that the use of -direct_IO by the swap code is a gross layering violation. -direct_IO is a callback for the filesystem, and the swap code need to call -read_iter instead of -readpage and -write_tier instead of -direct_IO, and leave the locking to the filesystem. The thing is, -read_iter() and -write_iter() might decide to fall back to buffered IO path. XFS is unusual in that respect - there O_DIRECT ends up with short write in such case. Other filesystems, OTOH... Alright, now what? Using -direct_IO directly is pretty much a no go because of the different locking conventions as was pointed out. Maybe some no, really, just direct I/O iocb flag? -- Omar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Mon, Dec 15, 2014 at 08:56:15AM -0800, Christoph Hellwig wrote: > On Mon, Dec 15, 2014 at 05:27:05PM +0100, Jan Kara wrote: > > On Sun 14-12-14 21:26:56, Omar Sandoval wrote: > > > The generic write code locks i_mutex for a direct_IO. Swap-over-NFS > > > doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to > > > be held, but most direct_IO implementations do. > > I think you are speaking about direct IO writes only, aren't you? For DIO > > reads we don't hold i_mutex AFAICS. And also for DIO writes we don't > > necessarily hold i_mutex - see for example XFS which doesn't take i_mutex > > for direct IO writes. It uses it's internal rwlock for this (see > > xfs_file_dio_aio_write()). So I think this is just wrong. > > The problem is that the use of ->direct_IO by the swap code is a gross > layering violation. ->direct_IO is a callback for the filesystem, and > the swap code need to call ->read_iter instead of ->readpage and > ->write_tier instead of ->direct_IO, and leave the locking to the > filesystem. The thing is, ->read_iter() and ->write_iter() might decide to fall back to buffered IO path. XFS is unusual in that respect - there O_DIRECT ends up with short write in such case. Other filesystems, OTOH... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Mon, Dec 15, 2014 at 08:56:15AM -0800, Christoph Hellwig wrote: On Mon, Dec 15, 2014 at 05:27:05PM +0100, Jan Kara wrote: On Sun 14-12-14 21:26:56, Omar Sandoval wrote: The generic write code locks i_mutex for a direct_IO. Swap-over-NFS doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to be held, but most direct_IO implementations do. I think you are speaking about direct IO writes only, aren't you? For DIO reads we don't hold i_mutex AFAICS. And also for DIO writes we don't necessarily hold i_mutex - see for example XFS which doesn't take i_mutex for direct IO writes. It uses it's internal rwlock for this (see xfs_file_dio_aio_write()). So I think this is just wrong. The problem is that the use of -direct_IO by the swap code is a gross layering violation. -direct_IO is a callback for the filesystem, and the swap code need to call -read_iter instead of -readpage and -write_tier instead of -direct_IO, and leave the locking to the filesystem. The thing is, -read_iter() and -write_iter() might decide to fall back to buffered IO path. XFS is unusual in that respect - there O_DIRECT ends up with short write in such case. Other filesystems, OTOH... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Thu, Dec 18, 2014 at 10:24:05PM -0800, Omar Sandoval wrote: > + swap_file = file_open_name(name, O_RDWR | O_LARGEFILE | O_DIRECT, 0); > + if (IS_ERR(swap_file) && PTR_ERR(swap_file) == -EINVAL) ITYM if (swap_file == ERR_PTR(-EINVAL)) here... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 10:03:13PM +, Al Viro wrote: > On Wed, Dec 17, 2014 at 10:52:56AM -0800, Christoph Hellwig wrote: > > On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote: > > > See my previous message. If we use O_DIRECT on the original open, then > > > filesystems that implement bmap but not direct_IO will no longer work. > > > These are the ones that I found in my tree: > > > > In the long run I don't think they are worth keeping. But to keep you > > out of that discussion you can just try an open without O_DIRECT if the > > open with the flag failed. > > Umm... That's one possibility, of course (and if swapon(2) is on someone's > hotpath, I really would like to see what the hell they are doing - it has > to be interesting in a sick way). If this is the approach you'd prefer, I'll go ahead and do that for v2. I personally think it looks pretty kludgey, but I'm fine either way: diff --git a/mm/swapfile.c b/mm/swapfile.c index 63f55cc..c1b3073 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2379,7 +2379,16 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) name = NULL; goto bad_swap; } - swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0); + swap_file = file_open_name(name, O_RDWR | O_LARGEFILE | O_DIRECT, 0); + if (IS_ERR(swap_file) && PTR_ERR(swap_file) == -EINVAL) + swap_file = file_open_name(name, O_RDWR | O_LARGEFILE, 0); if (IS_ERR(swap_file)) { error = PTR_ERR(swap_file); swap_file = NULL; > BTW, speaking of read/write vs. swap - what's the story with e.g. AFS > write() checking IS_SWAPFILE() and failing with -EBUSY? Note that > * it's done before acquiring i_mutex, so it isn't race-free > * it's dubious from the POSIX POV - EBUSY isn't in the error > list for write(2). > * other filesystems generally don't have anything of that sort. > NFS does, but local ones do not... > Besides, do we even allow swapfiles on AFS? AFS doesn't implement ->bmap or ->swap_activate, so that code is dead, probably cargo-culted from the NFS code. It seems pretty pointless, not only because it's inconsistent with the local filesystems like you mentioned, but also because it's trivial to bypass with O_DIRECT on NFS: ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from) { struct file *file = iocb->ki_filp; struct inode *inode = file_inode(file); unsigned long written = 0; ssize_t result; size_t count = iov_iter_count(from); loff_t pos = iocb->ki_pos; result = nfs_key_timeout_notify(file, inode); if (result) return result; if (file->f_flags & O_DIRECT) return nfs_file_direct_write(iocb, from, pos); dprintk("NFS: write(%pD2, %zu@%Ld)\n", file, count, (long long) pos); result = -EBUSY; if (IS_SWAPFILE(inode)) goto out_swapfile; I think it's safe to scrap that code. However, this also led me to find that NFS doesn't prevent truncates on an active swapfile. I'm submitting a patch for that now. -- Omar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 10:03:13PM +, Al Viro wrote: On Wed, Dec 17, 2014 at 10:52:56AM -0800, Christoph Hellwig wrote: On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote: See my previous message. If we use O_DIRECT on the original open, then filesystems that implement bmap but not direct_IO will no longer work. These are the ones that I found in my tree: In the long run I don't think they are worth keeping. But to keep you out of that discussion you can just try an open without O_DIRECT if the open with the flag failed. Umm... That's one possibility, of course (and if swapon(2) is on someone's hotpath, I really would like to see what the hell they are doing - it has to be interesting in a sick way). If this is the approach you'd prefer, I'll go ahead and do that for v2. I personally think it looks pretty kludgey, but I'm fine either way: diff --git a/mm/swapfile.c b/mm/swapfile.c index 63f55cc..c1b3073 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2379,7 +2379,16 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) name = NULL; goto bad_swap; } - swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0); + swap_file = file_open_name(name, O_RDWR | O_LARGEFILE | O_DIRECT, 0); + if (IS_ERR(swap_file) PTR_ERR(swap_file) == -EINVAL) + swap_file = file_open_name(name, O_RDWR | O_LARGEFILE, 0); if (IS_ERR(swap_file)) { error = PTR_ERR(swap_file); swap_file = NULL; BTW, speaking of read/write vs. swap - what's the story with e.g. AFS write() checking IS_SWAPFILE() and failing with -EBUSY? Note that * it's done before acquiring i_mutex, so it isn't race-free * it's dubious from the POSIX POV - EBUSY isn't in the error list for write(2). * other filesystems generally don't have anything of that sort. NFS does, but local ones do not... Besides, do we even allow swapfiles on AFS? AFS doesn't implement -bmap or -swap_activate, so that code is dead, probably cargo-culted from the NFS code. It seems pretty pointless, not only because it's inconsistent with the local filesystems like you mentioned, but also because it's trivial to bypass with O_DIRECT on NFS: ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from) { struct file *file = iocb-ki_filp; struct inode *inode = file_inode(file); unsigned long written = 0; ssize_t result; size_t count = iov_iter_count(from); loff_t pos = iocb-ki_pos; result = nfs_key_timeout_notify(file, inode); if (result) return result; if (file-f_flags O_DIRECT) return nfs_file_direct_write(iocb, from, pos); dprintk(NFS: write(%pD2, %zu@%Ld)\n, file, count, (long long) pos); result = -EBUSY; if (IS_SWAPFILE(inode)) goto out_swapfile; I think it's safe to scrap that code. However, this also led me to find that NFS doesn't prevent truncates on an active swapfile. I'm submitting a patch for that now. -- Omar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Thu, Dec 18, 2014 at 10:24:05PM -0800, Omar Sandoval wrote: + swap_file = file_open_name(name, O_RDWR | O_LARGEFILE | O_DIRECT, 0); + if (IS_ERR(swap_file) PTR_ERR(swap_file) == -EINVAL) ITYM if (swap_file == ERR_PTR(-EINVAL)) here... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 10:52:56AM -0800, Christoph Hellwig wrote: > On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote: > > See my previous message. If we use O_DIRECT on the original open, then > > filesystems that implement bmap but not direct_IO will no longer work. > > These are the ones that I found in my tree: > > In the long run I don't think they are worth keeping. But to keep you > out of that discussion you can just try an open without O_DIRECT if the > open with the flag failed. Umm... That's one possibility, of course (and if swapon(2) is on someone's hotpath, I really would like to see what the hell they are doing - it has to be interesting in a sick way). Said that, there's an interesting problem with O_DIRECT. It's irrelevant in this case, but it *can* be changed halfway through e.g write(2) and AFAICS we have at least some suspicious codepaths. Look at ext4_file_write_iter(), for example. We check O_DIRECT, then grab some locks, then proceed to look at the results of that check, do some work... and call __generic_file_write_iter(), which checks O_DIRECT again. If it has been cleared (or, probably worse, set) it looks like we'll get an interesting bunch of holes. Should we just labda-expand that call of __generic_file_write_iter() and replace its if (unlikely(file->f_flags & O_DIRECT)) { with if (odirect) to be guaranteed that it'll match the things we'd done before the call? I'm looking through the callchains right now in search of similar places right now, will follow up when I'm done... BTW, speaking of read/write vs. swap - what's the story with e.g. AFS write() checking IS_SWAPFILE() and failing with -EBUSY? Note that * it's done before acquiring i_mutex, so it isn't race-free * it's dubious from the POSIX POV - EBUSY isn't in the error list for write(2). * other filesystems generally don't have anything of that sort. NFS does, but local ones do not... Besides, do we even allow swapfiles on AFS? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote: > See my previous message. If we use O_DIRECT on the original open, then > filesystems that implement bmap but not direct_IO will no longer work. > These are the ones that I found in my tree: In the long run I don't think they are worth keeping. But to keep you out of that discussion you can just try an open without O_DIRECT if the open with the flag failed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 12:24:37AM -0800, Christoph Hellwig wrote: > On Wed, Dec 17, 2014 at 08:20:21AM +, Al Viro wrote: > > Where the hell would those other references come from? We open the damn > > thing in sys_swapon(), never put it into descriptor tables, etc. and > > the only reason why we use filp_close() instead of fput() is that we > > would miss ->flush() otherwise. > > > > Said that, why not simply *open* it with O_DIRECT to start with and be done > > with that? It's not as if those guys came preopened by caller - swapon(2) > > gets a pathname and does opening itself. > > Oops, should have dug deeper into the code. For some reason I assumed > the fd is passed in from userspace. > > The suggestion from Al is much better, given that we never do normal > I/O on the swapfile, just the bmap + direct bio submission which I hope > could go away in favor of the direct I/O variant in the long run. See my previous message. If we use O_DIRECT on the original open, then filesystems that implement bmap but not direct_IO will no longer work. These are the ones that I found in my tree: adfs befs bfs ecryptfs efs freevxfs hpfs isofs minix ntfs omfs qnx4 qnx6 sysv ufs Several of these are read only, and I can't imagine that anyone is using a swapfile on any of the rest, but if someone is, this would be a regression, wouldn't it? -- Omar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 08:20:21AM +, Al Viro wrote: > Where the hell would those other references come from? We open the damn > thing in sys_swapon(), never put it into descriptor tables, etc. and > the only reason why we use filp_close() instead of fput() is that we > would miss ->flush() otherwise. > > Said that, why not simply *open* it with O_DIRECT to start with and be done > with that? It's not as if those guys came preopened by caller - swapon(2) > gets a pathname and does opening itself. Oops, should have dug deeper into the code. For some reason I assumed the fd is passed in from userspace. The suggestion from Al is much better, given that we never do normal I/O on the swapfile, just the bmap + direct bio submission which I hope could go away in favor of the direct I/O variant in the long run. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 12:06:10AM -0800, Christoph Hellwig wrote: > > This seems to be more or less equivalent to doing a fcntl(F_SETFL) to > > add the O_DIRECT flag to swap_file (which is a struct file *). Swapoff > > calls filp_close on swap_file, so I don't see why it's necessary to > > clear the flag. > > filp_lose doesn't nessecarily destroy the file structure, there might be > other reference to it, e.g. from dup() or descriptor passing. Where the hell would those other references come from? We open the damn thing in sys_swapon(), never put it into descriptor tables, etc. and the only reason why we use filp_close() instead of fput() is that we would miss ->flush() otherwise. Said that, why not simply *open* it with O_DIRECT to start with and be done with that? It's not as if those guys came preopened by caller - swapon(2) gets a pathname and does opening itself. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Tue, Dec 16, 2014 at 12:56:24AM -0800, Omar Sandoval wrote: > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1728,6 +1728,9 @@ static int setup_swap_extents(struct swap_info_struct > *sis, sector_t *span) > } > > if (mapping->a_ops->swap_activate) { > + if (!mapping->a_ops->direct_IO) > + return -EINVAL; > + swap_file->f_flags |= O_DIRECT; > ret = mapping->a_ops->swap_activate(sis, swap_file, span); > if (!ret) { > sis->flags |= SWP_FILE; This needs to hold swap_file->f_lock, but otherwise looks good. > This seems to be more or less equivalent to doing a fcntl(F_SETFL) to > add the O_DIRECT flag to swap_file (which is a struct file *). Swapoff > calls filp_close on swap_file, so I don't see why it's necessary to > clear the flag. filp_lose doesn't nessecarily destroy the file structure, there might be other reference to it, e.g. from dup() or descriptor passing. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Tue, Dec 16, 2014 at 12:56:24AM -0800, Omar Sandoval wrote: --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1728,6 +1728,9 @@ static int setup_swap_extents(struct swap_info_struct *sis, sector_t *span) } if (mapping-a_ops-swap_activate) { + if (!mapping-a_ops-direct_IO) + return -EINVAL; + swap_file-f_flags |= O_DIRECT; ret = mapping-a_ops-swap_activate(sis, swap_file, span); if (!ret) { sis-flags |= SWP_FILE; This needs to hold swap_file-f_lock, but otherwise looks good. This seems to be more or less equivalent to doing a fcntl(F_SETFL) to add the O_DIRECT flag to swap_file (which is a struct file *). Swapoff calls filp_close on swap_file, so I don't see why it's necessary to clear the flag. filp_lose doesn't nessecarily destroy the file structure, there might be other reference to it, e.g. from dup() or descriptor passing. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 12:06:10AM -0800, Christoph Hellwig wrote: This seems to be more or less equivalent to doing a fcntl(F_SETFL) to add the O_DIRECT flag to swap_file (which is a struct file *). Swapoff calls filp_close on swap_file, so I don't see why it's necessary to clear the flag. filp_lose doesn't nessecarily destroy the file structure, there might be other reference to it, e.g. from dup() or descriptor passing. Where the hell would those other references come from? We open the damn thing in sys_swapon(), never put it into descriptor tables, etc. and the only reason why we use filp_close() instead of fput() is that we would miss -flush() otherwise. Said that, why not simply *open* it with O_DIRECT to start with and be done with that? It's not as if those guys came preopened by caller - swapon(2) gets a pathname and does opening itself. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 08:20:21AM +, Al Viro wrote: Where the hell would those other references come from? We open the damn thing in sys_swapon(), never put it into descriptor tables, etc. and the only reason why we use filp_close() instead of fput() is that we would miss -flush() otherwise. Said that, why not simply *open* it with O_DIRECT to start with and be done with that? It's not as if those guys came preopened by caller - swapon(2) gets a pathname and does opening itself. Oops, should have dug deeper into the code. For some reason I assumed the fd is passed in from userspace. The suggestion from Al is much better, given that we never do normal I/O on the swapfile, just the bmap + direct bio submission which I hope could go away in favor of the direct I/O variant in the long run. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 12:24:37AM -0800, Christoph Hellwig wrote: On Wed, Dec 17, 2014 at 08:20:21AM +, Al Viro wrote: Where the hell would those other references come from? We open the damn thing in sys_swapon(), never put it into descriptor tables, etc. and the only reason why we use filp_close() instead of fput() is that we would miss -flush() otherwise. Said that, why not simply *open* it with O_DIRECT to start with and be done with that? It's not as if those guys came preopened by caller - swapon(2) gets a pathname and does opening itself. Oops, should have dug deeper into the code. For some reason I assumed the fd is passed in from userspace. The suggestion from Al is much better, given that we never do normal I/O on the swapfile, just the bmap + direct bio submission which I hope could go away in favor of the direct I/O variant in the long run. See my previous message. If we use O_DIRECT on the original open, then filesystems that implement bmap but not direct_IO will no longer work. These are the ones that I found in my tree: adfs befs bfs ecryptfs efs freevxfs hpfs isofs minix ntfs omfs qnx4 qnx6 sysv ufs Several of these are read only, and I can't imagine that anyone is using a swapfile on any of the rest, but if someone is, this would be a regression, wouldn't it? -- Omar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote: See my previous message. If we use O_DIRECT on the original open, then filesystems that implement bmap but not direct_IO will no longer work. These are the ones that I found in my tree: In the long run I don't think they are worth keeping. But to keep you out of that discussion you can just try an open without O_DIRECT if the open with the flag failed. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Wed, Dec 17, 2014 at 10:52:56AM -0800, Christoph Hellwig wrote: On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote: See my previous message. If we use O_DIRECT on the original open, then filesystems that implement bmap but not direct_IO will no longer work. These are the ones that I found in my tree: In the long run I don't think they are worth keeping. But to keep you out of that discussion you can just try an open without O_DIRECT if the open with the flag failed. Umm... That's one possibility, of course (and if swapon(2) is on someone's hotpath, I really would like to see what the hell they are doing - it has to be interesting in a sick way). Said that, there's an interesting problem with O_DIRECT. It's irrelevant in this case, but it *can* be changed halfway through e.g write(2) and AFAICS we have at least some suspicious codepaths. Look at ext4_file_write_iter(), for example. We check O_DIRECT, then grab some locks, then proceed to look at the results of that check, do some work... and call __generic_file_write_iter(), which checks O_DIRECT again. If it has been cleared (or, probably worse, set) it looks like we'll get an interesting bunch of holes. Should we just labda-expand that call of __generic_file_write_iter() and replace its if (unlikely(file-f_flags O_DIRECT)) { with if (odirect) to be guaranteed that it'll match the things we'd done before the call? I'm looking through the callchains right now in search of similar places right now, will follow up when I'm done... BTW, speaking of read/write vs. swap - what's the story with e.g. AFS write() checking IS_SWAPFILE() and failing with -EBUSY? Note that * it's done before acquiring i_mutex, so it isn't race-free * it's dubious from the POSIX POV - EBUSY isn't in the error list for write(2). * other filesystems generally don't have anything of that sort. NFS does, but local ones do not... Besides, do we even allow swapfiles on AFS? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Tue, Dec 16, 2014 at 12:35:43AM -0800, Christoph Hellwig wrote: > On Mon, Dec 15, 2014 at 02:11:00PM -0800, Omar Sandoval wrote: > > Ok, I got the swap code working with ->read_iter/->write_iter without > > too much trouble. I wanted to double check before I submit if there's > > any gotchas involved with adding the O_DIRECT flag to a file pointer > > after it has been opened -- swapon opens the swapfile before we know if > > we're using the SWP_FILE infrastructure, and we need to add O_DIRECT so > > ->{read,write}_iter use direct I/O, but we can't add O_DIRECT to the > > original open without excluding filesystems that support the old bmap > > path but not direct I/O. > > In general just adding O_DIRECT is a problem. However given that the > swap file is locked against any other access while in use it seems ok > in this particular case. Just make sure to clear it on swapoff, and > write a detailed comment explaining the situation. I'll admit that I'm a bit confused. I want to do this: diff --git a/mm/swapfile.c b/mm/swapfile.c index 8798b2e..5145c09 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1728,6 +1728,9 @@ static int setup_swap_extents(struct swap_info_struct *sis, sector_t *span) } if (mapping->a_ops->swap_activate) { + if (!mapping->a_ops->direct_IO) + return -EINVAL; + swap_file->f_flags |= O_DIRECT; ret = mapping->a_ops->swap_activate(sis, swap_file, span); if (!ret) { sis->flags |= SWP_FILE; This seems to be more or less equivalent to doing a fcntl(F_SETFL) to add the O_DIRECT flag to swap_file (which is a struct file *). Swapoff calls filp_close on swap_file, so I don't see why it's necessary to clear the flag. -- Omar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Mon, Dec 15, 2014 at 02:11:00PM -0800, Omar Sandoval wrote: > Ok, I got the swap code working with ->read_iter/->write_iter without > too much trouble. I wanted to double check before I submit if there's > any gotchas involved with adding the O_DIRECT flag to a file pointer > after it has been opened -- swapon opens the swapfile before we know if > we're using the SWP_FILE infrastructure, and we need to add O_DIRECT so > ->{read,write}_iter use direct I/O, but we can't add O_DIRECT to the > original open without excluding filesystems that support the old bmap > path but not direct I/O. In general just adding O_DIRECT is a problem. However given that the swap file is locked against any other access while in use it seems ok in this particular case. Just make sure to clear it on swapoff, and write a detailed comment explaining the situation. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Mon, Dec 15, 2014 at 02:11:00PM -0800, Omar Sandoval wrote: Ok, I got the swap code working with -read_iter/-write_iter without too much trouble. I wanted to double check before I submit if there's any gotchas involved with adding the O_DIRECT flag to a file pointer after it has been opened -- swapon opens the swapfile before we know if we're using the SWP_FILE infrastructure, and we need to add O_DIRECT so -{read,write}_iter use direct I/O, but we can't add O_DIRECT to the original open without excluding filesystems that support the old bmap path but not direct I/O. In general just adding O_DIRECT is a problem. However given that the swap file is locked against any other access while in use it seems ok in this particular case. Just make sure to clear it on swapoff, and write a detailed comment explaining the situation. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Tue, Dec 16, 2014 at 12:35:43AM -0800, Christoph Hellwig wrote: On Mon, Dec 15, 2014 at 02:11:00PM -0800, Omar Sandoval wrote: Ok, I got the swap code working with -read_iter/-write_iter without too much trouble. I wanted to double check before I submit if there's any gotchas involved with adding the O_DIRECT flag to a file pointer after it has been opened -- swapon opens the swapfile before we know if we're using the SWP_FILE infrastructure, and we need to add O_DIRECT so -{read,write}_iter use direct I/O, but we can't add O_DIRECT to the original open without excluding filesystems that support the old bmap path but not direct I/O. In general just adding O_DIRECT is a problem. However given that the swap file is locked against any other access while in use it seems ok in this particular case. Just make sure to clear it on swapoff, and write a detailed comment explaining the situation. I'll admit that I'm a bit confused. I want to do this: diff --git a/mm/swapfile.c b/mm/swapfile.c index 8798b2e..5145c09 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1728,6 +1728,9 @@ static int setup_swap_extents(struct swap_info_struct *sis, sector_t *span) } if (mapping-a_ops-swap_activate) { + if (!mapping-a_ops-direct_IO) + return -EINVAL; + swap_file-f_flags |= O_DIRECT; ret = mapping-a_ops-swap_activate(sis, swap_file, span); if (!ret) { sis-flags |= SWP_FILE; This seems to be more or less equivalent to doing a fcntl(F_SETFL) to add the O_DIRECT flag to swap_file (which is a struct file *). Swapoff calls filp_close on swap_file, so I don't see why it's necessary to clear the flag. -- Omar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Mon, Dec 15, 2014 at 08:56:15AM -0800, Christoph Hellwig wrote: > On Mon, Dec 15, 2014 at 05:27:05PM +0100, Jan Kara wrote: > > On Sun 14-12-14 21:26:56, Omar Sandoval wrote: > > > The generic write code locks i_mutex for a direct_IO. Swap-over-NFS > > > doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to > > > be held, but most direct_IO implementations do. > > I think you are speaking about direct IO writes only, aren't you? For DIO > > reads we don't hold i_mutex AFAICS. And also for DIO writes we don't > > necessarily hold i_mutex - see for example XFS which doesn't take i_mutex > > for direct IO writes. It uses it's internal rwlock for this (see > > xfs_file_dio_aio_write()). So I think this is just wrong. > > The problem is that the use of ->direct_IO by the swap code is a gross > layering violation. ->direct_IO is a callback for the filesystem, and > the swap code need to call ->read_iter instead of ->readpage and > ->write_tier instead of ->direct_IO, and leave the locking to the > filesystem. > Ok, I got the swap code working with ->read_iter/->write_iter without too much trouble. I wanted to double check before I submit if there's any gotchas involved with adding the O_DIRECT flag to a file pointer after it has been opened -- swapon opens the swapfile before we know if we're using the SWP_FILE infrastructure, and we need to add O_DIRECT so ->{read,write}_iter use direct I/O, but we can't add O_DIRECT to the original open without excluding filesystems that support the old bmap path but not direct I/O. -- Omar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Mon, Dec 15, 2014 at 05:27:05PM +0100, Jan Kara wrote: > On Sun 14-12-14 21:26:56, Omar Sandoval wrote: > > The generic write code locks i_mutex for a direct_IO. Swap-over-NFS > > doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to > > be held, but most direct_IO implementations do. > I think you are speaking about direct IO writes only, aren't you? For DIO > reads we don't hold i_mutex AFAICS. And also for DIO writes we don't > necessarily hold i_mutex - see for example XFS which doesn't take i_mutex > for direct IO writes. It uses it's internal rwlock for this (see > xfs_file_dio_aio_write()). So I think this is just wrong. The problem is that the use of ->direct_IO by the swap code is a gross layering violation. ->direct_IO is a callback for the filesystem, and the swap code need to call ->read_iter instead of ->readpage and ->write_tier instead of ->direct_IO, and leave the locking to the filesystem. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Sun 14-12-14 21:26:56, Omar Sandoval wrote: > The generic write code locks i_mutex for a direct_IO. Swap-over-NFS > doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to > be held, but most direct_IO implementations do. I think you are speaking about direct IO writes only, aren't you? For DIO reads we don't hold i_mutex AFAICS. And also for DIO writes we don't necessarily hold i_mutex - see for example XFS which doesn't take i_mutex for direct IO writes. It uses it's internal rwlock for this (see xfs_file_dio_aio_write()). So I think this is just wrong. Honza > Signed-off-by: Omar Sandoval > --- > mm/page_io.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/page_io.c b/mm/page_io.c > index 955db8b..1630ac0 100644 > --- a/mm/page_io.c > +++ b/mm/page_io.c > @@ -263,6 +263,7 @@ int __swap_writepage(struct page *page, struct > writeback_control *wbc, > if (sis->flags & SWP_FILE) { > struct kiocb kiocb; > struct file *swap_file = sis->swap_file; > + struct inode *inode = file_inode(swap_file); > struct address_space *mapping = swap_file->f_mapping; > struct bio_vec bv = { > .bv_page = page, > @@ -283,9 +284,11 @@ int __swap_writepage(struct page *page, struct > writeback_control *wbc, > > set_page_writeback(page); > unlock_page(page); > + mutex_lock(>i_mutex); > ret = mapping->a_ops->direct_IO(ITER_BVEC | WRITE, > , , > kiocb.ki_pos); > + mutex_unlock(>i_mutex); > if (ret == PAGE_SIZE) { > count_vm_event(PSWPOUT); > ret = 0; > -- > 2.1.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Sun 14-12-14 21:26:56, Omar Sandoval wrote: The generic write code locks i_mutex for a direct_IO. Swap-over-NFS doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to be held, but most direct_IO implementations do. I think you are speaking about direct IO writes only, aren't you? For DIO reads we don't hold i_mutex AFAICS. And also for DIO writes we don't necessarily hold i_mutex - see for example XFS which doesn't take i_mutex for direct IO writes. It uses it's internal rwlock for this (see xfs_file_dio_aio_write()). So I think this is just wrong. Honza Signed-off-by: Omar Sandoval osan...@osandov.com --- mm/page_io.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/page_io.c b/mm/page_io.c index 955db8b..1630ac0 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -263,6 +263,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, if (sis-flags SWP_FILE) { struct kiocb kiocb; struct file *swap_file = sis-swap_file; + struct inode *inode = file_inode(swap_file); struct address_space *mapping = swap_file-f_mapping; struct bio_vec bv = { .bv_page = page, @@ -283,9 +284,11 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, set_page_writeback(page); unlock_page(page); + mutex_lock(inode-i_mutex); ret = mapping-a_ops-direct_IO(ITER_BVEC | WRITE, kiocb, from, kiocb.ki_pos); + mutex_unlock(inode-i_mutex); if (ret == PAGE_SIZE) { count_vm_event(PSWPOUT); ret = 0; -- 2.1.3 -- To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jan Kara j...@suse.cz SUSE Labs, CR -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Mon, Dec 15, 2014 at 05:27:05PM +0100, Jan Kara wrote: On Sun 14-12-14 21:26:56, Omar Sandoval wrote: The generic write code locks i_mutex for a direct_IO. Swap-over-NFS doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to be held, but most direct_IO implementations do. I think you are speaking about direct IO writes only, aren't you? For DIO reads we don't hold i_mutex AFAICS. And also for DIO writes we don't necessarily hold i_mutex - see for example XFS which doesn't take i_mutex for direct IO writes. It uses it's internal rwlock for this (see xfs_file_dio_aio_write()). So I think this is just wrong. The problem is that the use of -direct_IO by the swap code is a gross layering violation. -direct_IO is a callback for the filesystem, and the swap code need to call -read_iter instead of -readpage and -write_tier instead of -direct_IO, and leave the locking to the filesystem. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
On Mon, Dec 15, 2014 at 08:56:15AM -0800, Christoph Hellwig wrote: On Mon, Dec 15, 2014 at 05:27:05PM +0100, Jan Kara wrote: On Sun 14-12-14 21:26:56, Omar Sandoval wrote: The generic write code locks i_mutex for a direct_IO. Swap-over-NFS doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to be held, but most direct_IO implementations do. I think you are speaking about direct IO writes only, aren't you? For DIO reads we don't hold i_mutex AFAICS. And also for DIO writes we don't necessarily hold i_mutex - see for example XFS which doesn't take i_mutex for direct IO writes. It uses it's internal rwlock for this (see xfs_file_dio_aio_write()). So I think this is just wrong. The problem is that the use of -direct_IO by the swap code is a gross layering violation. -direct_IO is a callback for the filesystem, and the swap code need to call -read_iter instead of -readpage and -write_tier instead of -direct_IO, and leave the locking to the filesystem. Ok, I got the swap code working with -read_iter/-write_iter without too much trouble. I wanted to double check before I submit if there's any gotchas involved with adding the O_DIRECT flag to a file pointer after it has been opened -- swapon opens the swapfile before we know if we're using the SWP_FILE infrastructure, and we need to add O_DIRECT so -{read,write}_iter use direct I/O, but we can't add O_DIRECT to the original open without excluding filesystems that support the old bmap path but not direct I/O. -- Omar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
The generic write code locks i_mutex for a direct_IO. Swap-over-NFS doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to be held, but most direct_IO implementations do. Signed-off-by: Omar Sandoval --- mm/page_io.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/page_io.c b/mm/page_io.c index 955db8b..1630ac0 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -263,6 +263,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, if (sis->flags & SWP_FILE) { struct kiocb kiocb; struct file *swap_file = sis->swap_file; + struct inode *inode = file_inode(swap_file); struct address_space *mapping = swap_file->f_mapping; struct bio_vec bv = { .bv_page = page, @@ -283,9 +284,11 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, set_page_writeback(page); unlock_page(page); + mutex_lock(>i_mutex); ret = mapping->a_ops->direct_IO(ITER_BVEC | WRITE, , , kiocb.ki_pos); + mutex_unlock(>i_mutex); if (ret == PAGE_SIZE) { count_vm_event(PSWPOUT); ret = 0; -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
The generic write code locks i_mutex for a direct_IO. Swap-over-NFS doesn't grab the mutex because nfs_direct_IO doesn't expect i_mutex to be held, but most direct_IO implementations do. Signed-off-by: Omar Sandoval osan...@osandov.com --- mm/page_io.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/page_io.c b/mm/page_io.c index 955db8b..1630ac0 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -263,6 +263,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, if (sis-flags SWP_FILE) { struct kiocb kiocb; struct file *swap_file = sis-swap_file; + struct inode *inode = file_inode(swap_file); struct address_space *mapping = swap_file-f_mapping; struct bio_vec bv = { .bv_page = page, @@ -283,9 +284,11 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, set_page_writeback(page); unlock_page(page); + mutex_lock(inode-i_mutex); ret = mapping-a_ops-direct_IO(ITER_BVEC | WRITE, kiocb, from, kiocb.ki_pos); + mutex_unlock(inode-i_mutex); if (ret == PAGE_SIZE) { count_vm_event(PSWPOUT); ret = 0; -- 2.1.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/