Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
On Tue, Aug 18, 2020 at 12:58:07PM -0700, Kees Cook wrote: > On Tue, Aug 18, 2020 at 09:54:46PM +0200, Christoph Hellwig wrote: > > On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote: > > > On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote: > > > > default_file_splice_write is the last piece of generic code that uses > > > > set_fs to make the uaccess routines operate on kernel pointers. It > > > > implements a "fallback loop" for splicing from files that do not > > > > actually > > > > provide a proper splice_read method. The usual file systems and other > > > > high bandwith instances all provide a ->splice_read, so this just > > > > removes > > > > support for various device drivers and procfs/debugfs files. If splice > > > > support for any of those turns out to be important it can be added back > > > > by switching them to the iter ops and using generic_file_splice_read. > > > > > > > > Signed-off-by: Christoph Hellwig > > > > > > This seems a bit disruptive? I feel like this is going to make fuzzers > > > really noisy (e.g. trinity likes to splice random stuff out of /sys and > > > /proc). > > > > Noisy in the sence of triggering the pr_debug or because they can't > > handle -EINVAL? > > Well, maybe both? I doubt much _expects_ to be using splice, so I'm fine > with that, but it seems weird not to have a fall-back, especially if > something would like to splice a file out of there. But, I'm not opposed > to the change, it just seems like it might cause pain down the road. The problem is that without pretending a buffer is in user space when it actually isn't, we can't have a generic fallback. So we'll have to have specific support - I wrote generic support for seq_file, and willy did for /proc/sys, but at least the first caused a few problems and a fair amount of churn, so I'd rather see first if we can get away without it. > > -- > Kees Cook ---end quoted text---
Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
On Tue, Aug 18, 2020 at 09:54:46PM +0200, Christoph Hellwig wrote: > On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote: > > On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote: > > > default_file_splice_write is the last piece of generic code that uses > > > set_fs to make the uaccess routines operate on kernel pointers. It > > > implements a "fallback loop" for splicing from files that do not actually > > > provide a proper splice_read method. The usual file systems and other > > > high bandwith instances all provide a ->splice_read, so this just removes > > > support for various device drivers and procfs/debugfs files. If splice > > > support for any of those turns out to be important it can be added back > > > by switching them to the iter ops and using generic_file_splice_read. > > > > > > Signed-off-by: Christoph Hellwig > > > > This seems a bit disruptive? I feel like this is going to make fuzzers > > really noisy (e.g. trinity likes to splice random stuff out of /sys and > > /proc). > > Noisy in the sence of triggering the pr_debug or because they can't > handle -EINVAL? Well, maybe both? I doubt much _expects_ to be using splice, so I'm fine with that, but it seems weird not to have a fall-back, especially if something would like to splice a file out of there. But, I'm not opposed to the change, it just seems like it might cause pain down the road. -- Kees Cook
Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote: > On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote: > > default_file_splice_write is the last piece of generic code that uses > > set_fs to make the uaccess routines operate on kernel pointers. It > > implements a "fallback loop" for splicing from files that do not actually > > provide a proper splice_read method. The usual file systems and other > > high bandwith instances all provide a ->splice_read, so this just removes > > support for various device drivers and procfs/debugfs files. If splice > > support for any of those turns out to be important it can be added back > > by switching them to the iter ops and using generic_file_splice_read. > > > > Signed-off-by: Christoph Hellwig > > This seems a bit disruptive? I feel like this is going to make fuzzers > really noisy (e.g. trinity likes to splice random stuff out of /sys and > /proc). Noisy in the sence of triggering the pr_debug or because they can't handle -EINVAL?
Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote: > default_file_splice_write is the last piece of generic code that uses > set_fs to make the uaccess routines operate on kernel pointers. It > implements a "fallback loop" for splicing from files that do not actually > provide a proper splice_read method. The usual file systems and other > high bandwith instances all provide a ->splice_read, so this just removes > support for various device drivers and procfs/debugfs files. If splice > support for any of those turns out to be important it can be added back > by switching them to the iter ops and using generic_file_splice_read. > > Signed-off-by: Christoph Hellwig This seems a bit disruptive? I feel like this is going to make fuzzers really noisy (e.g. trinity likes to splice random stuff out of /sys and /proc). Conceptually, though: Reviewed-by: Kees Cook -- Kees Cook
[PATCH 03/11] fs: don't allow splice read/write without explicit ops
default_file_splice_write is the last piece of generic code that uses set_fs to make the uaccess routines operate on kernel pointers. It implements a "fallback loop" for splicing from files that do not actually provide a proper splice_read method. The usual file systems and other high bandwith instances all provide a ->splice_read, so this just removes support for various device drivers and procfs/debugfs files. If splice support for any of those turns out to be important it can be added back by switching them to the iter ops and using generic_file_splice_read. Signed-off-by: Christoph Hellwig --- fs/read_write.c| 2 +- fs/splice.c| 130 + include/linux/fs.h | 2 - 3 files changed, 15 insertions(+), 119 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 702c4301d9eb6b..8c61f67453e3d3 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1077,7 +1077,7 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos, } EXPORT_SYMBOL(vfs_iter_write); -ssize_t vfs_readv(struct file *file, const struct iovec __user *vec, +static ssize_t vfs_readv(struct file *file, const struct iovec __user *vec, unsigned long vlen, loff_t *pos, rwf_t flags) { struct iovec iovstack[UIO_FASTIOV]; diff --git a/fs/splice.c b/fs/splice.c index d7c8a7c4db07ff..412df7b48f9eb7 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -342,89 +342,6 @@ const struct pipe_buf_operations nosteal_pipe_buf_ops = { }; EXPORT_SYMBOL(nosteal_pipe_buf_ops); -static ssize_t kernel_readv(struct file *file, const struct kvec *vec, - unsigned long vlen, loff_t offset) -{ - mm_segment_t old_fs; - loff_t pos = offset; - ssize_t res; - - old_fs = get_fs(); - set_fs(KERNEL_DS); - /* The cast to a user pointer is valid due to the set_fs() */ - res = vfs_readv(file, (const struct iovec __user *)vec, vlen, &pos, 0); - set_fs(old_fs); - - return res; -} - -static ssize_t default_file_splice_read(struct file *in, loff_t *ppos, -struct pipe_inode_info *pipe, size_t len, -unsigned int flags) -{ - struct kvec *vec, __vec[PIPE_DEF_BUFFERS]; - struct iov_iter to; - struct page **pages; - unsigned int nr_pages; - unsigned int mask; - size_t offset, base, copied = 0; - ssize_t res; - int i; - - if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) - return -EAGAIN; - - /* -* Try to keep page boundaries matching to source pagecache ones - -* it probably won't be much help, but... -*/ - offset = *ppos & ~PAGE_MASK; - - iov_iter_pipe(&to, READ, pipe, len + offset); - - res = iov_iter_get_pages_alloc(&to, &pages, len + offset, &base); - if (res <= 0) - return -ENOMEM; - - nr_pages = DIV_ROUND_UP(res + base, PAGE_SIZE); - - vec = __vec; - if (nr_pages > PIPE_DEF_BUFFERS) { - vec = kmalloc_array(nr_pages, sizeof(struct kvec), GFP_KERNEL); - if (unlikely(!vec)) { - res = -ENOMEM; - goto out; - } - } - - mask = pipe->ring_size - 1; - pipe->bufs[to.head & mask].offset = offset; - pipe->bufs[to.head & mask].len -= offset; - - for (i = 0; i < nr_pages; i++) { - size_t this_len = min_t(size_t, len, PAGE_SIZE - offset); - vec[i].iov_base = page_address(pages[i]) + offset; - vec[i].iov_len = this_len; - len -= this_len; - offset = 0; - } - - res = kernel_readv(in, vec, nr_pages, *ppos); - if (res > 0) { - copied = res; - *ppos += res; - } - - if (vec != __vec) - kfree(vec); -out: - for (i = 0; i < nr_pages; i++) - put_page(pages[i]); - kvfree(pages); - iov_iter_advance(&to, copied); /* truncates and discards */ - return res; -} - /* * Send 'sd->len' bytes to socket from 'sd->file' at position 'sd->pos' * using sendpage(). Return the number of bytes sent. @@ -788,33 +705,6 @@ iter_file_splice_write(struct pipe_inode_info *pipe, struct file *out, EXPORT_SYMBOL(iter_file_splice_write); -static int write_pipe_buf(struct pipe_inode_info *pipe, struct pipe_buffer *buf, - struct splice_desc *sd) -{ - int ret; - void *data; - loff_t tmp = sd->pos; - - data = kmap(buf->page); - ret = __kernel_write(sd->u.file, data + buf->offset, sd->len, &tmp); - kunmap(buf->page); - - return ret; -} - -static ssize_t default_file_splice_write(struct pipe_inode_info *pipe, -struct file *out, loff_t *ppos, -size_t len, unsigned int flags) -{