Re: [PATCH][RFC] network splice receive v3
On Thu, Jul 19 2007, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote: > Hello. > > In article <[EMAIL PROTECTED]> (at Wed, 11 Jul 2007 11:19:27 +0200), Jens > Axboe <[EMAIL PROTECTED]> says: > > > @@ -835,6 +835,7 @@ const struct proto_ops inet_stream_ops = { > > .recvmsg = sock_common_recvmsg, > > .mmap = sock_no_mmap, > > .sendpage = tcp_sendpage, > > + .splice_read = tcp_splice_read, > > #ifdef CONFIG_COMPAT > > .compat_setsockopt = compat_sock_common_setsockopt, > > .compat_getsockopt = compat_sock_common_getsockopt, > > Please add similar bits in net/ipv6/af_inet6.c > unless there are any dependency on IPv4. > (And if there are, it is not good.) There are no specific ipv4 depedencies, it's just an oversight. So thanks for the clue, I'll add it! -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] network splice receive v3
Hello. In article <[EMAIL PROTECTED]> (at Wed, 11 Jul 2007 11:19:27 +0200), Jens Axboe <[EMAIL PROTECTED]> says: > @@ -835,6 +835,7 @@ const struct proto_ops inet_stream_ops = { > .recvmsg = sock_common_recvmsg, > .mmap = sock_no_mmap, > .sendpage = tcp_sendpage, > + .splice_read = tcp_splice_read, > #ifdef CONFIG_COMPAT > .compat_setsockopt = compat_sock_common_setsockopt, > .compat_getsockopt = compat_sock_common_getsockopt, Please add similar bits in net/ipv6/af_inet6.c unless there are any dependency on IPv4. (And if there are, it is not good.) --yoshfuji - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] network splice receive v3
Hi. On Fri, Jul 13, 2007 at 02:21:00PM +0200, Jens Axboe ([EMAIL PROTECTED]) wrote: > > It really looks like the last tree we tested, so if you think additional > > one will not hurt, feel free to ping, so I will completely rebase > > testing tree. > > It would be great if you could retest! There are some minor changes in > there, and some extra testing definitely will not hurt. I've just tested it with 2.6.22 (e1c1e98d2a3f57b22a0d4136c8160e54404aa437 commit) and did not found any problems - after qute big files were transferred there is no observed previously skb leak, no crashes (quite a few debug options are turned on in config) and files are correct on both peers, so it works good. > -- > Jens Axboe -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] network splice receive v3
On Thu, Jul 12 2007, Evgeniy Polyakov wrote: > On Wed, Jul 11, 2007 at 11:19:27AM +0200, Jens Axboe ([EMAIL PROTECTED]) > wrote: > > Hi, > > Hi Jens. > > > Here's an updated implementation of tcp network splice receive support. > > It actually works for me now, no data corruption seen. > > > > For the original announcement and how to test it, see: > > > > http://marc.info/?l=linux-netdev&m=118103093400770&w=2 > > > > The splice core changes needed to support this are now merged in > > 2.6.22-git, so the patchset shrinks to just two patches - one for adding > > a release hook, and one for the networking changes. > > > > The code is also available in the splice-net branch here: > > > > git://git.kernel.dk/data/git/linux-2.6-block.git splice-net > > > > There's a third experimental patch in there that allows vmsplice > > directly to user memory, that still needs some work though. > > > > Comments, testing welcome! > > It looks like you included all bits we found in the previous runs, so > likely it will work good, but so far I have conflicts merging todays git > and your tree in include/linux/splice.h, fs/ext2/file.c, fs/splice.c and > mm/filemap_xip.c. This can be a problem with my tree though. Hmm, the patch should apply directly to the tree as of when I posted this original mail, or any later one. I just tried a rebase, and it rebased fine on top of the current -git as well. So I think the issue is with your tree, sorry! > It really looks like the last tree we tested, so if you think additional > one will not hurt, feel free to ping, so I will completely rebase > testing tree. It would be great if you could retest! There are some minor changes in there, and some extra testing definitely will not hurt. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] network splice receive v3
On Wed, Jul 11, 2007 at 11:19:27AM +0200, Jens Axboe ([EMAIL PROTECTED]) wrote: > Hi, Hi Jens. > Here's an updated implementation of tcp network splice receive support. > It actually works for me now, no data corruption seen. > > For the original announcement and how to test it, see: > > http://marc.info/?l=linux-netdev&m=118103093400770&w=2 > > The splice core changes needed to support this are now merged in > 2.6.22-git, so the patchset shrinks to just two patches - one for adding > a release hook, and one for the networking changes. > > The code is also available in the splice-net branch here: > > git://git.kernel.dk/data/git/linux-2.6-block.git splice-net > > There's a third experimental patch in there that allows vmsplice > directly to user memory, that still needs some work though. > > Comments, testing welcome! It looks like you included all bits we found in the previous runs, so likely it will work good, but so far I have conflicts merging todays git and your tree in include/linux/splice.h, fs/ext2/file.c, fs/splice.c and mm/filemap_xip.c. This can be a problem with my tree though. It really looks like the last tree we tested, so if you think additional one will not hurt, feel free to ping, so I will completely rebase testing tree. > -- > Jens Axboe -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] network splice receive v3
On Wed, Jul 11 2007, Joel Becker wrote: > On Wed, Jul 11, 2007 at 11:19:27AM +0200, Jens Axboe wrote: > > Subject: [PATCH] splice: don't assume regular pages in splice_to_pipe() > > > > Allow caller to pass in a release function, there might be > > other resources that need releasing as well. Needed for > > network receive. > > > > diff --git a/fs/splice.c b/fs/splice.c > > index 3160951..4b4b501 100644 > > --- a/fs/splice.c > > +++ b/fs/splice.c > > @@ -254,11 +254,16 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe, > > } > > > > while (page_nr < spd_pages) > > - page_cache_release(spd->pages[page_nr++]); > > + spd->spd_release(spd, page_nr++); > > Rather than requiring the caller set this, shouldn't we just > allow it? Especially given there is only one non-page user? > > while (page_nr < spd_pages) > -page_cache_release(spd->pages[page_nr++]); > +if (spd->spd_release) > +spd->spd_release(spd, page_nr++); > +else > +page_cache_release(spd->pages[page_nr++]); Certainly possible, I think it's cleaner with it always being set though. If it grows other out-of-splice.c users, then your change may be a good idea though. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] network splice receive v3
On Wed, Jul 11, 2007 at 11:19:27AM +0200, Jens Axboe wrote: > Subject: [PATCH] splice: don't assume regular pages in splice_to_pipe() > > Allow caller to pass in a release function, there might be > other resources that need releasing as well. Needed for > network receive. > > diff --git a/fs/splice.c b/fs/splice.c > index 3160951..4b4b501 100644 > --- a/fs/splice.c > +++ b/fs/splice.c > @@ -254,11 +254,16 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe, > } > > while (page_nr < spd_pages) > - page_cache_release(spd->pages[page_nr++]); > + spd->spd_release(spd, page_nr++); Rather than requiring the caller set this, shouldn't we just allow it? Especially given there is only one non-page user? while (page_nr < spd_pages) - page_cache_release(spd->pages[page_nr++]); + if (spd->spd_release) + spd->spd_release(spd, page_nr++); + else + page_cache_release(spd->pages[page_nr++]); Joel -- "Any man who is under 30, and is not a liberal, has not heart; and any man who is over 30, and is not a conservative, has no brains." - Sir Winston Churchill Joel Becker Principal Software Developer Oracle E-mail: [EMAIL PROTECTED] Phone: (650) 506-8127 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][RFC] network splice receive v3
Hi, Here's an updated implementation of tcp network splice receive support. It actually works for me now, no data corruption seen. For the original announcement and how to test it, see: http://marc.info/?l=linux-netdev&m=118103093400770&w=2 The splice core changes needed to support this are now merged in 2.6.22-git, so the patchset shrinks to just two patches - one for adding a release hook, and one for the networking changes. The code is also available in the splice-net branch here: git://git.kernel.dk/data/git/linux-2.6-block.git splice-net There's a third experimental patch in there that allows vmsplice directly to user memory, that still needs some work though. Comments, testing welcome! -- Jens Axboe >From e59a68f2d7d261b301960b97659910aab8e3d776 Mon Sep 17 00:00:00 2001 From: Jens Axboe <[EMAIL PROTECTED]> Date: Mon, 11 Jun 2007 13:00:32 +0200 Subject: [PATCH] splice: don't assume regular pages in splice_to_pipe() Allow caller to pass in a release function, there might be other resources that need releasing as well. Needed for network receive. Signed-off-by: Jens Axboe <[EMAIL PROTECTED]> --- fs/splice.c|9 - include/linux/splice.h |1 + 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index 3160951..4b4b501 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -254,11 +254,16 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe, } while (page_nr < spd_pages) - page_cache_release(spd->pages[page_nr++]); + spd->spd_release(spd, page_nr++); return ret; } +static void spd_release_page(struct splice_pipe_desc *spd, unsigned int i) +{ + page_cache_release(spd->pages[i]); +} + static int __generic_file_splice_read(struct file *in, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, @@ -277,6 +282,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos, .partial = partial, .flags = flags, .ops = &page_cache_pipe_buf_ops, + .spd_release = spd_release_page, }; index = *ppos >> PAGE_CACHE_SHIFT; @@ -1674,6 +1680,7 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov, .partial = partial, .flags = flags, .ops = &user_page_pipe_buf_ops, + .spd_release = spd_release_page, }; pipe = pipe_info(file->f_path.dentry->d_inode); diff --git a/include/linux/splice.h b/include/linux/splice.h index 2c08456..b8fa41e 100644 --- a/include/linux/splice.h +++ b/include/linux/splice.h @@ -54,6 +54,7 @@ struct splice_pipe_desc { int nr_pages; /* number of pages in map */ unsigned int flags; /* splice flags */ const struct pipe_buf_operations *ops;/* ops associated with output pipe */ + void (*spd_release)(struct splice_pipe_desc *, unsigned int); }; typedef int (splice_actor)(struct pipe_inode_info *, struct pipe_buffer *, -- 1.5.3.rc0.90.gbaa79 >From b62e4a5a3e3220702e837e556427972dc591ff59 Mon Sep 17 00:00:00 2001 From: Jens Axboe <[EMAIL PROTECTED]> Date: Wed, 20 Jun 2007 09:54:14 +0200 Subject: [PATCH] TCP splice receive support Support for network splice receive. Signed-off-by: Jens Axboe <[EMAIL PROTECTED]> --- include/linux/net.h|3 + include/linux/skbuff.h |5 + include/net/tcp.h |3 + net/core/skbuff.c | 246 net/ipv4/af_inet.c |1 + net/ipv4/tcp.c | 129 + net/socket.c | 13 +++ 7 files changed, 400 insertions(+), 0 deletions(-) diff --git a/include/linux/net.h b/include/linux/net.h index efc4517..472ee12 100644 --- a/include/linux/net.h +++ b/include/linux/net.h @@ -19,6 +19,7 @@ #define _LINUX_NET_H #include +#include #include struct poll_table_struct; @@ -165,6 +166,8 @@ struct proto_ops { struct vm_area_struct * vma); ssize_t (*sendpage) (struct socket *sock, struct page *page, int offset, size_t size, int flags); + ssize_t (*splice_read)(struct socket *sock, loff_t *ppos, + struct pipe_inode_info *pipe, size_t len, unsigned int flags); }; struct net_proto_family { diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 6f0b2f7..177bffc 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1504,6 +1504,11 @@ extern int skb_store_bits(struct sk_buff *skb, int offset, extern __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset, u8 *to, int len, __wsum csum); +extern int skb_splice_bits(struct sk_buff *skb, + unsigned int offset, + struct pipe_inode_info *pipe, + unsigned int len, + unsigned int flags); extern void skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to); extern void skb_split(struct sk_buff *skb, struct sk_buff *skb1, const u32 len); diff --git a/include/net/tcp.h b/include/net/tcp.h index a8af9ae..8e86697 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -308,6 +308,9 @@ extern int tcp_twsk_uniqu