Re: [PATCH][RFC] network splice receive v3

2007-07-19 Thread Jens Axboe
On Thu, Jul 19 2007, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote:
> Hello.
> 
> In article <[EMAIL PROTECTED]> (at Wed, 11 Jul 2007 11:19:27 +0200), Jens 
> Axboe <[EMAIL PROTECTED]> says:
> 
> > @@ -835,6 +835,7 @@ const struct proto_ops inet_stream_ops = {
> > .recvmsg   = sock_common_recvmsg,
> > .mmap  = sock_no_mmap,
> > .sendpage  = tcp_sendpage,
> > +   .splice_read   = tcp_splice_read,
> >  #ifdef CONFIG_COMPAT
> > .compat_setsockopt = compat_sock_common_setsockopt,
> > .compat_getsockopt = compat_sock_common_getsockopt,
> 
> Please add similar bits in net/ipv6/af_inet6.c
> unless there are any dependency on IPv4.
> (And if there are, it is not good.)

There are no specific ipv4 depedencies, it's just an oversight. So
thanks for the clue, I'll add it!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive v3

2007-07-19 Thread YOSHIFUJI Hideaki / 吉藤英明
Hello.

In article <[EMAIL PROTECTED]> (at Wed, 11 Jul 2007 11:19:27 +0200), Jens Axboe 
<[EMAIL PROTECTED]> says:

> @@ -835,6 +835,7 @@ const struct proto_ops inet_stream_ops = {
>   .recvmsg   = sock_common_recvmsg,
>   .mmap  = sock_no_mmap,
>   .sendpage  = tcp_sendpage,
> + .splice_read   = tcp_splice_read,
>  #ifdef CONFIG_COMPAT
>   .compat_setsockopt = compat_sock_common_setsockopt,
>   .compat_getsockopt = compat_sock_common_getsockopt,

Please add similar bits in net/ipv6/af_inet6.c
unless there are any dependency on IPv4.
(And if there are, it is not good.)

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive v3

2007-07-19 Thread Evgeniy Polyakov
Hi.

On Fri, Jul 13, 2007 at 02:21:00PM +0200, Jens Axboe ([EMAIL PROTECTED]) wrote:
> > It really looks like the last tree we tested, so if you think additional
> > one will not hurt, feel free to ping, so I will completely rebase
> > testing tree.
> 
> It would be great if you could retest! There are some minor changes in
> there, and some extra testing definitely will not hurt.

I've just tested it with 2.6.22
(e1c1e98d2a3f57b22a0d4136c8160e54404aa437 commit) and did not found any
problems - after qute big files were transferred there is no observed 
previously skb leak, no crashes (quite a few debug options are turned on 
in config) and files are correct on both peers, so it works good.

> -- 
> Jens Axboe

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive v3

2007-07-13 Thread Jens Axboe
On Thu, Jul 12 2007, Evgeniy Polyakov wrote:
> On Wed, Jul 11, 2007 at 11:19:27AM +0200, Jens Axboe ([EMAIL PROTECTED]) 
> wrote:
> > Hi,
> 
> Hi Jens.
> 
> > Here's an updated implementation of tcp network splice receive support.
> > It actually works for me now, no data corruption seen.
> > 
> > For the original announcement and how to test it, see:
> > 
> > http://marc.info/?l=linux-netdev&m=118103093400770&w=2
> > 
> > The splice core changes needed to support this are now merged in
> > 2.6.22-git, so the patchset shrinks to just two patches - one for adding
> > a release hook, and one for the networking changes.
> > 
> > The code is also available in the splice-net branch here:
> > 
> > git://git.kernel.dk/data/git/linux-2.6-block.git splice-net
> > 
> > There's a third experimental patch in there that allows vmsplice
> > directly to user memory, that still needs some work though.
> > 
> > Comments, testing welcome!
> 
> It looks like you included all bits we found in the previous runs, so
> likely it will work good, but so far I have conflicts merging todays git
> and your tree in include/linux/splice.h, fs/ext2/file.c, fs/splice.c and 
> mm/filemap_xip.c. This can be a problem with my tree though.

Hmm, the patch should apply directly to the tree as of when I posted
this original mail, or any later one. I just tried a rebase, and it
rebased fine on top of the current -git as well. So I think the issue is
with your tree, sorry!

> It really looks like the last tree we tested, so if you think additional
> one will not hurt, feel free to ping, so I will completely rebase
> testing tree.

It would be great if you could retest! There are some minor changes in
there, and some extra testing definitely will not hurt.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive v3

2007-07-12 Thread Evgeniy Polyakov
On Wed, Jul 11, 2007 at 11:19:27AM +0200, Jens Axboe ([EMAIL PROTECTED]) wrote:
> Hi,

Hi Jens.

> Here's an updated implementation of tcp network splice receive support.
> It actually works for me now, no data corruption seen.
> 
> For the original announcement and how to test it, see:
> 
> http://marc.info/?l=linux-netdev&m=118103093400770&w=2
> 
> The splice core changes needed to support this are now merged in
> 2.6.22-git, so the patchset shrinks to just two patches - one for adding
> a release hook, and one for the networking changes.
> 
> The code is also available in the splice-net branch here:
> 
> git://git.kernel.dk/data/git/linux-2.6-block.git splice-net
> 
> There's a third experimental patch in there that allows vmsplice
> directly to user memory, that still needs some work though.
> 
> Comments, testing welcome!

It looks like you included all bits we found in the previous runs, so
likely it will work good, but so far I have conflicts merging todays git
and your tree in include/linux/splice.h, fs/ext2/file.c, fs/splice.c and 
mm/filemap_xip.c. This can be a problem with my tree though.
It really looks like the last tree we tested, so if you think additional
one will not hurt, feel free to ping, so I will completely rebase
testing tree.

> -- 
> Jens Axboe


-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive v3

2007-07-11 Thread Jens Axboe
On Wed, Jul 11 2007, Joel Becker wrote:
> On Wed, Jul 11, 2007 at 11:19:27AM +0200, Jens Axboe wrote:
> > Subject: [PATCH] splice: don't assume regular pages in splice_to_pipe()
> > 
> > Allow caller to pass in a release function, there might be
> > other resources that need releasing as well. Needed for
> > network receive.
> > 
> > diff --git a/fs/splice.c b/fs/splice.c
> > index 3160951..4b4b501 100644
> > --- a/fs/splice.c
> > +++ b/fs/splice.c
> > @@ -254,11 +254,16 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe,
> > }
> >  
> > while (page_nr < spd_pages)
> > -   page_cache_release(spd->pages[page_nr++]);
> > +   spd->spd_release(spd, page_nr++);
> 
>   Rather than requiring the caller set this, shouldn't we just
> allow it?  Especially given there is only one non-page user?
> 
>   while (page_nr < spd_pages)
>  -page_cache_release(spd->pages[page_nr++]);
>  +if (spd->spd_release)
>  +spd->spd_release(spd, page_nr++);
>  +else
>  +page_cache_release(spd->pages[page_nr++]);

Certainly possible, I think it's cleaner with it always being set
though. If it grows other out-of-splice.c users, then your change may be
a good idea though.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] network splice receive v3

2007-07-11 Thread Joel Becker
On Wed, Jul 11, 2007 at 11:19:27AM +0200, Jens Axboe wrote:
> Subject: [PATCH] splice: don't assume regular pages in splice_to_pipe()
> 
> Allow caller to pass in a release function, there might be
> other resources that need releasing as well. Needed for
> network receive.
> 
> diff --git a/fs/splice.c b/fs/splice.c
> index 3160951..4b4b501 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -254,11 +254,16 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe,
>   }
>  
>   while (page_nr < spd_pages)
> - page_cache_release(spd->pages[page_nr++]);
> + spd->spd_release(spd, page_nr++);

Rather than requiring the caller set this, shouldn't we just
allow it?  Especially given there is only one non-page user?

while (page_nr < spd_pages)
 -  page_cache_release(spd->pages[page_nr++]);
 +  if (spd->spd_release)
 +  spd->spd_release(spd, page_nr++);
 +  else
 +  page_cache_release(spd->pages[page_nr++]);

Joel

-- 

"Any man who is under 30, and is not a liberal, has not heart;
 and any man who is over 30, and is not a conservative, has no brains."
 - Sir Winston Churchill 

Joel Becker
Principal Software Developer
Oracle
E-mail: [EMAIL PROTECTED]
Phone: (650) 506-8127
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][RFC] network splice receive v3

2007-07-11 Thread Jens Axboe
Hi,

Here's an updated implementation of tcp network splice receive support.
It actually works for me now, no data corruption seen.

For the original announcement and how to test it, see:

http://marc.info/?l=linux-netdev&m=118103093400770&w=2

The splice core changes needed to support this are now merged in
2.6.22-git, so the patchset shrinks to just two patches - one for adding
a release hook, and one for the networking changes.

The code is also available in the splice-net branch here:

git://git.kernel.dk/data/git/linux-2.6-block.git splice-net

There's a third experimental patch in there that allows vmsplice
directly to user memory, that still needs some work though.

Comments, testing welcome!

-- 
Jens Axboe

>From e59a68f2d7d261b301960b97659910aab8e3d776 Mon Sep 17 00:00:00 2001
From: Jens Axboe <[EMAIL PROTECTED]>
Date: Mon, 11 Jun 2007 13:00:32 +0200
Subject: [PATCH] splice: don't assume regular pages in splice_to_pipe()

Allow caller to pass in a release function, there might be
other resources that need releasing as well. Needed for
network receive.

Signed-off-by: Jens Axboe <[EMAIL PROTECTED]>
---
 fs/splice.c|9 -
 include/linux/splice.h |1 +
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 3160951..4b4b501 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -254,11 +254,16 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe,
 	}
 
 	while (page_nr < spd_pages)
-		page_cache_release(spd->pages[page_nr++]);
+		spd->spd_release(spd, page_nr++);
 
 	return ret;
 }
 
+static void spd_release_page(struct splice_pipe_desc *spd, unsigned int i)
+{
+	page_cache_release(spd->pages[i]);
+}
+
 static int
 __generic_file_splice_read(struct file *in, loff_t *ppos,
 			   struct pipe_inode_info *pipe, size_t len,
@@ -277,6 +282,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
 		.partial = partial,
 		.flags = flags,
 		.ops = &page_cache_pipe_buf_ops,
+		.spd_release = spd_release_page,
 	};
 
 	index = *ppos >> PAGE_CACHE_SHIFT;
@@ -1674,6 +1680,7 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,
 		.partial = partial,
 		.flags = flags,
 		.ops = &user_page_pipe_buf_ops,
+		.spd_release = spd_release_page,
 	};
 
 	pipe = pipe_info(file->f_path.dentry->d_inode);
diff --git a/include/linux/splice.h b/include/linux/splice.h
index 2c08456..b8fa41e 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -54,6 +54,7 @@ struct splice_pipe_desc {
 	int nr_pages;			/* number of pages in map */
 	unsigned int flags;		/* splice flags */
 	const struct pipe_buf_operations *ops;/* ops associated with output pipe */
+	void (*spd_release)(struct splice_pipe_desc *, unsigned int);
 };
 
 typedef int (splice_actor)(struct pipe_inode_info *, struct pipe_buffer *,
-- 
1.5.3.rc0.90.gbaa79

>From b62e4a5a3e3220702e837e556427972dc591ff59 Mon Sep 17 00:00:00 2001
From: Jens Axboe <[EMAIL PROTECTED]>
Date: Wed, 20 Jun 2007 09:54:14 +0200
Subject: [PATCH] TCP splice receive support

Support for network splice receive.

Signed-off-by: Jens Axboe <[EMAIL PROTECTED]>
---
 include/linux/net.h|3 +
 include/linux/skbuff.h |5 +
 include/net/tcp.h  |3 +
 net/core/skbuff.c  |  246 
 net/ipv4/af_inet.c |1 +
 net/ipv4/tcp.c |  129 +
 net/socket.c   |   13 +++
 7 files changed, 400 insertions(+), 0 deletions(-)

diff --git a/include/linux/net.h b/include/linux/net.h
index efc4517..472ee12 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -19,6 +19,7 @@
 #define _LINUX_NET_H
 
 #include 
+#include 
 #include 
 
 struct poll_table_struct;
@@ -165,6 +166,8 @@ struct proto_ops {
   struct vm_area_struct * vma);
 	ssize_t		(*sendpage)  (struct socket *sock, struct page *page,
   int offset, size_t size, int flags);
+	ssize_t 	(*splice_read)(struct socket *sock,  loff_t *ppos,
+   struct pipe_inode_info *pipe, size_t len, unsigned int flags);
 };
 
 struct net_proto_family {
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6f0b2f7..177bffc 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1504,6 +1504,11 @@ extern int	   skb_store_bits(struct sk_buff *skb, int offset,
 extern __wsum	   skb_copy_and_csum_bits(const struct sk_buff *skb,
 	  int offset, u8 *to, int len,
 	  __wsum csum);
+extern int skb_splice_bits(struct sk_buff *skb,
+		unsigned int offset,
+		struct pipe_inode_info *pipe,
+		unsigned int len,
+		unsigned int flags);
 extern void	   skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to);
 extern void	   skb_split(struct sk_buff *skb,
  struct sk_buff *skb1, const u32 len);
diff --git a/include/net/tcp.h b/include/net/tcp.h
index a8af9ae..8e86697 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -308,6 +308,9 @@ extern int			tcp_twsk_uniqu