Re: [PATCH V3 net-next 2/3] rds: deliver zerocopy completion notification with data

2018-02-26 Thread Santosh Shilimkar

On 2/26/2018 9:11 AM, Sowmini Varadhan wrote:

On (02/26/18 09:07), Santosh Shilimkar wrote:

Just in case you haven't seen yet, Dan Carpenter reported skb deref
warning on previous version of the patch. Not sure why it wasn't sent
on netdev.


yes I saw it, but that was for the previous version of the patch
around code that delivers notifications on sk_error_queue.

This patch series removes the sk_error_queue support to the
smatch warning is not applicable after this patch.


I see it now. Thanks !!

Regards,
Santosh


Re: [PATCH V3 net-next 2/3] rds: deliver zerocopy completion notification with data

2018-02-26 Thread David Miller
From: Sowmini Varadhan 
Date: Mon, 26 Feb 2018 12:11:18 -0500

> on a different note, for some odd reason I'm not seeing this patch series
> on the patch queue, though its showing up in the archives.

Patchwork has been dropping patches lately, I've sent a detailed
report of a case of a BPF patch series to the ozlabs folks, but with
no response so far.


Re: [PATCH V3 net-next 2/3] rds: deliver zerocopy completion notification with data

2018-02-26 Thread Sowmini Varadhan
On (02/26/18 09:07), Santosh Shilimkar wrote:
> Just in case you haven't seen yet, Dan Carpenter reported skb deref
> warning on previous version of the patch. Not sure why it wasn't sent
> on netdev.

yes I saw it, but that was for the previous version of the patch
around code that delivers notifications on sk_error_queue.

This patch series removes the sk_error_queue support to the
smatch warning is not applicable after this patch.

on a different note, for some odd reason I'm not seeing this patch series
on the patch queue, though its showing up in the archives.

--Sowmini


Re: [PATCH V3 net-next 2/3] rds: deliver zerocopy completion notification with data

2018-02-26 Thread Santosh Shilimkar


On 2/25/2018 3:21 PM, Sowmini Varadhan wrote:

This commit is an optimization over commit 01883eda72bd
("rds: support for zcopy completion notification") for PF_RDS sockets.

RDS applications are predominantly request-response transactions, so
it is more efficient to reduce the number of system calls and have
zerocopy completion notification delivered as ancillary data on the
POLLIN channel.

Cookies are passed up as ancillary data (at level SOL_RDS) in a
struct rds_zcopy_cookies when the returned value of recvmsg() is
greater than, or equal to, 0. A max of RDS_MAX_ZCOOKIES may be passed
with each message.

This commit removes support for zerocopy completion notification on
MSG_ERRQUEUE for PF_RDS sockets.

Signed-off-by: Sowmini Varadhan 
---
v2: remove sk_error_queue path; lot of cautionary checks rds_recvmsg_zcookie()
 and callers to make sure we dont remove cookies from the queue and then
 fail to pass it up to caller
v3:
   - bounds check on skb->cb to make sure there is enough room for
 struct rds_zcopy_cookies as well as the rds_znotifier;
   - Refactor cautionary checks in rds_recvmsg_zcookie: if no msg_control
 has been passed, or if there not enough msg_controllen for a
 a rds_zcopy_cookies, return silently (do not return error, as the
 caller may have wanted other ancillary data which may happen to fit
 in the space provided)
   - return bool form rds_recvmsg_zcookie, some other code cleanup


Just in case you haven't seen yet, Dan Carpenter reported skb deref
warning on previous version of the patch. Not sure why it wasn't sent
on netdev.

smatch warnings:
net/rds/recv.c:605 rds_recvmsg_zcookie() warn: variable dereferenced 
before check 'skb' (see line 596)


With that addressed,

Acked-by: Santosh Shilimkar 



[PATCH V3 net-next 2/3] rds: deliver zerocopy completion notification with data

2018-02-25 Thread Sowmini Varadhan
This commit is an optimization over commit 01883eda72bd
("rds: support for zcopy completion notification") for PF_RDS sockets.

RDS applications are predominantly request-response transactions, so
it is more efficient to reduce the number of system calls and have
zerocopy completion notification delivered as ancillary data on the
POLLIN channel.

Cookies are passed up as ancillary data (at level SOL_RDS) in a
struct rds_zcopy_cookies when the returned value of recvmsg() is
greater than, or equal to, 0. A max of RDS_MAX_ZCOOKIES may be passed
with each message.

This commit removes support for zerocopy completion notification on
MSG_ERRQUEUE for PF_RDS sockets.

Signed-off-by: Sowmini Varadhan 
---
v2: remove sk_error_queue path; lot of cautionary checks rds_recvmsg_zcookie()
and callers to make sure we dont remove cookies from the queue and then
fail to pass it up to caller
v3: 
  - bounds check on skb->cb to make sure there is enough room for
struct rds_zcopy_cookies as well as the rds_znotifier; 
  - Refactor cautionary checks in rds_recvmsg_zcookie: if no msg_control
has been passed, or if there not enough msg_controllen for a
a rds_zcopy_cookies, return silently (do not return error, as the
caller may have wanted other ancillary data which may happen to fit
in the space provided)
  - return bool form rds_recvmsg_zcookie, some other code cleanup

 include/uapi/linux/errqueue.h |2 --
 include/uapi/linux/rds.h  |7 +++
 net/rds/af_rds.c  |7 +--
 net/rds/message.c |   38 --
 net/rds/rds.h |2 ++
 net/rds/recv.c|   31 ++-
 6 files changed, 60 insertions(+), 27 deletions(-)

diff --git a/include/uapi/linux/errqueue.h b/include/uapi/linux/errqueue.h
index 28812ed..dc64cfa 100644
--- a/include/uapi/linux/errqueue.h
+++ b/include/uapi/linux/errqueue.h
@@ -20,13 +20,11 @@ struct sock_extended_err {
 #define SO_EE_ORIGIN_ICMP6 3
 #define SO_EE_ORIGIN_TXSTATUS  4
 #define SO_EE_ORIGIN_ZEROCOPY  5
-#define SO_EE_ORIGIN_ZCOOKIE   6
 #define SO_EE_ORIGIN_TIMESTAMPING SO_EE_ORIGIN_TXSTATUS
 
 #define SO_EE_OFFENDER(ee) ((struct sockaddr*)((ee)+1))
 
 #define SO_EE_CODE_ZEROCOPY_COPIED 1
-#defineSO_EE_ORIGIN_MAX_ZCOOKIES   8
 
 /**
  * struct scm_timestamping - timestamps exposed through cmsg
diff --git a/include/uapi/linux/rds.h b/include/uapi/linux/rds.h
index 12e3bca..a66b213 100644
--- a/include/uapi/linux/rds.h
+++ b/include/uapi/linux/rds.h
@@ -104,6 +104,7 @@
 #define RDS_CMSG_MASKED_ATOMIC_CSWP9
 #define RDS_CMSG_RXPATH_LATENCY11
 #defineRDS_CMSG_ZCOPY_COOKIE   12
+#defineRDS_CMSG_ZCOPY_COMPLETION   13
 
 #define RDS_INFO_FIRST 1
 #define RDS_INFO_COUNTERS  1
@@ -317,6 +318,12 @@ struct rds_rdma_notify {
 #define RDS_RDMA_DROPPED   3
 #define RDS_RDMA_OTHER_ERROR   4
 
+#defineRDS_MAX_ZCOOKIES8
+struct rds_zcopy_cookies {
+   __u32 num;
+   __u32 cookies[RDS_MAX_ZCOOKIES];
+};
+
 /*
  * Common set of flags for all RDMA related structs
  */
diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c
index a937f18..f712610 100644
--- a/net/rds/af_rds.c
+++ b/net/rds/af_rds.c
@@ -77,6 +77,7 @@ static int rds_release(struct socket *sock)
rds_send_drop_to(rs, NULL);
rds_rdma_drop_keys(rs);
rds_notify_queue_get(rs, NULL);
+   __skb_queue_purge(>rs_zcookie_queue);
 
spin_lock_bh(_sock_lock);
list_del_init(>rs_item);
@@ -144,7 +145,7 @@ static int rds_getname(struct socket *sock, struct sockaddr 
*uaddr,
  *  -  to signal that a previously congested destination may have become
  * uncongested
  *  -  A notification has been queued to the socket (this can be a congestion
- * update, or a RDMA completion).
+ * update, or a RDMA completion, or a MSG_ZEROCOPY completion).
  *
  * EPOLLOUT is asserted if there is room on the send queue. This does not mean
  * however, that the next sendmsg() call will succeed. If the application tries
@@ -178,7 +179,8 @@ static __poll_t rds_poll(struct file *file, struct socket 
*sock,
spin_unlock(>rs_lock);
}
if (!list_empty(>rs_recv_queue) ||
-   !list_empty(>rs_notify_queue))
+   !list_empty(>rs_notify_queue) ||
+   !skb_queue_empty(>rs_zcookie_queue))
mask |= (EPOLLIN | EPOLLRDNORM);
if (rs->rs_snd_bytes < rds_sk_sndbuf(rs))
mask |= (EPOLLOUT | EPOLLWRNORM);
@@ -513,6 +515,7 @@ static int __rds_create(struct socket *sock, struct sock 
*sk, int protocol)
INIT_LIST_HEAD(>rs_recv_queue);
INIT_LIST_HEAD(>rs_notify_queue);
INIT_LIST_HEAD(>rs_cong_list);
+   skb_queue_head_init(>rs_zcookie_queue);
spin_lock_init(>rs_rdma_lock);
rs->rs_rdma_keys = RB_ROOT;