Re: [PATCH net] rds: Reintroduce statistics counting

2017-08-08 Thread Shamir Rabinovitch
On Tue, Aug 08, 2017 at 11:13:32AM +0200, Håkon Bugge wrote:
> In commit 7e3f2952eeb1 ("rds: don't let RDS shutdown a connection
> while senders are present"), refilling the receive queue was removed
> from rds_ib_recv(), along with the increment of
> s_ib_rx_refill_from_thread.
> 
> Commit 73ce4317bf98 ("RDS: make sure we post recv buffers")
> re-introduces filling the receive queue from rds_ib_recv(), but does
> not add the statistics counter. rds_ib_recv() was later renamed to
> rds_ib_recv_path().
> 
> This commit reintroduces the statistics counting of
> s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq.
> 
> Signed-off-by: Håkon Bugge <haakon.bu...@oracle.com>
> Reviewed-by: Knut Omang <knut.om...@oracle.com>
> Reviewed-by: Wei Lin Guay <wei.lin.g...@oracle.com>
> ---
>  net/rds/ib_recv.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
> index e10624a..9722bf8 100644
> --- a/net/rds/ib_recv.c
> +++ b/net/rds/ib_recv.c
> @@ -1015,8 +1015,10 @@ void rds_ib_recv_cqe_handler(struct rds_ib_connection 
> *ic,
>   if (rds_ib_ring_empty(>i_recv_ring))
>   rds_ib_stats_inc(s_ib_rx_ring_empty);
>  
> - if (rds_ib_ring_low(>i_recv_ring))
> + if (rds_ib_ring_low(>i_recv_ring)) {
>   rds_ib_recv_refill(conn, 0, GFP_NOWAIT);
> + rds_ib_stats_inc(s_ib_rx_refill_from_cq);
> + }
>  }
>  
>  int rds_ib_recv_path(struct rds_conn_path *cp)
> @@ -1029,6 +1031,7 @@ int rds_ib_recv_path(struct rds_conn_path *cp)
>   if (rds_conn_up(conn)) {
>   rds_ib_attempt_ack(ic);
>   rds_ib_recv_refill(conn, 0, GFP_KERNEL);
> + rds_ib_stats_inc(s_ib_rx_refill_from_thread);
>   }
>  
>   return ret;
> -- 
> 2.9.3
> 

Reviewed-by: Shamir Rabinovitch <shamir.rabinovi...@oracle.com>



[PATCH RDS v1] rds: debug messages are enabled by default

2016-10-27 Thread shamir . rabinovitch
From: shamir rabinovitch <shamir.rabinovi...@oracle.com>

rds use Kconfig option called "RDS_DEBUG" to enable rds debug messages.
This option cause the rds Makefile to add -DDEBUG to the rds gcc command
line.

When CONFIG_DYNAMIC_DEBUG is enabled, the "DEBUG" macro is used by
include/linux/dynamic_debug.h to decide if dynamic debug prints should
be sent by default to the kernel log.

rds should not enable this macro for production builds. rds dynamic
debug work as expected follow this fix.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovi...@oracle.com>
---
 net/rds/Makefile |2 +-
 net/rds/rds.h|2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/rds/Makefile b/net/rds/Makefile
index 0e72bec..56c7d27 100644
--- a/net/rds/Makefile
+++ b/net/rds/Makefile
@@ -13,5 +13,5 @@ obj-$(CONFIG_RDS_TCP) += rds_tcp.o
 rds_tcp-y :=   tcp.o tcp_connect.o tcp_listen.o tcp_recv.o \
tcp_send.o tcp_stats.o
 
-ccflags-$(CONFIG_RDS_DEBUG):=  -DDEBUG
+ccflags-$(CONFIG_RDS_DEBUG):=  -DRDS_DEBUG
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index 25532a4..4121e18 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -33,7 +33,7 @@
 #define KERNEL_HAS_ATOMIC64
 #endif
 
-#ifdef DEBUG
+#ifdef RDS_DEBUG
 #define rdsdebug(fmt, args...) pr_debug("%s(): " fmt, __func__ , ##args)
 #else
 /* sigh, pr_debug() causes unused variable warnings */
-- 
1.7.1



[PATCH v4 1/2] RDS: memory allocated must be align to 8

2016-04-07 Thread Shamir Rabinovitch
Fix issue in 'rds_ib_cong_recv' when accessing unaligned memory
allocated by 'rds_page_remainder_alloc' using uint64_t pointer.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovi...@oracle.com>
---
 net/rds/page.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/rds/page.c b/net/rds/page.c
index 616f21f..e2b5a58 100644
--- a/net/rds/page.c
+++ b/net/rds/page.c
@@ -135,8 +135,8 @@ int rds_page_remainder_alloc(struct scatterlist *scat, 
unsigned long bytes,
if (rem->r_offset != 0)
rds_stats_inc(s_page_remainder_hit);
 
-   rem->r_offset += bytes;
-   if (rem->r_offset == PAGE_SIZE) {
+   rem->r_offset += ALIGN(bytes, 8);
+   if (rem->r_offset >= PAGE_SIZE) {
__free_page(rem->r_page);
rem->r_page = NULL;
}
-- 
1.7.1



[PATCH v4 2/2] RDS: fix congestion map corruption for PAGE_SIZE > 4k

2016-04-07 Thread Shamir Rabinovitch
When PAGE_SIZE > 4k single page can contain 2 RDS fragments. If
'rds_ib_cong_recv' ignore the RDS fragment offset in to the page it
then read the data fragment as far congestion map update and lead to
corruption of the RDS connection far congestion map.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovi...@oracle.com>
---
 net/rds/ib_recv.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index 977fb86..abc8cc8 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -796,7 +796,7 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
 
addr = kmap_atomic(sg_page(>f_sg));
 
-   src = addr + frag_off;
+   src = addr + frag->f_sg.offset + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
-- 
1.7.1



Re: [PATCH v2] rds: rds-stress show all zeros after few minutes

2016-04-03 Thread Shamir Rabinovitch
On Thu, Mar 31, 2016 at 04:02:46PM -0400, David Miller wrote:
> From: shamir rabinovitch <shamir.rabinovi...@oracle.com>
> Date: Thu, 31 Mar 2016 02:29:22 -0400
> 
> > Issue can be seen on platforms that use 8K and above page size
> > while rds fragment size is 4K. On those platforms single page is
> > shared between 2 or more rds fragments. Each fragment has its own
> > offset and rds congestion map code need to take this offset to account.
> > Not taking this offset to account lead to reading the data fragment
> > as congestion map fragment and hang of the rds transmit due to far
> > congestion map corruption.
> > 
> > Signed-off-by: shamir rabinovitch <shamir.rabinovi...@oracle.com>
> > 
> > Reviewed-by: Wengang Wang <wen.gang.w...@oracle.com>
> > Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchand...@oracle.com>
> > Acked-by: Santosh Shilimkar <santosh.shilim...@oracle.com>
> > Tested-by: Anand Bibhuti <anand.bibh...@oracle.com>
> 
> This doesn't apply cleanly to my current tree, please respin.

Sorry for the trouble.

Re-sent the patch based on net-next master.
Broke the patch according to comments from Santosh Shilimkar.

BR, Shamir


[PATCH v3 1/2] RDS: fix "Kernel unaligned access" on sparc64

2016-04-03 Thread Shamir Rabinovitch
Sparc64 expect 64 bit types to be aligned to 8. This patch ensure that
memory allocated by 'rds_page_remainder_alloc' will be aligned to 8.
This patch fix issue found in 'rds_ib_cong_recv' when accessing
unaligned memory using uint64_t pointer.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovi...@oracle.com>

Reviewed-by: Wengang Wang <wen.gang.w...@oracle.com>
Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchand...@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilim...@oracle.com>
Tested-by: Anand Bibhuti <anand.bibh...@oracle.com>
---
 net/rds/page.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/rds/page.c b/net/rds/page.c
index 616f21f..4813e1f 100644
--- a/net/rds/page.c
+++ b/net/rds/page.c
@@ -135,8 +135,9 @@ int rds_page_remainder_alloc(struct scatterlist *scat, 
unsigned long bytes,
if (rem->r_offset != 0)
rds_stats_inc(s_page_remainder_hit);
 
-   rem->r_offset += bytes;
-   if (rem->r_offset == PAGE_SIZE) {
+   /* fix 'Kernel unaligned access' on sparc64 */
+   rem->r_offset += ALIGN(bytes, 8);
+   if (rem->r_offset >= PAGE_SIZE) {
__free_page(rem->r_page);
rem->r_page = NULL;
}
-- 
1.7.1



[PATCH v3 2/2] RDS: fix congestion map corruption for PAGE_SIZE > 8k

2016-04-03 Thread Shamir Rabinovitch
On Sparc64 page size is 8k. So single page contain 2 RDS fragments. If
'rds_ib_cong_recv' ignore the RDS fragment offset in to the page it then
read the data fragment as far congestion map update and lead to
corruption of the RDS connection far congestion map.

This patch require the below patch for Sparc64 platforms:
001f8f8 RDS: fix "Kernel unaligned access" on sparc64

Signed-off-by: Shamir Rabinovitch <shamir.rabinovi...@oracle.com>

Reviewed-by: Wengang Wang <wen.gang.w...@oracle.com>
Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchand...@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilim...@oracle.com>
Tested-by: Anand Bibhuti <anand.bibh...@oracle.com>
---
 net/rds/ib_recv.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index 977fb86..abc8cc8 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -796,7 +796,7 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
 
addr = kmap_atomic(sg_page(>f_sg));
 
-   src = addr + frag_off;
+   src = addr + frag->f_sg.offset + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
-- 
1.7.1



[PATCH v2] rds: rds-stress show all zeros after few minutes

2016-03-31 Thread shamir rabinovitch
Issue can be seen on platforms that use 8K and above page size
while rds fragment size is 4K. On those platforms single page is
shared between 2 or more rds fragments. Each fragment has its own
offset and rds congestion map code need to take this offset to account.
Not taking this offset to account lead to reading the data fragment
as congestion map fragment and hang of the rds transmit due to far
congestion map corruption.

Signed-off-by: shamir rabinovitch <shamir.rabinovi...@oracle.com>

Reviewed-by: Wengang Wang <wen.gang.w...@oracle.com>
Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchand...@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilim...@oracle.com>
Tested-by: Anand Bibhuti <anand.bibh...@oracle.com>
---
 net/rds/ib_recv.c |2 +-
 net/rds/iw_recv.c |2 +-
 net/rds/page.c|5 +++--
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index 977fb86..abc8cc8 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -796,7 +796,7 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
 
addr = kmap_atomic(sg_page(>f_sg));
 
-   src = addr + frag_off;
+   src = addr + frag->f_sg.offset + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
diff --git a/net/rds/iw_recv.c b/net/rds/iw_recv.c
index a66d179..62a1738 100644
--- a/net/rds/iw_recv.c
+++ b/net/rds/iw_recv.c
@@ -585,7 +585,7 @@ static void rds_iw_cong_recv(struct rds_connection *conn,
 
addr = kmap_atomic(frag->f_page);
 
-   src = addr + frag_off;
+   src = addr +  frag->f_offset + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
diff --git a/net/rds/page.c b/net/rds/page.c
index 5a14e6d..715cbaa 100644
--- a/net/rds/page.c
+++ b/net/rds/page.c
@@ -135,8 +135,9 @@ int rds_page_remainder_alloc(struct scatterlist *scat, 
unsigned long bytes,
if (rem->r_offset != 0)
rds_stats_inc(s_page_remainder_hit);
 
-   rem->r_offset += bytes;
-   if (rem->r_offset == PAGE_SIZE) {
+   /* some hw (e.g. sparc) require aligned memory */
+   rem->r_offset += ALIGN(bytes, 8);
+   if (rem->r_offset >= PAGE_SIZE) {
__free_page(rem->r_page);
rem->r_page = NULL;
}
-- 
1.7.1



Re: [PATCH] rds: rds-stress show all zeros after few minutes

2016-03-31 Thread Shamir Rabinovitch
On Thu, Mar 31, 2016 at 05:23:30PM +0300, Sergei Shtylyov wrote:
> On 3/31/2016 5:20 PM, Shamir Rabinovitch wrote:
> 
> >>>Issue can be seen on platforms that use 8K and above page size
> >>>while rds fragment size is 4K. On those platforms single page is
> >>>shared between 2 or more rds fragments. Each fragment has it's own
> >>
> >>Its.
> >
> >Fixed in next version.
> >Thanks.
> >
> >>
> >>>offeset and rds cong map code need to take this offset to account.
> >>
> >>Offset. What is "cong", congestion?
> >
> >'Offset' is in middle of the sentence so it is OK as-is.
> 
>No, "offeset" is not OK. :-)

Oh. Typo. Sorry. Will fix! :-)

> 
> [...]
> 
> >>>Signed-off-by: shamir rabinovitch <shamir.rabinovi...@oracle.com>
> [...]
> 
> MBR, Sergei
> 


Re: [PATCH] rds: rds-stress show all zeros after few minutes

2016-03-31 Thread Shamir Rabinovitch
On Thu, Mar 31, 2016 at 04:13:35PM +0300, Sergei Shtylyov wrote:
> Hello.
> 
> On 3/31/2016 3:50 AM, shamir rabinovitch wrote:
> 
> >Issue can be seen on platforms that use 8K and above page size
> >while rds fragment size is 4K. On those platforms single page is
> >shared between 2 or more rds fragments. Each fragment has it's own
> 
>Its.

Fixed in next version. 
Thanks.

> 
> >offeset and rds cong map code need to take this offset to account.
> 
>Offset. What is "cong", congestion?

'Offset' is in middle of the sentence so it is OK as-is. 
Cong is short hand of congestion.
It will be replaces 
with the full word in next version.

> 
> >Not taking this offset to account lead to reading the data fragment
> >as congestion map fragment and hang of the rds transmit due to far
> >cong map corruption.
> >
> >Reviewed-by: Wengang Wang <wen.gang.w...@oracle.com>
> >Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchand...@oracle.com>
> >Acked-by: Santosh Shilimkar <santosh.shilim...@oracle.com>
> >Tested-by: Anand Bibhuti <anand.bibh...@oracle.com>
> 
>These should be after your sign-off, not before.

Thanks for the comment.
Will be fixed in next version.

> 
> >Signed-off-by: shamir rabinovitch <shamir.rabinovi...@oracle.com>
> [...]
> 
> MBR, Sergei
> 

Thanks for the review.
BR, Shamir


[PATCH] rds: rds-stress show all zeros after few minutes

2016-03-31 Thread shamir rabinovitch
Issue can be seen on platforms that use 8K and above page size
while rds fragment size is 4K. On those platforms single page is
shared between 2 or more rds fragments. Each fragment has it's own
offeset and rds cong map code need to take this offset to account.
Not taking this offset to account lead to reading the data fragment
as congestion map fragment and hang of the rds transmit due to far
cong map corruption.

Reviewed-by: Wengang Wang <wen.gang.w...@oracle.com>
Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchand...@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilim...@oracle.com>
Tested-by: Anand Bibhuti <anand.bibh...@oracle.com>

Signed-off-by: shamir rabinovitch <shamir.rabinovi...@oracle.com>
---
 net/rds/ib_recv.c |2 +-
 net/rds/iw_recv.c |2 +-
 net/rds/page.c|5 +++--
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index 977fb86..abc8cc8 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -796,7 +796,7 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
 
addr = kmap_atomic(sg_page(>f_sg));
 
-   src = addr + frag_off;
+   src = addr + frag->f_sg.offset + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
diff --git a/net/rds/iw_recv.c b/net/rds/iw_recv.c
index a66d179..62a1738 100644
--- a/net/rds/iw_recv.c
+++ b/net/rds/iw_recv.c
@@ -585,7 +585,7 @@ static void rds_iw_cong_recv(struct rds_connection *conn,
 
addr = kmap_atomic(frag->f_page);
 
-   src = addr + frag_off;
+   src = addr +  frag->f_offset + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
diff --git a/net/rds/page.c b/net/rds/page.c
index 5a14e6d..715cbaa 100644
--- a/net/rds/page.c
+++ b/net/rds/page.c
@@ -135,8 +135,9 @@ int rds_page_remainder_alloc(struct scatterlist *scat, 
unsigned long bytes,
if (rem->r_offset != 0)
rds_stats_inc(s_page_remainder_hit);
 
-   rem->r_offset += bytes;
-   if (rem->r_offset == PAGE_SIZE) {
+   /* some hw (e.g. sparc) require aligned memory */
+   rem->r_offset += ALIGN(bytes, 8);
+   if (rem->r_offset >= PAGE_SIZE) {
__free_page(rem->r_page);
rem->r_page = NULL;
}
-- 
1.7.1