Re: [lustre-discuss] RDMA too many fragments/timed out - clients slowing entire filesystem performance

2016-11-01 Thread Brian W. Johanson

Great, thanks Doug!

Quotas are not enabled.

There are a few nodes that were exhibiting the issue fairly consistently.  We 
have recently added  70 clients (~900 total) which seems to have caused this to 
happen more frequently.


-b


On 11/01/2016 07:57 PM, Oucharek, Doug S wrote:

Hi Brian,

You need this patch: http://review.whamcloud.com/#/c/12451.  It has not landed 
to master yet and is off by default.  To activate it, add this module 
parameter line to your nodes (all of them):


options ko2iblnd wrq_sge=2

The issue is that something is causing an offset to be introduced to the bulk 
transfers.  That causes a misalignment of the source and destination 
fragments.  Due to how the algorithm works, you require twice as many 
descriptors to the fragments to do the RDMA operation.  So, you are running 
out of descriptors when you are only halfway done configuring the transfer. 
 The above patch creates two sets of descriptors so the second set can be 
utilized in situations like this.  The fix operates on the nodes which are 
doing the bulk transfers.  Since you can both read and write bulk data, you 
need the fix on server, clients, and LNet routers (basically, everywhere).


Question: are you using the quotas feature and could it be at or approaching a 
limit?  There has been some evidence that the quotas feature could be 
introducing the offset to bulk transfers.


Doug

On Nov 1, 2016, at 4:08 PM, Brian W. Johanson > wrote:


Centos 7.2
Lustre 2.8.0
ZFS 0.6.5.5
OPA 10.2.0.0.158


The clients and servers are on the same OPA network, no routing.  Once a 
client gets in this state, the filesystem performance drops to a faction of 
what it is capable of.

The client must be rebooted to clear the issue.

I imagine I am missing a bug in jira for this issue, does this look like a 
known issue?



Pertinent debug messages from the server:

0800:0002:34.0:1478026118.277782:0:29892:0:(o2iblnd_cb.c:3109:kiblnd_check_txs_locked()) 
Timed out tx: active_txs, 4 seconds
0800:0002:34.0:1478026118.277785:0:29892:0:(o2iblnd_cb.c:3172:kiblnd_check_conns()) 
Timed out RDMA with 10.4.119.112@o2ib (3): c: 112, oc: 0, rc: 66
0800:0100:34.0:1478026118.277787:0:29892:0:(o2iblnd_cb.c:1913:kiblnd_close_conn_locked()) 
Closing conn to 10.4.119.112@o2ib: error -110(waiting)
0100:0002:34.0:1478026118.277844:0:29892:0:(events.c:447:server_bulk_callback()) 
event type 5, status -103, desc 883e8e8bcc00
0100:0002:34.0:1478026118.288714:0:29892:0:(events.c:447:server_bulk_callback()) 
event type 3, status -103, desc 883e8e8bcc00
0100:0002:34.0:1478026118.299574:0:29892:0:(events.c:447:server_bulk_callback()) 
event type 5, status -103, desc 8810e92e9c00
0100:0002:34.0:1478026118.310434:0:29892:0:(events.c:447:server_bulk_callback()) 
event type 3, status -103, desc 8810e92e9c00



And from the client:

0400:0100:8.0:1477949860.565777:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949860.565782:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0800:0002:8.0:1477949860.702666:0:3629:0:(o2iblnd_cb.c:1094:kiblnd_init_rdma()) 
RDMA has too many fragments for peer 10.4.108.81@o2ib (32), src idx/frags: 
16/27 dst idx/frags: 16/27
0800:0002:8.0:1477949860.702667:0:3629:0:(o2iblnd_cb.c:1689:kiblnd_reply()) 
Can't setup rdma for GET from 10.4.108.81@o2ib: -90
0100:0002:8.0:1477949860.702669:0:3629:0:(events.c:201:client_bulk_callback()) 
event type 1, status -5, desc 880fd5d9bc00
0800:0002:8.0:1477949860.81:0:3629:0:(o2iblnd_cb.c:1094:kiblnd_init_rdma()) 
RDMA has too many fragments for peer 10.4.108.81@o2ib (32), src idx/frags: 
16/27 dst idx/frags: 16/27
0800:0002:8.0:1477949860.816668:0:3629:0:(o2iblnd_cb.c:1689:kiblnd_reply()) 
Can't setup rdma for GET from 10.4.108.81@o2ib: -90
0100:0002:8.0:1477949860.816669:0:3629:0:(events.c:201:client_bulk_callback()) 
event type 1, status -5, desc 880fd5d9bc00
0400:0100:8.0:1477949861.573660:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949861.573664:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949861.573667:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949861.573669:0:3629:0:(lib-move.c:1489:lnet_parse_put()) 
Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4

Re: [lustre-discuss] RDMA too many fragments/timed out - clients slowing entire filesystem performance

2016-11-01 Thread Oucharek, Doug S
Hi Brian,

You need this patch: http://review.whamcloud.com/#/c/12451.  It has not landed 
to master yet and is off by default.  To activate it, add this module parameter 
line to your nodes (all of them):

options ko2iblnd wrq_sge=2

The issue is that something is causing an offset to be introduced to the bulk 
transfers.  That causes a misalignment of the source and destination fragments. 
 Due to how the algorithm works, you require twice as many descriptors to the 
fragments to do the RDMA operation.  So, you are running out of descriptors 
when you are only halfway done configuring the transfer.  The above patch 
creates two sets of descriptors so the second set can be utilized in situations 
like this.  The fix operates on the nodes which are doing the bulk transfers.  
Since you can both read and write bulk data, you need the fix on server, 
clients, and LNet routers (basically, everywhere).

Question: are you using the quotas feature and could it be at or approaching a 
limit?  There has been some evidence that the quotas feature could be 
introducing the offset to bulk transfers.

Doug

On Nov 1, 2016, at 4:08 PM, Brian W. Johanson 
> wrote:

Centos 7.2
Lustre 2.8.0
ZFS 0.6.5.5
OPA 10.2.0.0.158


The clients and servers are on the same OPA network, no routing.  Once a client 
gets in this state, the filesystem performance drops to a faction of what it is 
capable of.
The client must be rebooted to clear the issue.


I imagine I am missing a bug in jira for this issue, does this look like a 
known issue?


Pertinent debug messages from the server:

0800:0002:34.0:1478026118.277782:0:29892:0:(o2iblnd_cb.c:3109:kiblnd_check_txs_locked())
 Timed out tx: active_txs, 4 seconds
0800:0002:34.0:1478026118.277785:0:29892:0:(o2iblnd_cb.c:3172:kiblnd_check_conns())
 Timed out RDMA with 10.4.119.112@o2ib (3): c: 112, oc: 0, rc: 66
0800:0100:34.0:1478026118.277787:0:29892:0:(o2iblnd_cb.c:1913:kiblnd_close_conn_locked())
 Closing conn to 10.4.119.112@o2ib: error -110(waiting)
0100:0002:34.0:1478026118.277844:0:29892:0:(events.c:447:server_bulk_callback())
 event type 5, status -103, desc 883e8e8bcc00
0100:0002:34.0:1478026118.288714:0:29892:0:(events.c:447:server_bulk_callback())
 event type 3, status -103, desc 883e8e8bcc00
0100:0002:34.0:1478026118.299574:0:29892:0:(events.c:447:server_bulk_callback())
 event type 5, status -103, desc 8810e92e9c00
0100:0002:34.0:1478026118.310434:0:29892:0:(events.c:447:server_bulk_callback())
 event type 3, status -103, desc 8810e92e9c00

And from the client:


0400:0100:8.0:1477949860.565777:0:3629:0:(lib-move.c:1489:lnet_parse_put())
 Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949860.565782:0:3629:0:(lib-move.c:1489:lnet_parse_put())
 Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0800:0002:8.0:1477949860.702666:0:3629:0:(o2iblnd_cb.c:1094:kiblnd_init_rdma())
 RDMA has too many fragments for peer 10.4.108.81@o2ib (32), src idx/frags: 
16/27 dst idx/frags: 16/27
0800:0002:8.0:1477949860.702667:0:3629:0:(o2iblnd_cb.c:1689:kiblnd_reply())
 Can't setup rdma for GET from 10.4.108.81@o2ib: -90
0100:0002:8.0:1477949860.702669:0:3629:0:(events.c:201:client_bulk_callback())
 event type 1, status -5, desc 880fd5d9bc00
0800:0002:8.0:1477949860.81:0:3629:0:(o2iblnd_cb.c:1094:kiblnd_init_rdma())
 RDMA has too many fragments for peer 10.4.108.81@o2ib (32), src idx/frags: 
16/27 dst idx/frags: 16/27
0800:0002:8.0:1477949860.816668:0:3629:0:(o2iblnd_cb.c:1689:kiblnd_reply())
 Can't setup rdma for GET from 10.4.108.81@o2ib: -90
0100:0002:8.0:1477949860.816669:0:3629:0:(events.c:201:client_bulk_callback())
 event type 1, status -5, desc 880fd5d9bc00
0400:0100:8.0:1477949861.573660:0:3629:0:(lib-move.c:1489:lnet_parse_put())
 Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949861.573664:0:3629:0:(lib-move.c:1489:lnet_parse_put())
 Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949861.573667:0:3629:0:(lib-move.c:1489:lnet_parse_put())
 Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949861.573669:0:3629:0:(lib-move.c:1489:lnet_parse_put())
 Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949861.573671:0:3629:0:(lib-move.c:1489:lnet_parse_put())
 Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match 1549728742532740 
offset 0 length 192: 4
0400:0100:8.0:1477949861.573673:0:3629:0:(lib-move.c:1489:lnet_parse_put())
 Dropping PUT from 12345-10.4.108.81@o2ib portal 4 match