Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the 
following link:
https://bugzilla.lustre.org/show_bug.cgi?id=13607



Client: catamount using liblustre 1.4.11

Router: kptllnd - ko2iblnd

Running b_eff_io
http://www.hlrs.de/organization/par/services/models/mpi/b_eff_io/index_v1.1.html
fails when attempting 
to write a chunk size of 1048832, this succeeds on the local lustre filesystem
on the XT3 (all contained within cray 
portals)

A successful run on the local filesystem:

-----+---+------+----+--------+----------+----------+----------------+-------+-------------+-------+-------+------+-----+-----+-----+-----+----------------
num. acc-| pat- |pat-|scheduled     chunk|     chunk|filename        | repeat| 
transferred|  meas-|=sum of|time of 
meas.calls|last |last | measured
of   |ess| tern |tern|  time  |      size|      size|                | factor| 
 MB of this|  ured |   I/O |barr- 
|bcast|file-| I/O |barr.| bandwidth
PEs  |   | type |    | [sec]  |   on disk| in memory|                |       | 
    pattern|  time |       | ier  | 
| sync|call |+bcst| of this pattern
-----+---+------+----+--------+----------+----------+----------------+-------+-------------+-------+-------+------+-----+-----+-----+-----+----------------
n=1   a=0 type=0 p= 0 Tp= 0.00 l= 1048576 L= 1048576 i00_001_0        r=    1 S=
  1.049 MB t= 0.01 = 0.011+ 
0.000+0.000+0.000 0.011 0.000 bw=  92.353 MB/s
n=1   a=0 type=0 p= 1 Tp= 0.40 l= 4194304 L= 4194304 i00_001_0        r=   14 S=
 58.720 MB t= 0.38 = 0.379+ 
0.000+0.000+0.000 0.027 0.000 bw= 154.821 MB/s
n=1   a=0 type=0 p= 2 Tp= 0.40 l= 1048576 L= 2097152 i00_001_0        r=   22 S=
 46.137 MB t= 0.39 = 0.390+ 
0.000+0.000+0.000 0.019 0.000 bw= 118.293 MB/s
n=1   a=0 type=0 p= 3 Tp= 0.40 l= 1048576 L= 1048576 i00_001_0        r=   32 S=
 33.554 MB t= 0.39 = 0.393+ 
0.000+0.000+0.000 0.011 0.000 bw=  85.332 MB/s
n=1   a=0 type=0 p= 4 Tp= 0.20 l=   32768 L= 1048576 i00_001_0        r=   17 S=
 17.826 MB t= 0.19 = 0.195+ 
0.000+0.000+0.000 0.011 0.000 bw=  91.425 MB/s
n=1   a=0 type=0 p= 5 Tp= 0.20 l=    1024 L= 1048576 i00_001_0        r=   17 S=
 17.826 MB t= 0.19 = 0.192+ 
0.000+0.000+0.000 0.011 0.000 bw=  92.684 MB/s
n=1   a=0 type=0 p= 6 Tp= 0.20 l=   32776 L= 1048832 i00_001_0        r=   13 S=
 13.635 MB t= 0.20 = 0.195+ 
0.000+0.000+0.000 0.015 0.000 bw=  69.834 MB/s
n=1   a=0 type=0 p= 7 Tp= 0.20 l=    1032 L= 1056768 i00_001_0        r=   12 S=
 12.681 MB t= 0.19 = 0.185+ 
0.000+0.000+0.000 0.015 0.000 bw=  68.509 MB/s
n=1   a=0 type=0 p= 8 Tp= 0.20 l= 1048584 L= 1048584 i00_001_0        r=   14 S=
 14.680 MB t= 0.20 = 0.198+ 
0.000+0.000+0.000 0.012 0.000 bw=  74.275 MB/s
  total pattern type:  S=216.108 MB  t=2.14 t_op=0.00 t_cl=0.00 wbw= 101.238
MB/s b_eff_io_write_scatter= 101.007 MB/s


The routed filesystem fails on pattern 6
-----+---+------+----+--------+----------+----------+---------------
num. acc-| pat- |pat-|scheduled     chunk|     chunk|filename
of   |ess| tern |tern|  time  |      size|      size|
PEs  |   | type |    | [sec]  |   on disk| in memory|
-----+---+------+----+--------+----------+----------+--------------

n=1   a=0 type=0 p= 6 Tp= 0.20 l=   32776 L= 1048832 i00_001_0



Router debug:
0000400:2000000:0:1189540683.206990:0:16915:0:(api-ni.c:1082:lnet_startup_lndnis())
Added LNI [EMAIL PROTECTED] [8/768]
0000400:2000000:0:1189540683.265861:0:16915:0:(api-ni.c:1082:lnet_startup_lndnis())
Added LNI [EMAIL PROTECTED] [512/1024]
0000800:020000:0:1189540707.150075:0:16918:0:(o2iblnd_cb.c:1207:kiblnd_init_rdma())
RDMA too fragmented: 128/256 src 128/256 dst frags
0000800:020000:0:1189540707.150086:0:16918:0:(o2iblnd_cb.c:449:kiblnd_handle_rx())
Can't setup rdma for PUT to [EMAIL PROTECTED]: -90
0000800:000400:0:1189540733.893046:0:16918:0:(ptllnd_peer.c:1142:kptllnd_tx_launch())
Refusing to create a new connection to [EMAIL PROTECTED] (non-kernel peer)
0000800:020000:0:1189540914.127475:0:16918:0:(o2iblnd_cb.c:1207:kiblnd_init_rdma())
RDMA too fragmented: 128/256 src 128/256 dst frags
0000800:020000:0:1189540914.127486:0:16918:0:(o2iblnd_cb.c:449:kiblnd_handle_rx())
Can't setup rdma for PUT to [EMAIL PROTECTED]: -90
0000800:020000:0:1189609614.644581:0:16918:0:(o2iblnd_cb.c:1207:kiblnd_init_rdma())
RDMA too fragmented: 128/256 src 128/256 dst frags
0000800:020000:0:1189609614.644594:0:16918:0:(o2iblnd_cb.c:449:kiblnd_handle_rx())
Can't setup rdma for PUT to [EMAIL PROTECTED]: -90
0000800:020000:0:1189609724.267097:0:16918:0:(o2iblnd_cb.c:1207:kiblnd_init_rdma())
RDMA too fragmented: 128/256 src 128/256 dst frags
0000800:020000:0:1189609724.267110:0:16918:0:(o2iblnd_cb.c:449:kiblnd_handle_rx())
Can't setup rdma for PUT to [EMAIL PROTECTED]: -90
0000800:020000:0:1189610024.296970:0:16918:0:(o2iblnd_cb.c:1207:kiblnd_init_rdma())
RDMA too fragmented: 128/256 src 128/256 dst frags
0000800:020000:0:1189610024.296981:0:16918:0:(o2iblnd_cb.c:449:kiblnd_handle_rx())
Can't setup rdma for PUT to [EMAIL PROTECTED]: -90
0000800:020000:0:1189610324.325446:0:16918:0:(o2iblnd_cb.c:1207:kiblnd_init_rdma())
RDMA too fragmented: 128/256 src 128/256 dst frags
0000800:020000:0:1189610324.325458:0:16918:0:(o2iblnd_cb.c:449:kiblnd_handle_rx())
Can't setup rdma for PUT to [EMAIL PROTECTED]: -90
0000800:020000:0:1189610624.352945:0:16918:0:(o2iblnd_cb.c:1207:kiblnd_init_rdma())
RDMA too fragmented: 128/256 src 128/256 dst frags
0000800:020000:0:1189610624.352957:0:16918:0:(o2iblnd_cb.c:449:kiblnd_handle_rx())
Can't setup rdma for PUT to [EMAIL PROTECTED]: -90
Debug log: 17 lines, 17 kept, 0 dropped.

_______________________________________________
Lustre-devel mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Reply via email to