Hi Thomas,

It is interesting that you have encountered this error without a router.  Good 
information.   I have updated LU-5718 with a link to this discussion.

The original fix posted to LU-5718 by Liang will fix his problem for you (it 
does not assume a router is the cause).  That fix does double the amount of 
memory used per QP.  Probably not an issue for a client, but could be an issue 
for a router (as Cray has found).

Are you using the quotas feature?  There is some evidence that may play a role 
here.

Doug

> On Sep 10, 2016, at 12:38 AM, Thomas Roth <t.r...@gsi.de> wrote:
> 
> Hi all,
> 
> we are running Lustre 2.5.3 on Infiniband. We have massive problems with 
> clients being unable to communicate with any number of OSTs, rendering the 
> entire cluster quite unusable.
> 
> Clients show
> > LNetError: 1399:0:(o2iblnd_cb.c:1140:kiblnd_init_rdma()) RDMA too 
> > fragmented for 10.20.0.242@o2ib1 (256): 231/256 src 231/256 dst frags
> > LNetError: 1399:0:(o2iblnd_cb.c:1690:kiblnd_reply()) Can't setup rdma for 
> > GET from 10.20.0.242@o2ib1: -90
> 
> which eventually results in OSTs at that nid becoming "temporarily 
> unavailable".
> However, the OSTs are never recovered, until they are manually evicted or the 
> host rebooted.
> 
> On the OSS side, this reads
> >  LNetError: 13660:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA 
> > with 10.20.0.220@o2ib1 (56): c: 7, oc: 0, rc: 7
> 
> 
> We have checked the IB fabric, which shows no errors. Since we are not able 
> to reproduce this effect in a simple way, we have also scrutinized the user 
> code, so far without results.
> 
> Whenever this happens, the connection between client and OSS is fine under 
> all IB test commands.
> Communication between client and OSS is still going on, but obviously when 
> Lustre tries to replay the missed transaction, this fragmentation limit is 
> hit again, so the OST never becomes available again.
> 
> If we understand correctly, the map_on_demand parameter should be increased 
> as a workaround.
> The ko2iblnd module seems to provide this parameter,
> > modinfo ko2iblnd
> > parm:           map_on_demand:map on demand (int)
> 
> but no matter what we load the module with, map_on_demand always remains at 
> the default value,
> > cat /sys/module/ko2iblnd/parameters/map_on_demand
> > 0
> 
> Is there any way to understand
> - why this memory fragmentation occurs/becomes so large?
> - how to measure the real fragmentation degree (o2iblnd simply stops at 256, 
> perhaps we are at 1000?)
> - why map_on_demand cannot be changed?
> 
> 
> Of course this all looks very much like LU-5718, but our clients are not 
> behind LNET routers.
> 
> There is one router which connects to the campus network but is not in use. 
> And there are some routers which connect to an older cluster, but of course 
> the old (1.8) clients never show any of these errors.
> 
> 
> Cheers,
> Thomas
> 
> --------------------------------------------------------------------
> Thomas Roth
> Department: HPC
> Location: SB3 1.262
> Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
> 
> GSI Helmholtzzentrum für Schwerionenforschung GmbH
> Planckstraße 1
> 64291 Darmstadt
> www.gsi.de
> 
> Gesellschaft mit beschränkter Haftung
> Sitz der Gesellschaft: Darmstadt
> Handelsregister: Amtsgericht Darmstadt, HRB 1528
> 
> Geschäftsführung: Professor Dr. Karlheinz Langanke
> Ursula Weyrich
> Jörg Blaurock
> 
> Vorsitzender des Aufsichtsrates: St Dr. Georg Schütte
> Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to