On Tue, 2018-09-04 at 08:06 -0700, Pak Lui wrote: > Hi all, > > I am having issue with the Lustre client pinging the server using > o2ib.I want to find out if anyone has a suggestion on what could be > the problem. Thanks in advance. > > lustre client pinging to server: > > [root@n0 ~]# lctl ping 192.168.13.8@o2ib > > failed to ping 192.168.13.8@o2ib: Input/output error <<<<<<< > > lustre client pinging to server over IPoIB works: > > [root@n0~]# ping -c 1 192.168.13.8 > > PING 192.168.13.8 (192.168.13.8) 56(84) bytes of data. > > 64 bytes from 192.168.13.8: icmp_seq=1 ttl=64 time=0.376 ms > > lustre client pinging to self or other client works: > > [root@n0 ~]# lctl ping 192.168.13.54@o2ib > > 12345-0@lo > > 12345-192.168.13.54@o2ib > > lustre client pinging to self or otover IPoIB works: > > [root@n0~]# ping -c 1 192.168.13.54 > > PING 192.168.13.54 (192.168.13.54) 56(84) bytes of data. > > 64 bytes from 192.168.13.54: icmp_seq=1 ttl=64 time=0.017 ms > > The lustre server and client have specified the modprobe for lnet: > > /etc/modprobe.conf > > options lnet networks=o2ib(ib0) > > The client reports some error when trying to ping or mount from the > client to server: > modprobe lustre lnet > lctl ping 192.168.13.8@o2ib > mount -v -t lustre 192.168.13.8@o2ib:/zfs /mnt/zfs > > > [root@n0 ~]# dmesg|tail > > [589805.093447] Lustre: Lustre: Build Version: 2.11.54 > > [589805.272652] LNet: Using FastReg for registration > > [589805.275954] LNet: Added LNI 192.168.13.54@o2ib [8/256/0/180] > > [589813.278370] LNet: > > 22357:0:(o2iblnd_cb.c:3320:kiblnd_check_conns()) Timed out tx for 1 > > 92.168.13.186@o2ib: 589813 seconds > > [589835.518404] LustreError: > > 22463:0:(mgc_request.c:251:do_config_log_add()) MGC192.168.13.8@o2i > > b: failed processing log, type 1: rc = -5 > > [589843.118385] LustreError: > > 22488:0:(mgc_request.c:601:do_requeue()) failed processing log: -5 > > [589866.718389] LustreError: 15c-8: MGC192.168.13.8@o2ib: The > > configuration from log 'zfs-client' failed (-5). This may be the > > result of communication errors between this node and the MGS, a bad > > configuration, or other errors. See the syslog for more > > information. > > [589866.741623] Lustre: Unmounted zfs-client > > [589867.278516] LustreError: > > 22463:0:(obd_mount.c:1599:lustre_fill_super()) Unable to mount (- > > 5) > > server reports some error during mounting: > > [root@license ~]# Sep 4 07:26:56 license kernel: LNet: > > 25518:0:(o2iblnd_cb.c:2475:kiblnd_passive_connect()) Can't accept > > conn from 192.168.13.54@o2ib (version 12): max_frags 16 > > incompatible without FMR pool (256 wanted) > > The lustre server setup: > > [root@license ~]# lfs df -h > > UUID bytes Used Available Use% > > Mounted on > > zfs-MDT0000_UUID 863.4M 7.5M 853.9M 1% > > /mnt/zfs[MDT:0] > > zfs-OST0000_UUID 1.7T 10.0G 1.7T 1% > > /mnt/zfs[OST:0] > > > > filesystem_summary: 1.7T 10.0G 1.7T 1% > > /mnt/zfs > > server: RHEL 7.5 (3.10.0-862.el7.x86_64), MLNX_OFED_LINUX-4.4- > 2.0.7.0, lustre 2.11.54 > client: RHEL 7.5 (4.14.0-49.el7a.aarch64), MLNX_OFED_LINUX-4.4- > 2.0.7.0 , lustre 2.11.54 >
It might be helpful to state the Lustre software versions that you have used. Also, given this is an Arm client with (with presumably 64K pg size), connecting to a x86 server (with presumably 4K pg size), have you added the map_on_demand=16 incantation to the server? I don't have direct experience of this, but heard it was needed in some Arm configurations (depending on server/client version): https://jira.whamcloud.com/browse/LU-10775 May be James can advise? best regards, Richard -- [email protected] Server Software Eco-System Tel: +1 512 410 9612 IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
