Re: [lustre-discuss] trouble mounting after a tunefs
I believe that this message is benign, and is presented when first starting the MDS. It has something to do with the OSTs not being online, IIRC. I get a similar warning on any system I run, for example: May 31 20:53:56 ie2-mds1.lfs.intl kernel: LustreError: 11-0: demo-MDT-lwp-MDT: Communicating with 0@lo, operation mds_connect failed with -11. This is from one of our lab systems. If the MDT shows up as mounted, there may not be a case to answer, although you will still need to verify that your connectivity works as expected :). Check that the storage target is mounted, that service is started (kernel threads are running), and that the content of /proc/fs/lustre/health_check says healthy, etc. lctl dl on the MDS should list the services that are up including the MDT, and lfs check servers on the client should return with a positive outlook (all targets active). Malcolm Cowe Intel High Performance Data Division -Original Message- From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of John White Sent: Saturday, June 13, 2015 1:07 AM To: lustre-discuss@lists.lustre.org Subject: [lustre-discuss] trouble mounting after a tunefs Good Morning Folks, We recently had to add TCP NIDs to an existing o2ib FS. We added the nid to the modprobe.d stuff and tossed the definition of the NID in the failnode and mgsnode params on all OSTs and the MGS + MDT. When either an o2ib or tcp client try to mount, the mount command hangs and dmesg repeats: LustreError: 11-0: brc-MDT-mdc-881036879c00: Communicating with 10.4.250.10@o2ib, operation mds_connect failed with -11. I fear we may have over-done the parameters, could anyone take a look here and let me know if we need to fix things up (remove params, etc)? MGS: Read previous values: Target: MGS Index: unassigned Lustre FS: Mount type: ldiskfs Flags: 0x4 (MGS ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: MDT: Read previous values: Target: brc-MDT Index: 0 Lustre FS: brc Mount type: ldiskfs Flags: 0x1001 (MDT no_primnode ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: mgsnode=10.4.250.11@o2ib,10.0.250.11@tcp:10.4.250.10@o2ib,10.0.250.10@tcp failover.node=10.4.250.10@o2ib,10.0.250.10@tcp:10.4.250.11@o2ib,10.0.250.11@tcp mdt.quota_type=ug OST(sample): Read previous values: Target: brc-OST0002 Index: 2 Lustre FS: brc Mount type: ldiskfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: errors=remount-ro Parameters: mgsnode=10.4.250.10@o2ib,10.0.250.10@tcp:10.4.250.11@o2ib,10.0.250.11@tcp failover.node=10.4.250.12@o2ib,10.0.250.12@tcp:10.4.250.13@o2ib,10.0.250.13@tcp ost.quota_type=ug ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] trouble mounting after a tunefs
Good Morning Folks, We recently had to add TCP NIDs to an existing o2ib FS. We added the nid to the modprobe.d stuff and tossed the definition of the NID in the failnode and mgsnode params on all OSTs and the MGS + MDT. When either an o2ib or tcp client try to mount, the mount command hangs and dmesg repeats: LustreError: 11-0: brc-MDT-mdc-881036879c00: Communicating with 10.4.250.10@o2ib, operation mds_connect failed with -11. I fear we may have over-done the parameters, could anyone take a look here and let me know if we need to fix things up (remove params, etc)? MGS: Read previous values: Target: MGS Index: unassigned Lustre FS: Mount type: ldiskfs Flags: 0x4 (MGS ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: MDT: Read previous values: Target: brc-MDT Index: 0 Lustre FS: brc Mount type: ldiskfs Flags: 0x1001 (MDT no_primnode ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: mgsnode=10.4.250.11@o2ib,10.0.250.11@tcp:10.4.250.10@o2ib,10.0.250.10@tcp failover.node=10.4.250.10@o2ib,10.0.250.10@tcp:10.4.250.11@o2ib,10.0.250.11@tcp mdt.quota_type=ug OST(sample): Read previous values: Target: brc-OST0002 Index: 2 Lustre FS: brc Mount type: ldiskfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: errors=remount-ro Parameters: mgsnode=10.4.250.10@o2ib,10.0.250.10@tcp:10.4.250.11@o2ib,10.0.250.11@tcp failover.node=10.4.250.12@o2ib,10.0.250.12@tcp:10.4.250.13@o2ib,10.0.250.13@tcp ost.quota_type=ug ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] trouble mounting after a tunefs
Hi John, on the Parameters line the different nodes should not be separated by :. Each node should be specified by a separate mgsnode=... or failover.node=... statement. I'm not sure if separating the two interfaces of each node by , is correct here, or if this should be splitted again in two separate statements. best regards, Martin On 06/12/2015 05:07 PM, John White wrote: Good Morning Folks, We recently had to add TCP NIDs to an existing o2ib FS. We added the nid to the modprobe.d stuff and tossed the definition of the NID in the failnode and mgsnode params on all OSTs and the MGS + MDT. When either an o2ib or tcp client try to mount, the mount command hangs and dmesg repeats: LustreError: 11-0: brc-MDT-mdc-881036879c00: Communicating with 10.4.250.10@o2ib, operation mds_connect failed with -11. I fear we may have over-done the parameters, could anyone take a look here and let me know if we need to fix things up (remove params, etc)? MGS: Read previous values: Target: MGS Index: unassigned Lustre FS: Mount type: ldiskfs Flags: 0x4 (MGS ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: MDT: Read previous values: Target: brc-MDT Index: 0 Lustre FS: brc Mount type: ldiskfs Flags: 0x1001 (MDT no_primnode ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: mgsnode=10.4.250.11@o2ib,10.0.250.11@tcp:10.4.250.10@o2ib,10.0.250.10@tcp failover.node=10.4.250.10@o2ib,10.0.250.10@tcp:10.4.250.11@o2ib,10.0.250.11@tcp mdt.quota_type=ug OST(sample): Read previous values: Target: brc-OST0002 Index: 2 Lustre FS: brc Mount type: ldiskfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: errors=remount-ro Parameters: mgsnode=10.4.250.10@o2ib,10.0.250.10@tcp:10.4.250.11@o2ib,10.0.250.11@tcp failover.node=10.4.250.12@o2ib,10.0.250.12@tcp:10.4.250.13@o2ib,10.0.250.13@tcp ost.quota_type=ug ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org