[lustre-discuss] lustre filesystem in hung state
Dear All, Lustre file system goes to hung state and unable to know the exact issue with lustre. Kindly find below information and help us to know the fixes for flle system kernerl hung issue. Cluster Details: Oss node/server is mounted with below mount targets. We could able to mount the client with home mounts and its works for some time. After 10-15mins all the clients hangs and oss node get rebooted. Kindly help. /dev/mapper/mdt-mgt19G 446M 17G 3% /mdt-mgt /dev/mapper/mdt-home 140G 2.8G 128G 3% /mdt-home /dev/mapper/mdt-scratch 140G 759M 130G 1% /mdt-scratch /dev/mapper/ost-home 3.7T 2.4T 1.1T 69% /ost-home Below Lustre packages has been installed at oss node. == kernel-devel-2.6.32-431.23.3.el6_lustre.x86_64 lustre-debuginfo-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 kernel-firmware-2.6.32-431.23.3.el6_lustre.x86_64 lustre-iokit-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 kernel-2.6.32-431.23.3.el6_lustre.x86_64 lustre-modules-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 lustre-tests-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 kernel-debuginfo-common-x86_64-2.6.32-431.23.3.el6_lustre.x86_64 lustre-osd-ldiskfs-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 kernel-debuginfo-2.6.32-431.23.3.el6_lustre.x86_64 = Lustre errors: = Feb 20 06:22:06 oss1 kernel: Lustre: 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous similar messages Feb 20 06:29:11 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar messages Feb 20 06:29:11 oss1 kernel: LustreError: 11-0: scratch-OST0001-osc-MDT: Communicating with 0@lo, operation ost_connect failed with -19. Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar messages Feb 20 06:32:42 oss1 kernel: Lustre: 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1550624551/real 0] req@880800be1000 x1625913123994836/t0(0) o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib:28/4 lens 400/544 e 0 to 1 dl 1550624562 ref 2 fl Rpc:XN/0/ rc 0/-1 Feb 20 06:32:42 oss1 kernel: Lustre: 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous similar messages Feb 20 06:39:36 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar messages Feb 20 06:39:36 oss1 kernel: LustreError: 11-0: scratch-OST0003-osc-MDT: Communicating with 0@lo, operation ost_connect failed with -19. Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar messages Feb 20 06:43:12 oss1 kernel: Lustre: 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1550625151/real 0] req@880800dcd000 x1625913123996040/t0(0) o8->scratch-OST0001-osc-MDT@192.168.1.5@o2ib:28/4 lens 400/544 e 0 to 1 dl 1550625192 ref 2 fl Rpc:XN/0/ rc 0/-1 Feb 20 06:43:12 oss1 kernel: Lustre: 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous similar messages Feb 20 06:50:01 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar messages Feb 20 06:50:01 oss1 kernel: LustreError: 11-0: scratch-OST0003-osc-MDT: Communicating with 0@lo, operation ost_connect failed with -19. Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar messages Feb 20 06:53:57 oss1 kernel: Lustre: 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1550625826/real 0] req@881005e88800 x1625913123997352/t0(0) o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib:28/4 lens 400/544 e 0 to 1 dl 1550625837 ref 2 fl Rpc:XN/0/ rc 0/-1 Feb 20 06:53:57 oss1 kernel: Lustre: 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous similar messages Feb 20 07:00:51 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Feb 20 07:00:51 oss1 kernel: LustreError: Skipped 17 previous similar messages Feb 20 07:00:51 oss1 kernel: LustreError: 11-0: scratch-OST0003-osc-MDT: Communicating with 0@lo, operation ost_connect failed with -19. Feb 20 07:00:51 oss1 kernel: LustreError: Skipped 17 previous similar messages Feb 20 07:04:32 oss1 kernel: Lustre: 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request s
Re: [lustre-discuss] Migrate MGS to ZFS
PS: it is always a good idea to make a backup of your MDT, since it is relatively small compared to the rest of the filesystem. A full-device "dd" copy doesn't take too long and is the most accurate backup for ldiskfs. Cheers, Andreas > On Feb 19, 2019, at 19:31, Andreas Dilger wrote: > > Yes, it is possible to migrate the MGS files to another device as you > propose. I don't think there is any particular difference if you move it to a > separate ldiskfs or ZFS target. > > One caveat is that we don't test combined ZFS and ldiskfs targets on the same > node, though in theory it would work. > > Migrating the MDT from ldiskfs to ZFS is also possible with newer versions of > Lustre (2.12 for sure, I don't recall if it is in 2.10 or not). You need to > follow a special process to do this, please see the Lustre Operations Manual > for details. > > Cheers, Andreas > >> On Feb 19, 2019, at 17:48, Fernando Pérez wrote: >> >> Dear lustre experts. >> >> Whats is the best way to migrate a MGS device to ZFS? Copy the >> CONFIGS/filesystem_name-* files from the old ldiskfs device to the new ZFS >> MGS device? >> >> Currently we have a combined MDT/MGT under ldiskfs with lustre 2.10.4. >> >> We want to upgrade to lustre 2.12.0 and then separate the combined MDT/MGT >> and migrate MDT and MGT to separate ZFS devices. >> >> Regards. >> = >> Fernando Pérez >> Institut de Ciències del Mar (CMIMA-CSIC) >> Departament Oceanografía Física i Tecnològica >> Passeig Marítim de la Barceloneta,37-49 >> 08003 Barcelona >> Phone: (+34) 93 230 96 35 >> = >> >> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Migrate MGS to ZFS
Yes, it is possible to migrate the MGS files to another device as you propose. I don't think there is any particular difference if you move it to a separate ldiskfs or ZFS target. One caveat is that we don't test combined ZFS and ldiskfs targets on the same node, though in theory it would work. Migrating the MDT from ldiskfs to ZFS is also possible with newer versions of Lustre (2.12 for sure, I don't recall if it is in 2.10 or not). You need to follow a special process to do this, please see the Lustre Operations Manual for details. Cheers, Andreas > On Feb 19, 2019, at 17:48, Fernando Pérez wrote: > > Dear lustre experts. > > Whats is the best way to migrate a MGS device to ZFS? Copy the > CONFIGS/filesystem_name-* files from the old ldiskfs device to the new ZFS > MGS device? > > Currently we have a combined MDT/MGT under ldiskfs with lustre 2.10.4. > > We want to upgrade to lustre 2.12.0 and then separate the combined MDT/MGT > and migrate MDT and MGT to separate ZFS devices. > > Regards. > = > Fernando Pérez > Institut de Ciències del Mar (CMIMA-CSIC) > Departament Oceanografía Física i Tecnològica > Passeig Marítim de la Barceloneta,37-49 > 08003 Barcelona > Phone: (+34) 93 230 96 35 > = > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Migrate MGS to ZFS
Dear lustre experts. Whats is the best way to migrate a MGS device to ZFS? Copy the CONFIGS/filesystem_name-* files from the old ldiskfs device to the new ZFS MGS device? Currently we have a combined MDT/MGT under ldiskfs with lustre 2.10.4. We want to upgrade to lustre 2.12.0 and then separate the combined MDT/MGT and migrate MDT and MGT to separate ZFS devices. Regards. = Fernando Pérez Institut de Ciències del Mar (CMIMA-CSIC) Departament Oceanografía Física i Tecnològica Passeig Marítim de la Barceloneta,37-49 08003 Barcelona Phone: (+34) 93 230 96 35 = ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org