[lustre-discuss] lustre filesystem in hung state

2019-02-19 Thread Anilkumar Naik
Dear All,

Lustre file system goes to hung state and unable to know the exact issue
with lustre. Kindly find below information and help us to know the fixes
for flle system kernerl hung issue.

Cluster Details:

Oss node/server is mounted with below mount targets. We could able to mount
the client with home mounts and its works for some time. After 10-15mins
all the clients hangs and oss node get rebooted. Kindly help.

 /dev/mapper/mdt-mgt19G  446M   17G   3% /mdt-mgt
/dev/mapper/mdt-home  140G  2.8G  128G   3% /mdt-home
/dev/mapper/mdt-scratch   140G  759M  130G   1% /mdt-scratch
/dev/mapper/ost-home  3.7T  2.4T  1.1T  69% /ost-home

Below Lustre packages has been installed at oss node.
==
kernel-devel-2.6.32-431.23.3.el6_lustre.x86_64
lustre-debuginfo-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
kernel-firmware-2.6.32-431.23.3.el6_lustre.x86_64
lustre-iokit-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
kernel-2.6.32-431.23.3.el6_lustre.x86_64
lustre-modules-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
lustre-tests-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
kernel-debuginfo-common-x86_64-2.6.32-431.23.3.el6_lustre.x86_64
lustre-osd-ldiskfs-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
kernel-debuginfo-2.6.32-431.23.3.el6_lustre.x86_64
=

Lustre errors:
=
Feb 20 06:22:06 oss1 kernel: Lustre:
6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
similar messages
Feb 20 06:29:11 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID: not
available for connect from 0@lo (no target). If you are running an HA pair
check that the target is mounted on the other server.
Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar
messages
Feb 20 06:29:11 oss1 kernel: LustreError: 11-0:
scratch-OST0001-osc-MDT: Communicating with 0@lo, operation ost_connect
failed with -19.
Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar
messages
Feb 20 06:32:42 oss1 kernel: Lustre:
6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
timed out for sent delay: [sent 1550624551/real 0]  req@880800be1000
x1625913123994836/t0(0) o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib:28/4
lens 400/544 e 0 to 1 dl 1550624562 ref 2 fl Rpc:XN/0/ rc 0/-1
Feb 20 06:32:42 oss1 kernel: Lustre:
6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
similar messages
Feb 20 06:39:36 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not
available for connect from 0@lo (no target). If you are running an HA pair
check that the target is mounted on the other server.
Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar
messages
Feb 20 06:39:36 oss1 kernel: LustreError: 11-0:
scratch-OST0003-osc-MDT: Communicating with 0@lo, operation ost_connect
failed with -19.
Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar
messages
Feb 20 06:43:12 oss1 kernel: Lustre:
6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
timed out for sent delay: [sent 1550625151/real 0]  req@880800dcd000
x1625913123996040/t0(0) o8->scratch-OST0001-osc-MDT@192.168.1.5@o2ib:28/4
lens 400/544 e 0 to 1 dl 1550625192 ref 2 fl Rpc:XN/0/ rc 0/-1
Feb 20 06:43:12 oss1 kernel: Lustre:
6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
similar messages
Feb 20 06:50:01 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not
available for connect from 0@lo (no target). If you are running an HA pair
check that the target is mounted on the other server.
Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar
messages
Feb 20 06:50:01 oss1 kernel: LustreError: 11-0:
scratch-OST0003-osc-MDT: Communicating with 0@lo, operation ost_connect
failed with -19.
Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar
messages
Feb 20 06:53:57 oss1 kernel: Lustre:
6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
timed out for sent delay: [sent 1550625826/real 0]  req@881005e88800
x1625913123997352/t0(0) o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib:28/4
lens 400/544 e 0 to 1 dl 1550625837 ref 2 fl Rpc:XN/0/ rc 0/-1
Feb 20 06:53:57 oss1 kernel: Lustre:
6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
similar messages
Feb 20 07:00:51 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not
available for connect from 0@lo (no target). If you are running an HA pair
check that the target is mounted on the other server.
Feb 20 07:00:51 oss1 kernel: LustreError: Skipped 17 previous similar
messages
Feb 20 07:00:51 oss1 kernel: LustreError: 11-0:
scratch-OST0003-osc-MDT: Communicating with 0@lo, operation ost_connect
failed with -19.
Feb 20 07:00:51 oss1 kernel: LustreError: Skipped 17 previous similar
messages
Feb 20 07:04:32 oss1 kernel: Lustre:
6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request 

Re: [lustre-discuss] Migrate MGS to ZFS

2019-02-19 Thread Andreas Dilger
PS: it is always a good idea to make a backup of your MDT, since it is 
relatively small compared to the rest of the filesystem. A full-device "dd" 
copy doesn't take too long and is the most accurate backup for ldiskfs. 

Cheers, Andreas

> On Feb 19, 2019, at 19:31, Andreas Dilger  wrote:
> 
> Yes, it is possible to migrate the MGS files to another device as you 
> propose. I don't think there is any particular difference if you move it to a 
> separate ldiskfs or ZFS target. 
> 
> One caveat is that we don't test combined ZFS and ldiskfs targets on the same 
> node, though in theory it would work. 
> 
> Migrating the MDT from ldiskfs to ZFS is also possible with newer versions of 
> Lustre (2.12 for sure, I don't recall if it is in 2.10 or not).  You need to 
> follow a special process to do this, please see the Lustre Operations Manual 
> for details. 
> 
> Cheers, Andreas
> 
>> On Feb 19, 2019, at 17:48, Fernando Pérez  wrote:
>> 
>> Dear lustre experts.
>> 
>> Whats is the best way to migrate a MGS device to ZFS? Copy the 
>> CONFIGS/filesystem_name-* files from the old ldiskfs device to the new ZFS 
>> MGS device?
>> 
>> Currently we have a combined MDT/MGT under ldiskfs with lustre 2.10.4. 
>> 
>> We want to upgrade to lustre 2.12.0 and then separate the combined MDT/MGT 
>> and migrate MDT and MGT to separate ZFS devices. 
>> 
>> Regards.
>> =
>> Fernando Pérez
>> Institut de Ciències del Mar (CMIMA-CSIC)
>> Departament Oceanografía Física i Tecnològica
>> Passeig Marítim de la Barceloneta,37-49
>> 08003 Barcelona
>> Phone:  (+34) 93 230 96 35
>> =
>> 
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Migrate MGS to ZFS

2019-02-19 Thread Andreas Dilger
Yes, it is possible to migrate the MGS files to another device as you propose. 
I don't think there is any particular difference if you move it to a separate 
ldiskfs or ZFS target. 

One caveat is that we don't test combined ZFS and ldiskfs targets on the same 
node, though in theory it would work. 

Migrating the MDT from ldiskfs to ZFS is also possible with newer versions of 
Lustre (2.12 for sure, I don't recall if it is in 2.10 or not).  You need to 
follow a special process to do this, please see the Lustre Operations Manual 
for details. 

Cheers, Andreas

> On Feb 19, 2019, at 17:48, Fernando Pérez  wrote:
> 
> Dear lustre experts.
> 
> Whats is the best way to migrate a MGS device to ZFS? Copy the 
> CONFIGS/filesystem_name-* files from the old ldiskfs device to the new ZFS 
> MGS device?
> 
> Currently we have a combined MDT/MGT under ldiskfs with lustre 2.10.4. 
> 
> We want to upgrade to lustre 2.12.0 and then separate the combined MDT/MGT 
> and migrate MDT and MGT to separate ZFS devices. 
> 
> Regards.
> =
> Fernando Pérez
> Institut de Ciències del Mar (CMIMA-CSIC)
> Departament Oceanografía Física i Tecnològica
> Passeig Marítim de la Barceloneta,37-49
> 08003 Barcelona
> Phone:  (+34) 93 230 96 35
> =
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Migrate MGS to ZFS

2019-02-19 Thread Fernando Pérez
Dear lustre experts.

Whats is the best way to migrate a MGS device to ZFS? Copy the 
CONFIGS/filesystem_name-* files from the old ldiskfs device to the new ZFS MGS 
device?

Currently we have a combined MDT/MGT under ldiskfs with lustre 2.10.4. 

We want to upgrade to lustre 2.12.0 and then separate the combined MDT/MGT and 
migrate MDT and MGT to separate ZFS devices. 

Regards.
=
Fernando Pérez
Institut de Ciències del Mar (CMIMA-CSIC)
Departament Oceanografía Física i Tecnològica
Passeig Marítim de la Barceloneta,37-49
08003 Barcelona
Phone:  (+34) 93 230 96 35
=

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org