Re: [lustre-discuss] lustre 2.5.3 ost not draining
Setting it degraded means the MDS will avoid allocations on that OST unless there aren't enough OSTs to meet the request (e.g. stripe_count = -1), so it should work. That is actually a very interesting workaround for this problem, and it will work for older versions of Lustre as well. It doesn't disable the OST completely, which is fine if you are doing space balancing (and may even be desirable to allow apps that need more bandwidth for a widely striped file), but it isn't good if you are trying to empty the OST completely to remove it. It looks like another approach would be to mark the OST as having no free space using OBD_FAIL_OST_ENOINO (0x229) fault injection on that OST: lctl set_param fail_loc=0x229 fail_val= This would cause the OST to return 0 free inodes from OST_STATFS for the specified OST index, and the MDT would skip this OST completely. To disable all of the OSTs on an OSS use = -1. It isn't possible to selectively disable a subset of OSTs using this method. The OBD_FAIL_OST_ENOINO fail_loc has been available since Lustre 2.2, which covers all of the 2.4+ versions that are affected by this issue. If this mechanism works for you (it should, as this fail_loc is used during regular testing) I'd be obliged if someone could file an LUDOC bug so the manual can be updated. Cheers, Andreas On 2015/07/10, 10:10 AM, "Alexander I Kulyavtsev" wrote: >I think so, try it. >We do set ost degraded on 1.8 when ost nears 95% and we migrate data to >another ost. >On 1.8 lfs_migrate uses 'rm' and objects are indeed deallocated. > >Alex > >On Jul 10, 2015, at 10:55 AM, Kurt Strosahl wrote: > >> Will that let deletes happen against it? >>w/r, >> Kurt >>- Original Message - >> From: "aik" >> To: "Kurt Strosahl" >> Cc: "aik" , "Sean Brisbane" >>, lustre-discuss@lists.lustre.org >> Sent: Friday, July 10, 2015 11:52:00 AM >> Subject: Re: [lustre-discuss] lustre 2.5.3 ost not draining >>Hi Kurt,to keep traffic from almost full OST we usually set ost in >>degraded mode like described in manual: >>> >>> Handling Degraded OST RAID Arrays >>> >>> To mark the OST as degraded, use: >>> lctl set_param obdfilter.{OST_name}.degraded=1 >>> >>> Alex. >>>On Jul 10, 2015, at 10:13 AM, Kurt Strosahl wrote: >>> No, I'm aware of why the ost is getting new writes... it is because I had to set the qos_threshold_rr to 100 due to https://jira.hpdd.intel.com/browse/LU-5778 (I have an ost that has to be ignored due to terrible write performance...) w/r, Kurt - Original Message - From: "Sean Brisbane" To: "Kurt Strosahl" Cc: "Patrick Farrell" , "lustre-discuss@lists.lustre.org" Sent: Friday, July 10, 2015 11:04:27 AM Subject: RE: [lustre-discuss] lustre 2.5.3 ost not draining Dear Kurt, Apologies. After leaving it some number of days it did *not* clean itself up, but I feel that some number of days is long enough to verify that it is a problem. Sounds like you have another issue if the OST is not being marked as full and writes are not being re-allocated to other OSTS . I also have that second issue on my system as well and I have only workarounds to offer you for the problem. Thanks, Sean -Original Message- From: Kurt Strosahl [mailto:stros...@jlab.org] Sent: 10 July 2015 16:01 To: Sean Brisbane Cc: Patrick Farrell; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] lustre 2.5.3 ost not draining The problem there is that I cannot afford to leave it "some number of days"... it is at 97% full, so new writes are going to it faster then it can clean itself off. w/r, Kurt - Original Message - From: "Sean Brisbane" To: "Patrick Farrell" , "Kurt Strosahl" Cc: lustre-discuss@lists.lustre.org Sent: Friday, July 10, 2015 10:44:39 AM Subject: RE: [lustre-discuss] lustre 2.5.3 ost not draining Hi, The 'space not freed' issue also happened to me and I left it 'some number of days' I don't recall how many, it was a while back. Cheers, Sean Cheers, Andreas -- Andreas Dilger Lustre Software Architect Intel High Performance Data Division ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Problems moving an OSS from an old Lustre installation to a new one
> On Jul 28, 2015, at 1:46 AM, Massimo Sgaravatto > wrote: > > Indeed we forgot to shutdown the old MGS/MDS before "attaching" the old OSS > to the new one, but I wonder why this caused problems. If the old MDS contacted the old OSS, and the file system names were still the same between the new and old file systems, maybe the old OSS still processed some config file from the old OSS and this started causing issues. > Our idea now would be to retry the mkfs operations (this time they would be > done with the old MDS switched down) > Do you see some better options ? Before doing mkfs, I would try running a writeconf to recreate the config logs (making sure the old MDS is powered off). That might clear out any old data and fix things. In general, I would highly recommend not having two Lustre file systems up and running with the same file system name. It is probably safer just to assign a new name to a new file system in case there are any conflicts (although you can still mount the Lustre file system at the same location). -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Problems moving an OSS from an old Lustre installation to a new one
I forgot to say that the filesystem names in the old and in the new Lustre installations are the same. On 28/07/2015 07:17, Massimo Sgaravatto wrote: Hi We are migrating from an old Lustre installation composed by 1 MDS and 2 OSS to a new Lustre 2.5.3 installation. For this second installation we installated from scratch a new MDS + a new OST and we migrated the data from the old Lustre system). Problems started when we tried to "move" a OSS from the old installation to the new one. For this OSS server we reinstalled from scratch the Operating System (keeping the same IP name and number). Then for the OSTs we formatted the file systems using commands such as: mkfs.lustre --reformat --fsname=cmswork --mgsnode=t2-mds-01.lnl.infn.it@tcp0 --ost --param ost.quota_type=ug --index=3 --mkfsoptions='-i 65536' /dev/mapper/MD1200_1p1 (t2-mds-01.lnl.infn.it is the new MDS) and then we mounted the file systems Apparently this worked. After a while we realized that in the syslog of this "moved" OSS there were messages such as: Jul 25 10:54:02 t2-oss-03 kernel: Lustre: cmswork-OST0003: haven't heard from client cmswork-MDT-mdtlov_UUID (at 10.60.16.8@tcp) in 232 seconds. I think it's dead, and I am evicting it. exp 8803123bf400, cur 1437814442 expire 1437814292 last 1437814210 10.60.16.8 is the IP name of the old MDS !!! No idea why it was expecting communications from it ! At any rate on this old MDS I umounted the MGS and MDT file systems. After a while users complaining that there were problems for some (not all) files written in the new OSTs, e.g.: # ls -l /lustre/cmswork/ronchese/pat_ntu/cmssw53B_slc6/dev08tmp/src/PDAnalysis/EDM/bin/ntu.root ls: cannot access /lustre/cmswork/ronchese/pat_ntu/cmssw53B_slc6/dev08tmp/src/PDAnalysis/EDM/bin/ntu.root: Cannot allocate memory In the syslog of the client: Jul 26 08:01:09 t2-ui-13 kernel: LustreError: 11-0: cmswork-OST0003-osc-880818e5: Communicating with 10.60.16.9@tcp, operation ldlm_enqueue failed with -12. 10.60.16.9 is the IP of the "moved" OSS. In its syslog: Jul 26 08:01:09 t2-oss-03 kernel: LustreError: 8114:0:(ldlm_resource.c:1188:ldlm_resource_get()) cmswork-OST0003: lvbo_init failed for resource 0xb9:0x0: rc = -2 Jul 26 08:01:09 t2-oss-03 kernel: LustreError: 8114:0:(ldlm_resource.c:1188:ldlm_resource_get()) Skipped 1 previous similar message Reading: https://jira.hpdd.intel.com/browse/LU-4034 I guess the memory is not the real problem. The problem is that the object was not found in the OST. Some interesting messages found in the syslog of the "moved" OSS: Jul 24 14:56:25 t2-oss-03 kernel: Lustre: cmswork-OST0003: Received MDS connection from 10.60.16.8@tcp, removing former export from 10.60.16.38@tcp Jul 24 14:56:27 t2-oss-03 kernel: Lustre: cmswork-OST0003: already connected client cmswork-MDT-mdtlov_UUID \ (at 10.60.16.8@tcp) with handle 0xdb376ec08bf7d020. Rejecting client with the same UUID trying to reconnect with\ handle 0x6dffb49bb9b3bc70 10.60.16.8 is the IP of the old MDS 10.60.16.38 is the IP of the new MDS For the the being we disabled the OSTs hosted on the "moved" OSS so that new objects are not written there. Any idea what the problem is and how we could recover the system ? Thanks, Massimo ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org smime.p7s Description: S/MIME Cryptographic Signature ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] quota only in- but not decreasing after upgrading to Lustre 2.5.3
Hi, it might help to disable quota using tune2fs and re-enable it again on the ext2 level on all devices, see LU-3861. (BTW you don't need the e2fsprogs mentioned in the bug, there was an official release last year in September). You have to stop lustre for the tune2fs run and it takes some time, because this triggers a quota check in the background (which does not produce any output on the screen). best regards, Martin On 07/28/2015 09:44 AM, Torsten Harenberg wrote: > a further observation: > > a user deleted a ~100MB file: > > before. > > [root@wnfg001 lustre]# lfs quota -u sandhoff /lustre > Disk quotas for user sandhoff (uid 11206): > Filesystem kbytes quota limit grace files quota limit > grace > /lustre 811480188 1811480200 2811480200 - 61077 0 > 0 - > > after: > > [root@wnfg001 lustre]# lfs quota -u sandhoff /lustre > Disk quotas for user sandhoff (uid 11206): > Filesystem kbytes quota limit grace files quota limit > grace > /lustre 811480188 1811480200 2811480200 - 61076 0 > 0 - > [root@wnfg001 lustre]# > > so #files decreased by 1, but not the #kbytes. > > Furthermore, the lfs quota command is pretty slow: > > [root@wnfg001 lustre]# time lfs quota -u sandhoff /lustre > Disk quotas for user sandhoff (uid 11206): > Filesystem kbytes quota limit grace files quota limit > grace > /lustre 811480188 1811480200 2811480200 - 61076 0 > 0 - > > real0m2.441s > user0m0.001s > sys 0m0.004s > [root@wnfg001 lustre]# > > although the system is not overloaded. > > Couldn't find anything useful in dmesg: > > [root@lustre2 ~]# dmesg | grep quota > VFS: Disk quotas dquot_6.5.2 > LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. > Opts: > [root@lustre2 ~]# > > [root@lustre3 ~]# dmesg | grep quota > VFS: Disk quotas dquot_6.5.2 > LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-12): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-12): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. quota=on. > Opts: > [root@lustre3 ~]# > > [root@lustre4 ~]# dmesg | grep quota > VFS: Disk quotas dquot_6.5.2 > LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-8): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-14): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-8): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-14): mounted filesystem with ordered data mode. quota=on. > Opts: > LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. > Opts: > [root@lustre4 ~]# > > Thanks for any hint! > > Best regards > > Torsten > smime.p7s Description: S/MIME Cryptographic Signature ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] quota only in- but not decreasing after upgrading to Lustre 2.5.3
a further observation: a user deleted a ~100MB file: before. [root@wnfg001 lustre]# lfs quota -u sandhoff /lustre Disk quotas for user sandhoff (uid 11206): Filesystem kbytes quota limit grace files quota limit grace /lustre 811480188 1811480200 2811480200 - 61077 0 0 - after: [root@wnfg001 lustre]# lfs quota -u sandhoff /lustre Disk quotas for user sandhoff (uid 11206): Filesystem kbytes quota limit grace files quota limit grace /lustre 811480188 1811480200 2811480200 - 61076 0 0 - [root@wnfg001 lustre]# so #files decreased by 1, but not the #kbytes. Furthermore, the lfs quota command is pretty slow: [root@wnfg001 lustre]# time lfs quota -u sandhoff /lustre Disk quotas for user sandhoff (uid 11206): Filesystem kbytes quota limit grace files quota limit grace /lustre 811480188 1811480200 2811480200 - 61076 0 0 - real0m2.441s user0m0.001s sys 0m0.004s [root@wnfg001 lustre]# although the system is not overloaded. Couldn't find anything useful in dmesg: [root@lustre2 ~]# dmesg | grep quota VFS: Disk quotas dquot_6.5.2 LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: [root@lustre2 ~]# [root@lustre3 ~]# dmesg | grep quota VFS: Disk quotas dquot_6.5.2 LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-12): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-12): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. quota=on. Opts: [root@lustre3 ~]# [root@lustre4 ~]# dmesg | grep quota VFS: Disk quotas dquot_6.5.2 LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-8): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-14): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-8): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-14): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. Opts: [root@lustre4 ~]# Thanks for any hint! Best regards Torsten -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> <> <> Dr. Torsten Harenberg torsten.harenb...@cern.ch <> <> Bergische Universitaet <> <> FB C - Physik Tel.: +49 (0)202 439-3521 <> <> Gaussstr. 20 Fax : +49 (0)202 439-2811 <> <> 42097 Wuppertal @CERN: Bat. 1-1-049<> <> <> <><><><><><><>< Of course it runs NetBSD http://www.netbsd.org ><> ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org