Re: [lustre-discuss] lustre 2.5.3 ost not draining

2015-07-28 Thread Dilger, Andreas
Setting it degraded means the MDS will avoid allocations on that OST
unless there aren't enough OSTs to meet the request (e.g. stripe_count =
-1), so it should work.

That is actually a very interesting workaround for this problem, and it
will work for older versions of Lustre as well.  It doesn't disable the
OST completely, which is fine if you are doing space balancing (and may
even be desirable to allow apps that need more bandwidth for a widely
striped file), but it isn't good if you are trying to empty the OST
completely to remove it.

It looks like another approach would be to mark the OST as having no free
space using OBD_FAIL_OST_ENOINO (0x229) fault injection on that OST:

   lctl set_param fail_loc=0x229 fail_val=

This would cause the OST to return 0 free inodes from OST_STATFS for the
specified OST index, and the MDT would skip this OST completely.  To
disable all of the OSTs on an OSS use  = -1.  It isn't possible
to selectively disable a subset of OSTs using this method.  The
OBD_FAIL_OST_ENOINO fail_loc has been available since Lustre 2.2, which
covers all of the 2.4+ versions that are affected by this issue.

If this mechanism works for you (it should, as this fail_loc is used
during regular testing) I'd be obliged if someone could file an LUDOC bug
so the manual can be updated.

Cheers, Andreas

On 2015/07/10, 10:10 AM, "Alexander I Kulyavtsev"
 wrote:

>I think so, try it.
>We do set ost degraded on 1.8 when ost nears 95% and we migrate data to
>another ost.
>On 1.8 lfs_migrate uses 'rm' and objects are indeed deallocated.
>
>Alex
>
>On Jul 10, 2015, at 10:55 AM, Kurt Strosahl  wrote:
>
>> Will that let deletes happen against it?
>>w/r,
>> Kurt
>>- Original Message -
>> From: "aik" 
>> To: "Kurt Strosahl" 
>> Cc: "aik" , "Sean Brisbane"
>>, lustre-discuss@lists.lustre.org
>> Sent: Friday, July 10, 2015 11:52:00 AM
>> Subject: Re: [lustre-discuss] lustre 2.5.3 ost not draining
>>Hi Kurt,to keep traffic from almost full OST we usually set ost in
>>degraded mode like described in manual:
>>> 
>>> Handling Degraded OST RAID Arrays
>>> 
>>> To mark the OST as degraded, use:
>>> lctl set_param obdfilter.{OST_name}.degraded=1
>>> 
>>> Alex.
>>>On Jul 10, 2015, at 10:13 AM, Kurt Strosahl  wrote:
>>> 
 No, I'm aware of why the ost is getting new writes... it is because I
had to set the qos_threshold_rr to 100 due to
https://jira.hpdd.intel.com/browse/LU-5778  (I have an ost that has to
be ignored due to terrible write performance...)
 
 w/r,
 Kurt
 
 - Original Message -
 From: "Sean Brisbane" 
 To: "Kurt Strosahl" 
 Cc: "Patrick Farrell" ,
"lustre-discuss@lists.lustre.org" 
 Sent: Friday, July 10, 2015 11:04:27 AM
 Subject: RE: [lustre-discuss] lustre 2.5.3 ost not draining
 
 Dear Kurt,
 
 Apologies.  After leaving it some number of days it did *not* clean
itself up, but I feel that some number of days is long enough to
verify 
that it is a problem.
 
 Sounds like you have another issue if the OST is not being marked as
full and writes are not being re-allocated to other OSTS .  I also
have 
that second issue on my system as well and I have only workarounds to
offer you for the problem.
 
 Thanks,
 Sean
 
 -Original Message-
 From: Kurt Strosahl [mailto:stros...@jlab.org]
 Sent: 10 July 2015 16:01
 To: Sean Brisbane
 Cc: Patrick Farrell; lustre-discuss@lists.lustre.org
 Subject: Re: [lustre-discuss] lustre 2.5.3 ost not draining
 
 The problem there is that I cannot afford to leave it "some number of
days"... it is at 97% full, so new writes are going to it faster then
it can clean itself off.
 
 w/r,
 Kurt
 
 - Original Message -
 From: "Sean Brisbane" 
 To: "Patrick Farrell" , "Kurt Strosahl"

 Cc: lustre-discuss@lists.lustre.org
 Sent: Friday, July 10, 2015 10:44:39 AM
 Subject: RE: [lustre-discuss] lustre 2.5.3 ost not draining
 
 Hi,
 
 The 'space not freed' issue also happened to me and I left it 'some
number of days'  I don't recall how many, it was a while back.
 
 Cheers,
Sean

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Problems moving an OSS from an old Lustre installation to a new one

2015-07-28 Thread Mohr Jr, Richard Frank (Rick Mohr)

> On Jul 28, 2015, at 1:46 AM, Massimo Sgaravatto 
>  wrote:
> 
> Indeed we forgot to shutdown the old MGS/MDS before "attaching" the old OSS 
> to the new one, but I wonder why this caused problems.

If the old MDS contacted the old OSS, and the file system names were still the 
same between the new and old file systems, maybe the old OSS still processed 
some config file from the old OSS and this started causing issues.

> Our idea now would be to retry the mkfs operations (this time they would be 
> done with the old MDS switched down)
> Do you see some better options ?

Before doing mkfs, I would try running a writeconf to recreate the config logs 
(making sure the old MDS is powered off).  That might clear out any old data 
and fix things.

In general, I would highly recommend not having two Lustre file systems up and 
running with the same file system name.  It is probably safer just to assign a 
new name to a new file system in case there are any conflicts (although you can 
still mount the Lustre file system at the same location).

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Problems moving an OSS from an old Lustre installation to a new one

2015-07-28 Thread Massimo Sgaravatto
I forgot to say that the filesystem names in the old and in the new 
Lustre installations are the same.




On 28/07/2015 07:17, Massimo Sgaravatto wrote:

Hi

We are migrating from an old Lustre installation composed by 1 MDS and 2
OSS to a new Lustre 2.5.3 installation.

For this second installation we installated from scratch a new MDS + a
new OST and we migrated the data from the old Lustre system).


Problems started when we tried to "move" a OSS from the old installation
to the new one.

For this OSS server we reinstalled from scratch the Operating System
(keeping the same IP name and number).
Then for the OSTs we formatted the file systems using commands such as:


  mkfs.lustre --reformat --fsname=cmswork
--mgsnode=t2-mds-01.lnl.infn.it@tcp0 --ost --param ost.quota_type=ug
--index=3 --mkfsoptions='-i 65536' /dev/mapper/MD1200_1p1


(t2-mds-01.lnl.infn.it is the new MDS)

and then we mounted the file systems

Apparently this worked.


After a while we realized that in the syslog of this "moved" OSS there
were messages such as:

Jul 25 10:54:02 t2-oss-03 kernel: Lustre: cmswork-OST0003: haven't heard
from client cmswork-MDT-mdtlov_UUID (at 10.60.16.8@tcp) in 232
seconds. I think it's dead, and I am evicting it. exp 8803123bf400,
cur 1437814442 expire 1437814292 last 1437814210


10.60.16.8 is the IP name of the old MDS !!!


No idea why it was expecting communications from it !
At any rate on this old MDS I umounted the MGS and MDT file systems.


After a while users complaining that there were problems for some (not
all) files written in the new OSTs, e.g.:

# ls -l
/lustre/cmswork/ronchese/pat_ntu/cmssw53B_slc6/dev08tmp/src/PDAnalysis/EDM/bin/ntu.root


ls: cannot access
/lustre/cmswork/ronchese/pat_ntu/cmssw53B_slc6/dev08tmp/src/PDAnalysis/EDM/bin/ntu.root:

Cannot allocate memory


In the syslog of the client:

Jul 26 08:01:09 t2-ui-13 kernel: LustreError: 11-0:
cmswork-OST0003-osc-880818e5: Communicating with 10.60.16.9@tcp,
operation ldlm_enqueue failed with -12.


10.60.16.9 is the IP of the "moved" OSS.
In its syslog:


Jul 26 08:01:09 t2-oss-03 kernel: LustreError:
8114:0:(ldlm_resource.c:1188:ldlm_resource_get()) cmswork-OST0003:
lvbo_init failed for resource 0xb9:0x0: rc = -2
Jul 26 08:01:09 t2-oss-03 kernel: LustreError:
8114:0:(ldlm_resource.c:1188:ldlm_resource_get()) Skipped 1 previous
similar message


Reading:

https://jira.hpdd.intel.com/browse/LU-4034

I guess the memory is not the real problem. The problem is that the
object was not found in the OST.


Some interesting messages found in the syslog of the "moved" OSS:

Jul 24 14:56:25 t2-oss-03 kernel: Lustre: cmswork-OST0003: Received MDS
connection from 10.60.16.8@tcp, removing former export from 10.60.16.38@tcp

Jul 24 14:56:27 t2-oss-03 kernel: Lustre: cmswork-OST0003: already
connected client cmswork-MDT-mdtlov_UUID \
(at 10.60.16.8@tcp) with handle 0xdb376ec08bf7d020. Rejecting client
with the same UUID trying to reconnect with\
  handle 0x6dffb49bb9b3bc70

10.60.16.8 is the IP of the old MDS
10.60.16.38 is the IP of the new MDS


For the the being we disabled the OSTs hosted on the "moved" OSS so that
new objects are not written there.


Any idea what the problem is and how we could recover the system ?



Thanks, Massimo



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org





smime.p7s
Description: S/MIME Cryptographic Signature
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] quota only in- but not decreasing after upgrading to Lustre 2.5.3

2015-07-28 Thread Martin Hecht
Hi,

it might help to disable quota using tune2fs and re-enable it again on
the ext2 level on all devices, see LU-3861.
(BTW you don't need the e2fsprogs mentioned in the bug, there was an
official release last year in September).

You have to stop lustre for the tune2fs run and it takes some time,
because this triggers a quota check in the background (which does not
produce any output on the screen).

best regards,
Martin

On 07/28/2015 09:44 AM, Torsten Harenberg wrote:
> a further observation:
>
> a user deleted a ~100MB file:
>
> before.
>
> [root@wnfg001 lustre]# lfs quota -u sandhoff /lustre
> Disk quotas for user sandhoff (uid 11206):
>  Filesystem  kbytes   quota   limit   grace   files   quota   limit
>   grace
> /lustre 811480188  1811480200 2811480200   -   61077   0
>   0   -
>
> after:
>
> [root@wnfg001 lustre]# lfs quota -u sandhoff /lustre
> Disk quotas for user sandhoff (uid 11206):
>  Filesystem  kbytes   quota   limit   grace   files   quota   limit
>   grace
> /lustre 811480188  1811480200 2811480200   -   61076   0
>   0   -
> [root@wnfg001 lustre]#
>
> so #files decreased by 1, but not the #kbytes.
>
> Furthermore, the lfs quota command is pretty slow:
>
> [root@wnfg001 lustre]# time lfs quota -u sandhoff /lustre
> Disk quotas for user sandhoff (uid 11206):
>  Filesystem  kbytes   quota   limit   grace   files   quota   limit
>   grace
> /lustre 811480188  1811480200 2811480200   -   61076   0
>   0   -
>
> real0m2.441s
> user0m0.001s
> sys 0m0.004s
> [root@wnfg001 lustre]#
>
> although the system is not overloaded.
>
> Couldn't find anything useful in dmesg:
>
> [root@lustre2 ~]# dmesg | grep quota
> VFS: Disk quotas dquot_6.5.2
> LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
> Opts:
> [root@lustre2 ~]#
>
> [root@lustre3 ~]# dmesg | grep quota
> VFS: Disk quotas dquot_6.5.2
> LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-12): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-12): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. quota=on.
> Opts:
> [root@lustre3 ~]#
>
> [root@lustre4 ~]# dmesg | grep quota
> VFS: Disk quotas dquot_6.5.2
> LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-8): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-14): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-8): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-14): mounted filesystem with ordered data mode. quota=on.
> Opts:
> LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on.
> Opts:
> [root@lustre4 ~]#
>
> Thanks for any hint!
>
> Best regards
>
>   Torsten
>




smime.p7s
Description: S/MIME Cryptographic Signature
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] quota only in- but not decreasing after upgrading to Lustre 2.5.3

2015-07-28 Thread Torsten Harenberg
a further observation:

a user deleted a ~100MB file:

before.

[root@wnfg001 lustre]# lfs quota -u sandhoff /lustre
Disk quotas for user sandhoff (uid 11206):
 Filesystem  kbytes   quota   limit   grace   files   quota   limit
  grace
/lustre 811480188  1811480200 2811480200   -   61077   0
  0   -

after:

[root@wnfg001 lustre]# lfs quota -u sandhoff /lustre
Disk quotas for user sandhoff (uid 11206):
 Filesystem  kbytes   quota   limit   grace   files   quota   limit
  grace
/lustre 811480188  1811480200 2811480200   -   61076   0
  0   -
[root@wnfg001 lustre]#

so #files decreased by 1, but not the #kbytes.

Furthermore, the lfs quota command is pretty slow:

[root@wnfg001 lustre]# time lfs quota -u sandhoff /lustre
Disk quotas for user sandhoff (uid 11206):
 Filesystem  kbytes   quota   limit   grace   files   quota   limit
  grace
/lustre 811480188  1811480200 2811480200   -   61076   0
  0   -

real0m2.441s
user0m0.001s
sys 0m0.004s
[root@wnfg001 lustre]#

although the system is not overloaded.

Couldn't find anything useful in dmesg:

[root@lustre2 ~]# dmesg | grep quota
VFS: Disk quotas dquot_6.5.2
LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
Opts:
[root@lustre2 ~]#

[root@lustre3 ~]# dmesg | grep quota
VFS: Disk quotas dquot_6.5.2
LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-12): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-12): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. quota=on.
Opts:
[root@lustre3 ~]#

[root@lustre4 ~]# dmesg | grep quota
VFS: Disk quotas dquot_6.5.2
LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-8): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-14): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-8): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-13): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-14): mounted filesystem with ordered data mode. quota=on.
Opts:
LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on.
Opts:
[root@lustre4 ~]#

Thanks for any hint!

Best regards

  Torsten

-- 
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
<>  <>
<> Dr. Torsten Harenberg torsten.harenb...@cern.ch  <>
<> Bergische Universitaet   <>
<> FB C - Physik Tel.: +49 (0)202 439-3521  <>
<> Gaussstr. 20  Fax : +49 (0)202 439-2811  <>
<> 42097 Wuppertal   @CERN: Bat. 1-1-049<>
<>  <>
<><><><><><><>< Of course it runs NetBSD http://www.netbsd.org ><>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org