Oh. Doh. I missed the low Inode usage that you listed in your first email. Ben was right, that kind of does point to some runaway log files like changelogs. Perhaps you enabled them by accident? Do you have any kind of HSM? It would be good to check the changelog_users regardless.

If that doesn't show anything, you probably need to mount your MDT's backend filesystem as a local filesytem (readonly) and look for where the space is going.

Chris

On 10/16/2015 10:37 AM, Christopher J. Morrone wrote:
Hi Torsten,

There is no reason to suspect that space usage on the MDT will be the
same as the average space usage on the OSTs.

Your MDT is storing the metadata about _all_ of the files in your Lustre
filesystem.  You can think of this metadata as a whole bunch of
zero-length files with some extended attributes, because under the
covers that is basically what the MDT is storing.

So space usage on the MDT will be directly proportional to the total
number of files in your Lustre filesystem.  The size of those files
doesn't really matter much, because the contents of the files are stored
on the OSTs.

So your issue is that, for your filesystem, on average your files are
too small to allow filling your OSTs.  Some possible solutions are:

* Increase the size of your MDT
* Encourage/require your users to start using larger average file sizes

Granted, neither approach is terribly easy.

I think that you can pretty safely ignore the changelogs suggestion.

Chris

On 10/16/2015 07:31 AM, Torsten Harenberg wrote:
Am 16.10.2015 um 16:01 schrieb Ben Evans:
Looks like you¹ve got some really large changelogs built up.  Did you
have
robin hood, or some other consumer running at some point that has since
stalled?

Don't think so, as I never heard about "Robin Hood" in the context of
Lustre.

The setup is pretty simple, devices were created with

mkfs.lustre --fsname=lustre --mgs --mdt --backfstype=ext4
--failnode=132.195.124.201@tcp --verbose /dev/mapper/MGTMDT

and

mkfs.lustre --fsname=lustre --ost --backfstype=ext4
--failnode=132.195.124.204@tcp --mgsnode=132.195.124.202@tcp
--mgsnode=132.195.124.201@tcp --verbose /dev/mapper/OST0000

respectively

The only "add-on" we have is quota support. Back in 2.1.5 that was
enabled with:

lfs quotacheck –ug /lustre
lfs quotaon /lustre


The file system is mounted on about 200 nodes and accessed by cluster
users.

Best regards,

   Torsten


.


_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to