Thanks, I will look into the ZFS quota since we are using ZFS for all storage, 
MDT and OSTs.

In our case, there is a single MDS/MDT. I have used Robinhood and lfs find (by 
group) commands to verify what the numbers should apparently be.

—
Dan Szkola
FNAL

> On Oct 9, 2023, at 10:13 AM, Andreas Dilger <adil...@whamcloud.com> wrote:
> 
> The quota accounting is controlled by the backing filesystem of the OSTs and 
> MDTs.
> 
> For ldiskfs/ext4 you could run e2fsck to re-count all of the inode and block 
> usage. 
> 
> For ZFS you would have to ask on the ZFS list to see if there is some way to 
> re-count the quota usage. 
> 
> The "inode" quota is accounted from the MDTs, while the "block" quota is 
> accounted from the OSTs. You might be able to see with "lfs quota -v -g 
> group" to see if there is one particular MDT that is returning too many 
> inodes. 
> 
> Possibly if you have directories that are striped across many MDTs it would 
> inflate the used inode count. For example, if every one of the 426k 
> directories reported by RBH was striped across 4 MDTs then you would see the 
> inode count add up to 3.6M. 
> 
> If that was the case, then I would really, really advise against striping 
> every directory in the filesystem.  That will cause problems far worse than 
> just inflating the inode quota accounting. 
> 
> Cheers, Andreas
> 
>> On Oct 9, 2023, at 22:33, Daniel Szkola via lustre-discuss 
>> <lustre-discuss@lists.lustre.org> wrote:
>> 
>> Is there really no way to force a recount of files used by the quota? All 
>> indications are we have accounts where files were removed and this is not 
>> reflected in the used file count in the quota. The space used seems correct 
>> but the inodes used numbers are way high. There must be a way to clear these 
>> numbers and have a fresh count done.
>> 
>> —
>> Dan Szkola
>> FNAL
>> 
>>> On Oct 4, 2023, at 11:37 AM, Daniel Szkola via lustre-discuss 
>>> <lustre-discuss@lists.lustre.org> wrote:
>>> 
>>> Also, quotas on the OSTS don’t add up to near 3 million files either:
>>> 
>>> [root@lustreclient scratch]# ssh ossnode0 lfs quota -g somegroup -I 0 
>>> /lustre1
>>> Disk quotas for grp somegroup (gid 9544):
>>>   Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>>>              1394853459       0 1913344192       -  132863       0       0  
>>>      -
>>> [root@lustreclient scratch]# ssh ossnode0 lfs quota -g somegroup -I 1 
>>> /lustre1
>>> Disk quotas for grp somegroup (gid 9544):
>>>   Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>>>              1411579601       0 1963246413       -  120643       0       0  
>>>      -
>>> [root@lustreclient scratch]# ssh ossnode1 lfs quota -g somegroup -I 2 
>>> /lustre1
>>> Disk quotas for grp somegroup (gid 9544):
>>>   Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>>>              1416507527       0 1789950778       -  190687       0       0  
>>>      -
>>> [root@lustreclient scratch]# ssh ossnode1 lfs quota -g somegroup -I 3 
>>> /lustre1
>>> Disk quotas for grp somegroup (gid 9544):
>>>   Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>>>              1636465724       0 1926578117       -  195034       0       0  
>>>      -
>>> [root@lustreclient scratch]# ssh ossnode2 lfs quota -g somegroup -I 4 
>>> /lustre1
>>> Disk quotas for grp somegroup (gid 9544):
>>>   Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>>>              2202272244       0 3020159313       -  185097       0       0  
>>>      -
>>> [root@lustreclient scratch]# ssh ossnode2 lfs quota -g somegroup -I 5 
>>> /lustre1
>>> Disk quotas for grp somegroup (gid 9544):
>>>   Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>>>              1324770165       0 1371244768       -  145347       0       0  
>>>      -
>>> [root@lustreclient scratch]# ssh ossnode3 lfs quota -g somegroup -I 6 
>>> /lustre1
>>> Disk quotas for grp somegroup (gid 9544):
>>>   Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>>>              2892027349       0 3221225472       -  169386       0       0  
>>>      -
>>> [root@lustreclient scratch]# ssh ossnode3 lfs quota -g somegroup -I 7 
>>> /lustre1
>>> Disk quotas for grp somegroup (gid 9544):
>>>   Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
>>>              2076201636       0 2474853207       -  171552       0       0  
>>>      -
>>> 
>>> 
>>> —
>>> Dan Szkola
>>> FNAL
>>> 
>>>>> On Oct 4, 2023, at 8:45 AM, Daniel Szkola via lustre-discuss 
>>>>> <lustre-discuss@lists.lustre.org> wrote:
>>>> 
>>>> No combination of ossnodek runs has helped with this.
>>>> 
>>>> Again, robinhood shows 1796104 files for the group, an 'lfs find -G gid' 
>>>> found 1796104 files as well.
>>>> 
>>>> So why is the quota command showing over 3 million inodes used?
>>>> 
>>>> There must be a way to force it to recount or clear all stale quota data 
>>>> and have it regenerate it?
>>>> 
>>>> Anyone?
>>>> 
>>>> —
>>>> Dan Szkola
>>>> FNAL
>>>> 
>>>> 
>>>>> On Sep 27, 2023, at 9:42 AM, Daniel Szkola via lustre-discuss 
>>>>> <lustre-discuss@lists.lustre.org> wrote:
>>>>> 
>>>>> We have a lustre filesystem that we just upgraded to 2.15.3, however this 
>>>>> problem has been going on for some time.
>>>>> 
>>>>> The quota command shows this:
>>>>> 
>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>> Filesystem    used   quota   limit   grace   files   quota   limit   grace
>>>>>   /lustre1  13.38T     40T     45T       - 3136761* 2621440 3670016 
>>>>> expired
>>>>> 
>>>>> The group is not using nearly that many files. We have robinhood 
>>>>> installed and it show this:
>>>>> 
>>>>> Using config file '/etc/robinhood.d/lustre1.conf'.
>>>>> group,     type,      count,     volume,   spc_used,   avg_size
>>>>> somegroup,   symlink,      59071,    5.12 MB,  103.16 MB,         91
>>>>> somegroup,       dir,     426619,    5.24 GB,    5.24 GB,   12.87 KB
>>>>> somegroup,      file,    1310414,   16.24 TB,   13.37 TB,   13.00 MB
>>>>> 
>>>>> Total: 1796104 entries, volume: 17866508365925 bytes (16.25 TB), space 
>>>>> used: 14704924899840 bytes (13.37 TB)
>>>>> 
>>>>> Any ideas what is wrong here?
>>>>> 
>>>>> —
>>>>> Dan Szkola
>>>>> FNAL

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
  • [... Daniel Szkola via lustre-discuss
    • ... Daniel Szkola via lustre-discuss
      • ... Mark Dixon via lustre-discuss
        • ... Daniel Szkola via lustre-discuss
          • ... Mark Dixon via lustre-discuss
      • ... Daniel Szkola via lustre-discuss
        • ... Daniel Szkola via lustre-discuss
          • ... Andreas Dilger via lustre-discuss
            • ... Daniel Szkola via lustre-discuss
              • ... Daniel Szkola via lustre-discuss
                • ... Andreas Dilger via lustre-discuss
                • ... Daniel Szkola via lustre-discuss
                • ... Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
                • ... Daniel Szkola via lustre-discuss
                • ... Andreas Dilger via lustre-discuss
                • ... Daniel Szkola via lustre-discuss
                • ... Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
                • ... Andreas Dilger via lustre-discuss
                • ... Daniel Szkola via lustre-discuss

Reply via email to