Hello Andreas,

lfs df -i reports 19,204,412 inodes used. When I did the full robinhood scan, 
it reported scanning 18,673,874 entries, so fairly close.

I don’t have a .lustre directory at the filesystem root. 

Another interesting aspect of this particular issue is I can run lctl lfsck and 
every time I get:

layout_repaired: 1468299

But it doesn’t seem to be actually repairing anything because if I run it 
again, I’ll get the same or a similar number.

I run it like this:
lctl lfsck_start -t layout -t namespace -o -M lfsc-MDT0000

—
Dan Szkola
FNAL


> On Oct 10, 2023, at 10:47 AM, Andreas Dilger <adil...@whamcloud.com> wrote:
> 
> There is a $ROOT/.lustre/lost+found that you could check. 
> 
> What does "lfs df -i" report for the used inode count?  Maybe it is RBH that 
> is reporting the wrong count?
> 
> The other alternative would be to mount the MDT filesystem directly as type 
> ZFS and see what df -i and find report?  
> 
> Cheers, Andreas
> 
>> On Oct 10, 2023, at 22:16, Daniel Szkola via lustre-discuss 
>> <lustre-discuss@lists.lustre.org> wrote:
>> 
>> OK, I disabled, waited for a while, then reenabled. I still get the same 
>> numbers. The only thing I can think is somehow the count is correct, despite 
>> the huge difference. Robinhood and find show about 1.7M files, dirs, and 
>> links. The quota is showing a bit over 3.1M inodes used. We only have one 
>> MDS and MGS. Any ideas where the discrepancy may lie? Orphans? Is there a 
>> lost+found area in lustre?
>> 
>> —
>> Dan Szkola
>> FNAL
>> 
>> 
>>> On Oct 10, 2023, at 8:24 AM, Daniel Szkola <dszk...@fnal.gov> wrote:
>>> 
>>> Hi Robert,
>>> 
>>> Thanks for the response. Do you remember exactly how you did it? Did you 
>>> bring everything down at any point? I know you can do this:
>>> 
>>> lctl conf_param fsname.quota.mdt=none
>>> 
>>> but is that all you did? Did you wait or bring everything down before 
>>> reenabling? I’m worried because that allegedly just enables/disables 
>>> enforcement and space accounting is always on. Andreas stated that quotas 
>>> are controlled by ZFS, but there has been no quota support enabled on any 
>>> of the ZFS volumes in our lustre filesystem.
>>> 
>>> —
>>> Dan Szkola
>>> FNAL
>>> 
>>>>> On Oct 10, 2023, at 2:17 AM, Redl, Robert <robert.r...@lmu.de> wrote:
>>>> 
>>>> Dear Dan,
>>>> 
>>>> I had a similar problem some time ago. We are also using ZFS for MDT and 
>>>> OSTs. For us, the used disk space was reported wrong. The problem was 
>>>> fixed by switching quota support off on the MGS and then on again. 
>>>> 
>>>> Cheers,
>>>> Robert
>>>> 
>>>>> Am 09.10.2023 um 17:55 schrieb Daniel Szkola via lustre-discuss 
>>>>> <lustre-discuss@lists.lustre.org>:
>>>>> 
>>>>> Thanks, I will look into the ZFS quota since we are using ZFS for all 
>>>>> storage, MDT and OSTs.
>>>>> 
>>>>> In our case, there is a single MDS/MDT. I have used Robinhood and lfs 
>>>>> find (by group) commands to verify what the numbers should apparently be.
>>>>> 
>>>>> —
>>>>> Dan Szkola
>>>>> FNAL
>>>>> 
>>>>>> On Oct 9, 2023, at 10:13 AM, Andreas Dilger <adil...@whamcloud.com> 
>>>>>> wrote:
>>>>>> 
>>>>>> The quota accounting is controlled by the backing filesystem of the OSTs 
>>>>>> and MDTs.
>>>>>> 
>>>>>> For ldiskfs/ext4 you could run e2fsck to re-count all of the inode and 
>>>>>> block usage. 
>>>>>> 
>>>>>> For ZFS you would have to ask on the ZFS list to see if there is some 
>>>>>> way to re-count the quota usage. 
>>>>>> 
>>>>>> The "inode" quota is accounted from the MDTs, while the "block" quota is 
>>>>>> accounted from the OSTs. You might be able to see with "lfs quota -v -g 
>>>>>> group" to see if there is one particular MDT that is returning too many 
>>>>>> inodes. 
>>>>>> 
>>>>>> Possibly if you have directories that are striped across many MDTs it 
>>>>>> would inflate the used inode count. For example, if every one of the 
>>>>>> 426k directories reported by RBH was striped across 4 MDTs then you 
>>>>>> would see the inode count add up to 3.6M. 
>>>>>> 
>>>>>> If that was the case, then I would really, really advise against 
>>>>>> striping every directory in the filesystem.  That will cause problems 
>>>>>> far worse than just inflating the inode quota accounting. 
>>>>>> 
>>>>>> Cheers, Andreas
>>>>>> 
>>>>>>> On Oct 9, 2023, at 22:33, Daniel Szkola via lustre-discuss 
>>>>>>> <lustre-discuss@lists.lustre.org> wrote:
>>>>>>> 
>>>>>>> Is there really no way to force a recount of files used by the quota? 
>>>>>>> All indications are we have accounts where files were removed and this 
>>>>>>> is not reflected in the used file count in the quota. The space used 
>>>>>>> seems correct but the inodes used numbers are way high. There must be a 
>>>>>>> way to clear these numbers and have a fresh count done.
>>>>>>> 
>>>>>>> —
>>>>>>> Dan Szkola
>>>>>>> FNAL
>>>>>>> 
>>>>>>>> On Oct 4, 2023, at 11:37 AM, Daniel Szkola via lustre-discuss 
>>>>>>>> <lustre-discuss@lists.lustre.org> wrote:
>>>>>>>> 
>>>>>>>> Also, quotas on the OSTS don’t add up to near 3 million files either:
>>>>>>>> 
>>>>>>>> [root@lustreclient scratch]# ssh ossnode0 lfs quota -g somegroup -I 0 
>>>>>>>> /lustre1
>>>>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>>>>> Filesystem  kbytes   quota   limit   grace   files   quota   limit   
>>>>>>>> grace
>>>>>>>>         1394853459       0 1913344192       -  132863       0       0  
>>>>>>>>      -
>>>>>>>> [root@lustreclient scratch]# ssh ossnode0 lfs quota -g somegroup -I 1 
>>>>>>>> /lustre1
>>>>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>>>>> Filesystem  kbytes   quota   limit   grace   files   quota   limit   
>>>>>>>> grace
>>>>>>>>         1411579601       0 1963246413       -  120643       0       0  
>>>>>>>>      -
>>>>>>>> [root@lustreclient scratch]# ssh ossnode1 lfs quota -g somegroup -I 2 
>>>>>>>> /lustre1
>>>>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>>>>> Filesystem  kbytes   quota   limit   grace   files   quota   limit   
>>>>>>>> grace
>>>>>>>>         1416507527       0 1789950778       -  190687       0       0  
>>>>>>>>      -
>>>>>>>> [root@lustreclient scratch]# ssh ossnode1 lfs quota -g somegroup -I 3 
>>>>>>>> /lustre1
>>>>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>>>>> Filesystem  kbytes   quota   limit   grace   files   quota   limit   
>>>>>>>> grace
>>>>>>>>         1636465724       0 1926578117       -  195034       0       0  
>>>>>>>>      -
>>>>>>>> [root@lustreclient scratch]# ssh ossnode2 lfs quota -g somegroup -I 4 
>>>>>>>> /lustre1
>>>>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>>>>> Filesystem  kbytes   quota   limit   grace   files   quota   limit   
>>>>>>>> grace
>>>>>>>>         2202272244       0 3020159313       -  185097       0       0  
>>>>>>>>      -
>>>>>>>> [root@lustreclient scratch]# ssh ossnode2 lfs quota -g somegroup -I 5 
>>>>>>>> /lustre1
>>>>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>>>>> Filesystem  kbytes   quota   limit   grace   files   quota   limit   
>>>>>>>> grace
>>>>>>>>         1324770165       0 1371244768       -  145347       0       0  
>>>>>>>>      -
>>>>>>>> [root@lustreclient scratch]# ssh ossnode3 lfs quota -g somegroup -I 6 
>>>>>>>> /lustre1
>>>>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>>>>> Filesystem  kbytes   quota   limit   grace   files   quota   limit   
>>>>>>>> grace
>>>>>>>>         2892027349       0 3221225472       -  169386       0       0  
>>>>>>>>      -
>>>>>>>> [root@lustreclient scratch]# ssh ossnode3 lfs quota -g somegroup -I 7 
>>>>>>>> /lustre1
>>>>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>>>>> Filesystem  kbytes   quota   limit   grace   files   quota   limit   
>>>>>>>> grace
>>>>>>>>         2076201636       0 2474853207       -  171552       0       0  
>>>>>>>>      -
>>>>>>>> 
>>>>>>>> 
>>>>>>>> —
>>>>>>>> Dan Szkola
>>>>>>>> FNAL
>>>>>>>> 
>>>>>>>>>> On Oct 4, 2023, at 8:45 AM, Daniel Szkola via lustre-discuss 
>>>>>>>>>> <lustre-discuss@lists.lustre.org> wrote:
>>>>>>>>> 
>>>>>>>>> No combination of ossnodek runs has helped with this.
>>>>>>>>> 
>>>>>>>>> Again, robinhood shows 1796104 files for the group, an 'lfs find -G 
>>>>>>>>> gid' found 1796104 files as well.
>>>>>>>>> 
>>>>>>>>> So why is the quota command showing over 3 million inodes used?
>>>>>>>>> 
>>>>>>>>> There must be a way to force it to recount or clear all stale quota 
>>>>>>>>> data and have it regenerate it?
>>>>>>>>> 
>>>>>>>>> Anyone?
>>>>>>>>> 
>>>>>>>>> —
>>>>>>>>> Dan Szkola
>>>>>>>>> FNAL
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Sep 27, 2023, at 9:42 AM, Daniel Szkola via lustre-discuss 
>>>>>>>>>> <lustre-discuss@lists.lustre.org> wrote:
>>>>>>>>>> 
>>>>>>>>>> We have a lustre filesystem that we just upgraded to 2.15.3, however 
>>>>>>>>>> this problem has been going on for some time.
>>>>>>>>>> 
>>>>>>>>>> The quota command shows this:
>>>>>>>>>> 
>>>>>>>>>> Disk quotas for grp somegroup (gid 9544):
>>>>>>>>>> Filesystem    used   quota   limit   grace   files   quota   limit   
>>>>>>>>>> grace
>>>>>>>>>> /lustre1  13.38T     40T     45T       - 3136761* 2621440 3670016 
>>>>>>>>>> expired
>>>>>>>>>> 
>>>>>>>>>> The group is not using nearly that many files. We have robinhood 
>>>>>>>>>> installed and it show this:
>>>>>>>>>> 
>>>>>>>>>> Using config file '/etc/robinhood.d/lustre1.conf'.
>>>>>>>>>> group,     type,      count,     volume,   spc_used,   avg_size
>>>>>>>>>> somegroup,   symlink,      59071,    5.12 MB,  103.16 MB,         91
>>>>>>>>>> somegroup,       dir,     426619,    5.24 GB,    5.24 GB,   12.87 KB
>>>>>>>>>> somegroup,      file,    1310414,   16.24 TB,   13.37 TB,   13.00 MB
>>>>>>>>>> 
>>>>>>>>>> Total: 1796104 entries, volume: 17866508365925 bytes (16.25 TB), 
>>>>>>>>>> space used: 14704924899840 bytes (13.37 TB)
>>>>>>>>>> 
>>>>>>>>>> Any ideas what is wrong here?
>>>>>>>>>> 
>>>>>>>>>> —
>>>>>>>>>> Dan Szkola
>>>>>>>>>> FNAL
>>>>> 
>>>>> _______________________________________________
>>>>> lustre-discuss mailing list
>>>>> lustre-discuss@lists.lustre.org
>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=fBsuy6PvJ6sM0CR4j5XQgdmXVIcfg7TPKCS_M1GPMH4gO9ZYMYzNwrtCk4VsbxsJ&s=G9xyFTfK33CmMEh4Da4Vor1Iu5u_EwwX04fX1YarFPI&e=
>>>>>  
>>>> 
>>> 
>> 
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=e9DXjyTaQ786Tg7WH7oIVaQOA1YDRqyxHOUaYU2_LQw&m=fBsuy6PvJ6sM0CR4j5XQgdmXVIcfg7TPKCS_M1GPMH4gO9ZYMYzNwrtCk4VsbxsJ&s=G9xyFTfK33CmMEh4Da4Vor1Iu5u_EwwX04fX1YarFPI&e=
>>  

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
  • R... Daniel Szkola via lustre-discuss
    • ... Mark Dixon via lustre-discuss
      • ... Daniel Szkola via lustre-discuss
        • ... Mark Dixon via lustre-discuss
    • ... Daniel Szkola via lustre-discuss
      • ... Daniel Szkola via lustre-discuss
        • ... Andreas Dilger via lustre-discuss
          • ... Daniel Szkola via lustre-discuss
            • ... Daniel Szkola via lustre-discuss
              • ... Andreas Dilger via lustre-discuss
              • ... Daniel Szkola via lustre-discuss
              • ... Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
              • ... Daniel Szkola via lustre-discuss
              • ... Andreas Dilger via lustre-discuss
              • ... Daniel Szkola via lustre-discuss
              • ... Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
              • ... Andreas Dilger via lustre-discuss
              • ... Daniel Szkola via lustre-discuss

Reply via email to