Re: [lustre-discuss] RV: Lustre quota issues

2019-07-11 Thread Moreno Diego (ID SIS)
Hi Thomas,

I think one way to get reliable quota values might be to reduce a couple of 
tunables:

- osc.*.max_dirty_mb : In this case you reduce the amount of non-committed data 
on the client cache and thus the potential for quota inconsistencies. I've been 
recently having quota issues on a filesystem with many OSTs set to 128MB as 
max_dirty_mb. That gives potential for a lot of dirty data depending on the 
number of clients you have, hence quota issues or better to say, false 
positives. That’s actually just my understanding of quota vs cache.

- mds01 ~]# lctl get_param qmt.*.*.soft_least_qunit: tunable to reduce the 
qunit size between soft and hard quota thus allowing a fine tuning of quota 
allowance when quota is on this "danger zone". That should help to have more 
reliable values. That's at least on Lustre 2.10.

I had the same problems you have w.r.t. to qunit. The documentation seems 
outdated or not accurate for, at least, Lustre 2.10.

Regards,

Diego
 

On 10.07.19, 13:15, "lustre-discuss on behalf of Thomas Roth" 
 wrote:

Yes, I have seen this pattern before.

My guess:

- cetafs-OST0017 - 21 are at the limit, thus the * there, thus the overall *
The manual warns you that somebody might write more than his allotted 
amount because the quota is
distributed over the OSTs, this might work the other way around, too.

- 8k or 20k is far too low.

However, I have no idea how to influence these values.


Regards
Thomas

On 09/07/2019 08.46, Alfonso Pardo wrote:
> Hi,
> 
> If I set quota to "-b 0 -B 0" and inmedeately set guota to 20G again, I 
get
> same result, quota exceded. 
> When I run "lfs quota -v", this is the output:
> 
> 

> 

> -
> Disk quotas for group XXX (gid 694):
>  Filesystemused   quota   limit   grace   files   quota   limit
> grace
>   /mnt/data  2.307G*20G 20G 6d23h59m58s   39997  10  
10
> -
> cetafs-MDT_UUID
>  25.29M   -  0k   -   39997   -   65536
> -
> cetafs-OST0014_UUID
>  451.1M   -  452.1M   -   -   -   -
> -
> cetafs-OST0015_UUID
>  409.7M   -  410.7M   -   -   -   -
> -
> cetafs-OST0016_UUID
>  429.4M   -  430.4M   -   -   -   -
> -
> cetafs-OST0017_UUID
>  1.022G*  -  1.022G   -   -   -   -
> -
> cetafs-OST0018_UUID
>  8k*  -  8k   -   -   -   -
> -
> cetafs-OST0019_UUID
> 20k*  - 20k   -   -   -   -
> -
> cetafs-OST001a_UUID
>  8k*  -  8k   -   -   -   -
> -
> cetafs-OST001b_UUID
> 24k*  - 24k   -   -   -   -
> -
> quotactl ost28 failed.
> quotactl ost29 failed.
> quotactl ost30 failed.
> quotactl ost31 failed.
> cetafs-OST0020_UUID
>116k*  -116k   -   -   -   -
> -
> cetafs-OST0021_UUID
>136k*  -136k   -   -   -   -
> -
> Total allocated inode limit: 65536, total allocated block limit: 2.285G
> Some errors happened when getting quota info. Some devices may be not
> working or deactivated. The data in "[]" is inaccurate.
> 

> 

> -
> 
> 
> As you can see, I have some OST deactivated, because I will remove them.
> 
> I have set quotas without "quota" (soft with -b) only setting "limit" 
(hard
> with -B), and it works fine, no quota exceded, but if I set "quota" (-b) 
the
> error appears.
> 
> 
> 
    > -Mensaje original-
> De: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] En
> nombre de Thomas Roth
> Enviado el: lunes, 8 de julio de 2019 15:14
> Para: lustre-discuss@lists.lustre.org
> Asunto: Re: [lustre-discuss] RV: Lustre quota issues
> 
> Perhaps the same issue that we see from time to time.
> 
> What happens if you remove the quota alltogether 

Re: [lustre-discuss] RV: Lustre quota issues

2019-07-10 Thread Thomas Roth
Yes, I have seen this pattern before.

My guess:

- cetafs-OST0017 - 21 are at the limit, thus the * there, thus the overall *
The manual warns you that somebody might write more than his allotted amount 
because the quota is
distributed over the OSTs, this might work the other way around, too.

- 8k or 20k is far too low.

However, I have no idea how to influence these values.


Regards
Thomas

On 09/07/2019 08.46, Alfonso Pardo wrote:
> Hi,
> 
> If I set quota to "-b 0 -B 0" and inmedeately set guota to 20G again, I get
> same result, quota exceded. 
> When I run "lfs quota -v", this is the output:
> 
> 
> 
> -
> Disk quotas for group XXX (gid 694):
>  Filesystemused   quota   limit   grace   files   quota   limit
> grace
>   /mnt/data  2.307G*20G 20G 6d23h59m58s   39997  10  10
> -
> cetafs-MDT_UUID
>  25.29M   -  0k   -   39997   -   65536
> -
> cetafs-OST0014_UUID
>  451.1M   -  452.1M   -   -   -   -
> -
> cetafs-OST0015_UUID
>  409.7M   -  410.7M   -   -   -   -
> -
> cetafs-OST0016_UUID
>  429.4M   -  430.4M   -   -   -   -
> -
> cetafs-OST0017_UUID
>  1.022G*  -  1.022G   -   -   -   -
> -
> cetafs-OST0018_UUID
>  8k*  -  8k   -   -   -   -
> -
> cetafs-OST0019_UUID
> 20k*  - 20k   -   -   -   -
> -
> cetafs-OST001a_UUID
>  8k*  -  8k   -   -   -   -
> -
> cetafs-OST001b_UUID
> 24k*  - 24k   -   -   -   -
> -
> quotactl ost28 failed.
> quotactl ost29 failed.
> quotactl ost30 failed.
> quotactl ost31 failed.
> cetafs-OST0020_UUID
>116k*  -116k   -   -   -   -
> -
> cetafs-OST0021_UUID
>136k*  -136k   -   -   -   -
> -
> Total allocated inode limit: 65536, total allocated block limit: 2.285G
> Some errors happened when getting quota info. Some devices may be not
> working or deactivated. The data in "[]" is inaccurate.
> 
> 
> -
> 
> 
> As you can see, I have some OST deactivated, because I will remove them.
> 
> I have set quotas without "quota" (soft with -b) only setting "limit" (hard
> with -B), and it works fine, no quota exceded, but if I set "quota" (-b) the
> error appears.
> 
> 
> 
> -Mensaje original-
> De: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] En
> nombre de Thomas Roth
> Enviado el: lunes, 8 de julio de 2019 15:14
> Para: lustre-discuss@lists.lustre.org
> Asunto: Re: [lustre-discuss] RV: Lustre quota issues
> 
> Perhaps the same issue that we see from time to time.
> 
> What happens if you remove the quota alltogether (-b 0 -B 0) and set them to
> 20G immedeately afterwards?
> That made Lustre reconsider and repent in our case.
> 
> Still, I suspect it is connected to the parts of the quota-total attributed
> to each OST:
> Try 'lfs quota -v' to see these for each OST.
> 
> The manual talks of the 'qunit size', but is not entirely clear whether that
> is a tunable, a value derived from Lustre/OST size, a fixed value, and how
> to dead with it.
> 
> 
> Regards
> Thomas
> 
> 
> 
> On 08/07/2019 12.42, Alfonso Pardo wrote:
>> Hi,
>>
>>  
>>
>> We have an issue with Lustre quotas and groups. We assign storage 
>> quotas to users groups. In general it works fine, but sometimes with 
>> some groups Lustre says that the quota limit has been passed. But it 
>> not true, if you run "lfs quota -g  /data" we get a storage used below
> the quota limit:
>>
>>  
>>
>> Disk quotas for group XX (gid 694):
>>
>> Filesystemused   quota   limit   grace   files   quota   limit   grace
>>
>> /mnt/data  2.148G*20G 20G 2d1h31m9s   32733  10  10
> -
>>
>>  
>>
>>  
>>
>> As you can see in this example the group has 2.148Gb used, and the 
>> quotas is established to 20Gb, but Lustre say that the quota has been 
>> pass

Re: [lustre-discuss] RV: Lustre quota issues

2019-07-09 Thread Alfonso Pardo
Hi,

If I set quota to "-b 0 -B 0" and inmedeately set guota to 20G again, I get
same result, quota exceded. 
When I run "lfs quota -v", this is the output:



-
Disk quotas for group XXX (gid 694):
 Filesystemused   quota   limit   grace   files   quota   limit
grace
  /mnt/data  2.307G*20G 20G 6d23h59m58s   39997  10  10
-
cetafs-MDT_UUID
 25.29M   -  0k   -   39997   -   65536
-
cetafs-OST0014_UUID
 451.1M   -  452.1M   -   -   -   -
-
cetafs-OST0015_UUID
 409.7M   -  410.7M   -   -   -   -
-
cetafs-OST0016_UUID
 429.4M   -  430.4M   -   -   -   -
-
cetafs-OST0017_UUID
 1.022G*  -  1.022G   -   -   -   -
-
cetafs-OST0018_UUID
 8k*  -  8k   -   -   -   -
-
cetafs-OST0019_UUID
20k*  - 20k   -   -   -   -
-
cetafs-OST001a_UUID
 8k*  -  8k   -   -   -   -
-
cetafs-OST001b_UUID
24k*  - 24k   -   -   -   -
-
quotactl ost28 failed.
quotactl ost29 failed.
quotactl ost30 failed.
quotactl ost31 failed.
cetafs-OST0020_UUID
   116k*  -116k   -   -   -   -
-
cetafs-OST0021_UUID
   136k*  -136k   -   -   -   -
-
Total allocated inode limit: 65536, total allocated block limit: 2.285G
Some errors happened when getting quota info. Some devices may be not
working or deactivated. The data in "[]" is inaccurate.


-


As you can see, I have some OST deactivated, because I will remove them.

I have set quotas without "quota" (soft with -b) only setting "limit" (hard
with -B), and it works fine, no quota exceded, but if I set "quota" (-b) the
error appears.



-Mensaje original-
De: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] En
nombre de Thomas Roth
Enviado el: lunes, 8 de julio de 2019 15:14
Para: lustre-discuss@lists.lustre.org
Asunto: Re: [lustre-discuss] RV: Lustre quota issues

Perhaps the same issue that we see from time to time.

What happens if you remove the quota alltogether (-b 0 -B 0) and set them to
20G immedeately afterwards?
That made Lustre reconsider and repent in our case.

Still, I suspect it is connected to the parts of the quota-total attributed
to each OST:
Try 'lfs quota -v' to see these for each OST.

The manual talks of the 'qunit size', but is not entirely clear whether that
is a tunable, a value derived from Lustre/OST size, a fixed value, and how
to dead with it.


Regards
Thomas



On 08/07/2019 12.42, Alfonso Pardo wrote:
> Hi,
> 
>  
> 
> We have an issue with Lustre quotas and groups. We assign storage 
> quotas to users groups. In general it works fine, but sometimes with 
> some groups Lustre says that the quota limit has been passed. But it 
> not true, if you run “lfs quota -g  /data” we get a storage used below
the quota limit:
> 
>  
> 
> Disk quotas for group XX (gid 694):
> 
> Filesystemused   quota   limit   grace   files   quota   limit   grace
> 
> /mnt/data  2.148G*20G 20G 2d1h31m9s   32733  10  10
-
> 
>  
> 
>  
> 
> As you can see in this example the group has 2.148Gb used, and the 
> quotas is established to 20Gb, but Lustre say that the quota has been 
> passed (*) and a grace period has started.
> 
>  
> 
> If I upgrade the quota to 50G, for example, the quota indicator (*) 
> and no quota errors is vanished.
> 
>  
> 
>  
> 
> Any idea or suggestion?
> 
>  
> 
>  
> 
> Thanks in advance
> 
>  
> 
> Alfonso Pardo Díaz
> CETA·Ciemat
> Departamento de Tecnología
> Conventual de San Francisco
> 10200 Trujillo (Cáceres)
> Tel: 927 659 317 | Ext: 214
> alfonso.pa...@ciemat.es <mailto:alfonso.pa...@ciemat.es>
> 
>  <http://www.ceta-ciemat.es/>
> 
>  
> 
> 
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 

--

Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986


GSI Helmholtzzentrum für Schwerionenforschung Gmb

Re: [lustre-discuss] RV: Lustre quota issues

2019-07-08 Thread Thomas Roth
Perhaps the same issue that we see from time to time.

What happens if you remove the quota alltogether (-b 0 -B 0) and set them to 
20G immedeately afterwards?
That made Lustre reconsider and repent in our case.

Still, I suspect it is connected to the parts of the quota-total attributed to 
each OST:
Try 'lfs quota -v' to see these for each OST.

The manual talks of the 'qunit size', but is not entirely clear whether that is 
a tunable, a value
derived from Lustre/OST size, a fixed value, and how to dead with it.


Regards
Thomas



On 08/07/2019 12.42, Alfonso Pardo wrote:
> Hi,
> 
>  
> 
> We have an issue with Lustre quotas and groups. We assign storage quotas to
> users groups. In general it works fine, but sometimes with some groups
> Lustre says that the quota limit has been passed. But it not true, if you
> run “lfs quota -g  /data” we get a storage used below the quota limit:
> 
>  
> 
> Disk quotas for group XX (gid 694):
> 
> Filesystemused   quota   limit   grace   files   quota   limit   grace
> 
> /mnt/data  2.148G*20G 20G 2d1h31m9s   32733  10  10   -
> 
>  
> 
>  
> 
> As you can see in this example the group has 2.148Gb used, and the quotas is
> established to 20Gb, but Lustre say that the quota has been passed (*) and a
> grace period has started.
> 
>  
> 
> If I upgrade the quota to 50G, for example, the quota indicator (*) and no
> quota errors is vanished.
> 
>  
> 
>  
> 
> Any idea or suggestion?
> 
>  
> 
>  
> 
> Thanks in advance
> 
>  
> 
> Alfonso Pardo Díaz
> CETA·Ciemat
> Departamento de Tecnología
> Conventual de San Francisco
> 10200 Trujillo (Cáceres)
> Tel: 927 659 317 | Ext: 214
> alfonso.pa...@ciemat.es  
> 
>   
> 
>  
> 
> 
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 

-- 

Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986


GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Ursula Weyrich, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Georg Schütte
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] RV: Lustre quota issues

2019-07-08 Thread Alfonso Pardo
Hi,

 

We have an issue with Lustre quotas and groups. We assign storage quotas to
users groups. In general it works fine, but sometimes with some groups
Lustre says that the quota limit has been passed. But it not true, if you
run “lfs quota -g  /data” we get a storage used below the quota limit:

 

Disk quotas for group XX (gid 694):

Filesystemused   quota   limit   grace   files   quota   limit   grace

/mnt/data  2.148G*20G 20G 2d1h31m9s   32733  10  10   -

 

 

As you can see in this example the group has 2.148Gb used, and the quotas is
established to 20Gb, but Lustre say that the quota has been passed (*) and a
grace period has started.

 

If I upgrade the quota to 50G, for example, the quota indicator (*) and no
quota errors is vanished.

 

 

Any idea or suggestion?

 

 

Thanks in advance

 

Alfonso Pardo Díaz
CETA·Ciemat
Departamento de Tecnología
Conventual de San Francisco
10200 Trujillo (Cáceres)
Tel: 927 659 317 | Ext: 214
alfonso.pa...@ciemat.es  

  

 

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org