Re: [Gluster-devel] Fuse mounts and inodes

2017-09-05 Thread Csaba Henk
Thanks Du, nice bit of info! It made me wander about the following:

- Could it be then the default answer we give to "glusterfs client
high memory usage"
  type of complaints to set vfs_cache_pressure to 100 + x?
- And then x = ? Was there proper performance testing done to see how
performance /
  mem consumtion changes in terms of vfs_cache_performace?
- vfs_cache_pressure is an allover system tunable. If 100 + x is ideal
for GlusterFS, can
  we take the courage to propose this? Is there no risk to trash other
(disk-based)
  filesystems' performace?

Csaba

On Wed, Sep 6, 2017 at 6:57 AM, Raghavendra G  wrote:
> Another parallel effort could be trying to configure the number of
> inodes/dentries cached by kernel VFS using /proc/sys/vm interface.
>
> ==
>
> vfs_cache_pressure
> --
>
> This percentage value controls the tendency of the kernel to reclaim
> the memory which is used for caching of directory and inode objects.
>
> At the default value of vfs_cache_pressure=100 the kernel will attempt to
> reclaim dentries and inodes at a "fair" rate with respect to pagecache and
> swapcache reclaim.  Decreasing vfs_cache_pressure causes the kernel to
> prefer
> to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel
> will
> never reclaim dentries and inodes due to memory pressure and this can easily
> lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100
> causes the kernel to prefer to reclaim dentries and inodes.
>
> Increasing vfs_cache_pressure significantly beyond 100 may have negative
> performance impact. Reclaim code needs to take various locks to find
> freeable
> directory and inode objects. With vfs_cache_pressure=1000, it will look for
> ten times more freeable objects than there are.
>
> Also we've an article for sysadmins which has a section:
>
> 
>
> With GlusterFS, many users with a lot of storage and many small files
> easily end up using a lot of RAM on the server side due to
> 'inode/dentry' caching, leading to decreased performance when the kernel
> keeps crawling through data-structures on a 40GB RAM system. Changing
> this value higher than 100 has helped many users to achieve fair caching
> and more responsiveness from the kernel.
>
> 
>
> Complete article can be found at:
> https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Linux%20Kernel%20Tuning/
>
> regards,
>
>
> On Tue, Sep 5, 2017 at 5:20 PM, Raghavendra Gowdappa 
> wrote:
>>
>> +gluster-devel
>>
>> Ashish just spoke to me about need of GC of inodes due to some state in
>> inode that is being proposed in EC. Hence adding more people to
>> conversation.
>>
>> > > On 4 September 2017 at 12:34, Csaba Henk  wrote:
>> > >
>> > > > I don't know, depends on how sophisticated GC we need/want/can get
>> > > > by. I
>> > > > guess the complexity will be inherent, ie. that of the algorithm
>> > > > chosen
>> > > > and
>> > > > how we address concurrency & performance impacts, but once that's
>> > > > got
>> > > > right
>> > > > the other aspects of implementation won't be hard.
>> > > >
>> > > > Eg. would it be good just to maintain a simple LRU list?
>> > > >
>> >
>> > Yes. I was also thinking of leveraging lru list. We can invalidate first
>> > "n"
>> > inodes from lru list of fuse inode table.
>> >
>> > >
>> > > That might work for starters.
>> > >
>> > > >
>> > > > Csaba
>> > > >
>> > > > On Mon, Sep 4, 2017 at 8:48 AM, Nithya Balachandran
>> > > > 
>> > > > wrote:
>> > > >
>> > > >>
>> > > >>
>> > > >> On 4 September 2017 at 12:14, Csaba Henk  wrote:
>> > > >>
>> > > >>> Basically how I see the fuse invalidate calls as rescuers of
>> > > >>> sanity.
>> > > >>>
>> > > >>> Normally, when you have lot of certain kind of stuff that tends to
>> > > >>> accumulate, the immediate thought is: let's set up some garbage
>> > > >>> collection
>> > > >>> mechanism, that will take care of keeping the accumulation at bay.
>> > > >>> But
>> > > >>> that's what doesn't work with inodes in a naive way, as they are
>> > > >>> referenced
>> > > >>> from kernel, so we have to keep them around until kernel tells us
>> > > >>> it's
>> > > >>> giving up its reference. However, with the fuse invalidate calls
>> > > >>> we can
>> > > >>> take the initiative and instruct the kernel: "hey, kernel, give up
>> > > >>> your
>> > > >>> references to this thing!"
>> > > >>>
>> > > >>> So we are actually free to implement any kind of inode GC in
>> > > >>> glusterfs,
>> > > >>> just have to take care to add the proper callback to
>> > > >>> fuse_invalidate_*
>> > > >>> and
>> > > >>> we are good to go.
>> > > >>>
>> > > >>>
>> > > >> That sounds good and something we need to do in the near future. Is
>> > > >> this
>> > > >> something that is easy to implement?
>> > > >>
>> > > >>
>> > > >>> Csaba
>> > > >>>
>> > > >>> On Mon, Sep 4, 2017 at 

Re: [Gluster-devel] Fuse mounts and inodes

2017-09-05 Thread Raghavendra G
Another parallel effort could be trying to configure the number of
inodes/dentries cached by kernel VFS using /proc/sys/vm interface.

==

vfs_cache_pressure
--

This percentage value controls the tendency of the kernel to reclaim
the memory which is used for caching of directory and inode objects.

At the default value of vfs_cache_pressure=100 the kernel will attempt to
reclaim dentries and inodes at a "fair" rate with respect to pagecache and
swapcache reclaim.  Decreasing vfs_cache_pressure causes the kernel to prefer
to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will
never reclaim dentries and inodes due to memory pressure and this can easily
lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100
causes the kernel to prefer to reclaim dentries and inodes.

Increasing vfs_cache_pressure significantly beyond 100 may have negative
performance impact. Reclaim code needs to take various locks to find freeable
directory and inode objects. With vfs_cache_pressure=1000, it will look for
ten times more freeable objects than there are.

Also we've an article for sysadmins which has a section:



With GlusterFS, many users with a lot of storage and many small files
easily end up using a lot of RAM on the server side due to
'inode/dentry' caching, leading to decreased performance when the kernel
keeps crawling through data-structures on a 40GB RAM system. Changing
this value higher than 100 has helped many users to achieve fair caching
and more responsiveness from the kernel.



Complete article can be found at:
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Linux%20Kernel%20Tuning/

regards,


On Tue, Sep 5, 2017 at 5:20 PM, Raghavendra Gowdappa 
wrote:

> +gluster-devel
>
> Ashish just spoke to me about need of GC of inodes due to some state in
> inode that is being proposed in EC. Hence adding more people to
> conversation.
>
> > > On 4 September 2017 at 12:34, Csaba Henk  wrote:
> > >
> > > > I don't know, depends on how sophisticated GC we need/want/can get
> by. I
> > > > guess the complexity will be inherent, ie. that of the algorithm
> chosen
> > > > and
> > > > how we address concurrency & performance impacts, but once that's got
> > > > right
> > > > the other aspects of implementation won't be hard.
> > > >
> > > > Eg. would it be good just to maintain a simple LRU list?
> > > >
> >
> > Yes. I was also thinking of leveraging lru list. We can invalidate first
> "n"
> > inodes from lru list of fuse inode table.
> >
> > >
> > > That might work for starters.
> > >
> > > >
> > > > Csaba
> > > >
> > > > On Mon, Sep 4, 2017 at 8:48 AM, Nithya Balachandran <
> nbala...@redhat.com>
> > > > wrote:
> > > >
> > > >>
> > > >>
> > > >> On 4 September 2017 at 12:14, Csaba Henk  wrote:
> > > >>
> > > >>> Basically how I see the fuse invalidate calls as rescuers of
> sanity.
> > > >>>
> > > >>> Normally, when you have lot of certain kind of stuff that tends to
> > > >>> accumulate, the immediate thought is: let's set up some garbage
> > > >>> collection
> > > >>> mechanism, that will take care of keeping the accumulation at bay.
> But
> > > >>> that's what doesn't work with inodes in a naive way, as they are
> > > >>> referenced
> > > >>> from kernel, so we have to keep them around until kernel tells us
> it's
> > > >>> giving up its reference. However, with the fuse invalidate calls
> we can
> > > >>> take the initiative and instruct the kernel: "hey, kernel, give up
> your
> > > >>> references to this thing!"
> > > >>>
> > > >>> So we are actually free to implement any kind of inode GC in
> glusterfs,
> > > >>> just have to take care to add the proper callback to
> fuse_invalidate_*
> > > >>> and
> > > >>> we are good to go.
> > > >>>
> > > >>>
> > > >> That sounds good and something we need to do in the near future. Is
> this
> > > >> something that is easy to implement?
> > > >>
> > > >>
> > > >>> Csaba
> > > >>>
> > > >>> On Mon, Sep 4, 2017 at 7:00 AM, Nithya Balachandran
> > > >>>  > > >>> > wrote:
> > > >>>
> > > 
> > > 
> > >  On 4 September 2017 at 10:25, Raghavendra Gowdappa
> > >   > >  > wrote:
> > > 
> > > >
> > > >
> > > > - Original Message -
> > > > > From: "Nithya Balachandran" 
> > > > > Sent: Monday, September 4, 2017 10:19:37 AM
> > > > > Subject: Fuse mounts and inodes
> > > > >
> > > > > Hi,
> > > > >
> > > > > One of the reasons for the memory consumption in gluster fuse
> > > > > mounts
> > > > is the
> > > > > number of inodes in the table which are never kicked out.
> > > > >
> > > > > Is there any way to default to an entry-timeout and
> > > > attribute-timeout value
> > > > > while mounting Gluster using Fuse? Say 60s each so those
> 

Re: [Gluster-devel] docs.gluster.org

2017-09-05 Thread Amye Scavarda
On Mon, Sep 4, 2017 at 1:28 AM, Niels de Vos  wrote:
> On Fri, Sep 01, 2017 at 06:21:38PM -0400, Amye Scavarda wrote:
>> On Fri, Sep 1, 2017 at 9:42 AM, Michael Scherer  wrote:
>> > Le vendredi 01 septembre 2017 à 14:02 +0100, Michael Scherer a écrit :
>> >> Le mercredi 30 août 2017 à 12:11 +0530, Nigel Babu a écrit :
>> >> > Hello,
>> >> >
>> >> > To reduce confusion, we've setup docs.gluster.org pointing to
>> >> > gluster.readthedocs.org. Both URLs will continue to work for the
>> >> > forseeable
>> >> > future.
>> >> >
>> >> > Please update any references that you control to point to
>> >> > docs.gluster.org. At
>> >> > some point in the distant future, we will switch to hosting
>> >> > docs.gluster.org on
>> >> > our own servers.
>> >> >
>> >> > RTD will set up a canonical link to docs.gluster.org[1]. Over time,
>> >> > this will
>> >> > change update the results on search engines to docs.gluster.org.
>> >> > This
>> >> > change
>> >> > will reduce confusion we've had with copies of our docs hosted on
>> >> > RTD.
>> >> >
>> >> > [1]: https://docs.readthedocs.io/en/latest/canonical.html
>> >>
>> >> So , seems TLS certificate is wrong, should we correct the link to be
>> >> http for now ?
>> >
>> > So I opened a few PR/review:
>> > https://github.com/gluster/glusterdocs/pull/259
>> >
>> > https://review.gluster.org/#/c/18182/
>> >
>> > https://github.com/gluster/glusterweb/pull/148
>> >
>> >
>> > --
>> > Michael Scherer
>> > Sysadmin, Community Infrastructure and Platform, OSAS
>> >
>> >
>> > ___
>> > Gluster-devel mailing list
>> > Gluster-devel@gluster.org
>> > http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>> Fair warning, glusterweb is now deprecated as discussed in Community
>> Meeting on 30 Aug. However, github.com/gluster/glusterweb will be used
>> as a bug tracker moving forward.
>
> Could you please replace the contents of the current glusterweb
> repository with a README explaining this? The old files can stay in the
> history so that we can reference them in case something on the new
> website needs to be re-added.
>
> Being able to send pull requests and get contributions from users that
> way was one of the main reasons to move to GitHub. Is that still
> possible with the new site, in a different repository? I guess WordPress
> has some import/export features, but I don't know if those can get
> nicely combined with a repository on GitHub.
>
> Thanks!
> Niels

So we're actually going to move forward in a different direction with
the glusterweb repository once we get everything cleaned up.

We'll still use the GitHub issue queue as a method of tracking
improvements/bugs/feature requests , but I'd like to get us to a place
where we have a WordPress theme that people can contribute to and code
for custom plugins that we've developed. I'll be looking for
volunteers that can edit content directly as well.

I'll create a branch outlining what that could look like and we'll
discuss at the next community meeting.
- amye


-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Release 3.12: Announced 3.12.0 and 3.12.1 in 5 days!

2017-09-05 Thread Shyam Ranganathan

Hi,

Release 3.12.0 has been announced [1] and tracker bug for 3.12.1 is now 
open [2].


3.12.1 will be tagged on the 10th of Sep, which is about 5 days away, so 
do update the tracker bug with any blockers that are identified.


Thanks,
Shyam

[1] Announce: 
http://lists.gluster.org/pipermail/announce/2017-September/82.html


[2] 3.12.1 tracker: 
https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.12.1

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Quota Used Value Incorrect - Fix now or after upgrade

2017-09-05 Thread Matthew B
Apologies - I copied and pasted the wrong ansible output:

matthew@laptop:~/playbooks$ ansible -i hosts gluster-servers[0:6] -u
matthewb --ask-pass -m shell -b --become-method=sudo --ask-become-pass -a
"getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/data/projects/MEOPAR | egrep
'^trusted.glusterfs.quota.size'"
SSH password:
SUDO password[defaults to SSH password]:
gluster02 | SUCCESS | rc=0 >>
trusted.glusterfs.quota.size=0x011ecfa56c05cd6d0006d478
trusted.glusterfs.quota.size.1=0x010ad4a452012a03000150fa

gluster05 | SUCCESS | rc=0 >>
trusted.glusterfs.quota.size=0x0033b8e93804cde90006b1a4
trusted.glusterfs.quota.size.1=0x010dca277c01297d00015005

gluster04 | SUCCESS | rc=0 >>
trusted.glusterfs.quota.size=0xff396f3ec004d7eb00068c62
trusted.glusterfs.quota.size.1=0x0106e6724801138f00012fb2

gluster01 | SUCCESS | rc=0 >>
trusted.glusterfs.quota.size=0x003d4d43480576160006afd2
trusted.glusterfs.quota.size.1=0x0133fe211e05d1610006cfd4

gluster03 | SUCCESS | rc=0 >>
trusted.glusterfs.quota.size=0xfd02acabf003599643e2
trusted.glusterfs.quota.size.1=0x0114e20f5e0113b300012fb2

gluster06 | SUCCESS | rc=0 >>
trusted.glusterfs.quota.size=0xff0c98de440536e400068cf2
trusted.glusterfs.quota.size.1=0x013532664e05e73f0006cfd4

gluster07 | SUCCESS | rc=0 >>
trusted.glusterfs.quota.size=0x01108e51140327c60006bf6d

Thanks,
 -Matthew

On Fri, Sep 1, 2017 at 3:22 PM, Matthew B 
wrote:

> Thanks Sanoj,
>
> Now the brick is showing the correct xattrs:
>
> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex 
> /mnt/raid6-storage/storage/data/projects/MEOPAR
> # file: /mnt/raid6-storage/storage/data/projects/MEOPAR
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
> trusted.gfid=0x7209b677f4b94d82a3820733620e6929
> trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x599f228800088654
> trusted.glusterfs.dht=0x0001b6db6d41db6db6ee*trusted.glusterfs.quota.d5a5ecda-7511-4bbb-9b4c-4fcc84e3e1da.contri=0x01108e51140327c60006bf6d*
> trusted.glusterfs.quota.dirty=0x3000
> trusted.glusterfs.quota.limit-set=0x0880*trusted.glusterfs.quota.size=0x01108e51140327c60006bf6d*
>
>
> However, the quota listing still shows the old (incorrect) value:
>
>
> [root@gluster07 ~]# gluster volume quota storage list | egrep "MEOPAR "
> /data/projects/MEOPAR  8.5TB 80%(6.8TB) *16384.0PB*  
> 10.6TB  No   No
>
>
> I've checked on each of the bricks and they look fine now - is there any
> way to reflect the new value in the quota itself?
>
> matthew@laptop:~/playbooks$ ansible -i hosts gluster-servers[0:6] -u matthewb 
> --ask-pass -m shell -b --become-method=sudo --ask-become-pass -a "getfattr 
> --absolute-names -m . -d -e hex 
> /mnt/raid6-storage/storage/data/projects/comp_support | egrep 
> '^trusted.glusterfs.quota.size\=' | sed 's/trusted.glusterfs.quota.size\=//' 
> | cut -c 1-18 | xargs printf '%d\n'"
> SSH password:
> SUDO password[defaults to SSH password]:
> gluster05 | SUCCESS | rc=0 >>
> 567293059584
>
> gluster04 | SUCCESS | rc=0 >>
> 510784812032
>
> gluster03 | SUCCESS | rc=0 >>
> 939742334464
>
> gluster01 | SUCCESS | rc=0 >>
> 98688324096
>
> gluster02 | SUCCESS | rc=0 >>
> 61449348096
>
> gluster07 | SUCCESS | rc=0 >>
> 29252869632
>
> gluster06 | SUCCESS | rc=0 >>
> 31899410944
>
>
> Thanks,
>  -Matthew
>
> On Fri, Sep 1, 2017 at 4:33 AM, Sanoj Unnikrishnan 
> wrote:
>
>> Hi Mathew,
>>
>> The other option is to explicitly remove the size and contri xattr at the
>> brick path and then do a stat from the mount point.
>>
>>  #setfattr -x trusted.glusterfs.quota.00
>> 00----0001.contri.1 
>>  #setfattr -x trusted.glusterfs.quota.size.1  
>>  #stat 
>>
>> Stat would heal the size and the contri xattr and the dirty xattr would
>> heal only on the next operation on the directory.
>>
>> After this you could set dirty bit and do  a stat again.
>>
>> setxattr -n trusted.glusterfs.quota.dirty -v 0x3100 
>>
>> stat 
>>
>>
>>
>> Regards,
>> Sanoj
>>
>> On Thu, Aug 31, 2017 at 9:12 PM, Matthew B > om> wrote:
>>
>>> Hi Raghavendra,
>>>
>>> I didn't get a chance to implement your suggestions, however it looks
>>> like the dirty bit is no longer set - so presumably the quota should have
>>> been updated, however the quota.size attribute is still incorrect though
>>> slightly different than before. Any other suggestions?
>>>
>>> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex
>>> 

Re: [Gluster-devel] Quota Used Value Incorrect - Fix now or after upgrade

2017-09-05 Thread Matthew B
Thanks Sanoj,

Now the brick is showing the correct xattrs:

[root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/data/projects/MEOPAR
# file: /mnt/raid6-storage/storage/data/projects/MEOPAR
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x7209b677f4b94d82a3820733620e6929
trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x599f228800088654
trusted.glusterfs.dht=0x0001b6db6d41db6db6ee*trusted.glusterfs.quota.d5a5ecda-7511-4bbb-9b4c-4fcc84e3e1da.contri=0x01108e51140327c60006bf6d*
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.limit-set=0x0880*trusted.glusterfs.quota.size=0x01108e51140327c60006bf6d*


However, the quota listing still shows the old (incorrect) value:


[root@gluster07 ~]# gluster volume quota storage list | egrep "MEOPAR "
/data/projects/MEOPAR  8.5TB 80%(6.8TB)
*16384.0PB*  10.6TB  No   No


I've checked on each of the bricks and they look fine now - is there any
way to reflect the new value in the quota itself?

matthew@laptop:~/playbooks$ ansible -i hosts gluster-servers[0:6] -u
matthewb --ask-pass -m shell -b --become-method=sudo --ask-become-pass
-a "getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/data/projects/comp_support | egrep
'^trusted.glusterfs.quota.size\=' | sed
's/trusted.glusterfs.quota.size\=//' | cut -c 1-18 | xargs printf
'%d\n'"
SSH password:
SUDO password[defaults to SSH password]:
gluster05 | SUCCESS | rc=0 >>
567293059584

gluster04 | SUCCESS | rc=0 >>
510784812032

gluster03 | SUCCESS | rc=0 >>
939742334464

gluster01 | SUCCESS | rc=0 >>
98688324096

gluster02 | SUCCESS | rc=0 >>
61449348096

gluster07 | SUCCESS | rc=0 >>
29252869632

gluster06 | SUCCESS | rc=0 >>
31899410944


Thanks,
 -Matthew

On Fri, Sep 1, 2017 at 4:33 AM, Sanoj Unnikrishnan 
wrote:

> Hi Mathew,
>
> The other option is to explicitly remove the size and contri xattr at the
> brick path and then do a stat from the mount point.
>
>  #setfattr -x 
> trusted.glusterfs.quota.----0001.contri.1
> 
>  #setfattr -x trusted.glusterfs.quota.size.1  
>  #stat 
>
> Stat would heal the size and the contri xattr and the dirty xattr would
> heal only on the next operation on the directory.
>
> After this you could set dirty bit and do  a stat again.
>
> setxattr -n trusted.glusterfs.quota.dirty -v 0x3100 
>
> stat 
>
>
>
> Regards,
> Sanoj
>
> On Thu, Aug 31, 2017 at 9:12 PM, Matthew B  com> wrote:
>
>> Hi Raghavendra,
>>
>> I didn't get a chance to implement your suggestions, however it looks
>> like the dirty bit is no longer set - so presumably the quota should have
>> been updated, however the quota.size attribute is still incorrect though
>> slightly different than before. Any other suggestions?
>>
>> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex
>> /mnt/raid6-storage/storage/data/projects/MEOPAR
>> # file: /mnt/raid6-storage/storage/data/projects/MEOPAR
>> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6
>> c6162656c65645f743a733000
>> trusted.gfid=0x7209b677f4b94d82a3820733620e6929
>> trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime
>> =0x599f228800088654
>> trusted.glusterfs.dht=0x0001b6db6d41db6db6ee
>> trusted.glusterfs.quota.d5a5ecda-7511-4bbb-9b4c-4fcc84e3e1da
>> .contri=0xfa3d7c28f60a9d0a0005fd2f
>> trusted.glusterfs.quota.dirty=0x3000
>> trusted.glusterfs.quota.limit-set=0x0880
>> trusted.glusterfs.quota.size=0xfa3d7c28f60a9
>> d0a0005fd2f
>>
>> Thanks,
>> -Matthew
>>
>> On Mon, Aug 28, 2017 at 8:05 PM, Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>>
>>>
>>>
>>> - Original Message -
>>> > From: "Matthew B" 
>>> > To: "Sanoj Unnikrishnan" 
>>> > Cc: "Raghavendra Gowdappa" , "Gluster Devel" <
>>> gluster-devel@gluster.org>
>>> > Sent: Monday, August 28, 2017 9:33:25 PM
>>> > Subject: Re: [Gluster-devel] Quota Used Value Incorrect - Fix now or
>>> after upgrade
>>> >
>>> > Hi Sanoj,
>>> >
>>> > Thank you for the information - I have applied the changes you
>>> specified
>>> > above - but I haven't seen any changes in the xattrs on the directory
>>> after
>>> > about 15 minutes:
>>>
>>> I think stat is served from cache - either gluster's md-cache or kernel
>>> attribute cache. For healing to happen we need to force a lookup (which we
>>> had hoped would be issued as part of stat cmd) and this lookup has to reach
>>> marker xlator loaded on bricks. To make sure a lookup on the directory
>>> reaches marker we need to:
>>>
>>> 1. Turn off kernel attribute and entry cache (using --entrytimeout=0 and
>>> --attribute-timeout=0 as options to 

Re: [Gluster-devel] Quota Used Value Incorrect - Fix now or after upgrade

2017-09-05 Thread Matthew B
Hi Raghavendra,

I didn't get a chance to implement your suggestions, however it looks like
the dirty bit is no longer set - so presumably the quota should have been
updated, however the quota.size attribute is still incorrect though
slightly different than before. Any other suggestions?

[root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/data/projects/MEOPAR
# file: /mnt/raid6-storage/storage/data/projects/MEOPAR
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x7209b677f4b94d82a3820733620e6929
trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x599f228800088654
trusted.glusterfs.dht=0x0001b6db6d41db6db6ee
trusted.glusterfs.quota.d5a5ecda-7511-4bbb-9b4c-4fcc84e3e1da.contri=0xfa3d7c28f60a9d0a0005fd2f
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.limit-set=0x0880
trusted.glusterfs.quota.size=0xfa3d7c28f60a9d0a0005fd2f

Thanks,
-Matthew

On Mon, Aug 28, 2017 at 8:05 PM, Raghavendra Gowdappa 
wrote:

>
>
> - Original Message -
> > From: "Matthew B" 
> > To: "Sanoj Unnikrishnan" 
> > Cc: "Raghavendra Gowdappa" , "Gluster Devel" <
> gluster-devel@gluster.org>
> > Sent: Monday, August 28, 2017 9:33:25 PM
> > Subject: Re: [Gluster-devel] Quota Used Value Incorrect - Fix now or
> after upgrade
> >
> > Hi Sanoj,
> >
> > Thank you for the information - I have applied the changes you specified
> > above - but I haven't seen any changes in the xattrs on the directory
> after
> > about 15 minutes:
>
> I think stat is served from cache - either gluster's md-cache or kernel
> attribute cache. For healing to happen we need to force a lookup (which we
> had hoped would be issued as part of stat cmd) and this lookup has to reach
> marker xlator loaded on bricks. To make sure a lookup on the directory
> reaches marker we need to:
>
> 1. Turn off kernel attribute and entry cache (using --entrytimeout=0 and
> --attribute-timeout=0 as options to glusterfs while mounting)
> 2. Turn off md-cache using gluster cli (gluster volume set
> performance.md-cache  off)
> 3. Turn off readdirplus in the entire stack [1]
>
> Once the above steps are done I guess doing a stat results in a lookup on
> the directory witnessed by marker. Once the issue is fixed you can undo the
> above three steps so that performance is not affected in your setup.
>
> [1] http://nongnu.13855.n7.nabble.com/Turning-off-readdirp-in-
> the-entire-stack-on-fuse-mount-td220297.html
>
> >
> > [root@gluster07 ~]# setfattr -n trusted.glusterfs.quota.dirty -v 0x3100
> > /mnt/raid6-storage/storage/data/projects/MEOPAR/
> >
> > [root@gluster07 ~]# stat /mnt/raid6-storage/storage/data/projects/MEOPAR
> >
> > [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex
> > /mnt/raid6-storage/storage/data/projects/MEOPAR
> > # file: /mnt/raid6-storage/storage/data/projects/MEOPAR
> > security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> > trusted.gfid=0x7209b677f4b94d82a3820733620e6929
> > trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.
> xtime=0x599f228800088654
> > trusted.glusterfs.dht=0x0001b6db6d41db6db6ee
> > trusted.glusterfs.quota.d5a5ecda-7511-4bbb-9b4c-4fcc84e3e1da.contri=
> 0xfa3d7c1ba60a9ccb0005fd2f
> > trusted.glusterfs.quota.dirty=0x3100
> > trusted.glusterfs.quota.limit-set=0x0880
> > trusted.glusterfs.quota.size=0xfa3d7c1ba60a
> 9ccb0005fd2f
> >
> > [root@gluster07 ~]# gluster volume status storage
> > Status of volume: storage
> > Gluster process TCP Port  RDMA Port  Online
> Pid
> > 
> --
> > Brick 10.0.231.50:/mnt/raid6-storage/storag
> > e   49159 0  Y
> > 2160
> > Brick 10.0.231.51:/mnt/raid6-storage/storag
> > e   49153 0  Y
> > 16037
> > Brick 10.0.231.52:/mnt/raid6-storage/storag
> > e   49159 0  Y
> > 2298
> > Brick 10.0.231.53:/mnt/raid6-storage/storag
> > e   49154 0  Y
> > 9038
> > Brick 10.0.231.54:/mnt/raid6-storage/storag
> > e   49153 0  Y
> > 32284
> > Brick 10.0.231.55:/mnt/raid6-storage/storag
> > e   49153 0  Y
> > 14840
> > Brick 10.0.231.56:/mnt/raid6-storage/storag
> > e   49152 0  Y
> > 29389
> > NFS Server on localhost 2049  0  Y
> > 29421
> > Quota Daemon on localhost   N/A   N/AY
> > 

[Gluster-devel] [RFC] Automatic discovery of Gluster Storage Servers

2017-09-05 Thread Niels de Vos
At one point I would like to be able to install Gluster Storage Servers
and have them automatically detected. I'd like to be able to place them
in a Trusted Storage Pool after detection (peer probe). For this, I
started a gluster-zeroconf project on GitHub [1]. It currently is in a
proof-of-concept state, and I would like to have some feedback if this
approach is useable for others than me.

The current functionality is like this:

- install gluster-zeroconf-avahi on a number of storage servers
- install python2-gluster-zeroconf on at least one of the storage
  servers
- run `gluster-discovery` from the python2-gluster-zeroconf package
- all discovered storage servers will be listed
- run `gluster-discovery probe` to peer probe all discovered servers

A few more details are in the README at [2].

Future enhancements could include:

- plugin for GlusterD 2
- detection of unused local disks (spare bricks)
- register newly detected systems at Heketi

Feedback much appreciated!
Niels


1. https://github.com/nixpanic/gluster-zeroconf
2. https://github.com/nixpanic/gluster-zeroconf/blob/master/README.md


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Coverity covscan for 2017-09-05-c0406501 (master branch)

2017-09-05 Thread staticanalysis
GlusterFS Coverity covscan results are available from
http://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2017-09-05-c0406501
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Fuse mounts and inodes

2017-09-05 Thread Raghavendra Gowdappa
+gluster-devel

Ashish just spoke to me about need of GC of inodes due to some state in inode 
that is being proposed in EC. Hence adding more people to conversation.

> > On 4 September 2017 at 12:34, Csaba Henk  wrote:
> > 
> > > I don't know, depends on how sophisticated GC we need/want/can get by. I
> > > guess the complexity will be inherent, ie. that of the algorithm chosen
> > > and
> > > how we address concurrency & performance impacts, but once that's got
> > > right
> > > the other aspects of implementation won't be hard.
> > >
> > > Eg. would it be good just to maintain a simple LRU list?
> > >
> 
> Yes. I was also thinking of leveraging lru list. We can invalidate first "n"
> inodes from lru list of fuse inode table.
> 
> > 
> > That might work for starters.
> > 
> > >
> > > Csaba
> > >
> > > On Mon, Sep 4, 2017 at 8:48 AM, Nithya Balachandran 
> > > wrote:
> > >
> > >>
> > >>
> > >> On 4 September 2017 at 12:14, Csaba Henk  wrote:
> > >>
> > >>> Basically how I see the fuse invalidate calls as rescuers of sanity.
> > >>>
> > >>> Normally, when you have lot of certain kind of stuff that tends to
> > >>> accumulate, the immediate thought is: let's set up some garbage
> > >>> collection
> > >>> mechanism, that will take care of keeping the accumulation at bay. But
> > >>> that's what doesn't work with inodes in a naive way, as they are
> > >>> referenced
> > >>> from kernel, so we have to keep them around until kernel tells us it's
> > >>> giving up its reference. However, with the fuse invalidate calls we can
> > >>> take the initiative and instruct the kernel: "hey, kernel, give up your
> > >>> references to this thing!"
> > >>>
> > >>> So we are actually free to implement any kind of inode GC in glusterfs,
> > >>> just have to take care to add the proper callback to fuse_invalidate_*
> > >>> and
> > >>> we are good to go.
> > >>>
> > >>>
> > >> That sounds good and something we need to do in the near future. Is this
> > >> something that is easy to implement?
> > >>
> > >>
> > >>> Csaba
> > >>>
> > >>> On Mon, Sep 4, 2017 at 7:00 AM, Nithya Balachandran
> > >>>  > >>> > wrote:
> > >>>
> > 
> > 
> >  On 4 September 2017 at 10:25, Raghavendra Gowdappa
> >   >  > wrote:
> > 
> > >
> > >
> > > - Original Message -
> > > > From: "Nithya Balachandran" 
> > > > Sent: Monday, September 4, 2017 10:19:37 AM
> > > > Subject: Fuse mounts and inodes
> > > >
> > > > Hi,
> > > >
> > > > One of the reasons for the memory consumption in gluster fuse
> > > > mounts
> > > is the
> > > > number of inodes in the table which are never kicked out.
> > > >
> > > > Is there any way to default to an entry-timeout and
> > > attribute-timeout value
> > > > while mounting Gluster using Fuse? Say 60s each so those entries
> > > will be
> > > > purged periodically?
> > >
> > > Once the entry timeouts, inodes won't be purged. Kernel sends a
> > > lookup
> > > to revalidate the mapping of path to inode. AFAIK, reverse
> > > invalidation
> > > (see inode_invalidate) is the only way to make kernel forget
> > > inodes/attributes.
> > >
> > > Is that something that can be done from the Fuse mount ? Or is this
> >  something that needs to be added to Fuse?
> > 
> > > >
> > > > Regards,
> > > > Nithya
> > > >
> > >
> > 
> > 
> > >>>
> > >>
> > >
> > 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Quota Used Value Incorrect - Fix now or after upgrade

2017-09-05 Thread Sanoj Unnikrishnan
HI Mathew,

In order to do listing we use an auxiliary mount, It could be that this is
returning cached values..
So please try the following.

1) unmount the auxiliary mount for the volume (would have "client pid -5"
is its command line)
.. /var/log/glusterfs/quota-mount-xyz.log -p
/var/run/gluster/xyz.pid --client-pid -5 
2) do a quota list again

Regards,
Sanoj


On Sat, Sep 2, 2017 at 3:55 AM, Matthew B 
wrote:

> Apologies - I copied and pasted the wrong ansible output:
>
> matthew@laptop:~/playbooks$ ansible -i hosts gluster-servers[0:6] -u
> matthewb --ask-pass -m shell -b --become-method=sudo --ask-become-pass -a
> "getfattr --absolute-names -m . -d -e hex 
> /mnt/raid6-storage/storage/data/projects/MEOPAR
> | egrep '^trusted.glusterfs.quota.size'"
> SSH password:
> SUDO password[defaults to SSH password]:
> gluster02 | SUCCESS | rc=0 >>
> trusted.glusterfs.quota.size=0x011ecfa56c05
> cd6d0006d478
> trusted.glusterfs.quota.size.1=0x010ad4a45201
> 2a03000150fa
>
> gluster05 | SUCCESS | rc=0 >>
> trusted.glusterfs.quota.size=0x0033b8e93804
> cde90006b1a4
> trusted.glusterfs.quota.size.1=0x010dca277c01
> 297d00015005
>
> gluster04 | SUCCESS | rc=0 >>
> trusted.glusterfs.quota.size=0xff396f3ec004
> d7eb00068c62
> trusted.glusterfs.quota.size.1=0x0106e6724801
> 138f00012fb2
>
> gluster01 | SUCCESS | rc=0 >>
> trusted.glusterfs.quota.size=0x003d4d434805
> 76160006afd2
> trusted.glusterfs.quota.size.1=0x0133fe211e05
> d1610006cfd4
>
> gluster03 | SUCCESS | rc=0 >>
> trusted.glusterfs.quota.size=0xfd02acabf003
> 599643e2
> trusted.glusterfs.quota.size.1=0x0114e20f5e01
> 13b300012fb2
>
> gluster06 | SUCCESS | rc=0 >>
> trusted.glusterfs.quota.size=0xff0c98de4405
> 36e400068cf2
> trusted.glusterfs.quota.size.1=0x013532664e05
> e73f0006cfd4
>
> gluster07 | SUCCESS | rc=0 >>
> trusted.glusterfs.quota.size=0x01108e511403
> 27c60006bf6d
>
> Thanks,
>  -Matthew
>
> On Fri, Sep 1, 2017 at 3:22 PM, Matthew B  > wrote:
>
>> Thanks Sanoj,
>>
>> Now the brick is showing the correct xattrs:
>>
>> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex 
>> /mnt/raid6-storage/storage/data/projects/MEOPAR
>> # file: /mnt/raid6-storage/storage/data/projects/MEOPAR
>> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>> trusted.gfid=0x7209b677f4b94d82a3820733620e6929
>> trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x599f228800088654
>> trusted.glusterfs.dht=0x0001b6db6d41db6db6ee*trusted.glusterfs.quota.d5a5ecda-7511-4bbb-9b4c-4fcc84e3e1da.contri=0x01108e51140327c60006bf6d*
>> trusted.glusterfs.quota.dirty=0x3000
>> trusted.glusterfs.quota.limit-set=0x0880*trusted.glusterfs.quota.size=0x01108e51140327c60006bf6d*
>>
>>
>> However, the quota listing still shows the old (incorrect) value:
>>
>>
>> [root@gluster07 ~]# gluster volume quota storage list | egrep "MEOPAR "
>> /data/projects/MEOPAR  8.5TB 80%(6.8TB) *16384.0PB*  
>> 10.6TB  No   No
>>
>>
>> I've checked on each of the bricks and they look fine now - is there any
>> way to reflect the new value in the quota itself?
>>
>> matthew@laptop:~/playbooks$ ansible -i hosts gluster-servers[0:6] -u 
>> matthewb --ask-pass -m shell -b --become-method=sudo --ask-become-pass -a 
>> "getfattr --absolute-names -m . -d -e hex 
>> /mnt/raid6-storage/storage/data/projects/comp_support | egrep 
>> '^trusted.glusterfs.quota.size\=' | sed 's/trusted.glusterfs.quota.size\=//' 
>> | cut -c 1-18 | xargs printf '%d\n'"
>> SSH password:
>> SUDO password[defaults to SSH password]:
>> gluster05 | SUCCESS | rc=0 >>
>> 567293059584
>>
>> gluster04 | SUCCESS | rc=0 >>
>> 510784812032
>>
>> gluster03 | SUCCESS | rc=0 >>
>> 939742334464
>>
>> gluster01 | SUCCESS | rc=0 >>
>> 98688324096
>>
>> gluster02 | SUCCESS | rc=0 >>
>> 61449348096
>>
>> gluster07 | SUCCESS | rc=0 >>
>> 29252869632
>>
>> gluster06 | SUCCESS | rc=0 >>
>> 31899410944
>>
>>
>> Thanks,
>>  -Matthew
>>
>> On Fri, Sep 1, 2017 at 4:33 AM, Sanoj Unnikrishnan 
>> wrote:
>>
>>> Hi Mathew,
>>>
>>> The other option is to explicitly remove the size and contri xattr at
>>> the brick path and then do a stat from the mount point.
>>>
>>>  #setfattr -x trusted.glusterfs.quota.00
>>> 00----0001.contri.1 
>>>  #setfattr -x trusted.glusterfs.quota.size.1  
>>>  #stat 
>>>
>>> Stat would heal the size and the contri xattr and the dirty xattr would
>>> heal only on the next operation on the directory.
>>>
>>> After this you 

[Gluster-devel] GD2 demo

2017-09-05 Thread Kaushal M
Hi all,

We had a GD2 demo in the Red Hat Bangalore office, aimed mainly at
developers. The demo was recorded and is available at the [1]. This
isn't the best possible demo, but it should give an idea of how
integration with GD2 will happen. Questions and comments are welcome.

~kaushal

[1]: https://bluejeans.com/s/z970R
Requires flash to view. If flash isn't possible, the video can be
downloaded for viewing as well.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Call for help: Updating xlator options for GD2 (round one)

2017-09-05 Thread Kaushal M
We are beginning the integration of GlusterFS and GD2. The first part
of this process, is to update xlator options to include some more
information that GD2 requires.

I've written down a guide to help with this process in a hackmd
document [1]. We will using this document to track progress with the
changes as well.

Please follow the guidelines in [1] and do your changes. As you do you
changes also keep updating the document (editable link [2]).

If there are questions or comments, let us know here or in the document.

Thanks,
Kaushal

[1]: https://hackmd.io/s/Hy87Y2oYW
[2]:https://hackmd.io/IYDgrA7AbFAMAsBaATBAnAI0fDBGMiGAZlFgCYCmEAxsGchbAMxVA===?both
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel