Re: [Gluster-devel] Spurious failures

2015-09-24 Thread Michael Scherer
Le jeudi 24 septembre 2015 à 06:50 -0400, Kotresh Hiremath Ravishankar a
écrit :
> >>> Ok, this definitely requires some tests and toughts. It only use ipv4
> >>> too ?
> >>> (I guess yes, since ipv6 is removed from the rackspace build slaves)
>
> Yes!
> 
> Could we know when can these settings be done on all linux slave machines?
> If it takes sometime, we should consider moving all geo-rep testcases 
> under bad tests
> till then.

I will do that this afternoon, now I have a clear idea of what need to
be done.
( I already pushed the path change )

-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS




signature.asc
Description: This is a digitally signed message part
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures

2015-09-24 Thread Kotresh Hiremath Ravishankar
>>> Ok, this definitely requires some tests and toughts. It only use ipv4
>>> too ?
>>> (I guess yes, since ipv6 is removed from the rackspace build slaves)
   
Yes!

Could we know when can these settings be done on all linux slave machines?
If it takes sometime, we should consider moving all geo-rep testcases under 
bad tests
till then.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Michael Scherer" 
> To: "Kotresh Hiremath Ravishankar" 
> Cc: "Krutika Dhananjay" , "Atin Mukherjee" 
> , "Gaurav Garg"
> , "Aravinda" , "Gluster Devel" 
> 
> Sent: Thursday, 24 September, 2015 1:18:16 PM
> Subject: Re: Spurious failures
> 
> Le jeudi 24 septembre 2015 à 02:24 -0400, Kotresh Hiremath Ravishankar a
> écrit :
> > Hi,
> > 
> > >>>So, it is ok if I restrict that to be used only on 127.0.0.1 ?
> > I think no, testcases use 'H0' to create volumes
> >  H0=${H0:=`hostname`};
> > Geo-rep expects passwordLess SSH to 'H0'
> >  
> 
> Ok, this definitely requires some tests and toughts. It only use ipv4
> too ?
> (I guess yes, since ipv6 is removed from the rackspace build slaves)
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
> 
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures

2015-09-24 Thread Kotresh Hiremath Ravishankar
Thank you:) and also please check the script I had given passes in all machines

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Michael Scherer" 
> To: "Kotresh Hiremath Ravishankar" 
> Cc: "Krutika Dhananjay" , "Atin Mukherjee" 
> , "Gaurav Garg"
> , "Aravinda" , "Gluster Devel" 
> 
> Sent: Thursday, 24 September, 2015 5:00:43 PM
> Subject: Re: Spurious failures
> 
> Le jeudi 24 septembre 2015 à 06:50 -0400, Kotresh Hiremath Ravishankar a
> écrit :
> > >>> Ok, this definitely requires some tests and toughts. It only use ipv4
> > >>> too ?
> > >>> (I guess yes, since ipv6 is removed from the rackspace build slaves)
> >
> > Yes!
> > 
> > Could we know when can these settings be done on all linux slave
> > machines?
> > If it takes sometime, we should consider moving all geo-rep testcases
> > under bad tests
> > till then.
> 
> I will do that this afternoon, now I have a clear idea of what need to
> be done.
> ( I already pushed the path change )
> 
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
> 
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures

2015-09-24 Thread Kotresh Hiremath Ravishankar
Hi,

>>>So, it is ok if I restrict that to be used only on 127.0.0.1 ?
I think no, testcases use 'H0' to create volumes
 H0=${H0:=`hostname`};
Geo-rep expects passwordLess SSH to 'H0'  
 

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Michael Scherer" 
> To: "Kotresh Hiremath Ravishankar" 
> Cc: "Krutika Dhananjay" , "Atin Mukherjee" 
> , "Gaurav Garg"
> , "Aravinda" , "Gluster Devel" 
> 
> Sent: Wednesday, 23 September, 2015 5:05:58 PM
> Subject: Re: Spurious failures
> 
> Le mercredi 23 septembre 2015 à 06:24 -0400, Kotresh Hiremath
> Ravishankar a écrit :
> > Hi Michael,
> > 
> > Please find my replies below.
> > 
> > >>> Root login using password should be disabled, so no. If that's still
> > >>> working and people use it, that's gonna change soon, too much problems
> > >>> with it.
> > 
> >   Ok
> > 
> > >>>Can you be more explicit on where should the user come from so I can
> > >>>properly integrate that ?
> > 
> >   It's just PasswordLess SSH from root to root on to same host.
> >   1. Generate ssh key:
> > #ssh-keygen
> >   2. Add it to /root/.ssh/authorized_keys
> > #ssh-copy-id -i  root@host
> > 
> >   Requirement by geo-replication:
> > 'ssh root@host' should not ask for password
> 
> So, it is ok if I restrict that to be used only on 127.0.0.1 ?
> 
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
> 
> 
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Memory leak in GlusterFS FUSE client

2015-09-24 Thread Oleksandr Natalenko
In our GlusterFS deployment we've encountered something like memory leak 
in GlusterFS FUSE client.


We use replicated (×2) GlusterFS volume to store mail (exim+dovecot, 
maildir format). Here is inode stats for both bricks and mountpoint:


===
Brick 1 (Server 1):

Filesystem InodesIUsed   
   IFree IUse% Mounted on
/dev/mapper/vg_vd1_misc-lv08_mail   578768144 10954918  
5678132262% /bricks/r6sdLV08_vd1_mail


Brick 2 (Server 2):

Filesystem InodesIUsed   
   IFree IUse% Mounted on
/dev/mapper/vg_vd0_misc-lv07_mail   578767984 10954913  
5678130712% /bricks/r6sdLV07_vd0_mail


Mountpoint (Server 3):

Filesystem  InodesIUsed  IFree IUse% 
Mounted on
glusterfs.xxx:mail   578767760 10954915  5678128452% 
/var/spool/mail/virtual

===

glusterfs.xxx domain has two A records for both Server 1 and Server 2.

Here is volume info:

===
Volume Name: mail
Type: Replicate
Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
Options Reconfigured:
nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24
features.cache-invalidation-timeout: 10
performance.stat-prefetch: off
performance.quick-read: on
performance.read-ahead: off
performance.flush-behind: on
performance.write-behind: on
performance.io-thread-count: 4
performance.cache-max-file-size: 1048576
performance.cache-size: 67108864
performance.readdir-ahead: off
===

Soon enough after mounting and exim/dovecot start, glusterfs client 
process begins to consume huge amount of RAM:


===
user@server3 ~$ ps aux | grep glusterfs | grep mail
root 28895 14.4 15.0 15510324 14908868 ?   Ssl  Sep03 4310:05 
/usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable 
--volfile-server=glusterfs.xxx --volfile-id=mail /var/spool/mail/virtual

===

That is, ~15 GiB of RAM.

Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 
GiB of RAM, and soon after starting mail daemons got OOM killer for 
glusterfs client process.


Mounting same share via NFS works just fine. Also, we have much less 
iowait and loadavg on client side with NFS.


Also, we've tried to change IO threads count and cache size in order to 
limit memory usage with no luck. As you can see, total cache size is 
4×64==256 MiB (compare to 15 GiB).


Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't 
help as well.


Here are volume memory stats:

===
Memory status for volume : mail
--
Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
Mallinfo

Arena: 36859904
Ordblks  : 10357
Smblks   : 519
Hblks: 21
Hblkhd   : 30515200
Usmblks  : 0
Fsmblks  : 53440
Uordblks : 18604144
Fordblks : 18255760
Keepcost : 114112

Mempool Stats
-
NameHotCount ColdCount PaddedSizeof 
AllocCount MaxAlloc   Misses Max-StdAlloc
 -  
--   
mail-server:fd_t   0  1024  108   
30773120  13700
mail-server:dentry_t   16110   274   84  
23567614816384  1106499 1152
mail-server:inode_t1636321  156  
23721687616384  1876651 1169
mail-trash:fd_t0  1024  108  
0000
mail-trash:dentry_t0 32768   84  
0000
mail-trash:inode_t 4 32764  156  
4400
mail-trash:trash_local_t   064 8628  
0000
mail-changetimerecorder:gf_ctr_local_t 06416540  
0000
mail-changelog:rpcsvc_request_t 0 8 2828 
 0000
mail-changelog:changelog_local_t 064  116
  0000
mail-bitrot-stub:br_stub_local_t 0   512   84  
79204400
mail-locks:pl_local_t  032  148
6812757400
mail-upcall:upcall_local_t 0   512  108  
0000
mail-marker:marker_local_t 0   128  332  
64980300
mail-quota:quota_local_t   064  476  
0000
mail-server:rpcsvc_request_t   0   512 2828   
45462533   3400
glusterfs:struct saved_frame 

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2015-09-24 Thread Oleksandr Natalenko
I've checked statedump of volume in question and haven't found lots of 
iobuf as mentioned in that bugreport.


However, I've noticed that there are lots of LRU records like this:

===
[conn.1.bound_xl./bricks/r6sdLV07_vd0_mail/mail.lru.1]
gfid=c4b29310-a19d-451b-8dd1-b3ac2d86b595
nlookup=1
fd-count=0
ref=0
ia_type=1
===

In fact, there are 16383 of them. I've checked "gluster volume set help" 
in order to find something LRU-related and have found this:


===
Option: network.inode-lru-limit
Default Value: 16384
Description: Specifies the maximum megabytes of memory to be used in the 
inode cache.

===

Is there error in description stating "maximum megabytes of memory"? 
Shouldn't it mean "maximum amount of LRU records"? If no, is that true, 
that inode cache could grow up to 16 GiB for client, and one must lower 
network.inode-lru-limit value?


Another thought: we've enabled write-behind, and the default 
write-behind-window-size value is 1 MiB. So, one may conclude that with 
lots of small files written, write-behind buffer could grow up to 
inode-lru-limit×write-behind-window-size=16 GiB? Who could explain that 
to me?


24.09.2015 10:42, Gabi C write:

oh, my bad...
coulb be this one?

https://bugzilla.redhat.com/show_bug.cgi?id=1126831 [2]
Anyway, on ovirt+gluster w I experienced similar behavior...

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2015-09-24 Thread Oleksandr Natalenko

We use bare GlusterFS installation with no oVirt involved.

24.09.2015 10:29, Gabi C wrote:

google vdsm memory leak..it's been discussed on list last year and
earlier this one...

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious failures

2015-09-24 Thread Michael Scherer
Le jeudi 24 septembre 2015 à 02:24 -0400, Kotresh Hiremath Ravishankar a
écrit :
> Hi,
> 
> >>>So, it is ok if I restrict that to be used only on 127.0.0.1 ?
> I think no, testcases use 'H0' to create volumes
>  H0=${H0:=`hostname`};
> Geo-rep expects passwordLess SSH to 'H0'  
>  

Ok, this definitely requires some tests and toughts. It only use ipv4
too ?
(I guess yes, since ipv6 is removed from the rackspace build slaves)
-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS




signature.asc
Description: This is a digitally signed message part
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] glusterfs 3.6.6 released

2015-09-24 Thread Raghavendra Bhat
Hi,

glusterfs-3.6.6 has been released and the packages for RHEL/Fedora/Centos
can be found here.
http://download.gluster.org/pub/gluster/glusterfs/3.6/LATEST/

Requesting people running 3.6.x to please try it out and let us know if
there are any issues.

This release supposedly fixes the bugs listed below since 3.6.5 was made
available. Thanks to all who submitted patches, reviewed the changes.

1259578 - [3.6.x] quota usage gets miscalculated when loc->gfid is NULL
1247972 - quota/marker: lk_owner is null while acquiring inodelk in rename
operation
1252072 - POSIX ACLs as used by a FUSE mount can not use more than 32 groups
1256245 - AFR: gluster v restart force or brick process restart doesn't
heal the files
1258069 - gNFSd: NFS mount fails with "Remote I/O error"
1173437 - [RFE] changes needed in snapshot info command's xml output.

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures

2015-09-24 Thread Michael Scherer
Le jeudi 24 septembre 2015 à 07:59 -0400, Kotresh Hiremath Ravishankar a
écrit :
> Thank you:) and also please check the script I had given passes in all 
> machines

So it worked everywhere, but on slave0 and slave1. Not sure what is
wrong, or if they are used, I will check later.


-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS




signature.asc
Description: This is a digitally signed message part
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel