Re: [Gluster-users] Hundreds of duplicate files

2015-02-20 Thread Olav Peeters

It look even worse than I had feared.. :-(
This really is a crazy bug.

If I understand you correctly, the only sane pairing of the xattrs is of 
the two 0-bit files, since this is the full list of bricks:


root@gluster01 ~]# gluster volume info

Volume Name: sr_vol01
Type: Distributed-Replicate
Volume ID: c6d6147e-2d91-4d98-b8d9-ba05ec7e4ad6
Status: Started
Number of Bricks: 21 x 2 = 42
Transport-type: tcp
Bricks:
Brick1: gluster01:/export/brick1gfs01
Brick2: gluster02:/export/brick1gfs02
Brick3: gluster01:/export/brick4gfs01
Brick4: gluster03:/export/brick4gfs03
Brick5: gluster02:/export/brick4gfs02
Brick6: gluster03:/export/brick1gfs03
Brick7: gluster01:/export/brick2gfs01
Brick8: gluster02:/export/brick2gfs02
Brick9: gluster01:/export/brick5gfs01
Brick10: gluster03:/export/brick5gfs03
Brick11: gluster02:/export/brick5gfs02
Brick12: gluster03:/export/brick2gfs03
Brick13: gluster01:/export/brick3gfs01
Brick14: gluster02:/export/brick3gfs02
Brick15: gluster01:/export/brick6gfs01
Brick16: gluster03:/export/brick6gfs03
Brick17: gluster02:/export/brick6gfs02
Brick18: gluster03:/export/brick3gfs03
Brick19: gluster01:/export/brick8gfs01
Brick20: gluster02:/export/brick8gfs02
Brick21: gluster01:/export/brick9gfs01
Brick22: gluster02:/export/brick9gfs02
Brick23: gluster01:/export/brick10gfs01
Brick24: gluster03:/export/brick10gfs03
Brick25: gluster01:/export/brick11gfs01
Brick26: gluster03:/export/brick11gfs03
Brick27: gluster02:/export/brick10gfs02
Brick28: gluster03:/export/brick8gfs03
Brick29: gluster02:/export/brick11gfs02
Brick30: gluster03:/export/brick9gfs03
Brick31: gluster01:/export/brick12gfs01
Brick32: gluster02:/export/brick12gfs02
Brick33: gluster01:/export/brick13gfs01
Brick34: gluster02:/export/brick13gfs02
Brick35: gluster01:/export/brick14gfs01
Brick36: gluster03:/export/brick14gfs03
Brick37: gluster01:/export/brick15gfs01
Brick38: gluster03:/export/brick15gfs03
Brick39: gluster02:/export/brick14gfs02
Brick40: gluster03:/export/brick12gfs03
Brick41: gluster02:/export/brick15gfs02
Brick42: gluster03:/export/brick13gfs03


The two 0-bit files are on brick 35 and 36 as the getfattr correctly lists.

Another sane pairing could be this (if the first file did not also refer 
to client-34 and client-35):


[root@gluster01 ~]# getfattr -m . -d -e hex 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x00010001
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster02 ~]# getfattr -m . -d -e hex 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

But why is the security.selinux hash different?


You mention hostname changes..
I noticed that if I do a listing of available shared storages on one of 
the XenServer I get:

uuid ( RO): 272b2366-dfbf-ad47-2a0f-5d5cc40863e3
  name-label ( RW): gluster_store
name-description ( RW): NFS SR [gluster01.irceline.be:/sr_vol01]
host ( RO): 
type ( RO): nfs
content-type ( RO):


if I do normal general linux:
[root@same_story_on_both_xenserver ~]# mount
gluster02.irceline.be:/sr_vol01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3 on 
/var/run/sr-mount/272b2366-dfbf-ad47-2a0f-5d5cc40863e3 type nfs 
(rw,soft,timeo=133,retrans=2147483647,tcp,noac,addr=192.168.0.72)


Originally the mount was done on gluster01 (ip 192.168.0.71) as the 
name-description of the xe sr-list indicates..
It is as though when gluster01 was not available for a couple of 
minutes, the NFS mount internally was somehow automatically reconfigured 
to gluster02, but NFS cannot do this as far as I know (unless there is 
some fail-over mechanism - I never configured this). There also is no 
load-balancing between client and server.
If gluster01 is not available, the gluster volume should not have been 
available, end of story.. But from perspective of a client the NFS could 
be to any one of the three gluster nodes. The client should see exactly 
the same data..


So a rebalance in the current s

Re: [Gluster-users] Hundreds of duplicate files

2015-02-20 Thread Joe Julian


On 02/20/2015 01:47 PM, Olav Peeters wrote:

Thanks Joe,
for the answers!

I was not clear enough about the set up apparently.
The Gluster cluster consist of 3 nodes with each 14 bricks. The bricks 
are formatted as xfs, mounted locally as xfs. There is one volume, 
type: Distributed-Replicate (replica 2). The configuration is so that 
bricks are mirrored on two different nodes.


The NFS mount which was alive but not used during reboot when the 
problem started are from clients (2 XenServer machines configured as a 
pool - a shared storage set-up). The comparisons I give below are 
between (other) clients mounting via either glusterfs or NFS. Similar 
problem with the exception that the first listing (via ls) after a 
fresh mount via NFS actually does find the files with data. A second 
listing only finds the 0 bit file with the same name.


So all the 0bit files in mode 0644 can be safely removed?

Probably? Is it likely that you have any empty files? I don't know.


Why do I see three files with the same name (and modification 
timestamp etc.) via either a glusterfs or NFS mount from a client? 
Deleting one of the three will probably not solve the issue either.. 
this seems to me an indexing issue in the gluster cluster.
Very good question. I don't know. The xattrs tell a strange story that I 
haven't seen before. One legit file shows sr_vol01-client-32 and 33. 
This would be normal, assuming the filename hash would put it on that 
replica pair (we can't tell since the rebalance has changed the hash 
map). Another file shows sr_vol01-client-32, 33, 34, and 35 with pending 
updates scheduled for 35. I have no idea which brick this is (see 
"gluster volume info" and map the digits (35) with the bricks offset by 
1 (client-35 is brick 36). That last one is on 40,41.


I don't know how these files all got on different replica sets. My 
speculations include hostname changes, long-running net-split conditions 
with different dht maps (failed rebalances), moved bricks, load 
balancers between client and server, mercury in retrograde (lol)...


How do I get Gluster to replicate the files correctly, only 2 versions 
of the same file, not three, and on two bricks on different machines?




Identify which replica is correct by using the little python script at 
http://joejulian.name/blog/dht-misses-are-expensive/ to get the hash of 
the filename. Examine the dht map to see which replica pair *should* 
have that hash and remove the others (and their hardlink in .glusterfs). 
There is no 1-liner that's going to do this. I would probably script the 
logic in python, have it print out what it was going to do, check that 
for sanity and, if sane, execute it.


But mostly figure out how Bricks 32 and/or 33 can become 34 and/or 35 
and/or 40 and/or 41. That's the root of the whole problem.



Cheers,
Olav




On 20/02/15 21:51, Joe Julian wrote:


On 02/20/2015 12:21 PM, Olav Peeters wrote:
Let's take one file (3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd) as an 
example...
On the 3 nodes where all bricks are formatted as XFS and mounted in 
/export and 272b2366-dfbf-ad47-2a0f-5d5cc40863e3 is the mounting 
point of a NFS shared storage connection from XenServer machines:
Did I just read this correctly? Your bricks are NFS mounts? ie, 
GlusterFS Client <-> GlusterFS Server <-> NFS <-> XFS


[root@gluster01 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec 
ls -la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Supposedly, this is the actual file.
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
This is not a linkfile. Note it's mode 0644. How it got there with 
those permissions would be a matter of history and would require 
information that's probably lost.


root@gluster02 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec 
ls -la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


[root@gluster03 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec 
ls -la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Same analysis as above.


3 files with information, 2 x a 0-bit file with the same name

Checking the 0-bit files:
[root@gluster01 ~]# getfattr -m . -d -e hex 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b

Re: [Gluster-users] Hundreds of duplicate files

2015-02-20 Thread Olav Peeters

Thanks Joe,
for the answers!

I was not clear enough about the set up apparently.
The Gluster cluster consist of 3 nodes with each 14 bricks. The bricks 
are formatted as xfs, mounted locally as xfs. There is one volume, type: 
Distributed-Replicate (replica 2). The configuration is so that bricks 
are mirrored on two different nodes.


The NFS mount which was alive but not used during reboot when the 
problem started are from clients (2 XenServer machines configured as a 
pool - a shared storage set-up). The comparisons I give below are 
between (other) clients mounting via either glusterfs or NFS. Similar 
problem with the exception that the first listing (via ls) after a fresh 
mount via NFS actually does find the files with data. A second listing 
only finds the 0 bit file with the same name.


So all the 0bit files in mode 0644 can be safely removed?

Why do I see three files with the same name (and modification timestamp 
etc.) via either a glusterfs or NFS mount from a client? Deleting one of 
the three will probably not solve the issue either.. this seems to me an 
indexing issue in the gluster cluster.


How do I get Gluster to replicate the files correctly, only 2 versions 
of the same file, not three, and on two bricks on different machines?


Cheers,
Olav




On 20/02/15 21:51, Joe Julian wrote:


On 02/20/2015 12:21 PM, Olav Peeters wrote:
Let's take one file (3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd) as an 
example...
On the 3 nodes where all bricks are formatted as XFS and mounted in 
/export and 272b2366-dfbf-ad47-2a0f-5d5cc40863e3 is the mounting 
point of a NFS shared storage connection from XenServer machines:
Did I just read this correctly? Your bricks are NFS mounts? ie, 
GlusterFS Client <-> GlusterFS Server <-> NFS <-> XFS


[root@gluster01 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls 
-la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Supposedly, this is the actual file.
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
This is not a linkfile. Note it's mode 0644. How it got there with 
those permissions would be a matter of history and would require 
information that's probably lost.


root@gluster02 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls 
-la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


[root@gluster03 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls 
-la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Same analysis as above.


3 files with information, 2 x a 0-bit file with the same name

Checking the 0-bit files:
[root@gluster01 ~]# getfattr -m . -d -e hex 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster03 ~]# getfattr -m . -d -e hex 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

This is not a glusterfs link file since there is no 
"trusted.glusterfs.dht.linkto", am I correct?

You are correct.


And checking the "good" files:

# file: 
export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x00010001
trusted.gfid=0xaefd184508414a8f8408f1ab

Re: [Gluster-users] Hundreds of duplicate files

2015-02-20 Thread Joe Julian


On 02/20/2015 12:21 PM, Olav Peeters wrote:
Let's take one file (3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd) as an 
example...
On the 3 nodes where all bricks are formatted as XFS and mounted in 
/export and 272b2366-dfbf-ad47-2a0f-5d5cc40863e3 is the mounting point 
of a NFS shared storage connection from XenServer machines:
Did I just read this correctly? Your bricks are NFS mounts? ie, 
GlusterFS Client <-> GlusterFS Server <-> NFS <-> XFS


[root@gluster01 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls 
-la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Supposedly, this is the actual file.
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
This is not a linkfile. Note it's mode 0644. How it got there with those 
permissions would be a matter of history and would require information 
that's probably lost.


root@gluster02 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls 
-la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


[root@gluster03 ~]# find 
/export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ -name '300*' -exec ls 
-la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

Same analysis as above.


3 files with information, 2 x a 0-bit file with the same name

Checking the 0-bit files:
[root@gluster01 ~]# getfattr -m . -d -e hex 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster03 ~]# getfattr -m . -d -e hex 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

This is not a glusterfs link file since there is no 
"trusted.glusterfs.dht.linkto", am I correct?

You are correct.


And checking the "good" files:

# file: 
export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x00010001
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster02 ~]# getfattr -m . -d -e hex 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster03 ~]# getfattr -m . -d -e hex 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-40=0x
trusted.afr.sr_vol01-client-41=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417



Seen from a client via a glusterfs mount:
[root@client ~]# ls -al 
/mnt/glusterfs/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/300*
-rw-

Re: [Gluster-users] Hundreds of duplicate files

2015-02-20 Thread Olav Peeters

Hi,
after waiting a really long time (nearly two days) for a heal and a 
rebalance to finish we are left with the following situation:


- the heal did get rid of some of the empty sticky bit files outside of 
.glusterfs dir (on the root of each brick), but not all


- the duplicates are still there also after doing a rebalance (and 
rebalance fix-layout)


Our cluster is:
Type: Distributed-Replicate (over three nodes)
Number of Bricks: 21 x 2 = 42 (replication set to 2)
Transport-type: tcp

Let's take one file (3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd) as an 
example...
On the 3 nodes where all bricks are formatted as XFS and mounted in 
/export and 272b2366-dfbf-ad47-2a0f-5d5cc40863e3 is the mounting point 
of a NFS shared storage connection from XenServer machines:


[root@gluster01 ~]# find /export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ 
-name '300*' -exec ls -la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


root@gluster02 ~]# find /export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ 
-name '300*' -exec ls -la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


[root@gluster03 ~]# find /export/*/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/ 
-name '300*' -exec ls -la {} \;
-rw-r--r--. 2 root root 44332659200 Feb 17 23:55 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd
-rw-r--r--. 2 root root 0 Feb 18 00:51 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd


3 files with information, 2 x a 0-bit file with the same name

Checking the 0-bit files:
[root@gluster01 ~]# getfattr -m . -d -e hex 
/export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick14gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster03 ~]# getfattr -m . -d -e hex 
/export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick14gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

This is not a glusterfs link file since there is no 
"trusted.glusterfs.dht.linkto", am I correct?


And checking the "good" files:

# file: 
export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.afr.sr_vol01-client-34=0x
trusted.afr.sr_vol01-client-35=0x00010001
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster02 ~]# getfattr -m . -d -e hex 
/export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-32=0x
trusted.afr.sr_vol01-client-33=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417

[root@gluster03 ~]# getfattr -m . -d -e hex 
/export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

getfattr: Removing leading '/' from absolute path names
# file: 
export/brick13gfs03/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd

security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.sr_vol01-client-40=0x
trusted.afr.sr_vol01-client-41=0x
trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417



Seen from a client via a glusterfs mount:
[root

[Gluster-users] Replica repair

2015-02-20 Thread Sam Giraffe
Hi,

I have a Gluster volume with 20 servers. The volume is setup with a
replica of 2.
Each server has 1 brick on it, so in essence I have 20 bricks, 10 of
which are a replica of the other 10.

One of the servers had a bad hard drive and the brick on the server
stopped responding.
This caused writes to the Gluster volume to slow down.
I am under the impression that one brick crashing should not have a
problem, not sure why writes slowed down? Any clue here?

Secondly, in order to restore the brick, I had to remote 2 bricks or 2
servers, since I had setup the volume with a replica of 2. For
removing the 2nd brick, I picked a server randomly, is that ok? I was
afraid I would have picked a server that is the replica of the bad
server and then I would lose data.

Lastly, do I need to heal the volume after removing both bricks? What
happens to the data on the bricks?

I am using Gluster 3.6


Thank you
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] cannot fuse mount from a cluster node

2015-02-20 Thread Kaushal M
This is a known issue non RPM (Fedora/EL) packages. The DEB packages and
packages for other distros don't do a post upgrade regeneration of
volfiles. So after the upgrade, GlusterD is searching for the new volfiles
which don't exist, and cannot give the clients with a volfile, leading to
the mounts failing.

You can look at https://bugzilla.redhat.com/show_bug.cgi?id=1191176 for
more details.

tl;dr stop glusterd, run `glusterd --xlator-option *upgrade=on -N` to
regenerate the volfiles, start glusterd (on all nodes).

~kaushal

On Fri, Feb 20, 2015 at 8:33 PM, Tamas Papp  wrote:

> hi All,
>
> After I rebooted the a cluster, linux clients are working fine.
> But nodes cannot mount the cluster.
>
>
> 16:01 gl0(pts/0):/var/log/glusterfs$ gluster volume status
> Status of volume: w-vol
> Gluster processPortOnlinePid
> 
> --
> Brick gl0:/mnt/brick1/data49152Y1841
> Brick gl1:/mnt/brick1/data49152Y1368
> Brick gl2:/mnt/brick1/data49152Y1703
> Brick gl3:/mnt/brick1/data49152Y1514
> Brick gl4:/mnt/brick1/data49152Y1354
> NFS Server on localhost2049Y2986
> NFS Server on gl12049Y1373
> NFS Server on gl22049Y1708
> NFS Server on gl42049Y1359
> NFS Server on gl32049Y1525
>
> Task Status of Volume w-vol
> 
> --
> There are no active volume tasks
>
> 16:01 gl0(pts/0):/var/log/glusterfs$ gluster volume info
>
> Volume Name: w-vol
> Type: Distribute
> Volume ID: ebaa67c4-7429-4106-9ab3-dfc85235a2a1
> Status: Started
> Number of Bricks: 5
> Transport-type: tcp
> Bricks:
> Brick1: gl0:/mnt/brick1/data
> Brick2: gl1:/mnt/brick1/data
> Brick3: gl2:/mnt/brick1/data
> Brick4: gl3:/mnt/brick1/data
> Brick5: gl4:/mnt/brick1/data
> Options Reconfigured:
> server.allow-insecure: on
> performance.cache-size: 4GB
> performance.flush-behind: on
> diagnostics.client-log-level: WARNING
>
>
>
>
> [2015-02-20 15:00:17.071186] I [MSGID: 100030] [glusterfsd.c:2018:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2
> (args: /usr/sbin/glusterfs --acl --direct-io-mode=disable --use-readdirp=no
> --volfile-server=gl0 --volfile-id=/w-vol /W/Projects)
> [2015-02-20 15:00:17.076517] E [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk]
> 0-glusterfs: failed to get the 'volume file' from server
> [2015-02-20 15:00:17.076575] E [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk]
> 0-mgmt: failed to fetch volume file (key:/w-vol)
> [2015-02-20 15:00:17.076760] W [glusterfsd.c:1194:cleanup_and_exit] (-->
> 0-: received signum (0), shutting down
> [2015-02-20 15:00:17.076791] I [fuse-bridge.c:5599:fini] 0-fuse:
> Unmounting '/W/Projects'.
> [2015-02-20 15:00:17.110711] W [glusterfsd.c:1194:cleanup_and_exit] (-->
> 0-: received signum (15), shutting down
> [2015-02-20 15:01:17.078206] I [MSGID: 100030] [glusterfsd.c:2018:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2
> (args: /usr/sbin/glusterfs --acl --direct-io-mode=disable --use-readdirp=no
> --volfile-server=gl0 --volfile-id=/w-vol /W/Projects)
> [2015-02-20 15:01:17.082935] E [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk]
> 0-glusterfs: failed to get the 'volume file' from server
> [2015-02-20 15:01:17.082992] E [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk]
> 0-mgmt: failed to fetch volume file (key:/w-vol)
> [2015-02-20 15:01:17.083173] W [glusterfsd.c:1194:cleanup_and_exit] (-->
> 0-: received signum (0), shutting down
> [2015-02-20 15:01:17.083203] I [fuse-bridge.c:5599:fini] 0-fuse:
> Unmounting '/W/Projects'.
>
>
> $ uname -a
> Linux gl0 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13 19:36:28 UTC 2015
> x86_64 x86_64 x86_64 GNU/Linux
>
> $ lsb_release -a
> No LSB modules are available.
> Distributor ID:Ubuntu
> Description:Ubuntu 14.04.2 LTS
> Release:14.04
> Codename:trusty
>
>
>
> ii  glusterfs-client 3.6.2-ubuntu1~trusty3amd64
> clustered file-system (client package)
> ii  glusterfs-common 3.6.2-ubuntu1~trusty3amd64
> GlusterFS common libraries and translator modules
> ii  glusterfs-server 3.6.2-ubuntu1~trusty3amd64
> clustered file-system (server package)
>
>
>
> Any idea?
>
>
> 10x
> tamas
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] pb glusterfs 3.4.2 built on Jan 3 2014 12:38:05

2015-02-20 Thread Pierre Léonard

Hi  Ghoshal,

That's funny. What's your glusterd version?

glusterd --version

glusterfs 3.5.3 built on Nov 13 2014 11:06:04
It seems that I have diffrent release of glusterfs.
Could be a problem. I know also that I have made an update of that 
computer a knew kernel, and new openssl and glibc .



I remember I get a problem with the peers files a guy named Kaushal help 
me. The files wher not good on that same computer. So I find the good 
peers files by analysing all my 14 nodes and the server restart.


Today I have check the peers files but find no evidency of mistake.

sincerely

--
Signature electronique
INRA 
*Pierre Léonard*
*Senior IT Manager*
*MetaGenoPolis*
pierre.leon...@jouy.inra.fr 
Tél. : +33 (0)1 34 65 29 78

Centre de recherche INRA
Domaine de Vilvert – Bât. 325 R+1
78 352 Jouy-en-Josas CEDEX
France
www.mgps.eu 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] pb glusterfs 3.4.2 built on Jan 3 2014 12:38:05

2015-02-20 Thread A Ghoshal
That's funny. What's your glusterd version?

glusterd --version

 -Pierre Léonard  wrote: -

 ===
 To: A Ghoshal , "gluster-users@gluster.org" 

 From: Pierre Léonard 
 Date: 02/20/2015 09:19PM 
 Subject: Re: [Gluster-users] pb glusterfs 3.4.2 built on Jan  3 2014 12:38:05
 ===
   Hi Ghoshal,
> Something's wrong with the configuration data in /var/lib/glusterd. 
> Try run glusterd with debug:
>
> glusterd --debug
>
> It might have more details.
OK I have found somme bad iptables and stop it but that does not solve 
the problem. It seems to get a rdma transport 
:/usr/lib64/glusterfs/3.5.3/rpc-transport/rdma.so
Which is not the good release. I have a folder 
/usr/lib64/glusterfs/3.6.1/rpc-transport/rdma.so

the debug listening follow :

[root@xstoocky06 ~]# glusterd --debug
[2015-02-20 15:34:46.367984] I [glusterfsd.c:1959:main] 0-glusterd: 
Started running glusterd version 3.5.3 (glusterd --debug)
[2015-02-20 15:34:46.368295] D [glusterfsd.c:596:get_volfp] 
0-glusterfsd: loading volume file /etc/glusterfs/glusterd.vol
[2015-02-20 15:34:46.419687] I [glusterd.c:1122:init] 0-management: 
Using /var/lib/glusterd as working directory
[2015-02-20 15:34:46.419820] D 
[glusterd.c:345:glusterd_rpcsvc_options_build] 0-: listen-backlog value: 128
[2015-02-20 15:34:46.420519] D [rpcsvc.c:2183:rpcsvc_init] 
0-rpc-service: RPC service inited.
[2015-02-20 15:34:46.420544] D [rpcsvc.c:1812:rpcsvc_program_register] 
0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, 
Port: 0
[2015-02-20 15:34:46.420601] D [rpc-transport.c:262:rpc_transport_load] 
0-rpc-transport: attempt to load file 
/usr/lib64/glusterfs/3.5.3/rpc-transport/socket.so
[2015-02-20 15:34:46.424797] I [socket.c:3645:socket_init] 
0-socket.management: SSL support is NOT enabled
[2015-02-20 15:34:46.424831] I [socket.c:3660:socket_init] 
0-socket.management: using system polling thread
[2015-02-20 15:34:46.424853] D [name.c:557:server_fill_address_family] 
0-socket.management: option address-family not specified, defaulting to inet
[2015-02-20 15:34:46.424968] D [rpc-transport.c:262:rpc_transport_load] 
0-rpc-transport: attempt to load file 
/usr/lib64/glusterfs/3.5.3/rpc-transport/rdma.so
[2015-02-20 15:34:46.448374] D [rpc-transport.c:300:rpc_transport_load] 
0-rpc-transport: dlsym (gf_rpc_transport_reconfigure) on 
/usr/lib64/glusterfs/3.5.3/rpc-transport/rdma.so: undefined symbol: 
reconfigure
librdmacm: Warning: couldn't read ABI version.
librdmacm: Warning: assuming: 4
librdmacm: Fatal: unable to get RDMA device list
[2015-02-20 15:34:46.448538] W [rdma.c:4194:__gf_rdma_ctx_create] 
0-rpc-transport/rdma: rdma_cm event channel creation failed (No such device)
[2015-02-20 15:34:46.448557] E [rdma.c:4482:init] 0-rdma.management: 
Failed to initialize IB Device
[2015-02-20 15:34:46.448570] E [rpc-transport.c:333:rpc_transport_load] 
0-rpc-transport: 'rdma' initialization failed
[2015-02-20 15:34:46.448680] W [rpcsvc.c:1535:rpcsvc_transport_create] 
0-rpc-service: cannot create listener, initing the transport failed
[2015-02-20 15:34:46.448703] D [rpcsvc.c:1812:rpcsvc_program_register] 
0-rpc-service: New program registered: GlusterD svc peer, Num: 1238437, 
Ver: 2, Port: 0
[2015-02-20 15:34:46.448718] D [rpcsvc.c:1812:rpcsvc_program_register] 
0-rpc-service: New program registered: GlusterD svc cli read-only, Num: 
1238463, Ver: 2, Port: 0
[2015-02-20 15:34:46.448731] D [rpcsvc.c:1812:rpcsvc_program_register] 
0-rpc-service: New program registered: GlusterD svc mgmt, Num: 1238433, 
Ver: 2, Port: 0
[2015-02-20 15:34:46.448744] D [rpcsvc.c:1812:rpcsvc_program_register] 
0-rpc-service: New program registered: Gluster Portmap, Num: 34123456, 
Ver: 1, Port: 0
[2015-02-20 15:34:46.448757] D [rpcsvc.c:1812:rpcsvc_program_register] 
0-rpc-service: New program registered: Gluster Handshake, Num: 14398633, 
Ver: 2, Port: 0
[2015-02-20 15:34:46.448769] D [rpcsvc.c:1812:rpcsvc_program_register] 
0-rpc-service: New program registered: Gluster MGMT Handshake, Num: 
1239873, Ver: 1, Port: 0
[2015-02-20 15:34:46.448828] D [rpcsvc.c:2183:rpcsvc_init] 
0-rpc-service: RPC service inited.
[2015-02-20 15:34:46.448843] D [rpcsvc.c:1812:rpcsvc_program_register] 
0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, 
Port: 0
[2015-02-20 15:34:46.448868] D [rpc-transport.c:262:rpc_transport_load] 
0-rpc-transport: attempt to load file 
/usr/lib64/glusterfs/3.5.3/rpc-transport/socket.so
[2015-02-20 15:34:46.448956] D [socket.c:3533:socket_init] 
0-socket.management: disabling nodelay
[2015-02-20 15:34:46.448980] I [socket.c:3645:socket_init] 
0-socket.management: SSL support is NOT enabled
[2015-02-20 15:34:46.448996] I [socket.c:3660:socket_init] 
0-socket.management: using system polling thread
[2015-02-20 15:34:46.449100] D [rpcsvc.c:1812:rpcsvc_program_register] 
0-rpc-service: New program registered: GlusterD svc cli, Num: 1238463, 
Ver: 2, Port: 0
[2015-02-20 15:34:46.449121] D [rpcsvc.c:1812

[Gluster-users] cannot fuse mount from a cluster node

2015-02-20 Thread Tamas Papp

hi All,

After I rebooted the a cluster, linux clients are working fine.
But nodes cannot mount the cluster.


16:01 gl0(pts/0):/var/log/glusterfs$ gluster volume status
Status of volume: w-vol
Gluster processPortOnlinePid
--
Brick gl0:/mnt/brick1/data49152Y1841
Brick gl1:/mnt/brick1/data49152Y1368
Brick gl2:/mnt/brick1/data49152Y1703
Brick gl3:/mnt/brick1/data49152Y1514
Brick gl4:/mnt/brick1/data49152Y1354
NFS Server on localhost2049Y2986
NFS Server on gl12049Y1373
NFS Server on gl22049Y1708
NFS Server on gl42049Y1359
NFS Server on gl32049Y1525

Task Status of Volume w-vol
--
There are no active volume tasks

16:01 gl0(pts/0):/var/log/glusterfs$ gluster volume info

Volume Name: w-vol
Type: Distribute
Volume ID: ebaa67c4-7429-4106-9ab3-dfc85235a2a1
Status: Started
Number of Bricks: 5
Transport-type: tcp
Bricks:
Brick1: gl0:/mnt/brick1/data
Brick2: gl1:/mnt/brick1/data
Brick3: gl2:/mnt/brick1/data
Brick4: gl3:/mnt/brick1/data
Brick5: gl4:/mnt/brick1/data
Options Reconfigured:
server.allow-insecure: on
performance.cache-size: 4GB
performance.flush-behind: on
diagnostics.client-log-level: WARNING




[2015-02-20 15:00:17.071186] I [MSGID: 100030] [glusterfsd.c:2018:main] 
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 
(args: /usr/sbin/glusterfs --acl --direct-io-mode=disable 
--use-readdirp=no --volfile-server=gl0 --volfile-id=/w-vol /W/Projects)
[2015-02-20 15:00:17.076517] E [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk] 
0-glusterfs: failed to get the 'volume file' from server
[2015-02-20 15:00:17.076575] E [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 
0-mgmt: failed to fetch volume file (key:/w-vol)
[2015-02-20 15:00:17.076760] W [glusterfsd.c:1194:cleanup_and_exit] (--> 
0-: received signum (0), shutting down
[2015-02-20 15:00:17.076791] I [fuse-bridge.c:5599:fini] 0-fuse: 
Unmounting '/W/Projects'.
[2015-02-20 15:00:17.110711] W [glusterfsd.c:1194:cleanup_and_exit] (--> 
0-: received signum (15), shutting down
[2015-02-20 15:01:17.078206] I [MSGID: 100030] [glusterfsd.c:2018:main] 
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 
(args: /usr/sbin/glusterfs --acl --direct-io-mode=disable 
--use-readdirp=no --volfile-server=gl0 --volfile-id=/w-vol /W/Projects)
[2015-02-20 15:01:17.082935] E [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk] 
0-glusterfs: failed to get the 'volume file' from server
[2015-02-20 15:01:17.082992] E [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 
0-mgmt: failed to fetch volume file (key:/w-vol)
[2015-02-20 15:01:17.083173] W [glusterfsd.c:1194:cleanup_and_exit] (--> 
0-: received signum (0), shutting down
[2015-02-20 15:01:17.083203] I [fuse-bridge.c:5599:fini] 0-fuse: 
Unmounting '/W/Projects'.



$ uname -a
Linux gl0 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13 19:36:28 UTC 2015 
x86_64 x86_64 x86_64 GNU/Linux


$ lsb_release -a
No LSB modules are available.
Distributor ID:Ubuntu
Description:Ubuntu 14.04.2 LTS
Release:14.04
Codename:trusty



ii  glusterfs-client 3.6.2-ubuntu1~trusty3amd64
clustered file-system (client package)
ii  glusterfs-common 3.6.2-ubuntu1~trusty3amd64
GlusterFS common libraries and translator modules
ii  glusterfs-server 3.6.2-ubuntu1~trusty3amd64
clustered file-system (server package)




Any idea?


10x
tamas
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] [TSR] Failed tests on glusterfs-3.6.3beta1, ZFS, CentOS 6.6

2015-02-20 Thread Kiran Patil
Please find the below gluster regression test summary report.

Test Summary Report
---
./tests/basic/ec/quota.t
(Wstat: 0 Tests: 22 Failed: 2)
  Failed tests:  16, 20
./tests/basic/quota-anon-fd-nfs.t
(Wstat: 0 Tests: 21 Failed: 1)
  Failed test:  18
./tests/basic/quota.t
(Wstat: 0 Tests: 73 Failed: 4)
  Failed tests:  24, 28, 32, 65
./tests/basic/uss.t
(Wstat: 0 Tests: 158 Failed: 8)
  Failed tests:  37-38, 69-70, 99-100, 127-128
./tests/basic/volume-snapshot.t
(Wstat: 0 Tests: 29 Failed: 2)
  Failed tests:  28-29
./tests/bugs/bug-1023974.t
(Wstat: 0 Tests: 15 Failed: 1)
  Failed test:  12
./tests/bugs/bug-1038598.t
(Wstat: 0 Tests: 28 Failed: 6)
  Failed tests:  17, 21-22, 26-28
./tests/bugs/bug-1045333.t
(Wstat: 0 Tests: 16 Failed: 1)
  Failed test:  15
./tests/bugs/bug-1087198.t
(Wstat: 0 Tests: 26 Failed: 2)
  Failed tests:  18, 23
./tests/bugs/bug-1113975.t
(Wstat: 0 Tests: 13 Failed: 3)
  Failed tests:  11-13
./tests/bugs/bug-1117851.t
(Wstat: 0 Tests: 24 Failed: 1)
  Failed test:  15
./tests/bugs/bug-1161886/bug-1161886.t
(Wstat: 0 Tests: 16 Failed: 4)
  Failed tests:  13-16
./tests/bugs/bug-1162498.t
(Wstat: 0 Tests: 30 Failed: 13)
  Failed tests:  10, 19-30
./tests/bugs/bug-765380.t
(Wstat: 0 Tests: 9 Failed: 1)
  Failed test:  6
./tests/bugs/bug-824753.t
(Wstat: 0 Tests: 16 Failed: 1)
  Failed test:  11
./tests/bugs/bug-948729/bug-948729-mode-script.t
(Wstat: 0 Tests: 23 Failed: 2)
  Failed tests:  19, 23
./tests/bugs/bug-948729/bug-948729.t
(Wstat: 0 Tests: 23 Failed: 2)
  Failed tests:  19, 23
Files=296, Tests=8411, 8656 wallclock secs ( 3.62 usr  1.97 sys +
527.17 cusr 683.51 csys = 1216.27 CPU)
Result: FAIL

Thanks,
Kiran.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] cannot fuse mount

2015-02-20 Thread Tamas Papp

hi All,

After I rebooted the a cluster it cannot fuse mount.
NFS mount still works fine.


$ gluster volume status
Status of volume: w-vol
Gluster processPortOnlinePid
-- 


Brick gl0:/mnt/brick1/data49152Y1841
Brick gl1:/mnt/brick1/data49152Y1368
Brick gl2:/mnt/brick1/data49152Y1703
Brick gl3:/mnt/brick1/data49152Y1514
Brick gl4:/mnt/brick1/data49152Y1354
NFS Server on localhost2049Y2986
NFS Server on gl12049Y1373
NFS Server on gl22049Y1708
NFS Server on gl42049Y1359
NFS Server on gl32049Y1525

Task Status of Volume w-vol
-- 


There are no active volume tasks

$ gluster volume info

Volume Name: w-vol
Type: Distribute
Volume ID: ebaa67c4-7429-4106-9ab3-dfc85235a2a1
Status: Started
Number of Bricks: 5
Transport-type: tcp
Bricks:
Brick1: gl0:/mnt/brick1/data
Brick2: gl1:/mnt/brick1/data
Brick3: gl2:/mnt/brick1/data
Brick4: gl3:/mnt/brick1/data
Brick5: gl4:/mnt/brick1/data
Options Reconfigured:
server.allow-insecure: on
performance.cache-size: 4GB
performance.flush-behind: on
diagnostics.client-log-level: WARNING




[2015-02-20 15:00:17.071186] I [MSGID: 100030] [glusterfsd.c:2018:main] 
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 
(args: /usr/sbin/glusterfs --acl --direct-io-mode=disable 
--use-readdirp=no --volfile-server=gl0 --volfile-id=/w-vol /W/Projects)
[2015-02-20 15:00:17.076517] E [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk] 
0-glusterfs: failed to get the 'volume file' from server
[2015-02-20 15:00:17.076575] E [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 
0-mgmt: failed to fetch volume file (key:/w-vol)
[2015-02-20 15:00:17.076760] W [glusterfsd.c:1194:cleanup_and_exit] (--> 
0-: received signum (0), shutting down
[2015-02-20 15:00:17.076791] I [fuse-bridge.c:5599:fini] 0-fuse: 
Unmounting '/W/Projects'.
[2015-02-20 15:00:17.110711] W [glusterfsd.c:1194:cleanup_and_exit] (--> 
0-: received signum (15), shutting down
[2015-02-20 15:01:17.078206] I [MSGID: 100030] [glusterfsd.c:2018:main] 
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 
(args: /usr/sbin/glusterfs --acl --direct-io-mode=disable 
--use-readdirp=no --volfile-server=gl0 --volfile-id=/w-vol /W/Projects)
[2015-02-20 15:01:17.082935] E [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk] 
0-glusterfs: failed to get the 'volume file' from server
[2015-02-20 15:01:17.082992] E [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 
0-mgmt: failed to fetch volume file (key:/w-vol)
[2015-02-20 15:01:17.083173] W [glusterfsd.c:1194:cleanup_and_exit] (--> 
0-: received signum (0), shutting down
[2015-02-20 15:01:17.083203] I [fuse-bridge.c:5599:fini] 0-fuse: 
Unmounting '/W/Projects'.



$ uname -a
Linux gl0 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13 19:36:28 UTC 2015 
x86_64 x86_64 x86_64 GNU/Linux


$ lsb_release -a
No LSB modules are available.
Distributor ID:Ubuntu
Description:Ubuntu 14.04.2 LTS
Release:14.04
Codename:trusty



ii  glusterfs-client 3.6.2-ubuntu1~trusty3 amd64clustered 
file-system (client package)
ii  glusterfs-common 3.6.2-ubuntu1~trusty3 amd64GlusterFS common 
libraries and translator modules
ii  glusterfs-server 3.6.2-ubuntu1~trusty3 amd64clustered 
file-system (server package)



Does anybody have idea, what's going on?


10x
tamas
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterD uses 50% of RAM

2015-02-20 Thread RASTELLI Alessandro
HTTP URL works fine.
Now shall I restart the glusterd daemon?

-Original Message-
From: Niels de Vos [mailto:nde...@redhat.com] 
Sent: venerdì 20 febbraio 2015 15:47
To: RASTELLI Alessandro
Cc: Atin Mukherjee; gluster-users@gluster.org
Subject: Re: [Gluster-users] GlusterD uses 50% of RAM

On Fri, Feb 20, 2015 at 01:50:38PM +, RASTELLI Alessandro wrote:
> I get this:
> 
> [root@gluster03-mi glusterfs]# git fetch 
> git://review.gluster.org/glusterfs refs/changes/28/9328/4 && git 
> checkout FETCH_HEAD
> fatal: Couldn't find remote ref refs/changes/28/9328/4
> 
> What's wrong with that?

I think anonymous git does not (always) work. You could try fetching over HTTP:

  $ git fetch http://review.gluster.org/glusterfs refs/changes/28/9328/4 && git 
checkout FETCH_HEAD

Niels

> 
> A.
> 
> -Original Message-
> From: Atin Mukherjee [mailto:amukh...@redhat.com]
> Sent: venerdì 20 febbraio 2015 12:54
> To: RASTELLI Alessandro
> Cc: gluster-users@gluster.org
> Subject: Re: [Gluster-users] GlusterD uses 50% of RAM
> 
> From the cmd log history I could see lots of volume status commands were 
> triggered parallely. This is a known issue for 3.6 and it would cause a 
> memory leak. http://review.gluster.org/#/c/9328/ should solve it.
> 
> ~Atin
> 
> On 02/20/2015 04:36 PM, RASTELLI Alessandro wrote:
> > 10MB log
> > sorry :)
> > 
> > -Original Message-
> > From: Atin Mukherjee [mailto:amukh...@redhat.com]
> > Sent: venerdì 20 febbraio 2015 10:49
> > To: RASTELLI Alessandro; gluster-users@gluster.org
> > Subject: Re: [Gluster-users] GlusterD uses 50% of RAM
> > 
> > Could you please share the cmd_history.log & glusterd log file to analyze 
> > this high memory usage.
> > 
> > ~Atin
> > 
> > On 02/20/2015 03:10 PM, RASTELLI Alessandro wrote:
> >> Hi,
> >> I've noticed that one of our 6 gluster 3.6.2 nodes has "glusterd" 
> >> process using 50% of RAM, on the other nodes usage is about 5% This can be 
> >> a bug?
> >> Should I restart glusterd daemon?
> >> Thank you
> >> A
> >>
> >> From: Volnei Puttini [mailto:vol...@vcplinux.com.br]
> >> Sent: lunedì 9 febbraio 2015 18:06
> >> To: RASTELLI Alessandro; gluster-users@gluster.org
> >> Subject: Re: [Gluster-users] cannot access to CIFS export
> >>
> >> Hi Alessandro,
> >>
> >> My system:
> >>
> >> CentOS 7
> >>
> >> samba-vfs-glusterfs-4.1.1-37.el7_0.x86_64
> >> samba-winbind-4.1.1-37.el7_0.x86_64
> >> samba-libs-4.1.1-37.el7_0.x86_64
> >> samba-common-4.1.1-37.el7_0.x86_64
> >> samba-winbind-modules-4.1.1-37.el7_0.x86_64
> >> samba-winbind-clients-4.1.1-37.el7_0.x86_64
> >> samba-4.1.1-37.el7_0.x86_64
> >> samba-client-4.1.1-37.el7_0.x86_64
> >>
> >> glusterfs 3.6.2 built on Jan 22 2015 12:59:57
> >>
> >> Try this, work fine for me:
> >>
> >> [GFSVOL]
> >> browseable = No
> >> comment = Gluster share of volume gfsvol
> >> path = /
> >> read only = No
> >> guest ok = Yes
> >> kernel share modes = No
> >> posix locking = No
> >> vfs objects = glusterfs
> >> glusterfs:loglevel = 7
> >> glusterfs:logfile = /var/log/samba/glusterfs-gfstest.log
> >> glusterfs:volume = vgtest
> >> glusterfs:volfile_server = 192.168.2.21
> >>
> >> On 09-02-2015 14:45, RASTELLI Alessandro wrote:
> >> Hi,
> >> I've created and started a new replica volume "downloadstat" with CIFS 
> >> export enabled on GlusterFS 3.6.2.
> >> I can see the following piece has been added automatically to smb.conf:
> >> [gluster-downloadstat]
> >> comment = For samba share of volume downloadstat vfs objects = 
> >> glusterfs glusterfs:volume = downloadstat glusterfs:logfile = 
> >> /var/log/samba/glusterfs-downloadstat.%M.log
> >> glusterfs:loglevel = 7
> >> path = /
> >> read only = no
> >> guest ok = yes
> >>
> >> I restarted smb service, without errors.
> >> When I try to access from Win7 client to 
> >> "\\gluster01-mi\gluster-downloadstat"
> >>  it asks me a login (which user do I need to put?) and then gives me error 
> >> "The network path was not found"
> >> and on Gluster smb.log I see:
> >> [2015/02/09 17:21:13.111639,  0] smbd/vfs.c:173(vfs_init_custom)
> >>   error probing vfs module 'glusterfs': NT_STATUS_UNSUCCESSFUL
> >> [2015/02/09 17:21:13.111709,  0] smbd/vfs.c:315(smbd_vfs_init)
> >>   smbd_vfs_init: vfs_init_custom failed for glusterfs
> >> [2015/02/09 17:21:13.111741,  0] smbd/service.c:902(make_connection_snum)
> >>   vfs_init failed for service gluster-downloadstat
> >>
> >> Can you explain how to fix?
> >> Thanks
> >>
> >> Alessandro
> >>
> >> From: 
> >> gluster-users-boun...@gluster.org >> te r .org> [mailto:gluster-users-boun...@gluster.org] On Behalf Of 
> >> David F.
> >> Robinson
> >> Sent: domenica 8 febbraio 2015 18:19
> >> To: Gluster Devel;
> >> gluster-users@gluster.org
> >> Subject: [Gluster-users] cannot delete non-empty directory
> >>
> >> I am seeing these messsages after I delete large amounts of data using 
> >> gluster 3.6.2.
> >> c

Re: [Gluster-users] GlusterD uses 50% of RAM

2015-02-20 Thread Niels de Vos
On Fri, Feb 20, 2015 at 01:50:38PM +, RASTELLI Alessandro wrote:
> I get this:
> 
> [root@gluster03-mi glusterfs]# git fetch git://review.gluster.org/glusterfs 
> refs/changes/28/9328/4 && git checkout FETCH_HEAD
> fatal: Couldn't find remote ref refs/changes/28/9328/4
> 
> What's wrong with that?

I think anonymous git does not (always) work. You could try fetching
over HTTP:

  $ git fetch http://review.gluster.org/glusterfs refs/changes/28/9328/4 && git 
checkout FETCH_HEAD

Niels

> 
> A.
> 
> -Original Message-
> From: Atin Mukherjee [mailto:amukh...@redhat.com] 
> Sent: venerdì 20 febbraio 2015 12:54
> To: RASTELLI Alessandro
> Cc: gluster-users@gluster.org
> Subject: Re: [Gluster-users] GlusterD uses 50% of RAM
> 
> From the cmd log history I could see lots of volume status commands were 
> triggered parallely. This is a known issue for 3.6 and it would cause a 
> memory leak. http://review.gluster.org/#/c/9328/ should solve it.
> 
> ~Atin
> 
> On 02/20/2015 04:36 PM, RASTELLI Alessandro wrote:
> > 10MB log
> > sorry :)
> > 
> > -Original Message-
> > From: Atin Mukherjee [mailto:amukh...@redhat.com]
> > Sent: venerdì 20 febbraio 2015 10:49
> > To: RASTELLI Alessandro; gluster-users@gluster.org
> > Subject: Re: [Gluster-users] GlusterD uses 50% of RAM
> > 
> > Could you please share the cmd_history.log & glusterd log file to analyze 
> > this high memory usage.
> > 
> > ~Atin
> > 
> > On 02/20/2015 03:10 PM, RASTELLI Alessandro wrote:
> >> Hi,
> >> I've noticed that one of our 6 gluster 3.6.2 nodes has "glusterd" 
> >> process using 50% of RAM, on the other nodes usage is about 5% This can be 
> >> a bug?
> >> Should I restart glusterd daemon?
> >> Thank you
> >> A
> >>
> >> From: Volnei Puttini [mailto:vol...@vcplinux.com.br]
> >> Sent: lunedì 9 febbraio 2015 18:06
> >> To: RASTELLI Alessandro; gluster-users@gluster.org
> >> Subject: Re: [Gluster-users] cannot access to CIFS export
> >>
> >> Hi Alessandro,
> >>
> >> My system:
> >>
> >> CentOS 7
> >>
> >> samba-vfs-glusterfs-4.1.1-37.el7_0.x86_64
> >> samba-winbind-4.1.1-37.el7_0.x86_64
> >> samba-libs-4.1.1-37.el7_0.x86_64
> >> samba-common-4.1.1-37.el7_0.x86_64
> >> samba-winbind-modules-4.1.1-37.el7_0.x86_64
> >> samba-winbind-clients-4.1.1-37.el7_0.x86_64
> >> samba-4.1.1-37.el7_0.x86_64
> >> samba-client-4.1.1-37.el7_0.x86_64
> >>
> >> glusterfs 3.6.2 built on Jan 22 2015 12:59:57
> >>
> >> Try this, work fine for me:
> >>
> >> [GFSVOL]
> >> browseable = No
> >> comment = Gluster share of volume gfsvol
> >> path = /
> >> read only = No
> >> guest ok = Yes
> >> kernel share modes = No
> >> posix locking = No
> >> vfs objects = glusterfs
> >> glusterfs:loglevel = 7
> >> glusterfs:logfile = /var/log/samba/glusterfs-gfstest.log
> >> glusterfs:volume = vgtest
> >> glusterfs:volfile_server = 192.168.2.21
> >>
> >> On 09-02-2015 14:45, RASTELLI Alessandro wrote:
> >> Hi,
> >> I've created and started a new replica volume "downloadstat" with CIFS 
> >> export enabled on GlusterFS 3.6.2.
> >> I can see the following piece has been added automatically to smb.conf:
> >> [gluster-downloadstat]
> >> comment = For samba share of volume downloadstat vfs objects = 
> >> glusterfs glusterfs:volume = downloadstat glusterfs:logfile = 
> >> /var/log/samba/glusterfs-downloadstat.%M.log
> >> glusterfs:loglevel = 7
> >> path = /
> >> read only = no
> >> guest ok = yes
> >>
> >> I restarted smb service, without errors.
> >> When I try to access from Win7 client to 
> >> "\\gluster01-mi\gluster-downloadstat"
> >>  it asks me a login (which user do I need to put?) and then gives me error 
> >> "The network path was not found"
> >> and on Gluster smb.log I see:
> >> [2015/02/09 17:21:13.111639,  0] smbd/vfs.c:173(vfs_init_custom)
> >>   error probing vfs module 'glusterfs': NT_STATUS_UNSUCCESSFUL
> >> [2015/02/09 17:21:13.111709,  0] smbd/vfs.c:315(smbd_vfs_init)
> >>   smbd_vfs_init: vfs_init_custom failed for glusterfs
> >> [2015/02/09 17:21:13.111741,  0] smbd/service.c:902(make_connection_snum)
> >>   vfs_init failed for service gluster-downloadstat
> >>
> >> Can you explain how to fix?
> >> Thanks
> >>
> >> Alessandro
> >>
> >> From: 
> >> gluster-users-boun...@gluster.org >> r .org> [mailto:gluster-users-boun...@gluster.org] On Behalf Of David 
> >> F.
> >> Robinson
> >> Sent: domenica 8 febbraio 2015 18:19
> >> To: Gluster Devel;
> >> gluster-users@gluster.org
> >> Subject: [Gluster-users] cannot delete non-empty directory
> >>
> >> I am seeing these messsages after I delete large amounts of data using 
> >> gluster 3.6.2.
> >> cannot delete non-empty directory: 
> >> old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_fin
> >> a
> >> l
> >>
> >> >From the FUSE mount (as root), the directory shows up as empty:
> >>
> >> # pwd
> >> /backup/homegfs/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storag
> >> e
> >> 

Re: [Gluster-users] pb glusterfs 3.4.2 built on Jan 3 2014 12:38:05

2015-02-20 Thread A Ghoshal
Something's wrong with the configuration data in /var/lib/glusterd. Try 
run glusterd with debug:

glusterd --debug

It might have more details.



From:   Pierre Léonard 
To: "gluster-users@gluster.org" 
Date:   02/20/2015 08:08 PM
Subject:[Gluster-users] pb glusterfs 3.4.2 built on Jan  3 2014 
12:38:05
Sent by:gluster-users-boun...@gluster.org



Hi All,

I have a problem on restarting the glusterd service release 3.4.2 some of 
my 14 nodes (Centos 6.5 and 6.6)  have stop the service and when I want to 
restart it I got that message in etc-glusterfs-glusterd.vol.log

[root@xstoocky10 glusterfs]# cat etc-glusterfs-glusterd.vol.log
[2015-02-20 14:31:22.094851] I [glusterfsd.c:1910:main] 
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.4.2 
(/usr/sbin/glusterd --pid-file=/var/run/glusterd.pid)
[2015-02-20 14:31:22.099381] I [glusterd.c:961:init] 0-management: Using 
/var/lib/glusterd as working directory
[2015-02-20 14:31:22.103021] I [socket.c:3480:socket_init] 
0-socket.management: SSL support is NOT enabled
[2015-02-20 14:31:22.103056] I [socket.c:3495:socket_init] 
0-socket.management: using system polling thread
[2015-02-20 14:31:22.103949] W [rdma.c:4197:__gf_rdma_ctx_create] 
0-rpc-transport/rdma: rdma_cm event channel creation failed (No such 
device)
[2015-02-20 14:31:22.103980] E [rdma.c:4485:init] 0-rdma.management: 
Failed to initialize IB Device
[2015-02-20 14:31:22.103995] E [rpc-transport.c:320:rpc_transport_load] 
0-rpc-transport: 'rdma' initialization failed
[2015-02-20 14:31:22.104080] W [rpcsvc.c:1389:rpcsvc_transport_create] 
0-rpc-service: cannot create listener, initing the transport failed
[2015-02-20 14:31:24.177993] I 
[glusterd-store.c:1339:glusterd_restore_op_version] 0-glusterd: retrieved 
op-version: 2
[2015-02-20 14:31:24.189994] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-0
[2015-02-20 14:31:24.190047] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-1
[2015-02-20 14:31:24.190070] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-2
[2015-02-20 14:31:24.190090] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-3
[2015-02-20 14:31:24.190109] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-4
[2015-02-20 14:31:24.190128] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-5
[2015-02-20 14:31:24.190147] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-6
[2015-02-20 14:31:24.190166] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-7
[2015-02-20 14:31:24.190185] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-8
[2015-02-20 14:31:24.190203] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-9
[2015-02-20 14:31:24.190222] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-10
[2015-02-20 14:31:24.190242] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-11
[2015-02-20 14:31:24.190261] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-12
[2015-02-20 14:31:24.190280] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-13
[2015-02-20 14:31:24.630365] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-0
[2015-02-20 14:31:24.630416] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-1
[2015-02-20 14:31:24.630439] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-2
[2015-02-20 14:31:24.630460] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-3
[2015-02-20 14:31:24.630479] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-4
[2015-02-20 14:31:24.630499] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-5
[2015-02-20 14:31:24.630518] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-6
[2015-02-20 14:31:24.630538] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-7
[2015-02-20 14:31:24.630557] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-8
[2015-02-20 14:31:24.630577] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-9
[2015-02-20 14:31:24.630597] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-10
[2015-02-20 14:31:24.630617] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-11
[2015-02-20 14:31:24.630636] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-12
[2015-02-20 14:31:24.630668] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-13
[2015-02-20 14:31:24.750892

[Gluster-users] pb glusterfs 3.4.2 built on Jan 3 2014 12:38:05

2015-02-20 Thread Pierre Léonard

Hi All,

I have a problem on restarting the glusterd service release 3.4.2 some 
of my 14 nodes (Centos 6.5 and 6.6)  have stop the service and when I 
want to restart it I got that message in /etc-glusterfs-glusterd.vol.log/


[root@xstoocky10 glusterfs]# cat etc-glusterfs-glusterd.vol.log
[2015-02-20 14:31:22.094851] I [glusterfsd.c:1910:main] 
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.4.2 
(/usr/sbin/glusterd --pid-file=/var/run/glusterd.pid)
[2015-02-20 14:31:22.099381] I [glusterd.c:961:init] 0-management: Using 
/var/lib/glusterd as working directory
[2015-02-20 14:31:22.103021] I [socket.c:3480:socket_init] 
0-socket.management: SSL support is NOT enabled
[2015-02-20 14:31:22.103056] I [socket.c:3495:socket_init] 
0-socket.management: using system polling thread
[2015-02-20 14:31:22.103949] W [rdma.c:4197:__gf_rdma_ctx_create] 
0-rpc-transport/rdma: rdma_cm event channel creation failed (No such device)
[2015-02-20 14:31:22.103980] E [rdma.c:4485:init] 0-rdma.management: 
Failed to initialize IB Device
[2015-02-20 14:31:22.103995] E [rpc-transport.c:320:rpc_transport_load] 
0-rpc-transport: 'rdma' initialization failed
[2015-02-20 14:31:22.104080] W [rpcsvc.c:1389:rpcsvc_transport_create] 
0-rpc-service: cannot create listener, initing the transport failed
[2015-02-20 14:31:24.177993] I 
[glusterd-store.c:1339:glusterd_restore_op_version] 0-glusterd: 
retrieved op-version: 2
[2015-02-20 14:31:24.189994] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-0
[2015-02-20 14:31:24.190047] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-1
[2015-02-20 14:31:24.190070] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-2
[2015-02-20 14:31:24.190090] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-3
[2015-02-20 14:31:24.190109] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-4
[2015-02-20 14:31:24.190128] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-5
[2015-02-20 14:31:24.190147] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-6
[2015-02-20 14:31:24.190166] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-7
[2015-02-20 14:31:24.190185] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-8
[2015-02-20 14:31:24.190203] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-9
[2015-02-20 14:31:24.190222] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-10
[2015-02-20 14:31:24.190242] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-11
[2015-02-20 14:31:24.190261] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-12
[2015-02-20 14:31:24.190280] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-13
[2015-02-20 14:31:24.630365] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-0
[2015-02-20 14:31:24.630416] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-1
[2015-02-20 14:31:24.630439] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-2
[2015-02-20 14:31:24.630460] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-3
[2015-02-20 14:31:24.630479] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-4
[2015-02-20 14:31:24.630499] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-5
[2015-02-20 14:31:24.630518] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-6
[2015-02-20 14:31:24.630538] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-7
[2015-02-20 14:31:24.630557] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-8
[2015-02-20 14:31:24.630577] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-9
[2015-02-20 14:31:24.630597] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-10
[2015-02-20 14:31:24.630617] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-11
[2015-02-20 14:31:24.630636] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-12
[2015-02-20 14:31:24.630668] E 
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: 
brick-13
[2015-02-20 14:31:24.750892] I 
[glusterd-handler.c:2818:glusterd_friend_add] 0-management: connect 
returned 0
[2015-02-20 14:31:24.762397] I 
[glusterd-handler.c:2818:glusterd_friend_add] 0-management: connect 
returned 0
[2015-02-20 14:31:24.773862] I 
[glusterd-handler.c:2818:glusterd_friend_add] 0-management: connect 
returned 0
[2015-02-20 14:31:24.785393] I 
[glusterd-handler.c:2818:glu

Re: [Gluster-users] [Gluster-devel] In a replica 2 server, file-updates on one server missing on the other server #Personal#

2015-02-20 Thread A Ghoshal
I found out the reason this happens a few of days back. Just to let you 
know..

It seems it has partly to do with the way we handle reboots on our setup. 
When we take down one of our replica servers (for testing/maintenance), to 
ensure that the bricks are unmounted correctly, we kill off the glusterfsd 
processes (short of stopping the volume and causing service disruption to 
the mount clients). Let us assume that serv1 is being rebooted. When we 
kill off glusterfsd,

For file-systems that are normally not accessed: 

1. ping between the mount client on serv0 and the brick's glusterfsd on 
serv1 times out. In our system, this ping is configured at 10 seconds. 

2. At this point, the mount client on serv0 destroys the now defunct TCP 
connection and querying the port of the remote brick with the remote 
glusterd process. 

3. But, since by this time serv1 is already down, no response arrives, and 
the local mount client retries the query till serv1 is up once more, upon 
which the glusterd on serv1  responds with the newly allocated port number 
for the brick, and a new connection is thus established.

For frequently accessed file-systems: 

1. it is one of the file operations (read/write) that times out. This 
happens much earlier than 10 seconds. 
This results in the connection being destroyed and the mount client on 
serv0 querying remote glusterd for the remote brick's port number. 

2. Because this happens so quickly, glusterd on serv1 is not yet down, and 
also unaware that the local brick is not alive anymore. So, it returns the 
port number of the dead process.

3. For the mount client on serv0, since the query succeeded, it does not 
attempt another port query, but instead tries to connect to the stale port 
number ad infinitum. 

Our solution to this problem is simple - before we kill glusterfsd and 
unmount the bricks, we stop glusterd:

/etc/init.d/glusterd stop

This ensures that the portmap queries by the mount client on serv0 are 
never honored.

Thanks,
Anirban



From:   A Ghoshal/MUM/TCS
To: Ben England 
Cc: gluster-users@gluster.org
Date:   02/05/2015 04:50 AM
Subject:Re: [Gluster-devel] [Gluster-users] In a replica 2 server, 
file-updates on one server missing on the other server #Personal#
Sent by:A Ghoshal


CC gluster-users.

No, there aren't any firewall rules in our server. As I write in one of my 
earlier emails, if I kill the mount client, and remount the volume, then 
the problem disappears. That is to say, this causes the client to refresh 
remote port data and from there everything's fine. Also, we dont' use 
gfapi - and bind() is always good.




From:   Ben England 
To: A Ghoshal 
Date:   02/05/2015 04:40 AM
Subject:Re: [Gluster-devel] [Gluster-users] In a replica 2 server, 
file-updates on one server missing on the other server #Personal#



could it be a problem with iptables blocking connections?  DO iptables 
--list and make sure gluster ports are allowed through, at both ends. 
Also, if you are using libgfapi, be sure you use rpc-auth-allow-insecure 
if you have a lot of gfapi instances, or else you'll run into problems.

- Original Message -
> From: "A Ghoshal" 
> To: "Ben England" 
> Sent: Wednesday, February 4, 2015 6:07:10 PM
> Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server, 
file-updates on one server missing on the other
> server #Personal#
> 
> Thanks, Ben, same here :/ I actually get port numbers for glusterfsd in
> any of the three ways:
> 
> 1. gluster volume status 
> 2. command line for glusterfsd on target server.
> 3. if you're really paranoid, get the glusterfsd PID and use netstat.
> 
> Looking at the code it seems to me that the whole thing operates on a
> statd-notify paradigm. Your local mount client registers for notify on 
all
> remote glusterfsd's. When remote brick goes down and comes back up, you
> are notified and then it calls portmap to obtain remote glusterfsd port.
> 
> I see here that both glusterd are up. But somehow the port number of the
> remote glusterfsd with the mount client is now stale - not sure how it
> happens. Now, the client keeps trying to connect on the stale port every 
3
> seconds. It gets the return errno of -111 (-ECONNREFUSED) which is 
clearly
> indicating that there is not listener on the remote host's IP at this
> port.
> 
> Design-wise, could it indicate to the mount client that the port number
> information needs to be refreshed? Would you say this is a bug of sorts?
> 
> 
> 
> 
> From:   Ben England 
> To: A Ghoshal 
> Date:   02/05/2015 03:59 AM
> Subject:Re: [Gluster-devel] [Gluster-users] In a replica 2 
server,
> file-updates on one server missing on the other server #Personal#
> 
> 
> 
> I thought Gluster was based on ONC RPC, which means there are no fixed
> port numbers except for glusterd (24007).  The client connects to
> Glusterd, reads the volfile, and gets the port numbers of the registered
> glusterfsd processes at that ti

Re: [Gluster-users] GlusterD uses 50% of RAM

2015-02-20 Thread RASTELLI Alessandro
I get this:

[root@gluster03-mi glusterfs]# git fetch git://review.gluster.org/glusterfs 
refs/changes/28/9328/4 && git checkout FETCH_HEAD
fatal: Couldn't find remote ref refs/changes/28/9328/4

What's wrong with that?

A.

-Original Message-
From: Atin Mukherjee [mailto:amukh...@redhat.com] 
Sent: venerdì 20 febbraio 2015 12:54
To: RASTELLI Alessandro
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] GlusterD uses 50% of RAM

>From the cmd log history I could see lots of volume status commands were 
>triggered parallely. This is a known issue for 3.6 and it would cause a memory 
>leak. http://review.gluster.org/#/c/9328/ should solve it.

~Atin

On 02/20/2015 04:36 PM, RASTELLI Alessandro wrote:
> 10MB log
> sorry :)
> 
> -Original Message-
> From: Atin Mukherjee [mailto:amukh...@redhat.com]
> Sent: venerdì 20 febbraio 2015 10:49
> To: RASTELLI Alessandro; gluster-users@gluster.org
> Subject: Re: [Gluster-users] GlusterD uses 50% of RAM
> 
> Could you please share the cmd_history.log & glusterd log file to analyze 
> this high memory usage.
> 
> ~Atin
> 
> On 02/20/2015 03:10 PM, RASTELLI Alessandro wrote:
>> Hi,
>> I've noticed that one of our 6 gluster 3.6.2 nodes has "glusterd" 
>> process using 50% of RAM, on the other nodes usage is about 5% This can be a 
>> bug?
>> Should I restart glusterd daemon?
>> Thank you
>> A
>>
>> From: Volnei Puttini [mailto:vol...@vcplinux.com.br]
>> Sent: lunedì 9 febbraio 2015 18:06
>> To: RASTELLI Alessandro; gluster-users@gluster.org
>> Subject: Re: [Gluster-users] cannot access to CIFS export
>>
>> Hi Alessandro,
>>
>> My system:
>>
>> CentOS 7
>>
>> samba-vfs-glusterfs-4.1.1-37.el7_0.x86_64
>> samba-winbind-4.1.1-37.el7_0.x86_64
>> samba-libs-4.1.1-37.el7_0.x86_64
>> samba-common-4.1.1-37.el7_0.x86_64
>> samba-winbind-modules-4.1.1-37.el7_0.x86_64
>> samba-winbind-clients-4.1.1-37.el7_0.x86_64
>> samba-4.1.1-37.el7_0.x86_64
>> samba-client-4.1.1-37.el7_0.x86_64
>>
>> glusterfs 3.6.2 built on Jan 22 2015 12:59:57
>>
>> Try this, work fine for me:
>>
>> [GFSVOL]
>> browseable = No
>> comment = Gluster share of volume gfsvol
>> path = /
>> read only = No
>> guest ok = Yes
>> kernel share modes = No
>> posix locking = No
>> vfs objects = glusterfs
>> glusterfs:loglevel = 7
>> glusterfs:logfile = /var/log/samba/glusterfs-gfstest.log
>> glusterfs:volume = vgtest
>> glusterfs:volfile_server = 192.168.2.21
>>
>> On 09-02-2015 14:45, RASTELLI Alessandro wrote:
>> Hi,
>> I've created and started a new replica volume "downloadstat" with CIFS 
>> export enabled on GlusterFS 3.6.2.
>> I can see the following piece has been added automatically to smb.conf:
>> [gluster-downloadstat]
>> comment = For samba share of volume downloadstat vfs objects = 
>> glusterfs glusterfs:volume = downloadstat glusterfs:logfile = 
>> /var/log/samba/glusterfs-downloadstat.%M.log
>> glusterfs:loglevel = 7
>> path = /
>> read only = no
>> guest ok = yes
>>
>> I restarted smb service, without errors.
>> When I try to access from Win7 client to 
>> "\\gluster01-mi\gluster-downloadstat"
>>  it asks me a login (which user do I need to put?) and then gives me error 
>> "The network path was not found"
>> and on Gluster smb.log I see:
>> [2015/02/09 17:21:13.111639,  0] smbd/vfs.c:173(vfs_init_custom)
>>   error probing vfs module 'glusterfs': NT_STATUS_UNSUCCESSFUL
>> [2015/02/09 17:21:13.111709,  0] smbd/vfs.c:315(smbd_vfs_init)
>>   smbd_vfs_init: vfs_init_custom failed for glusterfs
>> [2015/02/09 17:21:13.111741,  0] smbd/service.c:902(make_connection_snum)
>>   vfs_init failed for service gluster-downloadstat
>>
>> Can you explain how to fix?
>> Thanks
>>
>> Alessandro
>>
>> From: 
>> gluster-users-boun...@gluster.org> r .org> [mailto:gluster-users-boun...@gluster.org] On Behalf Of David 
>> F.
>> Robinson
>> Sent: domenica 8 febbraio 2015 18:19
>> To: Gluster Devel;
>> gluster-users@gluster.org
>> Subject: [Gluster-users] cannot delete non-empty directory
>>
>> I am seeing these messsages after I delete large amounts of data using 
>> gluster 3.6.2.
>> cannot delete non-empty directory: 
>> old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_fin
>> a
>> l
>>
>> >From the FUSE mount (as root), the directory shows up as empty:
>>
>> # pwd
>> /backup/homegfs/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storag
>> e
>> /Jimmy_Old/src_vj1.5_final
>>
>> # ls -al
>> total 5
>> d- 2 root root4106 Feb  6 13:55 .
>> drwxrws--- 3  601 dmiller   72 Feb  6 13:55 ..
>>
>> However, when you look at the bricks, the files are still there (none on 
>> brick01bkp, all files are on brick02bkp).  All of the files are 0-length and 
>> have --T permissions.
>> Any suggestions on how to fix this and how to prevent it from happening?
>>
>> #  ls -al
>> /data/brick*/homegfs_bkp/backup.0/old_shelf4/Aegis/\!\!\!Programs/Rav
>> e nCFD/Storage/Jimmy_Old/sr

Re: [Gluster-users] GlusterD uses 50% of RAM

2015-02-20 Thread Atin Mukherjee
>From the cmd log history I could see lots of volume status commands were
triggered parallely. This is a known issue for 3.6 and it would cause a
memory leak. http://review.gluster.org/#/c/9328/ should solve it.

~Atin

On 02/20/2015 04:36 PM, RASTELLI Alessandro wrote:
> 10MB log
> sorry :)
> 
> -Original Message-
> From: Atin Mukherjee [mailto:amukh...@redhat.com] 
> Sent: venerdì 20 febbraio 2015 10:49
> To: RASTELLI Alessandro; gluster-users@gluster.org
> Subject: Re: [Gluster-users] GlusterD uses 50% of RAM
> 
> Could you please share the cmd_history.log & glusterd log file to analyze 
> this high memory usage.
> 
> ~Atin
> 
> On 02/20/2015 03:10 PM, RASTELLI Alessandro wrote:
>> Hi,
>> I've noticed that one of our 6 gluster 3.6.2 nodes has "glusterd" 
>> process using 50% of RAM, on the other nodes usage is about 5% This can be a 
>> bug?
>> Should I restart glusterd daemon?
>> Thank you
>> A
>>
>> From: Volnei Puttini [mailto:vol...@vcplinux.com.br]
>> Sent: lunedì 9 febbraio 2015 18:06
>> To: RASTELLI Alessandro; gluster-users@gluster.org
>> Subject: Re: [Gluster-users] cannot access to CIFS export
>>
>> Hi Alessandro,
>>
>> My system:
>>
>> CentOS 7
>>
>> samba-vfs-glusterfs-4.1.1-37.el7_0.x86_64
>> samba-winbind-4.1.1-37.el7_0.x86_64
>> samba-libs-4.1.1-37.el7_0.x86_64
>> samba-common-4.1.1-37.el7_0.x86_64
>> samba-winbind-modules-4.1.1-37.el7_0.x86_64
>> samba-winbind-clients-4.1.1-37.el7_0.x86_64
>> samba-4.1.1-37.el7_0.x86_64
>> samba-client-4.1.1-37.el7_0.x86_64
>>
>> glusterfs 3.6.2 built on Jan 22 2015 12:59:57
>>
>> Try this, work fine for me:
>>
>> [GFSVOL]
>> browseable = No
>> comment = Gluster share of volume gfsvol
>> path = /
>> read only = No
>> guest ok = Yes
>> kernel share modes = No
>> posix locking = No
>> vfs objects = glusterfs
>> glusterfs:loglevel = 7
>> glusterfs:logfile = /var/log/samba/glusterfs-gfstest.log
>> glusterfs:volume = vgtest
>> glusterfs:volfile_server = 192.168.2.21
>>
>> On 09-02-2015 14:45, RASTELLI Alessandro wrote:
>> Hi,
>> I've created and started a new replica volume "downloadstat" with CIFS 
>> export enabled on GlusterFS 3.6.2.
>> I can see the following piece has been added automatically to smb.conf:
>> [gluster-downloadstat]
>> comment = For samba share of volume downloadstat vfs objects = 
>> glusterfs glusterfs:volume = downloadstat glusterfs:logfile = 
>> /var/log/samba/glusterfs-downloadstat.%M.log
>> glusterfs:loglevel = 7
>> path = /
>> read only = no
>> guest ok = yes
>>
>> I restarted smb service, without errors.
>> When I try to access from Win7 client to 
>> "\\gluster01-mi\gluster-downloadstat"
>>  it asks me a login (which user do I need to put?) and then gives me error 
>> "The network path was not found"
>> and on Gluster smb.log I see:
>> [2015/02/09 17:21:13.111639,  0] smbd/vfs.c:173(vfs_init_custom)
>>   error probing vfs module 'glusterfs': NT_STATUS_UNSUCCESSFUL
>> [2015/02/09 17:21:13.111709,  0] smbd/vfs.c:315(smbd_vfs_init)
>>   smbd_vfs_init: vfs_init_custom failed for glusterfs
>> [2015/02/09 17:21:13.111741,  0] smbd/service.c:902(make_connection_snum)
>>   vfs_init failed for service gluster-downloadstat
>>
>> Can you explain how to fix?
>> Thanks
>>
>> Alessandro
>>
>> From: 
>> gluster-users-boun...@gluster.org> .org> [mailto:gluster-users-boun...@gluster.org] On Behalf Of David F. 
>> Robinson
>> Sent: domenica 8 febbraio 2015 18:19
>> To: Gluster Devel; 
>> gluster-users@gluster.org
>> Subject: [Gluster-users] cannot delete non-empty directory
>>
>> I am seeing these messsages after I delete large amounts of data using 
>> gluster 3.6.2.
>> cannot delete non-empty directory: 
>> old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_fina
>> l
>>
>> >From the FUSE mount (as root), the directory shows up as empty:
>>
>> # pwd
>> /backup/homegfs/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage
>> /Jimmy_Old/src_vj1.5_final
>>
>> # ls -al
>> total 5
>> d- 2 root root4106 Feb  6 13:55 .
>> drwxrws--- 3  601 dmiller   72 Feb  6 13:55 ..
>>
>> However, when you look at the bricks, the files are still there (none on 
>> brick01bkp, all files are on brick02bkp).  All of the files are 0-length and 
>> have --T permissions.
>> Any suggestions on how to fix this and how to prevent it from happening?
>>
>> #  ls -al 
>> /data/brick*/homegfs_bkp/backup.0/old_shelf4/Aegis/\!\!\!Programs/Rave
>> nCFD/Storage/Jimmy_Old/src_vj1.5_final
>> /data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final:
>> total 4
>> d-+ 2 root root  10 Feb  6 13:55 .
>> drwxrws---+ 3  601 raven 36 Feb  6 13:55 ..
>>
>> /data/brick02bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final:
>> total 8
>> d-+ 3 root root  4096 Dec 31  1969 .
>> drwxrws---+ 3  601 raven   36 Feb  6 13:55 ..
>> ---

Re: [Gluster-users] [TSR] Failed tests on glusterfs-3.6.3beta1, ZFS, CentOS 6.6

2015-02-20 Thread Kiran Patil
I rerun the above failed tests on ext4 and below are the one who failed.

tests/basic/quota-anon-fd-nfs.t
tests/basic/volume-snapshot.t
tests/bugs/bug-1045333.t
tests/bugs/bug-1087198.t
tests/bugs/bug-1113975.t
tests/bugs/bug-1117851.t
tests/bugs/bug-1161886/bug-1161886.t
tests/bugs/bug-1162498.t
tests/bugs/bug-765380.t
tests/bugs/bug-824753.t

Thanks,
Kiran.


On Fri, Feb 20, 2015 at 4:40 PM, Kiran Patil  wrote:
> Please find the below gluster regression test summary report.
>
> Test Summary Report
> ---
> ./tests/basic/ec/quota.t
> (Wstat: 0 Tests: 22 Failed: 2)
>   Failed tests:  16, 20
> ./tests/basic/quota-anon-fd-nfs.t
> (Wstat: 0 Tests: 21 Failed: 1)
>   Failed test:  18
> ./tests/basic/quota.t
> (Wstat: 0 Tests: 73 Failed: 4)
>   Failed tests:  24, 28, 32, 65
> ./tests/basic/uss.t
> (Wstat: 0 Tests: 158 Failed: 8)
>   Failed tests:  37-38, 69-70, 99-100, 127-128
> ./tests/basic/volume-snapshot.t
> (Wstat: 0 Tests: 29 Failed: 2)
>   Failed tests:  28-29
> ./tests/bugs/bug-1023974.t
> (Wstat: 0 Tests: 15 Failed: 1)
>   Failed test:  12
> ./tests/bugs/bug-1038598.t
> (Wstat: 0 Tests: 28 Failed: 6)
>   Failed tests:  17, 21-22, 26-28
> ./tests/bugs/bug-1045333.t
> (Wstat: 0 Tests: 16 Failed: 1)
>   Failed test:  15
> ./tests/bugs/bug-1087198.t
> (Wstat: 0 Tests: 26 Failed: 2)
>   Failed tests:  18, 23
> ./tests/bugs/bug-1113975.t
> (Wstat: 0 Tests: 13 Failed: 3)
>   Failed tests:  11-13
> ./tests/bugs/bug-1117851.t
> (Wstat: 0 Tests: 24 Failed: 1)
>   Failed test:  15
> ./tests/bugs/bug-1161886/bug-1161886.t
> (Wstat: 0 Tests: 16 Failed: 4)
>   Failed tests:  13-16
> ./tests/bugs/bug-1162498.t
> (Wstat: 0 Tests: 30 Failed: 13)
>   Failed tests:  10, 19-30
> ./tests/bugs/bug-765380.t
> (Wstat: 0 Tests: 9 Failed: 1)
>   Failed test:  6
> ./tests/bugs/bug-824753.t
> (Wstat: 0 Tests: 16 Failed: 1)
>   Failed test:  11
> ./tests/bugs/bug-948729/bug-948729-mode-script.t
> (Wstat: 0 Tests: 23 Failed: 2)
>   Failed tests:  19, 23
> ./tests/bugs/bug-948729/bug-948729.t
> (Wstat: 0 Tests: 23 Failed: 2)
>   Failed tests:  19, 23
> Files=296, Tests=8411, 8656 wallclock secs ( 3.62 usr  1.97 sys +
> 527.17 cusr 683.51 csys = 1216.27 CPU)
> Result: FAIL
>
> Thanks,
> Kiran.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterD uses 50% of RAM

2015-02-20 Thread Atin Mukherjee
Could you please share the cmd_history.log & glusterd log file to
analyze this high memory usage.

~Atin

On 02/20/2015 03:10 PM, RASTELLI Alessandro wrote:
> Hi,
> I've noticed that one of our 6 gluster 3.6.2 nodes has "glusterd" process 
> using 50% of RAM, on the other nodes usage is about 5%
> This can be a bug?
> Should I restart glusterd daemon?
> Thank you
> A
> 
> From: Volnei Puttini [mailto:vol...@vcplinux.com.br]
> Sent: lunedì 9 febbraio 2015 18:06
> To: RASTELLI Alessandro; gluster-users@gluster.org
> Subject: Re: [Gluster-users] cannot access to CIFS export
> 
> Hi Alessandro,
> 
> My system:
> 
> CentOS 7
> 
> samba-vfs-glusterfs-4.1.1-37.el7_0.x86_64
> samba-winbind-4.1.1-37.el7_0.x86_64
> samba-libs-4.1.1-37.el7_0.x86_64
> samba-common-4.1.1-37.el7_0.x86_64
> samba-winbind-modules-4.1.1-37.el7_0.x86_64
> samba-winbind-clients-4.1.1-37.el7_0.x86_64
> samba-4.1.1-37.el7_0.x86_64
> samba-client-4.1.1-37.el7_0.x86_64
> 
> glusterfs 3.6.2 built on Jan 22 2015 12:59:57
> 
> Try this, work fine for me:
> 
> [GFSVOL]
> browseable = No
> comment = Gluster share of volume gfsvol
> path = /
> read only = No
> guest ok = Yes
> kernel share modes = No
> posix locking = No
> vfs objects = glusterfs
> glusterfs:loglevel = 7
> glusterfs:logfile = /var/log/samba/glusterfs-gfstest.log
> glusterfs:volume = vgtest
> glusterfs:volfile_server = 192.168.2.21
> 
> On 09-02-2015 14:45, RASTELLI Alessandro wrote:
> Hi,
> I've created and started a new replica volume "downloadstat" with CIFS export 
> enabled on GlusterFS 3.6.2.
> I can see the following piece has been added automatically to smb.conf:
> [gluster-downloadstat]
> comment = For samba share of volume downloadstat
> vfs objects = glusterfs
> glusterfs:volume = downloadstat
> glusterfs:logfile = /var/log/samba/glusterfs-downloadstat.%M.log
> glusterfs:loglevel = 7
> path = /
> read only = no
> guest ok = yes
> 
> I restarted smb service, without errors.
> When I try to access from Win7 client to 
> "\\gluster01-mi\gluster-downloadstat"
>  it asks me a login (which user do I need to put?) and then gives me error 
> "The network path was not found"
> and on Gluster smb.log I see:
> [2015/02/09 17:21:13.111639,  0] smbd/vfs.c:173(vfs_init_custom)
>   error probing vfs module 'glusterfs': NT_STATUS_UNSUCCESSFUL
> [2015/02/09 17:21:13.111709,  0] smbd/vfs.c:315(smbd_vfs_init)
>   smbd_vfs_init: vfs_init_custom failed for glusterfs
> [2015/02/09 17:21:13.111741,  0] smbd/service.c:902(make_connection_snum)
>   vfs_init failed for service gluster-downloadstat
> 
> Can you explain how to fix?
> Thanks
> 
> Alessandro
> 
> From: 
> gluster-users-boun...@gluster.org 
> [mailto:gluster-users-boun...@gluster.org] On Behalf Of David F. Robinson
> Sent: domenica 8 febbraio 2015 18:19
> To: Gluster Devel; gluster-users@gluster.org
> Subject: [Gluster-users] cannot delete non-empty directory
> 
> I am seeing these messsages after I delete large amounts of data using 
> gluster 3.6.2.
> cannot delete non-empty directory: 
> old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final
> 
>>From the FUSE mount (as root), the directory shows up as empty:
> 
> # pwd
> /backup/homegfs/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final
> 
> # ls -al
> total 5
> d- 2 root root4106 Feb  6 13:55 .
> drwxrws--- 3  601 dmiller   72 Feb  6 13:55 ..
> 
> However, when you look at the bricks, the files are still there (none on 
> brick01bkp, all files are on brick02bkp).  All of the files are 0-length and 
> have --T permissions.
> Any suggestions on how to fix this and how to prevent it from happening?
> 
> #  ls -al 
> /data/brick*/homegfs_bkp/backup.0/old_shelf4/Aegis/\!\!\!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final
> /data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final:
> total 4
> d-+ 2 root root  10 Feb  6 13:55 .
> drwxrws---+ 3  601 raven 36 Feb  6 13:55 ..
> 
> /data/brick02bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final:
> total 8
> d-+ 3 root root  4096 Dec 31  1969 .
> drwxrws---+ 3  601 raven   36 Feb  6 13:55 ..
> -T  5  601 raven0 Nov 20 00:08 read_inset.f.gz
> -T  5  601 raven0 Nov 20 00:08 readbc.f.gz
> -T  5  601 raven0 Nov 20 00:08 readcn.f.gz
> -T  5  601 raven0 Nov 20 00:08 readinp.f.gz
> -T  5  601 raven0 Nov 20 00:08 readinp_v1_2.f.gz
> -T  5  601 raven0 Nov 20 00:08 readinp_v1_3.f.gz
> -T  5  601 raven0 Nov 20 00:08 rotatept.f.gz
> d-+ 2 root root   118 Feb  6 13:54 save1
> -T  5  601 raven0 Nov 20 00:08 sepvec.f.gz
> -T  5  601 raven0 Nov 20 00:08 shadow.f.gz
> -T  5  601 raven0 Nov 20 00:08 snksrc.f.gz
> -T  5  601 raven

[Gluster-users] GlusterD uses 50% of RAM

2015-02-20 Thread RASTELLI Alessandro
Hi,
I've noticed that one of our 6 gluster 3.6.2 nodes has "glusterd" process using 
50% of RAM, on the other nodes usage is about 5%
This can be a bug?
Should I restart glusterd daemon?
Thank you
A

From: Volnei Puttini [mailto:vol...@vcplinux.com.br]
Sent: lunedì 9 febbraio 2015 18:06
To: RASTELLI Alessandro; gluster-users@gluster.org
Subject: Re: [Gluster-users] cannot access to CIFS export

Hi Alessandro,

My system:

CentOS 7

samba-vfs-glusterfs-4.1.1-37.el7_0.x86_64
samba-winbind-4.1.1-37.el7_0.x86_64
samba-libs-4.1.1-37.el7_0.x86_64
samba-common-4.1.1-37.el7_0.x86_64
samba-winbind-modules-4.1.1-37.el7_0.x86_64
samba-winbind-clients-4.1.1-37.el7_0.x86_64
samba-4.1.1-37.el7_0.x86_64
samba-client-4.1.1-37.el7_0.x86_64

glusterfs 3.6.2 built on Jan 22 2015 12:59:57

Try this, work fine for me:

[GFSVOL]
browseable = No
comment = Gluster share of volume gfsvol
path = /
read only = No
guest ok = Yes
kernel share modes = No
posix locking = No
vfs objects = glusterfs
glusterfs:loglevel = 7
glusterfs:logfile = /var/log/samba/glusterfs-gfstest.log
glusterfs:volume = vgtest
glusterfs:volfile_server = 192.168.2.21

On 09-02-2015 14:45, RASTELLI Alessandro wrote:
Hi,
I've created and started a new replica volume "downloadstat" with CIFS export 
enabled on GlusterFS 3.6.2.
I can see the following piece has been added automatically to smb.conf:
[gluster-downloadstat]
comment = For samba share of volume downloadstat
vfs objects = glusterfs
glusterfs:volume = downloadstat
glusterfs:logfile = /var/log/samba/glusterfs-downloadstat.%M.log
glusterfs:loglevel = 7
path = /
read only = no
guest ok = yes

I restarted smb service, without errors.
When I try to access from Win7 client to 
"\\gluster01-mi\gluster-downloadstat"
 it asks me a login (which user do I need to put?) and then gives me error "The 
network path was not found"
and on Gluster smb.log I see:
[2015/02/09 17:21:13.111639,  0] smbd/vfs.c:173(vfs_init_custom)
  error probing vfs module 'glusterfs': NT_STATUS_UNSUCCESSFUL
[2015/02/09 17:21:13.111709,  0] smbd/vfs.c:315(smbd_vfs_init)
  smbd_vfs_init: vfs_init_custom failed for glusterfs
[2015/02/09 17:21:13.111741,  0] smbd/service.c:902(make_connection_snum)
  vfs_init failed for service gluster-downloadstat

Can you explain how to fix?
Thanks

Alessandro

From: 
gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of David F. Robinson
Sent: domenica 8 febbraio 2015 18:19
To: Gluster Devel; gluster-users@gluster.org
Subject: [Gluster-users] cannot delete non-empty directory

I am seeing these messsages after I delete large amounts of data using gluster 
3.6.2.
cannot delete non-empty directory: 
old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final

>From the FUSE mount (as root), the directory shows up as empty:

# pwd
/backup/homegfs/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final

# ls -al
total 5
d- 2 root root4106 Feb  6 13:55 .
drwxrws--- 3  601 dmiller   72 Feb  6 13:55 ..

However, when you look at the bricks, the files are still there (none on 
brick01bkp, all files are on brick02bkp).  All of the files are 0-length and 
have --T permissions.
Any suggestions on how to fix this and how to prevent it from happening?

#  ls -al 
/data/brick*/homegfs_bkp/backup.0/old_shelf4/Aegis/\!\!\!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final
/data/brick01bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final:
total 4
d-+ 2 root root  10 Feb  6 13:55 .
drwxrws---+ 3  601 raven 36 Feb  6 13:55 ..

/data/brick02bkp/homegfs_bkp/backup.0/old_shelf4/Aegis/!!!Programs/RavenCFD/Storage/Jimmy_Old/src_vj1.5_final:
total 8
d-+ 3 root root  4096 Dec 31  1969 .
drwxrws---+ 3  601 raven   36 Feb  6 13:55 ..
-T  5  601 raven0 Nov 20 00:08 read_inset.f.gz
-T  5  601 raven0 Nov 20 00:08 readbc.f.gz
-T  5  601 raven0 Nov 20 00:08 readcn.f.gz
-T  5  601 raven0 Nov 20 00:08 readinp.f.gz
-T  5  601 raven0 Nov 20 00:08 readinp_v1_2.f.gz
-T  5  601 raven0 Nov 20 00:08 readinp_v1_3.f.gz
-T  5  601 raven0 Nov 20 00:08 rotatept.f.gz
d-+ 2 root root   118 Feb  6 13:54 save1
-T  5  601 raven0 Nov 20 00:08 sepvec.f.gz
-T  5  601 raven0 Nov 20 00:08 shadow.f.gz
-T  5  601 raven0 Nov 20 00:08 snksrc.f.gz
-T  5  601 raven0 Nov 20 00:08 source.f.gz
-T  5  601 raven0 Nov 20 00:08 step.f.gz
-T  5  601 raven0 Nov 20 00:08 stoprog.f.gz
-T  5  601 raven0 Nov 20 00:08 summer6.f.gz
-T  5  601 raven0 Nov 20 00:08 totforc.f.gz
-T  5  601 raven0 Nov 20 00:08 tritet.f.gz
-T  5  601 raven0 Nov 20 00:08 wallrsd.f.gz
-T  5  601 raven0 Nov 20 00:08 wheat.f.

[Gluster-users] Changelogs on gluster 3.6

2015-02-20 Thread Félix de Lelelis
Hi,

There is anyway to take information about the last changelog that is
applied on slave and master  node in geo-replication?

Thanks
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Split-brain info gluster 3.6

2015-02-20 Thread Ravishankar N
This is fixed in http://review.gluster.org/9459 and should be available 
in 3.7.
As a workaround, you can restart the selfheal daemon process (gluster v 
start  force). This should clear its history.


Thanks,
Ravi

On 02/20/2015 01:43 PM, Félix de Lelelis wrote:

Hi,

I generated a split-brain condition, and I solved it but with "gluster 
volume heal vol_name info split-brain"  yet I can see past entries solved:


Number of entries: 3
atpath on brick
---
2015-02-19 17:13:08 /split
2015-02-19 17:14:09 /split
2015-02-19 17:15:10 /split

Brick srv-vln-des2-priv1:/gfs-to-snap/prueba/brick1/brick
Number of entries: 4
atpath on brick
---
2015-02-19 17:09:32 /split
2015-02-19 17:13:08 /split
2015-02-19 17:14:09 /split
2015-02-19 17:15:10 /split

How can I reset that entries?

Thanks


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Split-brain info gluster 3.6

2015-02-20 Thread Félix de Lelelis
Hi,

I generated a split-brain condition, and I solved it but with "gluster
volume heal vol_name info split-brain"  yet I can see past entries solved:

Number of entries: 3
atpath on brick
---
2015-02-19 17:13:08 /split
2015-02-19 17:14:09 /split
2015-02-19 17:15:10 /split

Brick srv-vln-des2-priv1:/gfs-to-snap/prueba/brick1/brick
Number of entries: 4
atpath on brick
---
2015-02-19 17:09:32 /split
2015-02-19 17:13:08 /split
2015-02-19 17:14:09 /split
2015-02-19 17:15:10 /split

How can I reset that entries?

Thanks
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users