Re: [Gluster-users] AttributeError: python: undefined symbol: gf_changelog_register

2015-05-25 Thread Kotresh Hiremath Ravishankar
Hi Marco,

'gf_changelog_register' is an API exposed from the shared library 
'libgfchangelog.so'.
Please check whether 'libgfchangelog.so' is available to linker by using 
following command.

#ldconfig -p | grep libgfchangelog

If it is not found, please find where the libgfchangelog.so is installed and 
run ldconfig
on it.

e.g., If found at /usr/local/lib/libgfchangelog.so,

#ldconfig /usr/local/lib

After this, confirm whether the library is cached by using first command above 
and try restarting
geo-replication.

Let us know if the library is cached and still you face this issue.
Hope this helps!

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Marco" 
> To: Gluster-users@gluster.org
> Sent: Tuesday, May 26, 2015 4:25:26 AM
> Subject: [Gluster-users] AttributeError: python: undefined symbol:
> gf_changelog_register
> 
> Hello all.
> 
> I have an issue when I'm trying to populate a geo-replication volume:
> 
> [2015-05-25 23:59:26.666712] I [monitor(monitor):129:monitor] Monitor:
> 
> [2015-05-25 23:59:26.667079] I [monitor(monitor):130:monitor] Monitor:
> starting gsyncd worker
> [2015-05-25 23:59:26.762124] I [gsyncd(/gluster):532:main_i] :
> syncing: gluster://localhost:volume1 ->
> ssh://r...@gluster3.marcobaldo.ch:gluster://localhost:volume1_slave
> [2015-05-25 23:59:29.611541] I [master(/gluster):58:gmaster_builder]
> : setting up xsync change detection mode
> [2015-05-25 23:59:29.612349] I [master(/gluster):357:__init__] _GMaster:
> using 'rsync' as the sync engine
> [2015-05-25 23:59:29.613812] I [master(/gluster):58:gmaster_builder]
> : setting up changelog change detection mode
> [2015-05-25 23:59:29.614294] I [master(/gluster):357:__init__] _GMaster:
> using 'rsync' as the sync engine
> [2015-05-25 23:59:29.616271] I [master(/gluster):1103:register]
> _GMaster: xsync temp directory:
> /var/run/gluster/volume1/ssh%3A%2F%2Froot%40192.168.178.233%3Agluster%3A%2F%2F127.0.0.1%3Avolume1_slave/1077eb0027f1f616115bcb74a330d1c2/xsync
> [2015-05-25 23:59:29.648611] E
> [syncdutils(/gluster):240:log_raise_exception] : FAIL:
> Traceback (most recent call last):
>   File "/usr/lib/glusterfs/python/syncdaemon/gsyncd.py", line 150, in main
> main_i()
>   File "/usr/lib/glusterfs/python/syncdaemon/gsyncd.py", line 542, in main_i
> local.service_loop(*[r for r in [remote] if r])
>   File "/usr/lib/glusterfs/python/syncdaemon/resource.py", line 1175, in
> service_loop
> g2.register()
>   File "/usr/lib/glusterfs/python/syncdaemon/master.py", line 1077, in
> register
> workdir, logfile, 9, 5)
>   File "/usr/lib/glusterfs/python/syncdaemon/resource.py", line 614, in
> changelog_register
> Changes.cl_register(cl_brick, cl_dir, cl_log, cl_level, retries)
>  File "/usr/lib/glusterfs/python/syncdaemon/libgfchangelog.py", line 23,
> in cl_register
> ret = cls._get_api('gf_changelog_register')(brick, path,
>   File "/usr/lib/glusterfs/python/syncdaemon/libgfchangelog.py", line
> 19, in _get_api
> return getattr(cls.libgfc, call)
>   File "/usr/lib64/python2.7/ctypes/__init__.py", line 378, in __getattr__
> func = self.__getitem__(name)
>   File "/usr/lib64/python2.7/ctypes/__init__.py", line 383, in __getitem__
> func = self._FuncPtr((name_or_ordinal, self))
> AttributeError: python: undefined symbol: gf_changelog_register
> [2015-05-25 23:59:29.650513] I [syncdutils(/gluster):192:finalize]
> : exiting.
> [2015-05-25 23:59:30.613435] I [monitor(monitor):157:monitor] Monitor:
> worker(/gluster) died in startup phase
> 
> 
> COMMANDS
> 
> # gluster volume geo-replication volume1
> gluster3.marcobaldo.ch::volume1_slave start
> Starting geo-replication session between volume1 &
> gluster3.marcobaldo.ch::volume1_slave has been successful
> 
> # gluster volume geo-replication volume1
> gluster3.marcobaldo.ch::volume1_slave status
>  
> MASTER NODEMASTER VOLMASTER BRICK
> SLAVESTATUS CHECKPOINT
> STATUSCRAWL STATUS
> ---
> fs2volume1   /gluster
> gluster3.marcobaldo.ch::volume1_slaveInitializing...
> N/A  N/A
> fs1volume1   /gluster
> gluster3.marcobaldo.ch::volume1_slaveInitializing...
> N/A  N/A
> 
> and after a few seconds
> 
> # gluster volume geo-replication volume1
> gluster3.marcobaldo.ch::volume1_slave status
>  
> MASTER NODEMASTER VOLMASTER BRICK
> SLAVESTATUSCHECKPOINT STATUS
> CRAWL STATUS
> --
> fs2volume1   /gluster
> gluster3.marcobaldo.ch::volume1_slavefaultyN/A
> N/A
> fs1volume1   /gluster
> gluster

Re: [Gluster-users] [Gluster-devel] Gluster 3.7.0 released

2015-05-25 Thread Atin Mukherjee


On 05/26/2015 03:12 AM, Ted Miller wrote:
> 
> From: Niels de Vos 
> Sent: Monday, May 25, 2015 4:44 PM
> 
> On Mon, May 25, 2015 at 06:49:26PM +, Ted Miller wrote:
>>
>> 
>> From: Humble Devassy Chirammal 
>> Sent: Monday, May 18, 2015 9:37 AM
>> Hi All,
>>
>> GlusterFS 3.7.0 RPMs for RHEL, CentOS, Fedora and packages for Debian are 
>> available at download.gluster.org [1].
>>
>> [1] http://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.0/
>>
>> --Humble
>>
>>
>> On Thu, May 14, 2015 at 2:49 PM, Vijay Bellur 
>> mailto:vbel...@redhat.com>> wrote:
>>
>> Hi All,
>>
>> I am happy to announce that Gluster 3.7.0 is now generally available. 3.7.0 
>> contains several
>>
>> [snip]
>>
>> Cheers,
>> Vijay
>>
>> [snip]
>>
>> What happened to packages for RHEL/Centos 5?  I have the (probably
>> unusual--added gluster to existing servers) setup of running a replica
>> 3 cluster where two nodes run on Centos 6 and one is still on Centos
>> 5.  This is a personal setup, and I have been using
>> http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-5/x86_64/repodata/repomod.xml
>> as my repo.  It has worked fine for a while, but this time the two
>> Centos 6 nodes updated to 3.7, but the Centos 5 node got left behind
>> at 3.6.3.
> 
> Packages for RHEL/CentOS-5 are not available yet. These will follow
> later. Thare are some changes needed to be able to build the packages on
> EL5. Because we are currently stabilizing our CI/regression tests, we do
> not merge any other changes. Until we provide packages in our
> repository, you could apply patch http://review.gluster.org/10803
> yourself and build the EL5 version. I expect that we will do a release
> in 2-3 weeks which will have EL5 RPMs too.
> 
> I have no idea about the problem below, it sounds like something the
> GlusterD developers could help with.
> 
> Niels
> 
>> Command 'gluster volume status' on the C5 machine makes everything
>> look fine:
>>
>> Status of volume: ISO2
>> Gluster process   PortOnline  Pid
>> --
>> Brick 10.x.x.2:/bricks/01/iso249162   Y   4679
>> Brick 10.x.x.4:/bricks/01/iso249183   Y   6447
>> Brick 10.x.x.9:/bricks/01/iso249169   Y   1985
>>
>> But the same command on either of the C6 machines shows the C5 machine
>> (10.x.x.2) missing in action (though it does recognize that there are
>> NFS and heal daemons there):
>>
>> Status of volume: ISO2
>> Gluster process TCP Port  RDMA Port  Online  Pid
>> --
>> Brick 10.41.65.4:/bricks/01/iso249183 0  Y   6447
>> Brick 10.41.65.9:/bricks/01/iso249169 0  Y   1985
>> NFS Server on localhost 2049  0  Y   2279
>> Self-heal Daemon on localhost   N/A   N/AY   2754
>> NFS Server on 10.41.65.22049  0  Y   4757
>> Self-heal Daemon on 10.41.65.2  N/A   N/AY   4764
>> NFS Server on 10.41.65.42049  0  Y   6543
>> Self-heal Daemon on 10.41.65.4  N/A   N/AY   6551
>>
>> So, is this just an oversight (I hope), or has support for C5 been dropped?
>> If support for C5 is gone, how do I downgrade my Centos6 machines back
>> to 3.6.x? (I know how to change the repo, but the actual sequence of
>> yum commands and gluster commands is unknown to me).
Could you attach the glusterd log file of 10.x.x.2 machine and the node
from where you triggered volume status. Could you also share gluster
volume info output of all the nodes?
>>
>> Ted Miller
>> Elkhart, IN, USA
> 
> 
> Thanks for the information.  As long as I know it is coming, I can improvise 
> and hang on.
> 
> I am assuming that the problem with the .2 machine not being seen is a result 
> of running a cluster with a version split.
> 
> Ted Miller
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 

-- 
~Atin
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] AttributeError: python: undefined symbol: gf_changelog_register

2015-05-25 Thread Marco
Hello all.

I have an issue when I'm trying to populate a geo-replication volume:

[2015-05-25 23:59:26.666712] I [monitor(monitor):129:monitor] Monitor:

[2015-05-25 23:59:26.667079] I [monitor(monitor):130:monitor] Monitor:
starting gsyncd worker
[2015-05-25 23:59:26.762124] I [gsyncd(/gluster):532:main_i] :
syncing: gluster://localhost:volume1 ->
ssh://r...@gluster3.marcobaldo.ch:gluster://localhost:volume1_slave
[2015-05-25 23:59:29.611541] I [master(/gluster):58:gmaster_builder]
: setting up xsync change detection mode
[2015-05-25 23:59:29.612349] I [master(/gluster):357:__init__] _GMaster:
using 'rsync' as the sync engine
[2015-05-25 23:59:29.613812] I [master(/gluster):58:gmaster_builder]
: setting up changelog change detection mode
[2015-05-25 23:59:29.614294] I [master(/gluster):357:__init__] _GMaster:
using 'rsync' as the sync engine
[2015-05-25 23:59:29.616271] I [master(/gluster):1103:register]
_GMaster: xsync temp directory:
/var/run/gluster/volume1/ssh%3A%2F%2Froot%40192.168.178.233%3Agluster%3A%2F%2F127.0.0.1%3Avolume1_slave/1077eb0027f1f616115bcb74a330d1c2/xsync
[2015-05-25 23:59:29.648611] E
[syncdutils(/gluster):240:log_raise_exception] : FAIL:
Traceback (most recent call last):
  File "/usr/lib/glusterfs/python/syncdaemon/gsyncd.py", line 150, in main
main_i()
  File "/usr/lib/glusterfs/python/syncdaemon/gsyncd.py", line 542, in main_i
local.service_loop(*[r for r in [remote] if r])
  File "/usr/lib/glusterfs/python/syncdaemon/resource.py", line 1175, in
service_loop
g2.register()
  File "/usr/lib/glusterfs/python/syncdaemon/master.py", line 1077, in
register
workdir, logfile, 9, 5)
  File "/usr/lib/glusterfs/python/syncdaemon/resource.py", line 614, in
changelog_register
Changes.cl_register(cl_brick, cl_dir, cl_log, cl_level, retries)
 File "/usr/lib/glusterfs/python/syncdaemon/libgfchangelog.py", line 23,
in cl_register
ret = cls._get_api('gf_changelog_register')(brick, path,
  File "/usr/lib/glusterfs/python/syncdaemon/libgfchangelog.py", line
19, in _get_api
return getattr(cls.libgfc, call)
  File "/usr/lib64/python2.7/ctypes/__init__.py", line 378, in __getattr__
func = self.__getitem__(name)
  File "/usr/lib64/python2.7/ctypes/__init__.py", line 383, in __getitem__
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: python: undefined symbol: gf_changelog_register
[2015-05-25 23:59:29.650513] I [syncdutils(/gluster):192:finalize]
: exiting.
[2015-05-25 23:59:30.613435] I [monitor(monitor):157:monitor] Monitor:
worker(/gluster) died in startup phase


COMMANDS

# gluster volume geo-replication volume1
gluster3.marcobaldo.ch::volume1_slave start
Starting geo-replication session between volume1 &
gluster3.marcobaldo.ch::volume1_slave has been successful

# gluster volume geo-replication volume1
gluster3.marcobaldo.ch::volume1_slave status
 
MASTER NODEMASTER VOLMASTER BRICK   
SLAVESTATUS CHECKPOINT
STATUSCRAWL STATUS   
---
fs2volume1   /gluster   
gluster3.marcobaldo.ch::volume1_slaveInitializing...   
N/A  N/A
fs1volume1   /gluster   
gluster3.marcobaldo.ch::volume1_slaveInitializing...   
N/A  N/A

and after a few seconds

# gluster volume geo-replication volume1
gluster3.marcobaldo.ch::volume1_slave status
 
MASTER NODEMASTER VOLMASTER BRICK   
SLAVESTATUSCHECKPOINT STATUS   
CRAWL STATUS   
--
fs2volume1   /gluster   
gluster3.marcobaldo.ch::volume1_slavefaultyN/A 
N/A
fs1volume1   /gluster   
gluster3.marcobaldo.ch::volume1_slavefaultyN/A  N/A


VOLUMES
**

Volume Name: volume1
Type: Replicate
Volume ID: 0952d1ce-f62c-40b6-809a-4e193db0f1f9
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: gluster1.marcobaldo.ch:/gluster
Brick2: gluster2.marcobaldo.ch:/gluster
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
nfs.disable: off
 
Volume Name: volume1_slave
Type: Distribute
Volume ID: b0b161d8-a642-4d41-808e-2bb076989f78
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: gluster3.marcobaldo.ch:/gluster_slave


VERSION
*

# glusterd -V
glusterfs 3.5.2 built on *bleep*
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. 
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of

Re: [Gluster-users] [Gluster-devel] Gluster 3.7.0 released

2015-05-25 Thread Ted Miller

From: Niels de Vos 
Sent: Monday, May 25, 2015 4:44 PM

On Mon, May 25, 2015 at 06:49:26PM +, Ted Miller wrote:
>
> 
> From: Humble Devassy Chirammal 
> Sent: Monday, May 18, 2015 9:37 AM
> Hi All,
>
> GlusterFS 3.7.0 RPMs for RHEL, CentOS, Fedora and packages for Debian are 
> available at download.gluster.org [1].
>
> [1] http://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.0/
>
> --Humble
>
>
> On Thu, May 14, 2015 at 2:49 PM, Vijay Bellur 
> mailto:vbel...@redhat.com>> wrote:
>
> Hi All,
>
> I am happy to announce that Gluster 3.7.0 is now generally available. 3.7.0 
> contains several
>
> [snip]
>
> Cheers,
> Vijay
>
> [snip]
>
> What happened to packages for RHEL/Centos 5?  I have the (probably
> unusual--added gluster to existing servers) setup of running a replica
> 3 cluster where two nodes run on Centos 6 and one is still on Centos
> 5.  This is a personal setup, and I have been using
> http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-5/x86_64/repodata/repomod.xml
> as my repo.  It has worked fine for a while, but this time the two
> Centos 6 nodes updated to 3.7, but the Centos 5 node got left behind
> at 3.6.3.

Packages for RHEL/CentOS-5 are not available yet. These will follow
later. Thare are some changes needed to be able to build the packages on
EL5. Because we are currently stabilizing our CI/regression tests, we do
not merge any other changes. Until we provide packages in our
repository, you could apply patch http://review.gluster.org/10803
yourself and build the EL5 version. I expect that we will do a release
in 2-3 weeks which will have EL5 RPMs too.

I have no idea about the problem below, it sounds like something the
GlusterD developers could help with.

Niels

> Command 'gluster volume status' on the C5 machine makes everything
> look fine:
>
> Status of volume: ISO2
> Gluster process   PortOnline  Pid
> --
> Brick 10.x.x.2:/bricks/01/iso249162   Y   4679
> Brick 10.x.x.4:/bricks/01/iso249183   Y   6447
> Brick 10.x.x.9:/bricks/01/iso249169   Y   1985
>
> But the same command on either of the C6 machines shows the C5 machine
> (10.x.x.2) missing in action (though it does recognize that there are
> NFS and heal daemons there):
>
> Status of volume: ISO2
> Gluster process TCP Port  RDMA Port  Online  Pid
> --
> Brick 10.41.65.4:/bricks/01/iso249183 0  Y   6447
> Brick 10.41.65.9:/bricks/01/iso249169 0  Y   1985
> NFS Server on localhost 2049  0  Y   2279
> Self-heal Daemon on localhost   N/A   N/AY   2754
> NFS Server on 10.41.65.22049  0  Y   4757
> Self-heal Daemon on 10.41.65.2  N/A   N/AY   4764
> NFS Server on 10.41.65.42049  0  Y   6543
> Self-heal Daemon on 10.41.65.4  N/A   N/AY   6551
>
> So, is this just an oversight (I hope), or has support for C5 been dropped?
> If support for C5 is gone, how do I downgrade my Centos6 machines back
> to 3.6.x? (I know how to change the repo, but the actual sequence of
> yum commands and gluster commands is unknown to me).
>
> Ted Miller
> Elkhart, IN, USA


Thanks for the information.  As long as I know it is coming, I can improvise 
and hang on.

I am assuming that the problem with the .2 machine not being seen is a result 
of running a cluster with a version split.

Ted Miller
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] Gluster 3.7.0 released

2015-05-25 Thread Niels de Vos
On Mon, May 25, 2015 at 06:49:26PM +, Ted Miller wrote:
> 
> 
> From: Humble Devassy Chirammal 
> Sent: Monday, May 18, 2015 9:37 AM
> Hi All,
> 
> GlusterFS 3.7.0 RPMs for RHEL, CentOS, Fedora and packages for Debian are 
> available at download.gluster.org [1].
> 
> [1] http://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.0/
> 
> --Humble
> 
> 
> On Thu, May 14, 2015 at 2:49 PM, Vijay Bellur 
> mailto:vbel...@redhat.com>> wrote:
> 
> Hi All,
> 
> I am happy to announce that Gluster 3.7.0 is now generally available. 3.7.0 
> contains several
> 
> [snip]
> 
> Cheers,
> Vijay
> 
> [snip]
> 
> What happened to packages for RHEL/Centos 5?  I have the (probably
> unusual--added gluster to existing servers) setup of running a replica
> 3 cluster where two nodes run on Centos 6 and one is still on Centos
> 5.  This is a personal setup, and I have been using
> http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-5/x86_64/repodata/repomod.xml
> as my repo.  It has worked fine for a while, but this time the two
> Centos 6 nodes updated to 3.7, but the Centos 5 node got left behind
> at 3.6.3.

Packages for RHEL/CentOS-5 are not available yet. These will follow
later. Thare are some changes needed to be able to build the packages on
EL5. Because we are currently stabilizing our CI/regression tests, we do
not merge any other changes. Until we provide packages in our
repository, you could apply patch http://review.gluster.org/10803
yourself and build the EL5 version. I expect that we will do a release
in 2-3 weeks which will have EL5 RPMs too.

I have no idea about the problem below, it sounds like something the
GlusterD developers could help with.

Niels

> Command 'gluster volume status' on the C5 machine makes everything
> look fine:
> 
> Status of volume: ISO2
> Gluster process   PortOnline  Pid
> --
> Brick 10.x.x.2:/bricks/01/iso249162   Y   4679
> Brick 10.x.x.4:/bricks/01/iso249183   Y   6447
> Brick 10.x.x.9:/bricks/01/iso249169   Y   1985
> 
> But the same command on either of the C6 machines shows the C5 machine
> (10.x.x.2) missing in action (though it does recognize that there are
> NFS and heal daemons there):
> 
> Status of volume: ISO2
> Gluster process TCP Port  RDMA Port  Online  Pid
> --
> Brick 10.41.65.4:/bricks/01/iso249183 0  Y   6447
> Brick 10.41.65.9:/bricks/01/iso249169 0  Y   1985
> NFS Server on localhost 2049  0  Y   2279
> Self-heal Daemon on localhost   N/A   N/AY   2754
> NFS Server on 10.41.65.22049  0  Y   4757
> Self-heal Daemon on 10.41.65.2  N/A   N/AY   4764
> NFS Server on 10.41.65.42049  0  Y   6543
> Self-heal Daemon on 10.41.65.4  N/A   N/AY   6551
> 
> So, is this just an oversight (I hope), or has support for C5 been dropped?
> If support for C5 is gone, how do I downgrade my Centos6 machines back
> to 3.6.x? (I know how to change the repo, but the actual sequence of
> yum commands and gluster commands is unknown to me).
> 
> Ted Miller
> Elkhart, IN, USA

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] Gluster 3.7.0 released

2015-05-25 Thread Ted Miller


From: Humble Devassy Chirammal 
Sent: Monday, May 18, 2015 9:37 AM
Hi All,

GlusterFS 3.7.0 RPMs for RHEL, CentOS, Fedora and packages for Debian are 
available at download.gluster.org [1].

[1] http://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.0/

--Humble


On Thu, May 14, 2015 at 2:49 PM, Vijay Bellur 
mailto:vbel...@redhat.com>> wrote:

Hi All,

I am happy to announce that Gluster 3.7.0 is now generally available. 3.7.0 
contains several

[snip]

Cheers,
Vijay

[snip]

What happened to packages for RHEL/Centos 5?  I have the (probably 
unusual--added gluster to existing servers) setup of running a replica 3 
cluster where two nodes run on Centos 6 and one is still on Centos 5.  This is 
a personal setup, and I have been using 
http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-5/x86_64/repodata/repomod.xml
 as my repo.  It has worked fine for a while, but this time the two Centos 6 
nodes updated to 3.7, but the Centos 5 node got left behind at 3.6.3.  Command 
'gluster volume status' on the C5 machine makes everything look fine:

Status of volume: ISO2
Gluster process   PortOnline  Pid
--
Brick 10.x.x.2:/bricks/01/iso249162   Y   4679
Brick 10.x.x.4:/bricks/01/iso249183   Y   6447
Brick 10.x.x.9:/bricks/01/iso249169   Y   1985

But the same command on either of the C6 machines shows the C5 machine 
(10.x.x.2) missing in action (though it does recognize that there are NFS and 
heal daemons there):

Status of volume: ISO2
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick 10.41.65.4:/bricks/01/iso249183 0  Y   6447
Brick 10.41.65.9:/bricks/01/iso249169 0  Y   1985
NFS Server on localhost 2049  0  Y   2279
Self-heal Daemon on localhost   N/A   N/AY   2754
NFS Server on 10.41.65.22049  0  Y   4757
Self-heal Daemon on 10.41.65.2  N/A   N/AY   4764
NFS Server on 10.41.65.42049  0  Y   6543
Self-heal Daemon on 10.41.65.4  N/A   N/AY   6551

So, is this just an oversight (I hope), or has support for C5 been dropped?
If support for C5 is gone, how do I downgrade my Centos6 machines back to 
3.6.x? (I know how to change the repo, but the actual sequence of yum commands 
and gluster commands is unknown to me).

Ted Miller
Elkhart, IN, USA
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync

2015-05-25 Thread PEPONNET, Cyril N (Cyril)
Thanks for the clarification.

Regarding our setup the xsync is done, but I had to force the change.detector 
to changelog (not switched automatiquely even after a geo-rep stop / restart).

Now change log is enabled, let see how it behave :)

--
Cyril Peponnet

On May 25, 2015, at 12:43 AM, Kotresh Hiremath Ravishankar 
mailto:khire...@redhat.com>> wrote:

Hi Cyril,

Answers inline

Thanks and Regards,
Kotresh H R

- Original Message -
From: "Cyril N PEPONNET (Cyril)" 
mailto:cyril.pepon...@alcatel-lucent.com>>
To: "Kotresh Hiremath Ravishankar" 
mailto:khire...@redhat.com>>
Cc: "gluster-users" 
mailto:gluster-users@gluster.org>>
Sent: Friday, May 22, 2015 9:34:47 PM
Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not present 
- Falling back to xsync

One last question, correct me if I’m wrong.

When you start a geo-rep process it starts with xsync aka hybrid crawling
(sending files every 60s, with files windows set as 8192 files per sent).

When the crawl is done it should use changelog detector and dynamically
change things to slaves.

1/ During the hybride crawl, if we delete files from master (and they were
already transfered to the slave), xsync process will not delete them from
the slave (and we can’t change as the option as is hardcoded).
When it will pass to changelog, will it remove the non existent folders and
files on the slave that are no longer on the master ?


 You are right, xsync does not sync delete files, once it is already synced.
 After xsync, when it switches to changelog, it doesn't delete all the non 
existing
 entries on slave that are no longer on the master. Changelog is capable of 
deleting
 files from the time it got switched to changelog.

2/ With changelog, if I add a file of 10GB and after a file of 1KB, will the
changelog process with queue (waiting for the 10GB file to be sent) or are
the sent done in thread ?
(ex I add a 10GB file and I delete it after 1min, what will happen ?)

  Changelog records the operations happened in master and is replayed by 
geo-replication
  on to slave volume. Geo-replication syncs files in two phases.

  1. Phase-1: Create entries through RPC( 0 byte files on slave keeping gfid 
intact as in master)
  2. Phase-2: Sync data, through rsync/tar_over_ssh (Multi threaded)

  Ok, now keeping that in mind, Phase-1 happens serially, and the phase two 
happens parallely.
  Zero byte files of 10GB and 1KB gets created on slave serially and data for 
the same syncs
  parallely. Another thing to remember, geo-rep makes sure that, syncing data 
to file is tried
  only after zero byte file for the same is created already.


In latest release 3.7, xsync crawl is minimized by the feature called history 
crawl introduced in 3.6.
So the chances of missing deletes/renames are less.

Thanks.

--
Cyril Peponnet

On May 21, 2015, at 10:22 PM, Kotresh Hiremath Ravishankar
mailto:khire...@redhat.com>> wrote:

Great, hope that should work. Let's see

Thanks and Regards,
Kotresh H R

- Original Message -
From: "Cyril N PEPONNET (Cyril)" 
mailto:cyril.pepon...@alcatel-lucent.com>>
To: "Kotresh Hiremath Ravishankar" 
mailto:khire...@redhat.com>>
Cc: "gluster-users" 
mailto:gluster-users@gluster.org>>
Sent: Friday, May 22, 2015 5:31:13 AM
Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not
present - Falling back to xsync

Thanks to JoeJulian / Kaushal I managed to re-enable the changelog option
and
the socket is now present.

For the record I had some clients running rhs gluster-fuse and our nodes
are
running glusterfs release and op-version are not “compatible”.

Now I have to wait for the init crawl see if it switches to changelog
detector mode.

Thanks Kotresh
--
Cyril Peponnet

On May 21, 2015, at 8:39 AM, Cyril Peponnet
mailto:cyril.pepon...@alcatel-lucent.com>> 
wrote:

Hi,

Unfortunately,

# gluster vol set usr_global changelog.changelog off
volume set: failed: Staging failed on
mvdcgluster01.us.alcatel-lucent.com.
Error: One or more connected clients cannot support the feature being
set.
These clients need to be upgraded or disconnected before running this
command again


I don’t know really why, I have some clients using 3.6 as fuse client
others are running on 3.5.2.

Any advice ?

--
Cyril Peponnet

On May 20, 2015, at 5:17 AM, Kotresh Hiremath Ravishankar
mailto:khire...@redhat.com>> wrote:

Hi Cyril,

From the brick logs, it seems the changelog-notifier thread has got
killed
for some reason,
as notify is failing with EPIPE.

Try the following. It should probably help:
1. Stop geo-replication.
2. Disable changelog: gluster vol set 
changelog.changelog off
3. Enable changelog: glluster vol set 
changelog.changelog on
4. Start geo-replication.

Let me know if it works.

Thanks and Regards,
Kotresh H R

- Original Message -
From: "Cyril N PEPONNET (Cyril)" 
mailto:cyril.pepon...@alcatel-lucent.com>>
To: "gluster-users" 
mailto:gluster-users@gluster.org>>
Sen

[Gluster-users] how about all bricks in one process with multi-thread

2015-05-25 Thread 张兵
Hi all
I test glusterfs performance ,In my storage node ,I have 16 disks ,one disk 
as a glusterfs brick;
when volume run,there is 16 glusterfsd process running ,one gluterfsd occupy10% 
cpu ,all 16 bricks
occupy all cpu ;
   I want decrease system cpu occupy ,why not all bricks in one process with 
multi-thread;
Is there any try all bricks in one process;
best regard.
 James

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Centos7x64] Geo-replication problem glusterfs 3.7.0-2

2015-05-25 Thread wodel youchi
Hi, and thanks for your replies.

For Kotresh : No, I am not using tar ssh for my geo-replication.

For Aravinda: I had to recreate my slave volume all over et restart the
geo-replication.

If I have thousands of files with this problem, do I have to execute the
fix for all of them? is there an easy way?
Can checkpoints help me in this situation?
and more important, what can cause this problem?

I am syncing containers, they contain lot of files small files, using tar
ssh, would it be more suitable?


PS: I tried to execute this command on the Master

bash generate-gfid-file.sh localhost:data2   $PWD/get-gfid.sh
/tmp/master_gfid_file.txt

but I got errors with files that have blank (space) in their names,
for example: Admin Guide.pdf

the script sees two files Admin and Guide.pdf, then the get-gfid.sh
returns errors "no such file or directory"

thanks.


2015-05-25 7:00 GMT+01:00 Aravinda :

> Looks like this is GFID conflict issue not the tarssh issue.
>
> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
> 'e529a399-756d-4cb1-9779-0af2822a0d94', 'gid': 0, 'mode': 33152, 'entry':
> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.mdb', 'op': 'CREATE'}, 2)
>
> Data: {'uid': 0,
>'gfid': 'e529a399-756d-4cb1-9779-0af2822a0d94',
>'gid': 0,
>'mode': 33152,
>'entry': '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.mdb',
>'op': 'CREATE'}
>
> and Error: 2
>
> During creation of "main.mdb" RPC failed with error number 2, ie, ENOENT.
> This error comes when parent directory not exists or exists with different
> GFID.
> In this case Parent GFID "874799ef-df75-437b-bc8f-3fcd58b54789" does not
> exists on slave.
>
>
> To fix the issue,
> -
> Find the parent directory of "main.mdb",
> Get the GFID of that directory, using getfattr
> Check the GFID of the same directory in Slave(To confirm GFIDs are
> different)
> To fix the issue, Delete that directory in Slave.
> Set virtual xattr for that directory and all the files inside that
> directory.
> setfattr -n glusterfs.geo-rep.trigger-sync -v "1" 
> setfattr -n glusterfs.geo-rep.trigger-sync -v "1" 
>
>
> Geo-rep will recreate the directory with Proper GFID and starts sync.
>
> Let us know if you need any help.
>
> --
> regards
> Aravinda
>
>
>
>
> On 05/25/2015 10:54 AM, Kotresh Hiremath Ravishankar wrote:
>
>> Hi Wodel,
>>
>> Is the sync mode, tar over ssh (i.e., config use_tarssh is true) ?
>> If yes, there is known issue with it and patch is already up in master.
>>
>> But it can be resolved in either of the two ways.
>>
>> 1. If sync mode required is tar over ssh, just disable sync_xattrs which
>> is true
>> by default.
>>
>>  gluster vol geo-rep  :: config
>> sync_xattrs false
>>
>> 2. If sync mode is ok to be changed to rsync. Please do.
>>   gluster vol geo-rep  ::
>> use_tarssh false
>>
>> NOTE: rsync supports syncing of acls and xattrs where as tar over ssh
>> does not.
>>In 3.7.0-2, tar over ssh should be used with sync_xattrs to false
>>
>> Hope this helps.
>>
>> Thanks and Regards,
>> Kotresh H R
>>
>> - Original Message -
>>
>>> From: "wodel youchi" 
>>> To: "gluster-users" 
>>> Sent: Sunday, May 24, 2015 3:31:38 AM
>>> Subject: [Gluster-users] [Centos7x64] Geo-replication problem glusterfs
>>> 3.7.0-2
>>>
>>> Hi,
>>>
>>> I have two gluster servers in replicated mode as MASTERS
>>> and one server for replicated geo-replication.
>>>
>>> I've updated my glusterfs installation to 3.7.0-2, all three servers
>>>
>>> I've recreated my slave volumes
>>> I've started the geo-replication, it worked for a while and now I have
>>> some
>>> problmes
>>>
>>> 1- Files/directories are not deleted on slave
>>> 2- New files/rectories are not synced to the slave.
>>>
>>> I have these lines on the active master
>>>
>>> [2015-05-23 06:21:17.156939] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> 'e529a399-756d-4cb1-9779-0af2822a0d94', 'gid': 0, 'mode': 33152, 'entry':
>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.mdb', 'op': 'CREATE'},
>>> 2)
>>> [2015-05-23 06:21:17.158066] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> 'b4bffa4c-2e88-4b60-9f6a-c665c4d9f7ed', 'gid': 0, 'mode': 33152, 'entry':
>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.hdb', 'op': 'CREATE'},
>>> 2)
>>> [2015-05-23 06:21:17.159154] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> '9920cdee-6b87-4408-834b-4389f5d451fe', 'gid': 0, 'mode': 33152, 'entry':
>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.db', 'op': 'CREATE'}, 2)
>>> [2015-05-23 06:21:17.160242] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> '307756d2-d924-456f-b090-10d3ff9caccb', 'gid': 0, 'mode': 33152, 'entry':
>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.ndb', 'op': 'CREATE'},
>>> 2)
>

Re: [Gluster-users] Fwd:Re: client is terrible with large amount of    small files

2015-05-25 Thread Kamal
Hi Joe,

   17 seconds is for 442kkB. Is that normal ?

 Duration: 17 seconds > Data Read: 0 bytes >Data Written: 442368 bytes


 1349 seconds for 22MB
 > Duration: 1349 seconds > Data Read: 624 bytes >Data Written: 
26675732 bytes

Regards,
Kamal

 On Fri, 08 May 2015 11:47:13 +0530 Joe Julian 
wrote  

Looks like 17 seconds. That's not 14 minutes. 

On May 7, 2015 10:55:05 PM PDT, gjprabu  wrote: Hi 
Team,
 
  Any options to solve below issues.

Regards
Prabu
 

 On Thu, 07 May 2015 12:23:02 +0530  wrote  

Hi Vijay,

Do we have any other options to increase the performance.

Regards
Prabu





 On Wed, 06 May 2015 15:51:20 +0530 gjprabu  
wrote  

Hi Vijay,

   We tired on physical machines but its doesn't improve speed.

# gluster volume info
 
Volume Name: integvoltest
Type: Replicate
Volume ID: 6c66afb9-d466-428e-b944-e15d7a1be5f2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: integ-gluster3:/srv/sdb1/brick7
Brick2: integ-gluster4:/srv/sdb1/brick7
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
cluster.ensure-durability: off
cluster.readdir-optimize: on
performance.readdir-ahead: on
server.event-threads: 30
client.event-threads: 30

 

 
gluster volume profile integvoltest info
Brick: integ-gluster3:/srv/sdb1/brick7
--
Cumulative Stats:
   Block Size:  4b+   8b+  16b+ 
 No. of Reads:0 0 1 
No. of Writes:2 4 8 
 
   Block Size: 32b+  64b+ 128b+ 
 No. of Reads:2 2 2 
No. of Writes:8 6 6 
 
   Block Size:256b+ 512b+1024b+ 
 No. of Reads:0 0 0 
No. of Writes:4 2 6 
 
   Block Size:   2048b+4096b+ 
 No. of Reads:0 0 
No. of Writes:2  6507 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls Fop
 -   ---   ---   ---   
  0.00   0.00 us   0.00 us   0.00 us 37  FORGET
  0.00   0.00 us   0.00 us   0.00 us 61 RELEASE
  0.00   0.00 us   0.00 us   0.00 us 202442  RELEASEDIR
  0.00 142.00 us 142.00 us 142.00 us  1 REMOVEXATTR
  0.00 101.00 us  79.00 us 142.00 us  3STAT
  0.00 313.00 us 313.00 us 313.00 us  1 XATTROP
  0.00 120.00 us  96.00 us 145.00 us  3READ
  0.00 101.75 us  69.00 us 158.00 us  4  STATFS
  0.00 131.25 us 112.00 us 147.00 us  4GETXATTR
  0.00 256.00 us 216.00 us 309.00 us  3  UNLINK
  0.00 820.00 us 820.00 us 820.00 us  1 SYMLINK
  0.00 109.80 us  72.00 us 197.00 us 10 READDIR
  0.00 125.58 us 100.00 us 161.00 us 12 SETATTR
  0.00 138.36 us 102.00 us 196.00 us 11OPEN
  0.00  55.38 us  24.00 us 240.00 us 29   FLUSH
  0.00 445.00 us 125.00 us 937.00 us  4READDIRP
  0.01 306.43 us 165.00 us 394.00 us  7  RENAME
  0.01 199.55 us 153.00 us 294.00 us 11SETXATTR
  0.01  72.64 us  28.00 us 227.00 us 47FINODELK
  0.02  67.69 us  30.00 us 241.00 us 96 ENTRYLK
  0.031038.18 us 943.00 us1252.00 us 11   MKDIR
  0.03 251.49 us 147.00 us 865.00 us 53FXATTROP
  0.061115.60 us 808.00 us1860.00 us 20  CREATE
  0.07 323.83 us  31.00 us   22132.00 us 88 INODELK
  1.41 170.57 us  79.00 us2022.00 us   3262   WRITE
 26.35 103.15 us   4.00 us 260.00 us 100471 OPENDIR
 71.98 139.07 us  47.00 us 471.00 us 203591  LOOKUP
 
Duration: 1349 seconds
   Data Read: 624 bytes
Data Written: 26675732 bytes
 
Interval 25 Stats:
   Block Size:   4096b+ 
 No. of Reads:0 
No. of Writes:  108 
 %-latency   Avg-latency   Min-

Re: [Gluster-users] gluster 3.4.5,gluster client process was core dump

2015-05-25 Thread Dang Zhiqiang
Thank you very much.


A relevant log:
data1.log:20694:[2015-05-21 07:24:32.652102] E [quota.c:318:quota_check_limit] 
(-->/usr/lib64/glusterfs/3.4.5/xlator/cluster/replicate.so(afr_getxattr_cbk+0xf8)
 [0x7f81fccc5168] 
(-->/usr/lib64/glusterfs/3.4.5/xlator/cluster/distribute.so(dht_getxattr_cbk+0x17d)
 [0x7f81fca8736d] 
(-->/usr/lib64/glusterfs/3.4.5/xlator/features/quota.so(quota_validate_cbk+0x1cd)
 [0x7f81fc8578fd]))) 0-dfs-quota: invalid argument: local->stub


local->stub == NULL



gdb) l *0x7f81fc8578fd

0x7f81fc8578fd is in quota_validate_cbk (quota.c:243).

238  gettimeofday (&ctx->tv, NULL);

239  }

240  UNLOCK (&ctx->lock);

241 

242  quota_check_limit (frame, local->validate_loc.inode, this, NULL, 
NULL);

243  return 0;

244 

245  unwind:

246  LOCK (&local->lock);

247  {

 

quota_check_limit

318 GF_VALIDATE_OR_GOTO (this->name, local->stub, out);


At 2015-05-25 18:17:52, "Susant Palai"  wrote:
>We found a similar crash and the fix for the same is here 
>http://review.gluster.org/#/c/10389/. You can find the RCA in the commit 
>message.
>
>Regards,
>Susant
>
>- Original Message -
>> From: "Dang Zhiqiang" 
>> To: gluster-users@gluster.org
>> Sent: Monday, 25 May, 2015 3:30:16 PM
>> Subject: [Gluster-users]  gluster 3.4.5,gluster client process was core dump
>> 
>> Hi,
>> 
>> Why this is and how to fix it?
>> Thanks.
>> 
>> client log:
>> data1.log:20695:[2015-05-25 03:12:31.084149] W
>> [dht-common.c:2016:dht_getxattr_cbk]
>> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x346d80d6f5]
>> (-->/usr/lib64/glusterfs/3.4.5/xlator/protocol/client.so(client3_3_getxattr_cbk+0x178)
>> [0x7f81fcf33ad8]
>> (-->/usr/lib64/glusterfs/3.4.5/xlator/cluster/replicate.so(afr_getxattr_cbk+0xf8)
>> [0x7f81fccc5168]))) 0-dfs-dht: invalid argument: frame->local
>> 
>> core dump info:
>> Core was generated by `/usr/sbin/glusterfs --volfile-id=dfs
>> --volfile-server=node1 /dat'.
>> Program terminated with signal 11, Segmentation fault.
>> #0 0x7f81fca87354 in dht_getxattr_cbk (frame=0x7f82009efe34,
>> cookie=, this=, op_ret=> optimized out>, op_errno=0,
>> xattr=, xdata=0x0) at dht-common.c:2043
>> 2043 DHT_STACK_UNWIND (getxattr, frame, local->op_ret, op_errno,
>> Missing separate debuginfos, use: debuginfo-install
>> glibc-2.12-1.132.el6_5.4.x86_64 keyutils-libs-1.4-4.el6.x86_64
>> krb5-libs-1.10.3-15.el6_5.1.x86_64 libcom_err-1.41.12-18.el6_5.1.x86_64
>> libgcc-4.4.7-4.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64
>> openssl-1.0.1e-16.el6_5.7.x86_64 zlib-1.2.3-29.el6.x86_64
>> (gdb) bt
>> #0 0x7f81fca87354 in dht_getxattr_cbk (frame=0x7f82009efe34,
>> cookie=, this=, op_ret=> optimized out>, op_errno=0,
>> xattr=, xdata=0x0) at dht-common.c:2043
>> #1 0x7f81fccc5168 in afr_getxattr_cbk (frame=0x7f8200a0d32c,
>> cookie=, this=, op_ret=0,
>> op_errno=0, dict=0x7f82003a768c,
>> xdata=0x0) at afr-inode-read.c:618
>> #2 0x7f81fcf33ad8 in client3_3_getxattr_cbk (req=,
>> iov=, count=,
>> myframe=0x7f82009a58fc)
>> at client-rpc-fops.c:1115
>> #3 0x00346d80d6f5 in rpc_clnt_handle_reply (clnt=0x232cb40,
>> pollin=0x1173ac10) at rpc-clnt.c:771
>> #4 0x00346d80ec6f in rpc_clnt_notify (trans=,
>> mydata=0x232cb70, event=, data=)
>> at rpc-clnt.c:891
>> #5 0x00346d80a4e8 in rpc_transport_notify (this=,
>> event=, data=) at
>> rpc-transport.c:497
>> #6 0x7f81fdf7f216 in socket_event_poll_in (this=0x233c5a0) at
>> socket.c:2118
>> #7 0x7f81fdf80c3d in socket_event_handler (fd=,
>> idx=, data=0x233c5a0, poll_in=1, poll_out=0,
>> poll_err=0) at socket.c:2230
>> #8 0x00346d45e907 in event_dispatch_epoll_handler (event_pool=0x228be90)
>> at event-epoll.c:384
>> #9 event_dispatch_epoll (event_pool=0x228be90) at event-epoll.c:445
>> #10 0x00406818 in main (argc=4, argv=0x7fff9e2e4898) at
>> glusterfsd.c:1934
>> (gdb) print ((call_frame_t *)0x7f82009efe34)->local
>> $2 = (void *) 0x0
>> (gdb) l *0x7f81fca87354
>> 0x7f81fca87354 is in dht_getxattr_cbk (dht-common.c:2043).
>> 2038 dht_aggregate_xattr (xattr, local->xattr);
>> 2039 local->xattr = dict_copy (xattr, local->xattr);
>> 2040 }
>> 2041 out:
>> 2042 if (is_last_call (this_call_cnt)) {
>> 2043 DHT_STACK_UNWIND (getxattr, frame, local->op_ret, op_errno,
>> 2044 local->xattr, NULL);
>> 2045 }
>> 2046 return 0;
>> 2047 }
>> 
>> jump code:
>> 2016 VALIDATE_OR_GOTO (frame->local, out);
>> 
>> 
>> volume info:
>> # gluster v info
>> Volume Name: dfs
>> Type: Distributed-Replicate
>> Volume ID: 1848afb0-44ef-418c-a58f-8d7159ec5d1e
>> Status: Started
>> Number of Bricks: 2 x 2 = 4
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1:/data/vol/dfs
>> Brick2: node2:/data/vol/dfs
>> Brick3: node3:/data/vol/dfs
>> Brick4: node4:/data/vol/dfs
>> Options Reconfigured:
>> diagnostics.client-log-level: WARNING
>> diagnostics.brick-log-level: WARNING
>> nfs.disable: on
>> features.quota: on
>> features.limit-usage:
>> /video/CLOUD:200TB,/video/YI

Re: [Gluster-users] gluster 3.4.5,gluster client process was core dump

2015-05-25 Thread Susant Palai
We found a similar crash and the fix for the same is here 
http://review.gluster.org/#/c/10389/. You can find the RCA in the commit 
message.

Regards,
Susant

- Original Message -
> From: "Dang Zhiqiang" 
> To: gluster-users@gluster.org
> Sent: Monday, 25 May, 2015 3:30:16 PM
> Subject: [Gluster-users]  gluster 3.4.5,gluster client process was core dump
> 
> Hi,
> 
> Why this is and how to fix it?
> Thanks.
> 
> client log:
> data1.log:20695:[2015-05-25 03:12:31.084149] W
> [dht-common.c:2016:dht_getxattr_cbk]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x346d80d6f5]
> (-->/usr/lib64/glusterfs/3.4.5/xlator/protocol/client.so(client3_3_getxattr_cbk+0x178)
> [0x7f81fcf33ad8]
> (-->/usr/lib64/glusterfs/3.4.5/xlator/cluster/replicate.so(afr_getxattr_cbk+0xf8)
> [0x7f81fccc5168]))) 0-dfs-dht: invalid argument: frame->local
> 
> core dump info:
> Core was generated by `/usr/sbin/glusterfs --volfile-id=dfs
> --volfile-server=node1 /dat'.
> Program terminated with signal 11, Segmentation fault.
> #0 0x7f81fca87354 in dht_getxattr_cbk (frame=0x7f82009efe34,
> cookie=, this=, op_ret= optimized out>, op_errno=0,
> xattr=, xdata=0x0) at dht-common.c:2043
> 2043 DHT_STACK_UNWIND (getxattr, frame, local->op_ret, op_errno,
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.12-1.132.el6_5.4.x86_64 keyutils-libs-1.4-4.el6.x86_64
> krb5-libs-1.10.3-15.el6_5.1.x86_64 libcom_err-1.41.12-18.el6_5.1.x86_64
> libgcc-4.4.7-4.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64
> openssl-1.0.1e-16.el6_5.7.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0 0x7f81fca87354 in dht_getxattr_cbk (frame=0x7f82009efe34,
> cookie=, this=, op_ret= optimized out>, op_errno=0,
> xattr=, xdata=0x0) at dht-common.c:2043
> #1 0x7f81fccc5168 in afr_getxattr_cbk (frame=0x7f8200a0d32c,
> cookie=, this=, op_ret=0,
> op_errno=0, dict=0x7f82003a768c,
> xdata=0x0) at afr-inode-read.c:618
> #2 0x7f81fcf33ad8 in client3_3_getxattr_cbk (req=,
> iov=, count=,
> myframe=0x7f82009a58fc)
> at client-rpc-fops.c:1115
> #3 0x00346d80d6f5 in rpc_clnt_handle_reply (clnt=0x232cb40,
> pollin=0x1173ac10) at rpc-clnt.c:771
> #4 0x00346d80ec6f in rpc_clnt_notify (trans=,
> mydata=0x232cb70, event=, data=)
> at rpc-clnt.c:891
> #5 0x00346d80a4e8 in rpc_transport_notify (this=,
> event=, data=) at
> rpc-transport.c:497
> #6 0x7f81fdf7f216 in socket_event_poll_in (this=0x233c5a0) at
> socket.c:2118
> #7 0x7f81fdf80c3d in socket_event_handler (fd=,
> idx=, data=0x233c5a0, poll_in=1, poll_out=0,
> poll_err=0) at socket.c:2230
> #8 0x00346d45e907 in event_dispatch_epoll_handler (event_pool=0x228be90)
> at event-epoll.c:384
> #9 event_dispatch_epoll (event_pool=0x228be90) at event-epoll.c:445
> #10 0x00406818 in main (argc=4, argv=0x7fff9e2e4898) at
> glusterfsd.c:1934
> (gdb) print ((call_frame_t *)0x7f82009efe34)->local
> $2 = (void *) 0x0
> (gdb) l *0x7f81fca87354
> 0x7f81fca87354 is in dht_getxattr_cbk (dht-common.c:2043).
> 2038 dht_aggregate_xattr (xattr, local->xattr);
> 2039 local->xattr = dict_copy (xattr, local->xattr);
> 2040 }
> 2041 out:
> 2042 if (is_last_call (this_call_cnt)) {
> 2043 DHT_STACK_UNWIND (getxattr, frame, local->op_ret, op_errno,
> 2044 local->xattr, NULL);
> 2045 }
> 2046 return 0;
> 2047 }
> 
> jump code:
> 2016 VALIDATE_OR_GOTO (frame->local, out);
> 
> 
> volume info:
> # gluster v info
> Volume Name: dfs
> Type: Distributed-Replicate
> Volume ID: 1848afb0-44ef-418c-a58f-8d7159ec5d1e
> Status: Started
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp
> Bricks:
> Brick1: node1:/data/vol/dfs
> Brick2: node2:/data/vol/dfs
> Brick3: node3:/data/vol/dfs
> Brick4: node4:/data/vol/dfs
> Options Reconfigured:
> diagnostics.client-log-level: WARNING
> diagnostics.brick-log-level: WARNING
> nfs.disable: on
> features.quota: on
> features.limit-usage:
> /video/CLOUD:200TB,/video/YINGSHIKU:200TB,/video/LIVENEW:200TB,/video/SOCIAL:200TB,/video/mini:200TB,/video/2013:200TB,/video:200TB
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] gluster 3.4.5,gluster client process was core dump

2015-05-25 Thread Dang Zhiqiang
Hi,


Why this is and how to fix it?
Thanks.


client log:
data1.log:20695:[2015-05-25 03:12:31.084149] W 
[dht-common.c:2016:dht_getxattr_cbk] 
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x346d80d6f5] 
(-->/usr/lib64/glusterfs/3.4.5/xlator/protocol/client.so(client3_3_getxattr_cbk+0x178)
 [0x7f81fcf33ad8] 
(-->/usr/lib64/glusterfs/3.4.5/xlator/cluster/replicate.so(afr_getxattr_cbk+0xf8)
 [0x7f81fccc5168]))) 0-dfs-dht: invalid argument: frame->local


core dump info:
Core was generated by `/usr/sbin/glusterfs --volfile-id=dfs 
--volfile-server=node1 /dat'.
Program terminated with signal 11, Segmentation fault.
#0  0x7f81fca87354 in dht_getxattr_cbk (frame=0x7f82009efe34, cookie=, this=, op_ret=, 
op_errno=0, 
xattr=, xdata=0x0) at dht-common.c:2043
2043DHT_STACK_UNWIND (getxattr, frame, local->op_ret, op_errno,
Missing separate debuginfos, use: debuginfo-install 
glibc-2.12-1.132.el6_5.4.x86_64 keyutils-libs-1.4-4.el6.x86_64 
krb5-libs-1.10.3-15.el6_5.1.x86_64 libcom_err-1.41.12-18.el6_5.1.x86_64 
libgcc-4.4.7-4.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 
openssl-1.0.1e-16.el6_5.7.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt 
#0  0x7f81fca87354 in dht_getxattr_cbk (frame=0x7f82009efe34, cookie=, this=, op_ret=, 
op_errno=0, 
xattr=, xdata=0x0) at dht-common.c:2043
#1  0x7f81fccc5168 in afr_getxattr_cbk (frame=0x7f8200a0d32c, cookie=, this=, op_ret=0, op_errno=0, 
dict=0x7f82003a768c, 
xdata=0x0) at afr-inode-read.c:618
#2  0x7f81fcf33ad8 in client3_3_getxattr_cbk (req=, 
iov=, count=, myframe=0x7f82009a58fc)
at client-rpc-fops.c:1115
#3  0x00346d80d6f5 in rpc_clnt_handle_reply (clnt=0x232cb40, 
pollin=0x1173ac10) at rpc-clnt.c:771
#4  0x00346d80ec6f in rpc_clnt_notify (trans=, 
mydata=0x232cb70, event=, data=) at 
rpc-clnt.c:891
#5  0x00346d80a4e8 in rpc_transport_notify (this=, 
event=, data=) at rpc-transport.c:497
#6  0x7f81fdf7f216 in socket_event_poll_in (this=0x233c5a0) at socket.c:2118
#7  0x7f81fdf80c3d in socket_event_handler (fd=, 
idx=, data=0x233c5a0, poll_in=1, poll_out=0, poll_err=0) 
at socket.c:2230
#8  0x00346d45e907 in event_dispatch_epoll_handler (event_pool=0x228be90) 
at event-epoll.c:384
#9  event_dispatch_epoll (event_pool=0x228be90) at event-epoll.c:445
#10 0x00406818 in main (argc=4, argv=0x7fff9e2e4898) at 
glusterfsd.c:1934
(gdb) print ((call_frame_t *)0x7f82009efe34)->local
$2 = (void *) 0x0
(gdb) l *0x7f81fca87354
0x7f81fca87354 is in dht_getxattr_cbk (dht-common.c:2043).
2038dht_aggregate_xattr (xattr, local->xattr);
2039local->xattr = dict_copy (xattr, local->xattr);
2040}
2041out:
2042if (is_last_call (this_call_cnt)) {
2043DHT_STACK_UNWIND (getxattr, frame, local->op_ret, op_errno,
2044  local->xattr, NULL);
2045}
2046return 0;
2047}


jump code:
2016 VALIDATE_OR_GOTO (frame->local, out);




volume info:
# gluster v info
 
Volume Name: dfs
Type: Distributed-Replicate
Volume ID: 1848afb0-44ef-418c-a58f-8d7159ec5d1e
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: node1:/data/vol/dfs
Brick2: node2:/data/vol/dfs
Brick3: node3:/data/vol/dfs
Brick4: node4:/data/vol/dfs
Options Reconfigured:
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
nfs.disable: on
features.quota: on
features.limit-usage: 
/video/CLOUD:200TB,/video/YINGSHIKU:200TB,/video/LIVENEW:200TB,/video/SOCIAL:200TB,/video/mini:200TB,/video/2013:200TB,/video:200TB

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync

2015-05-25 Thread Kotresh Hiremath Ravishankar
Hi Cyril,

Answers inline

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Cyril N PEPONNET (Cyril)" 
> To: "Kotresh Hiremath Ravishankar" 
> Cc: "gluster-users" 
> Sent: Friday, May 22, 2015 9:34:47 PM
> Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not 
> present - Falling back to xsync
> 
> One last question, correct me if I’m wrong.
> 
> When you start a geo-rep process it starts with xsync aka hybrid crawling
> (sending files every 60s, with files windows set as 8192 files per sent).
> 
> When the crawl is done it should use changelog detector and dynamically
> change things to slaves.
> 
> 1/ During the hybride crawl, if we delete files from master (and they were
> already transfered to the slave), xsync process will not delete them from
> the slave (and we can’t change as the option as is hardcoded).
> When it will pass to changelog, will it remove the non existent folders and
> files on the slave that are no longer on the master ?
> 
   
  You are right, xsync does not sync delete files, once it is already synced.
  After xsync, when it switches to changelog, it doesn't delete all the non 
existing
  entries on slave that are no longer on the master. Changelog is capable of 
deleting
  files from the time it got switched to changelog.

> 2/ With changelog, if I add a file of 10GB and after a file of 1KB, will the
> changelog process with queue (waiting for the 10GB file to be sent) or are
> the sent done in thread ?
> (ex I add a 10GB file and I delete it after 1min, what will happen ?)
> 
   Changelog records the operations happened in master and is replayed by 
geo-replication
   on to slave volume. Geo-replication syncs files in two phases.
  
   1. Phase-1: Create entries through RPC( 0 byte files on slave keeping gfid 
intact as in master) 
   2. Phase-2: Sync data, through rsync/tar_over_ssh (Multi threaded)

   Ok, now keeping that in mind, Phase-1 happens serially, and the phase two 
happens parallely.
   Zero byte files of 10GB and 1KB gets created on slave serially and data for 
the same syncs
   parallely. Another thing to remember, geo-rep makes sure that, syncing data 
to file is tried
   only after zero byte file for the same is created already.


In latest release 3.7, xsync crawl is minimized by the feature called history 
crawl introduced in 3.6.
So the chances of missing deletes/renames are less.

> Thanks.
> 
> --
> Cyril Peponnet
> 
> > On May 21, 2015, at 10:22 PM, Kotresh Hiremath Ravishankar
> >  wrote:
> > 
> > Great, hope that should work. Let's see
> > 
> > Thanks and Regards,
> > Kotresh H R
> > 
> > - Original Message -
> >> From: "Cyril N PEPONNET (Cyril)" 
> >> To: "Kotresh Hiremath Ravishankar" 
> >> Cc: "gluster-users" 
> >> Sent: Friday, May 22, 2015 5:31:13 AM
> >> Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not
> >> present - Falling back to xsync
> >> 
> >> Thanks to JoeJulian / Kaushal I managed to re-enable the changelog option
> >> and
> >> the socket is now present.
> >> 
> >> For the record I had some clients running rhs gluster-fuse and our nodes
> >> are
> >> running glusterfs release and op-version are not “compatible”.
> >> 
> >> Now I have to wait for the init crawl see if it switches to changelog
> >> detector mode.
> >> 
> >> Thanks Kotresh
> >> --
> >> Cyril Peponnet
> >> 
> >>> On May 21, 2015, at 8:39 AM, Cyril Peponnet
> >>>  wrote:
> >>> 
> >>> Hi,
> >>> 
> >>> Unfortunately,
> >>> 
> >>> # gluster vol set usr_global changelog.changelog off
> >>> volume set: failed: Staging failed on
> >>> mvdcgluster01.us.alcatel-lucent.com.
> >>> Error: One or more connected clients cannot support the feature being
> >>> set.
> >>> These clients need to be upgraded or disconnected before running this
> >>> command again
> >>> 
> >>> 
> >>> I don’t know really why, I have some clients using 3.6 as fuse client
> >>> others are running on 3.5.2.
> >>> 
> >>> Any advice ?
> >>> 
> >>> --
> >>> Cyril Peponnet
> >>> 
>  On May 20, 2015, at 5:17 AM, Kotresh Hiremath Ravishankar
>   wrote:
>  
>  Hi Cyril,
>  
>  From the brick logs, it seems the changelog-notifier thread has got
>  killed
>  for some reason,
>  as notify is failing with EPIPE.
>  
>  Try the following. It should probably help:
>  1. Stop geo-replication.
>  2. Disable changelog: gluster vol set 
>  changelog.changelog off
>  3. Enable changelog: glluster vol set 
>  changelog.changelog on
>  4. Start geo-replication.
>  
>  Let me know if it works.
>  
>  Thanks and Regards,
>  Kotresh H R
>  
>  - Original Message -
> > From: "Cyril N PEPONNET (Cyril)" 
> > To: "gluster-users" 
> > Sent: Tuesday, May 19, 2015 3:16:22 AM
> > Subject: [Gluster-users] Geo-Replication - Changelog socket is not
> > present - Falling back to xsync
> > 
> > Hi Gluster Community,
> > 
> > I have a 3 nodes setup at