Re: [Gluster-users] Gluster 2.6 and infiniband

2012-06-08 Thread bxma...@gmail.com
Hello,

after downgrade kernel to 2.6.28 ( on 3.2.12 is glusterd not working -
check my previous email )
i'm not able to run rdma at all, mount without rdma ( i'm using
tcp,rdma ) is working ok but speed max 150mb/s
after try to mount .rdma it fail and log contain this:

[2012-06-08 03:50:32.442263] I [glusterfsd.c:1493:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
3.2.6
[2012-06-08 03:50:32.451931] W [write-behind.c:3023:init]
0-atlas1-write-behind: disabling write-behind for first 0 bytes
[2012-06-08 03:50:32.455502] E [rdma.c:3969:rdma_init]
0-rpc-transport/rdma: Failed to get infinibanddevice context
[2012-06-08 03:50:32.455528] E [rdma.c:4813:init] 0-atlas1-client-0:
Failed to initialize IB Device
[2012-06-08 03:50:32.455541] E
[rpc-transport.c:742:rpc_transport_load] 0-rpc-transport: 'rdma'
initialization failed
[2012-06-08 03:50:32.44] W
[rpc-clnt.c:926:rpc_clnt_connection_init] 0-atlas1-client-0: loading
of new rpc-transport failed
[2012-06-08 03:50:32.456355] E [client.c:2095:client_init_rpc]
0-atlas1-client-0: failed to initialize RPC
[2012-06-08 03:50:32.456378] E [xlator.c:1447:xlator_init]
0-atlas1-client-0: Initialization of volume 'atlas1-client-0' failed,
review your volfile again
[2012-06-08 03:50:32.456391] E [graph.c:348:glusterfs_graph_init]
0-atlas1-client-0: initializing translator failed
[2012-06-08 03:50:32.456403] E [graph.c:526:glusterfs_graph_activate]
0-graph: init failed
[2012-06-08 03:50:32.456680] W [glusterfsd.c:727:cleanup_and_exit]
(--/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7f98ecea7175] (--/usr/sbin
/glusterfs(mgmt_getspec_cbk+0xc7) [0x4089d7]
(--/usr/sbin/glusterfs(glusterfs_process_volfp+0x1a0) [0x406410])))
0-: received signum (0), shutting down
[2012-06-08 03:50:32.456720] I [fuse-bridge.c:3727:fini] 0-fuse:
Unmounting 'mount'.

Infiniband config is same with new and old kernel.

thanks

Matus


2012/6/7 Sabuj Pattanayek sab...@gmail.com:
 To make a long story short, I made rdma client connect files and
 mounted with them directly :

 #/etc/glusterd/vols/pirdist/pirdist.rdma-fuse.vol       /pirdist        
 glusterfs
 transport=rdma  0 0
 #/etc/glusterd/vols/pirstripe/pirstripe.rdma-fuse.vol   /pirstripe            
   glusterfs
 transport=rdma  0 0

 the transport=rdma does nothing here since it reads the parameters
 from .vol files . However you'll see that they're now commented out
 since RDMA has been very unstable for us. Servers lose their
 connections to each other, which somehow causes gbe clients to lose
 their connections.  IP over IB however is working great, although at
 the expense of some performance vs RDMA, but it's still much better
 than gbe.

 On Thu, Jun 7, 2012 at 4:25 AM, bxma...@gmail.com bxma...@gmail.com wrote:
 Hello,

 at first it was tcp then tcp,rdma.

 You are right that without tcp definition .rdma is not working. But
 now i have another problem.
 I'm trying tcp / rdma, im trying even tcp/rdma using normal network
 card ( not using infiniband IP but normal 1gbit network
 card and i have still same speed, upload about 30mb/s and download
 about 200mb/s .. so i'm not sure if rdma is even working.

 Native infiniband is giving me 3500mb/s speed with benchmark tests
 (ib_rdma_bw ).

 thanks

 Matus

 2012/6/7 Amar Tumballi ama...@redhat.com:
 On 06/07/2012 02:04 PM, bxma...@gmail.com wrote:

 Hello,

 i have a problem with gluster 3.2.6 and infiniband. With gluster 3.3
 its working ok but with 3.2.6 i have following problems:

 when i'm trying to mount rdma volume using command mount -t glusterfs
 192.168.100.1:/atlas1.rdma mount  i get:

 [2012-06-07 04:30:18.894337] I [glusterfsd.c:1493:main]
 0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs
 version 3.2.6
 [2012-06-07 04:30:18.907499] E
 [glusterfsd-mgmt.c:628:mgmt_getspec_cbk] 0-glusterfs: failed to get
 the 'volume file' from server
 [2012-06-07 04:30:18.907592] E
 [glusterfsd-mgmt.c:695:mgmt_getspec_cbk] 0-mgmt: failed to fetch
 volume file (key:/atlas1.rdma)
 [2012-06-07 04:30:18.907995] W [glusterfsd.c:727:cleanup_and_exit]
 (--/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0xc9)
 [0x7f784e2c8bc9] (--/usr/local/
 lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7f784e2c8975]
 (--/usr/local/sbin/glusterfs(mgmt_getspec_cbk+0x28b) [0x40861b])))
 0-: received signum (0)
 , shutting down
 [2012-06-07 04:30:18.908049] I [fuse-bridge.c:3727:fini] 0-fuse:
 Unmounting 'mount'.

 Same command without .rdma works ok.


 Is the volume's transport type only 'rdma' ? or 'tcp,rdma' ? If its only
 'rdma', then appending .rdma to volume name is not required. The appending
 of .rdma is only required when there are both type of transports on a
 volume (ie, 'tcp,rdma'), as from the client you can decide which transport
 you want to mount.

 default volume name would point to 'tcp' transport type, and appending
 .rdma, will point to rdma transport type.

 Hope that is clear now.

 Regards,
 Amar

[Gluster-users] Gluster 2.6 and infiniband

2012-06-07 Thread bxma...@gmail.com
Hello,

i have a problem with gluster 3.2.6 and infiniband. With gluster 3.3
its working ok but with 3.2.6 i have following problems:

when i'm trying to mount rdma volume using command mount -t glusterfs
192.168.100.1:/atlas1.rdma mount  i get:

[2012-06-07 04:30:18.894337] I [glusterfsd.c:1493:main]
0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs
version 3.2.6
[2012-06-07 04:30:18.907499] E
[glusterfsd-mgmt.c:628:mgmt_getspec_cbk] 0-glusterfs: failed to get
the 'volume file' from server
[2012-06-07 04:30:18.907592] E
[glusterfsd-mgmt.c:695:mgmt_getspec_cbk] 0-mgmt: failed to fetch
volume file (key:/atlas1.rdma)
[2012-06-07 04:30:18.907995] W [glusterfsd.c:727:cleanup_and_exit]
(--/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0xc9)
[0x7f784e2c8bc9] (--/usr/local/
lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7f784e2c8975]
(--/usr/local/sbin/glusterfs(mgmt_getspec_cbk+0x28b) [0x40861b])))
0-: received signum (0)
, shutting down
[2012-06-07 04:30:18.908049] I [fuse-bridge.c:3727:fini] 0-fuse:
Unmounting 'mount'.

Same command without .rdma works ok.

thanks

Matus
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 2.6 and infiniband

2012-06-07 Thread bxma...@gmail.com
Hello,

at first it was tcp then tcp,rdma.

You are right that without tcp definition .rdma is not working. But
now i have another problem.
I'm trying tcp / rdma, im trying even tcp/rdma using normal network
card ( not using infiniband IP but normal 1gbit network
card and i have still same speed, upload about 30mb/s and download
about 200mb/s .. so i'm not sure if rdma is even working.

Native infiniband is giving me 3500mb/s speed with benchmark tests
(ib_rdma_bw ).

thanks

Matus

2012/6/7 Amar Tumballi ama...@redhat.com:
 On 06/07/2012 02:04 PM, bxma...@gmail.com wrote:

 Hello,

 i have a problem with gluster 3.2.6 and infiniband. With gluster 3.3
 its working ok but with 3.2.6 i have following problems:

 when i'm trying to mount rdma volume using command mount -t glusterfs
 192.168.100.1:/atlas1.rdma mount  i get:

 [2012-06-07 04:30:18.894337] I [glusterfsd.c:1493:main]
 0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs
 version 3.2.6
 [2012-06-07 04:30:18.907499] E
 [glusterfsd-mgmt.c:628:mgmt_getspec_cbk] 0-glusterfs: failed to get
 the 'volume file' from server
 [2012-06-07 04:30:18.907592] E
 [glusterfsd-mgmt.c:695:mgmt_getspec_cbk] 0-mgmt: failed to fetch
 volume file (key:/atlas1.rdma)
 [2012-06-07 04:30:18.907995] W [glusterfsd.c:727:cleanup_and_exit]
 (--/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0xc9)
 [0x7f784e2c8bc9] (--/usr/local/
 lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7f784e2c8975]
 (--/usr/local/sbin/glusterfs(mgmt_getspec_cbk+0x28b) [0x40861b])))
 0-: received signum (0)
 , shutting down
 [2012-06-07 04:30:18.908049] I [fuse-bridge.c:3727:fini] 0-fuse:
 Unmounting 'mount'.

 Same command without .rdma works ok.


 Is the volume's transport type only 'rdma' ? or 'tcp,rdma' ? If its only
 'rdma', then appending .rdma to volume name is not required. The appending
 of .rdma is only required when there are both type of transports on a
 volume (ie, 'tcp,rdma'), as from the client you can decide which transport
 you want to mount.

 default volume name would point to 'tcp' transport type, and appending
 .rdma, will point to rdma transport type.

 Hope that is clear now.

 Regards,
 Amar
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Gluster 3.3 and xen 4.1.2 under Kernel 3.2 problem

2012-06-06 Thread bxma...@gmail.com
Hello

i have following problem, i'm trying to run gluster server 3.3.0 with
xen 3.1.2 with kernel 3.2.12.
I'm running glusted on dom0, not inside virtual env.

Glusterd is dieing with following message:

E [glusterd.c:270:glusterd_check_gsync_present] 0-glusterd:
geo-replication module not working as desired
D [glusterd.c:298:glusterd_check_gsync_present] 0-glusterd: Returning -1
E [xlator.c:385:xlator_init] 0-management: Initialization of volume
'management' failed, review your volfile again

With hypervisor turned off same kernel and exactly same settings works fine.

thanks

Matus
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Gluster 3.2 configurations + translators

2011-09-15 Thread bxma...@gmail.com
Hello,

i'm little confused about gluster configuration interface. I did start
with gluster 3.2 and i did all configurations using gluster cli
command.
Now when i was looking into way how to tune performance i find out in
documentation on many places some pieces of text configuration files,
but usually there is a warning that it is old and should be not used.

Right now im solving how to turn on io-cache and i find in some
documentation that it need to be turned on on server and client end as
well.
On server i did use
gluster volume set atlas performance.io-cache on

but on client gluser command die on timeout or error that glusterd not working.

So question is how to configure correctly client end on gluster  ?
There is very little about this on gluster 3.2 documentation and i
don't know how much from 3.1 can be used here. And is there any
translator documentation for gluster 3.2 ?

thanks

Matus
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.2 configurations + translators

2011-09-15 Thread bxma...@gmail.com
Hmmm, where can i check if client is configured to pull configuration
from server ?
On server i have /etc/glusterd and /etc/gluster which look like it is
not used at all.
On client end there is only /etc/gluster which is also not used( all defaults ).

Matus

2011/9/15  greg_sw...@aotx.uscourts.gov:


 gluster-users-boun...@gluster.org wrote on 09/15/2011 08:40:52 AM:

 i'm little confused about gluster configuration interface. I did start
 with gluster 3.2 and i did all configurations using gluster cli
 command.
 Now when i was looking into way how to tune performance i find out in
 documentation on many places some pieces of text configuration files,
 but usually there is a warning that it is old and should be not used.

 Right now im solving how to turn on io-cache and i find in some
 documentation that it need to be turned on on server and client end as
 well.
 On server i did use
 gluster volume set atlas performance.io-cache on

 but on client gluser command die on timeout or error that glusterd
 not working.

 So question is how to configure correctly client end on gluster  ?
 There is very little about this on gluster 3.2 documentation and i
 don't know how much from 3.1 can be used here. And is there any
 translator documentation for gluster 3.2 ?

 With the newer versions they are really pushing away from having to
 manually configure bits.  As long as your client is configured to pull its
 configuration file from the server when you run the command on the server
 the client should get an updated config file.

 You should be able to look in the clients log file and see the fact that
 the config file updated (I don't have an example at the moment).

 Another way you can check this is if the number of connections from the
 client to the server (netstat -pant | grep gluster | wc -l) increases after
 you make the change. (should increase by the count of bricks in the volume
 i believe).

 -greg


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Top Reset

2011-07-21 Thread bxma...@gmail.com
Hmm, i just find out that top stats are not changing at all, it look
like it was grabbing data for some time and then no changes ... does
anyone know how it is working ?
Documentation is very bad about this feature.

thanks

Matus

2011/7/20 bxma...@gmail.com bxma...@gmail.com:
 Hello,

 is there any way how to reset volume TOP statistics ?

 thanks

 Matus

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Top Reset

2011-07-20 Thread bxma...@gmail.com
Hello,

is there any way how to reset volume TOP statistics ?

thanks

Matus
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.2.0 and ucarp not working

2011-06-08 Thread bxma...@gmail.com
When client is connecting to any gluster node it automaticly receive list of
all other nodes for that volume.

Matus
Dne 8.6.2011 8:13 Joshua Baker-LePain jl...@duke.edu napsal(a):
 On Mon, 6 Jun 2011 at 1:30am, Craig Carl wrote

 Matus -
 If you are using the Gluster native client (mount -t glusterfs ...)
 then ucarp/CTDB is NOT required and you should not install it. Always
 use the real IPs when you are mounting with 'mount -t glusterfs...'.

 Hrm. That wasn't my understanding. Say my fstab line looks like this:

 192.168.2.100:/distrep /mnt/distrep glusterfs defaults,_netdev 0 0

 Now, let's say that at mount time 192.168.2.100 is down. How does the
 Gluster native client know which other IP addresses to contact to get the
 volume file? Is there a way to put multiple hosts in the fstab line?

 --
 Joshua Baker-LePain
 QB3 Shared Cluster Sysadmin
 UCSF
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] uninterruptible processes writing to glusterfsshare

2011-06-08 Thread bxma...@gmail.com
Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and
it happen again today ...
php-fpm freeze and reboot was only solution.

Matus


2011/6/7 Markus Fröhlich markus.froehl...@xidras.com:
 hi!

 there ist no relavant output from dmesg.
 no entries in the server log - only the one line in the client-server log, I
 already posted.

 the glusterfs version on the server had been updated to gfs 3.2.0 more than
 a month ago.
 because of the troubles on the backup server, I deleted the whole backup
 share and started from scratch.


 I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to
 2.8.5-41.1
 maybe this helps.

 here is the changelog info:

 Authors:
 
    Miklos Szeredi mik...@szeredi.hu
 Distribution: systemsmanagement:baracus / SLE_11_SP1
 * Tue Mar 29 2011 db...@novell.com
 - remove the --no-canonicalize usage for suse_version = 11.3

 * Mon Mar 21 2011 co...@novell.com
 - licenses package is about to die

 * Thu Feb 17 2011 mszer...@suse.cz
 - In case of failure to add to /etc/mtab don't umount. [bnc#668820]
  [CVE-2011-0541]

 * Tue Nov 16 2010 mszer...@suse.cz
 - Fix symlink attack for mount and umount [bnc#651598]

 * Wed Oct 27 2010 mszer...@suse.cz
 - Remove /etc/init.d/boot.fuse [bnc#648843]

 * Tue Sep 28 2010 mszer...@suse.cz
 - update to 2.8.5
  * fix option escaping for fusermount [bnc#641480]

 * Wed Apr 28 2010 mszer...@suse.cz
 - keep examples and internal docs in devel package (from jnweiger)

 * Mon Apr 26 2010 mszer...@suse.cz
 - update to 2.8.4
  * fix checking for symlinks in umount from /tmp
  * fix umounting if /tmp is a symlink


 kind regards
 markus froehlich

 Am 06.06.2011 21:19, schrieb Anthony J. Biacco:

 Could be fuse, check 'dmesg' for kernel module timeouts.

 In a similar vein, has anyone seen signifigant performance/reliability
 with diff fuse versions? say, latest source vs. Rhel distro rpms vers.

 -Tony



 -Original Message-
 From: Mohit Anchliamohitanch...@gmail.com
 Sent: June 06, 2011 1:14 PM
 To: Markus Fröhlichmarkus.froehl...@xidras.com
 Cc: gluster-users@gluster.orggluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing to
 glusterfsshare

 Is there anything in the server logs? Does it follow any particular
 pattern before going in this mode?

 Did you upgrade Gluster or is this new install?

 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com:

 hi!

 sometimes we've on some client-servers hanging uninterruptible processes
 (ps aux stat is on D ) and on one the CPU wait I/O grows within some
 minutes to 100%.
 you are not able to kill such processes - also kill -9 doesnt work -
 when
 you connect via strace to such an process, you wont see anything and
 you
 cannot detach it again.

 there are only two possibilities:
 killing the glusterfs process (umount GFS share) or rebooting the server.

 the only log entry I found, was on one client - just a single line:
 [2011-06-06 10:44:18.593211] I
 [afr-common.c:581:afr_lookup_collect_xattr]
 0-office-data-replicate-0: data self-heal is pending for

 /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML/bilder/Thumbs.db.

 one of the client-servers is a samba-server, the other one a
 backup-server
 based on rsync with millions of small files.

 gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0

 and here are the configs from server and client:
 server config

 /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol:
 volume office-data-posix
    type storage/posix
    option directory /GFS/office-data02
 end-volume

 volume office-data-access-control
    type features/access-control
    subvolumes office-data-posix
 end-volume

 volume office-data-locks
    type features/locks
    subvolumes office-data-access-control
 end-volume

 volume office-data-io-threads
    type performance/io-threads
    subvolumes office-data-locks
 end-volume

 volume office-data-marker
    type features/marker
    option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659
    option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp
    option xtime off
    option quota off
    subvolumes office-data-io-threads
 end-volume

 volume /GFS/office-data02
    type debug/io-stats
    option latency-measurement off
    option count-fop-hits off
    subvolumes office-data-marker
 end-volume

 volume office-data-server
    type protocol/server
    option transport-type tcp
    option auth.addr./GFS/office-data02.allow *
    subvolumes /GFS/office-data02
 end-volume


 --
 client config /etc/glusterd/vols/office-data/office-data-fuse.vol:
 volume office-data-client-0
    type protocol/client
    option remote-host gfs-01-01
    option remote-subvolume /GFS/office-data02
    option transport-type tcp
 end-volume

 volume office-data-replicate-0
    type cluster/replicate
    subvolumes office-data-client-0
 end-volume

 volume office-data-write-behind
    type performance/write-behind
    subvolumes 

Re: [Gluster-users] uninterruptible processes writing toglusterfsshare

2011-06-08 Thread bxma...@gmail.com
How to disable  io-cache routine ? I will try it and report back :)

thanks

Matus

2011/6/8 Mohit Anchlia mohitanch...@gmail.com:
 On Wed, Jun 8, 2011 at 12:29 PM, Justice London jlon...@lawinfo.com wrote:
 Hopefully this will help some people... try disabling the io-cache routine
 in the fuse configurations for your share. Let me know if you need
 instruction on doing this. It solved all of the lockup issues I was
 experiencing. I believe there is some sort of as-yet-undetermined memory
 leak here.

 Was there a bug filed? If you think this is a bug it will help others as well.

 Justice London

 -Original Message-
 From: gluster-users-boun...@gluster.org
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of bxma...@gmail.com
 Sent: Wednesday, June 08, 2011 12:22 PM
 To: gluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing
 toglusterfsshare

 Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and
 it happen again today ...
 php-fpm freeze and reboot was only solution.

 Matus


 2011/6/7 Markus Fröhlich markus.froehl...@xidras.com:
 hi!

 there ist no relavant output from dmesg.
 no entries in the server log - only the one line in the client-server log,
 I
 already posted.

 the glusterfs version on the server had been updated to gfs 3.2.0 more
 than
 a month ago.
 because of the troubles on the backup server, I deleted the whole backup
 share and started from scratch.


 I looked for a update of fuse and upgraded from 2.7.2-61.18.1 to
 2.8.5-41.1
 maybe this helps.

 here is the changelog info:

 Authors:
 
    Miklos Szeredi mik...@szeredi.hu
 Distribution: systemsmanagement:baracus / SLE_11_SP1
 * Tue Mar 29 2011 db...@novell.com
 - remove the --no-canonicalize usage for suse_version = 11.3

 * Mon Mar 21 2011 co...@novell.com
 - licenses package is about to die

 * Thu Feb 17 2011 mszer...@suse.cz
 - In case of failure to add to /etc/mtab don't umount. [bnc#668820]
  [CVE-2011-0541]

 * Tue Nov 16 2010 mszer...@suse.cz
 - Fix symlink attack for mount and umount [bnc#651598]

 * Wed Oct 27 2010 mszer...@suse.cz
 - Remove /etc/init.d/boot.fuse [bnc#648843]

 * Tue Sep 28 2010 mszer...@suse.cz
 - update to 2.8.5
  * fix option escaping for fusermount [bnc#641480]

 * Wed Apr 28 2010 mszer...@suse.cz
 - keep examples and internal docs in devel package (from jnweiger)

 * Mon Apr 26 2010 mszer...@suse.cz
 - update to 2.8.4
  * fix checking for symlinks in umount from /tmp
  * fix umounting if /tmp is a symlink


 kind regards
 markus froehlich

 Am 06.06.2011 21:19, schrieb Anthony J. Biacco:

 Could be fuse, check 'dmesg' for kernel module timeouts.

 In a similar vein, has anyone seen signifigant performance/reliability
 with diff fuse versions? say, latest source vs. Rhel distro rpms vers.

 -Tony



 -Original Message-
 From: Mohit Anchliamohitanch...@gmail.com
 Sent: June 06, 2011 1:14 PM
 To: Markus Fröhlichmarkus.froehl...@xidras.com
 Cc: gluster-users@gluster.orggluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing to
 glusterfsshare

 Is there anything in the server logs? Does it follow any particular
 pattern before going in this mode?

 Did you upgrade Gluster or is this new install?

 2011/6/6 Markus Fröhlichmarkus.froehl...@xidras.com:

 hi!

 sometimes we've on some client-servers hanging uninterruptible processes
 (ps aux stat is on D ) and on one the CPU wait I/O grows within some
 minutes to 100%.
 you are not able to kill such processes - also kill -9 doesnt work -
 when
 you connect via strace to such an process, you wont see anything and
 you
 cannot detach it again.

 there are only two possibilities:
 killing the glusterfs process (umount GFS share) or rebooting the
 server.

 the only log entry I found, was on one client - just a single line:
 [2011-06-06 10:44:18.593211] I
 [afr-common.c:581:afr_lookup_collect_xattr]
 0-office-data-replicate-0: data self-heal is pending for


 /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML
 /bilder/Thumbs.db.

 one of the client-servers is a samba-server, the other one a
 backup-server
 based on rsync with millions of small files.

 gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0

 and here are the configs from server and client:
 server config


 /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol
 :
 volume office-data-posix
    type storage/posix
    option directory /GFS/office-data02
 end-volume

 volume office-data-access-control
    type features/access-control
    subvolumes office-data-posix
 end-volume

 volume office-data-locks
    type features/locks
    subvolumes office-data-access-control
 end-volume

 volume office-data-io-threads
    type performance/io-threads
    subvolumes office-data-locks
 end-volume

 volume office-data-marker
    type features/marker
    option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659
    option timestamp-file /etc/glusterd

Re: [Gluster-users] uninterruptible processes writing to glusterfsshare

2011-06-07 Thread bxma...@gmail.com
I had similar problem, php-fpm sometime(once per 1 - 2 weeks ) hang,
process is waiting for some IO, gluster itself is working ok, server
reboot is only solution. Nothing in logs, nothing in dmesg. Gluster
version 3.2, kernel 2.6.34, running under xen, distribution gentoo. It
start after gluster installation, never had this problem with previous
openafs ( many other problems :))) ).

Matus

2011/6/6 Anthony J. Biacco abia...@formatdynamics.com:
 Could be fuse, check 'dmesg' for kernel module timeouts.

 In a similar vein, has anyone seen signifigant performance/reliability with 
 diff fuse versions? say, latest source vs. Rhel distro rpms vers.

 -Tony



 -Original Message-
 From: Mohit Anchlia mohitanch...@gmail.com
 Sent: June 06, 2011 1:14 PM
 To: Markus Fröhlich markus.froehl...@xidras.com
 Cc: gluster-users@gluster.org gluster-users@gluster.org
 Subject: Re: [Gluster-users] uninterruptible processes writing to 
 glusterfsshare

 Is there anything in the server logs? Does it follow any particular
 pattern before going in this mode?

 Did you upgrade Gluster or is this new install?

 2011/6/6 Markus Fröhlich markus.froehl...@xidras.com:
 hi!

 sometimes we've on some client-servers hanging uninterruptible processes
 (ps aux stat is on D ) and on one the CPU wait I/O grows within some
 minutes to 100%.
 you are not able to kill such processes - also kill -9 doesnt work - when
 you connect via strace to such an process, you wont see anything and you
 cannot detach it again.

 there are only two possibilities:
 killing the glusterfs process (umount GFS share) or rebooting the server.

 the only log entry I found, was on one client - just a single line:
 [2011-06-06 10:44:18.593211] I [afr-common.c:581:afr_lookup_collect_xattr]
 0-office-data-replicate-0: data self-heal is pending for
 /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML/bilder/Thumbs.db.

 one of the client-servers is a samba-server, the other one a backup-server
 based on rsync with millions of small files.

 gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0

 and here are the configs from server and client:
 server config
 /etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol:
 volume office-data-posix
    type storage/posix
    option directory /GFS/office-data02
 end-volume

 volume office-data-access-control
    type features/access-control
    subvolumes office-data-posix
 end-volume

 volume office-data-locks
    type features/locks
    subvolumes office-data-access-control
 end-volume

 volume office-data-io-threads
    type performance/io-threads
    subvolumes office-data-locks
 end-volume

 volume office-data-marker
    type features/marker
    option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659
    option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp
    option xtime off
    option quota off
    subvolumes office-data-io-threads
 end-volume

 volume /GFS/office-data02
    type debug/io-stats
    option latency-measurement off
    option count-fop-hits off
    subvolumes office-data-marker
 end-volume

 volume office-data-server
    type protocol/server
    option transport-type tcp
    option auth.addr./GFS/office-data02.allow *
    subvolumes /GFS/office-data02
 end-volume


 --
 client config /etc/glusterd/vols/office-data/office-data-fuse.vol:
 volume office-data-client-0
    type protocol/client
    option remote-host gfs-01-01
    option remote-subvolume /GFS/office-data02
    option transport-type tcp
 end-volume

 volume office-data-replicate-0
    type cluster/replicate
    subvolumes office-data-client-0
 end-volume

 volume office-data-write-behind
    type performance/write-behind
    subvolumes office-data-replicate-0
 end-volume

 volume office-data-read-ahead
    type performance/read-ahead
    subvolumes office-data-write-behind
 end-volume

 volume office-data-io-cache
    type performance/io-cache
    subvolumes office-data-read-ahead
 end-volume

 volume office-data-quick-read
    type performance/quick-read
    subvolumes office-data-io-cache
 end-volume

 volume office-data-stat-prefetch
    type performance/stat-prefetch
    subvolumes office-data-quick-read
 end-volume

 volume office-data
    type debug/io-stats
    option latency-measurement off
    option count-fop-hits off
    subvolumes office-data-stat-prefetch
 end-volume


  -- Mit freundlichen Grüssen

 Markus Fröhlich
 Techniker

 

 Xidras GmbH
 Stockern 47
 3744 Stockern
 Austria

 Tel:     +43 (0) 2983 201 30503
 Fax:     +43 (0) 2983 201 305039
 Email:   markus.froehl...@xidras.com
 Web:    http://www.xidras.com

 FN 317036 f | Landesgericht Krems | ATU64485024

 

 VERTRAULICHE INFORMATIONEN!
 Diese eMail enthält vertrauliche Informationen und ist nur für den
 berechtigten Empfänger bestimmt. Wenn diese eMail nicht 

[Gluster-users] Gluster 3.2.0 and ucarp not working

2011-06-06 Thread bxma...@gmail.com
Hello everybody.

I have a problem setting up gluster failover funcionality. Based on
manual i setup ucarp which is working well ( tested with ping/ssh etc
)

But when i use virtual address for gluster volume mount and i turn off
one of nodes machine/gluster will freeze until node is back online.

My virtual ip is 3.200 and machine real ip is 3.233 and 3.5. In
gluster log i can see:

[2011-06-06 02:33:54.230082] I
[client-handshake.c:913:client_setvolume_cbk] 0-atlas-client-1:
Connected to 192.168.3.233:24009, attached to re
mote volume '/atlas'.
[2011-06-06 02:33:54.230116] I [afr-common.c:2514:afr_notify]
0-atlas-replicate-0: Subvolume 'atlas-client-1' came back up; going
online.
[2011-06-06 02:33:54.237541] I [fuse-bridge.c:3316:fuse_graph_setup]
0-fuse: switched to graph 0
[2011-06-06 02:33:54.237801] I [fuse-bridge.c:2897:fuse_init]
0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13
kernel 7.13
[2011-06-06 02:33:54.238757] I [afr-common.c:836:afr_fresh_lookup_cbk]
0-atlas-replicate-0: added root inode
[2011-06-06 02:33:54.272650] I
[client-handshake.c:913:client_setvolume_cbk] 0-atlas-client-0:
Connected to 192.168.3.5:24009, attached to remo
te volume '/atlas'.


Even when IP i'm using at mount is 3.200 ... Its look like that at the
end gluster is using real machine IP's even when i'm connecting to
virtual. Is there a way
how to turn this functionality off or it is just broken ?

thanks for answer

Matus
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.2.0 and ucarp not working

2011-06-06 Thread bxma...@gmail.com
Hi Craig,

thanks for answer. I'm using replications - which is working ok,
volume is mounted using -t glusterfs parameter, does it matter ?
All nodes are using real IP's not virtual ( for probing etc ) , i'm
using virtual only for mounting volume on client. I was waiting over
10 minutes for volume
to wake up but it never start to work - it never switch to another
node, even when UCARP was already pointing there, there was lot
of recovery messages on log but no attemt to connect to second node.

thanks

Matus


2011/6/6 Craig Carl cc...@gluster.com:
 Matus -

 Gluster has automatic, built-in failover if you are using replica nodes.
 ucarp is only required if you want highly available NFS mounts.

   To use ucarp with Gluster you should -
        1. Install Gluster and create a replica volume. [1]
            1. DO NOT use the virtual IPs when you peer probe or create
 the volume, that won't work.
            2. Set the ping-timeout volume option to 25 seconds. [2]
        2. Install and setup ucarp.
        3. Mount your NFS clients using the VIPs.
        4. Mount your glusterfs clients using the real IP addresses.

 We mostly use CTDB because it supports NFS and Samba, I can't attach a
 document here but I'll email you directly with the documentation.

 [1]
 http://gluster.com/community/documentation/index.php/Gluster_3.2:_Configuring_Distributed_Replicated_Volumes
 [2]
 http://gluster.com/community/documentation/index.php/Gluster_3.2:_Setting_Volume_Options

 Thanks,

 Craig

 --
 Craig Carl, Senior Systems Engineer | Gluster
 408.829.9953(PST) | http://gluster.com
 http://www.gluster.com/gluster-for-aws/
 http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE5666F925A557DD8


 On 6/6/11 12:09 AM, bxma...@gmail.com wrote:
 Hello everybody.

 I have a problem setting up gluster failover funcionality. Based on
 manual i setup ucarp which is working well ( tested with ping/ssh etc
 )

 But when i use virtual address for gluster volume mount and i turn off
 one of nodes machine/gluster will freeze until node is back online.

 My virtual ip is 3.200 and machine real ip is 3.233 and 3.5. In
 gluster log i can see:

 [2011-06-06 02:33:54.230082] I
 [client-handshake.c:913:client_setvolume_cbk] 0-atlas-client-1:
 Connected to 192.168.3.233:24009, attached to re
 mote volume '/atlas'.
 [2011-06-06 02:33:54.230116] I [afr-common.c:2514:afr_notify]
 0-atlas-replicate-0: Subvolume 'atlas-client-1' came back up; going
 online.
 [2011-06-06 02:33:54.237541] I [fuse-bridge.c:3316:fuse_graph_setup]
 0-fuse: switched to graph 0
 [2011-06-06 02:33:54.237801] I [fuse-bridge.c:2897:fuse_init]
 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13
 kernel 7.13
 [2011-06-06 02:33:54.238757] I [afr-common.c:836:afr_fresh_lookup_cbk]
 0-atlas-replicate-0: added root inode
 [2011-06-06 02:33:54.272650] I
 [client-handshake.c:913:client_setvolume_cbk] 0-atlas-client-0:
 Connected to 192.168.3.5:24009, attached to remo
 te volume '/atlas'.


 Even when IP i'm using at mount is 3.200 ... Its look like that at the
 end gluster is using real machine IP's even when i'm connecting to
 virtual. Is there a way
 how to turn this functionality off or it is just broken ?

 thanks for answer

 Matus
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.2.0 and ucarp not working

2011-06-06 Thread bxma...@gmail.com
great, it is working now ... strange is that before i setup that
network.ping-timeout value it never switch to another node ( i was
waiting for 5 minutes and nothing ) and now
after 25 seconds is up and working - maybe there is no default value ?
I dont know :) ... but it is working for me now, thanks a lot

Matus


2011/6/6 Craig Carl cc...@gluster.com:
 Exactly! The default ping-timeout is 42 seconds.

 Craig

 On 6/6/11 1:50 AM, bxma...@gmail.com wrote:
 I see, i'm starting to understand that ... so in theory it should work
 fine with normal IP, and after $ping-timeout seconds it should switch
 to another node if one is dead, i'm i right ?

 2011/6/6 Craig Carl cc...@gluster.com:
 Matus -
   If you are using the Gluster native client (mount -t glusterfs ...)
 then ucarp/CTDB is NOT required and you should not install it. Always
 use the real IPs when you are mounting with 'mount -t glusterfs...'.

 Craig


 On 6/6/11 1:16 AM, bxma...@gmail.com wrote:
 Hi Craig,

 thanks for answer. I'm using replications - which is working ok,
 volume is mounted using -t glusterfs parameter, does it matter ?
 All nodes are using real IP's not virtual ( for probing etc ) , i'm
 using virtual only for mounting volume on client. I was waiting over
 10 minutes for volume
 to wake up but it never start to work - it never switch to another
 node, even when UCARP was already pointing there, there was lot
 of recovery messages on log but no attemt to connect to second node.

 thanks

 Matus


 2011/6/6 Craig Carl cc...@gluster.com:
 Matus -

 Gluster has automatic, built-in failover if you are using replica nodes.
 ucarp is only required if you want highly available NFS mounts.

   To use ucarp with Gluster you should -
        1. Install Gluster and create a replica volume. [1]
            1. DO NOT use the virtual IPs when you peer probe or create
 the volume, that won't work.
            2. Set the ping-timeout volume option to 25 seconds. [2]
        2. Install and setup ucarp.
        3. Mount your NFS clients using the VIPs.
        4. Mount your glusterfs clients using the real IP addresses.

 We mostly use CTDB because it supports NFS and Samba, I can't attach a
 document here but I'll email you directly with the documentation.

 [1]
 http://gluster.com/community/documentation/index.php/Gluster_3.2:_Configuring_Distributed_Replicated_Volumes
 [2]
 http://gluster.com/community/documentation/index.php/Gluster_3.2:_Setting_Volume_Options

 Thanks,

 Craig

 --
 Craig Carl, Senior Systems Engineer | Gluster
 408.829.9953(PST) | http://gluster.com
 http://www.gluster.com/gluster-for-aws/
 http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE5666F925A557DD8


 On 6/6/11 12:09 AM, bxma...@gmail.com wrote:
 Hello everybody.

 I have a problem setting up gluster failover funcionality. Based on
 manual i setup ucarp which is working well ( tested with ping/ssh etc
 )

 But when i use virtual address for gluster volume mount and i turn off
 one of nodes machine/gluster will freeze until node is back online.

 My virtual ip is 3.200 and machine real ip is 3.233 and 3.5. In
 gluster log i can see:

 [2011-06-06 02:33:54.230082] I
 [client-handshake.c:913:client_setvolume_cbk] 0-atlas-client-1:
 Connected to 192.168.3.233:24009, attached to re
 mote volume '/atlas'.
 [2011-06-06 02:33:54.230116] I [afr-common.c:2514:afr_notify]
 0-atlas-replicate-0: Subvolume 'atlas-client-1' came back up; going
 online.
 [2011-06-06 02:33:54.237541] I [fuse-bridge.c:3316:fuse_graph_setup]
 0-fuse: switched to graph 0
 [2011-06-06 02:33:54.237801] I [fuse-bridge.c:2897:fuse_init]
 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13
 kernel 7.13
 [2011-06-06 02:33:54.238757] I [afr-common.c:836:afr_fresh_lookup_cbk]
 0-atlas-replicate-0: added root inode
 [2011-06-06 02:33:54.272650] I
 [client-handshake.c:913:client_setvolume_cbk] 0-atlas-client-0:
 Connected to 192.168.3.5:24009, attached to remo
 te volume '/atlas'.


 Even when IP i'm using at mount is 3.200 ... Its look like that at the
 end gluster is using real machine IP's even when i'm connecting to
 virtual. Is there a way
 how to turn this functionality off or it is just broken ?

 thanks for answer

 Matus
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi