Re: [Gluster-users] Optimizing Gluster (gfapi) for high IOPS
On Sun, Mar 23, 2014 at 6:10 AM, Josh Boon glus...@joshboon.com wrote: Thanks for those options. My machines tend to be self-healing rather frequently. Doing a gluster volume heal VMARRAY info the file list cycles through most of my high IOPS machines Also what's the best way to apply those options with out bricking the running VM's? I just made a rough stab and took the cluster down. The CPU problem thing sounds a lot like what I ran into with my ovirt on gluster deployment (same boxes). What I did to solve that was use cgroups to limit the CPU usage glusterd and glusterfsd is allowed to use. [1] I'm not completely sure if libgfapi uses the glusterd process to access the storage, could someone else comment? However, I know by limiting glusterfsd we can slow down the replication process by limiting the CPU it sees, thus not bringing the entire system to a halt. [1] http://www.andrewklau.com/controlling-glusterfsd-cpu-outbreaks-with-cgroups/ - Original Message - From: Vijay Bellur vbel...@redhat.com To: Josh Boon glus...@joshboon.com, Nick Majeran nmaje...@gmail.com Cc: Gluster-users@gluster.org List gluster-users@gluster.org Sent: Saturday, March 22, 2014 1:36:09 PM Subject: Re: [Gluster-users] Optimizing Gluster (gfapi) for high IOPS On 03/21/2014 09:50 PM, Josh Boon wrote: Hardware RAID 5 on SSD's using LVM formatted with XFS default options mounted with noatime Also I don't a lot of history for this current troubled machine but the sysctl additions don't appear to have made a significant difference Performance tunables in [1] are normally recommended for qemu - libgfapi. The last two options are related to quorum and the remaining tunables are related to performance. It might be worth a check to see if these options help provide better performance. Do you happen to know if self-healing was in progress when the machines stall? -Vijay [1] https://github.com/gluster/glusterfs/blob/master/extras/group-virt.example ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] cgroup config for glusterfsd in gluster 3.5?
Hi, I had recent comment from Giuseppe on my post about gluster cgroups [1], and also I heard some interesting things at a recent meetup around recent progressions with systemd along with gluster etc. Comment: I've noted that latest GlusterFS 3.5.0 nightly packages do not include (nor use) the /etc/sysconfig/glusterfsd file anymore. Should we deduce that the glusterd hierarchy/settings now controls both? I haven't had the time to look into 3.5, so now does the glusterd process control glusterfsd as well? Or would this cgroup method no longer work. [1] http://www.andrewklau.com/controlling-glusterfsd-cpu-outbreaks-with-cgroups/ ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Capturing config in a file
Hi, Can anyone tell me how I can capture the gluster brick config in a file ? We're running RHN Satellite and I'd like to be able to push a config file out to any new brick servers and also store for existing servers. I'm running 3.4.2 and wondered where and which files would be necessary to capture all of glusters configuration? Thanks, Steve ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Fwd: Capturing config in a file
Adding the list, in order to have other people's opinions... Today, the Monday after day light saving time change, I have only half a neuron on duty, so, reply to all is a task next-to-impossible. -- Forwarded message -- From: Carlos Capriotti capriotti.car...@gmail.com Date: Mon, Mar 31, 2014 at 2:50 PM Subject: Re: [Gluster-users] Capturing config in a file To: Steve Thomas stho...@rpstechnologysolutions.co.uk Steve: Capturing config on a file CAN be done, BUt, using that config is another story. What I did when I needed it: gluster volume info yourvolumenamehere glusterconf.txt next edit that file, trimming whatever unnecessary info it has, untill it looks like this: nfs.trusted-sync on nfs.addr-namelookup off nfs.nlm off network.ping-timeout 20 performance.quick-read off performance.read-ahead off performance.io-cache off performance.stat-prefetch off cluster.eager-lock enable network.remote-dio on performance.cache-max-file-size 2MB performance.cache-refresh-timeout 4 performance.cache-size 1GB performance.write-behind-window-size 4MB performance.io-thread-count 32 And then use this very simple script, using your glusterconf.txt file as a parameter, to duplicate your settings on your volume: #!/bin/bash while read line; do echo $line gluster volume set vmdata $line done $1 Of course there is a lot of room for improvement, but that gets the job done. Cheers, ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] nfs acces denied
Hello all, I've already tried to post this, but i'm unsure it arrived to the mailing list. I have some issues regarding my nfs mounts. My setup is as follows: Rhel 6.4, gluster 3.4.2-1 running on a vm (4 cores, 8GB ram) attached to a san. I have one disk on which are all the bricks (formatted ext4 in 64 bit mode) of 25TB. On the gluster side of things, everything works without issues. The trouble starts when I mount a volume as an nfs mount. Lots of volumes work without issues, but others behave strangely. The volumes that act weird generally contain many files (can be accidental?). The volumes in question mount without issues, but when I try to go into any subdirectory sometimes it works, sometimes I get errors. On windows with nfs client: access denied In nfslog: [2014-03-31 13:57:58.771241] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in gfid:c8d94120-6851-46ea-9f28-c629a44b1015. holes=1 overlaps=0 [2014-03-31 13:57:58.771348] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: gfid:c8d94120-6851-46ea-9f28-c629a44b1015: Invalid argument [2014-03-31 13:57:58.771380] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 : c8d94120-6851-46ea-9f28-c629a44b1015 [2014-03-31 13:57:58.771819] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1ec28530, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-03-31 13:57:58.798967] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b. holes=1 overlaps=0 [2014-03-31 13:57:58.799039] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b: Invalid argument [2014-03-31 13:57:58.799056] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 : 14972193-1039-4d7a-aed5-0d7e7eccf57b [2014-03-31 13:57:58.799088] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1ec28531, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) On linux: [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/ ls: /media/2011/201105/20110530/37: No such file or directory total 332 ... drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 32 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 34 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 35 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 36 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 37 ... [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/37 ls: /media/2011/201105/20110530/37/NN.073824357.1.tif: No such file or directory ls: /media/2011/201105/20110530/37/NN.073824357.3.tif: No such file or directory total 54 -rwxrwxr-x 0 nfsnobody 1003 9340 Jun 6 2011 NN.073824357.1.tif -rwxrwxr-x 1 nfsnobody 1003 35312 Jun 6 2011 NN.073824357.2.tif -rwxrwxr-x 0 nfsnobody 1003 9340 Jun 6 2011 NN.073824357.3.tif I see in the nfslog: ... [2014-03-31 12:44:18.941083] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/55. holes=1 overlaps=0 [2014-03-31 12:44:18.958078] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/30. holes=1 overlaps=0 [2014-03-31 12:44:18.959980] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/90. holes=1 overlaps=0 [2014-03-31 12:44:18.961094] E [dht-helper.c:429:dht_subvol_get_hashed] (--/usr/lib64/glusterfs/3.4.2/xlator/debug/io-stats.so(io_stats_lookup+0x157) [0x7fd6a61282e7] (--/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) [0x3dfe01c03d] (--/usr/lib64/glusterfs/3.4.2/xlator/cluster/distribute.so(dht_lookup+0xa7e) [0x7fd6a656af2e]))) 0-caviar_data11-dht: invalid argument: loc-parent [2014-03-31 12:44:18.961283] W [client-rpc-fops.c:2624:client3_3_lookup_cbk] 0-caviar_data11-client-0: remote operation failed: Invalid argument. Path: gfid:---- (----) [2014-03-31 12:44:18.961319] E [acl3.c:334:acl3_getacl_resume] 0-nfs-ACL: Unable to resolve FH: (192.168.151.21:740) caviar_data11 : ---- [2014-03-31 12:44:18.961338] E [acl3.c:342:acl3_getacl_resume] 0-nfs-ACL: unable to open_and_resume ... The weirdest thing is it changes from time to time which files and directories work and which don't Any ideas? Thanks! ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] nfs acces denied
Well, saying your client-side is linux does not help much. Distro, flavor, etc helps a lot, but I'll take a wild guess here. First, force your NFS mount (client) to use nfs version 3. The same for Microsoft. (It is fair to say I have no idea if the MS client supports v4 or not). Additionally, check that firewalls are disabled on both sides, just for testing. The same goes for SElinux. Windows and ACL, and user mapping is something that might be in your way too. There is a Technet document that describes how to handle this mapping if I am not wrong. Just for testing, mount your nfs share you your own server, using localhost:/nfs_share and see how it goes. It is a good start. Kr, Carlos On Mon, Mar 31, 2014 at 3:58 PM, VAN CAUSBROECK Wannes wannes.vancausbro...@onprvp.fgov.be wrote: Hello all, I've already tried to post this, but i'm unsure it arrived to the mailing list. I have some issues regarding my nfs mounts. My setup is as follows: Rhel 6.4, gluster 3.4.2-1 running on a vm (4 cores, 8GB ram) attached to a san. I have one disk on which are all the bricks (formatted ext4 in 64 bit mode) of 25TB. On the gluster side of things, everything works without issues. The trouble starts when I mount a volume as an nfs mount. Lots of volumes work without issues, but others behave strangely. The volumes that act weird generally contain many files (can be accidental?). The volumes in question mount without issues, but when I try to go into any subdirectory sometimes it works, sometimes I get errors. On windows with nfs client: access denied In nfslog: [2014-03-31 13:57:58.771241] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in gfid:c8d94120-6851-46ea-9f28-c629a44b1015. holes=1 overlaps=0 [2014-03-31 13:57:58.771348] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: gfid:c8d94120-6851-46ea-9f28-c629a44b1015: Invalid argument [2014-03-31 13:57:58.771380] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 : c8d94120-6851-46ea-9f28-c629a44b1015 [2014-03-31 13:57:58.771819] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1ec28530, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-03-31 13:57:58.798967] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b. holes=1 overlaps=0 [2014-03-31 13:57:58.799039] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b: Invalid argument [2014-03-31 13:57:58.799056] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 : 14972193-1039-4d7a-aed5-0d7e7eccf57b [2014-03-31 13:57:58.799088] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1ec28531, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) On linux: [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/ ls: /media/2011/201105/20110530/37: No such file or directory total 332 ... drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 32 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 34 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 35 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 36 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 37 ... [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/37 ls: /media/2011/201105/20110530/37/NN.073824357.1.tif: No such file or directory ls: /media/2011/201105/20110530/37/NN.073824357.3.tif: No such file or directory total 54 -rwxrwxr-x 0 nfsnobody 1003 9340 Jun 6 2011 NN.073824357.1.tif -rwxrwxr-x 1 nfsnobody 1003 35312 Jun 6 2011 NN.073824357.2.tif -rwxrwxr-x 0 nfsnobody 1003 9340 Jun 6 2011 NN.073824357.3.tif I see in the nfslog: ... [2014-03-31 12:44:18.941083] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/55. holes=1 overlaps=0 [2014-03-31 12:44:18.958078] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/30. holes=1 overlaps=0 [2014-03-31 12:44:18.959980] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/90. holes=1 overlaps=0 [2014-03-31 12:44:18.961094] E [dht-helper.c:429:dht_subvol_get_hashed] (--/usr/lib64/glusterfs/3.4.2/xlator/debug/io-stats.so(io_stats_lookup+0x157) [0x7fd6a61282e7] (--/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) [0x3dfe01c03d] (--/usr/lib64/glusterfs/3.4.2/xlator/cluster/distribute.so(dht_lookup+0xa7e) [0x7fd6a656af2e]))) 0-caviar_data11-dht: invalid argument: loc-parent [2014-03-31 12:44:18.961283] W [client-rpc-fops.c:2624:client3_3_lookup_cbk] 0-caviar_data11-client-0: remote operation failed: Invalid
Re: [Gluster-users] nfs acces denied
Well, with 'client' i do actually mean the server itself. i've tried forcing linux and windows to nfs V3 and tcp, and on windows i played around with the uid and gid, but the result is always the same On 31 Mar 2014, at 17:22, Carlos Capriotti capriotti.car...@gmail.commailto:capriotti.car...@gmail.com wrote: Well, saying your client-side is linux does not help much. Distro, flavor, etc helps a lot, but I'll take a wild guess here. First, force your NFS mount (client) to use nfs version 3. The same for Microsoft. (It is fair to say I have no idea if the MS client supports v4 or not). Additionally, check that firewalls are disabled on both sides, just for testing. The same goes for SElinux. Windows and ACL, and user mapping is something that might be in your way too. There is a Technet document that describes how to handle this mapping if I am not wrong. Just for testing, mount your nfs share you your own server, using localhost:/nfs_share and see how it goes. It is a good start. Kr, Carlos On Mon, Mar 31, 2014 at 3:58 PM, VAN CAUSBROECK Wannes wannes.vancausbro...@onprvp.fgov.bemailto:wannes.vancausbro...@onprvp.fgov.be wrote: Hello all, I’ve already tried to post this, but i’m unsure it arrived to the mailing list. I have some issues regarding my nfs mounts. My setup is as follows: Rhel 6.4, gluster 3.4.2-1 running on a vm (4 cores, 8GB ram) attached to a san. I have one disk on which are all the bricks (formatted ext4 in 64 bit mode) of 25TB. On the gluster side of things, everything works without issues. The trouble starts when I mount a volume as an nfs mount. Lots of volumes work without issues, but others behave strangely. The volumes that act weird generally contain many files (can be accidental?). The volumes in question mount without issues, but when I try to go into any subdirectory sometimes it works, sometimes I get errors. On windows with nfs client: access denied In nfslog: [2014-03-31 13:57:58.771241] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in gfid:c8d94120-6851-46ea-9f28-c629a44b1015. holes=1 overlaps=0 [2014-03-31 13:57:58.771348] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: gfid:c8d94120-6851-46ea-9f28-c629a44b1015: Invalid argument [2014-03-31 13:57:58.771380] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984http://192.168.148.46:984) caviar_data11 : c8d94120-6851-46ea-9f28-c629a44b1015 [2014-03-31 13:57:58.771819] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1ec28530, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-03-31 13:57:58.798967] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b. holes=1 overlaps=0 [2014-03-31 13:57:58.799039] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b: Invalid argument [2014-03-31 13:57:58.799056] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984http://192.168.148.46:984) caviar_data11 : 14972193-1039-4d7a-aed5-0d7e7eccf57b [2014-03-31 13:57:58.799088] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1ec28531, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) …. On linux: [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/ ls: /media/2011/201105/20110530/37: No such file or directory total 332 … drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 32 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 34 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 35 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 36 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 37 … [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/37 ls: /media/2011/201105/20110530/37/NN.073824357.1.tif: No such file or directory ls: /media/2011/201105/20110530/37/NN.073824357.3.tif: No such file or directory total 54 -rwxrwxr-x 0 nfsnobody 1003 9340 Jun 6 2011 NN.073824357.1.tif -rwxrwxr-x 1 nfsnobody 1003 35312 Jun 6 2011 NN.073824357.2.tif -rwxrwxr-x 0 nfsnobody 1003 9340 Jun 6 2011 NN.073824357.3.tif I see in the nfslog: … [2014-03-31 12:44:18.941083] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/55. holes=1 overlaps=0 [2014-03-31 12:44:18.958078] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/30. holes=1 overlaps=0 [2014-03-31 12:44:18.959980] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/90. holes=1 overlaps=0 [2014-03-31 12:44:18.961094] E [dht-helper.c:429:dht_subvol_get_hashed] (--/usr/lib64/glusterfs/3.4.2/xlator/debug/io-stats.so(io_stats_lookup+0x157) [0x7fd6a61282e7] (--/usr/lib64/libglusterfs.so.0(default_lookup+0x6d)
Re: [Gluster-users] nfs acces denied
maybe it would be nice to see your volume info for affected volumes. Also, on the server side, what happens if you mount the share using glusterfs instead of nfs ? any change the native nfs server is running on your server ? Are there any auto-heal processes running ? There are a few name resolution messages on your logs, that seem to refer to the nodes themselves. Any DNS conflicts ? Maybe add the names of servers to the hosts file ? You MS client seems to be having issues with user/group translation. It seems to create files with gid 1003. (I could be wrong). Again, is SElinux/ACLs/iptables disabled ? All is very inconclusive os far. On Mon, Mar 31, 2014 at 5:26 PM, VAN CAUSBROECK Wannes wannes.vancausbro...@onprvp.fgov.be wrote: Well, with 'client' i do actually mean the server itself. i've tried forcing linux and windows to nfs V3 and tcp, and on windows i played around with the uid and gid, but the result is always the same On 31 Mar 2014, at 17:22, Carlos Capriotti capriotti.car...@gmail.com wrote: Well, saying your client-side is linux does not help much. Distro, flavor, etc helps a lot, but I'll take a wild guess here. First, force your NFS mount (client) to use nfs version 3. The same for Microsoft. (It is fair to say I have no idea if the MS client supports v4 or not). Additionally, check that firewalls are disabled on both sides, just for testing. The same goes for SElinux. Windows and ACL, and user mapping is something that might be in your way too. There is a Technet document that describes how to handle this mapping if I am not wrong. Just for testing, mount your nfs share you your own server, using localhost:/nfs_share and see how it goes. It is a good start. Kr, Carlos On Mon, Mar 31, 2014 at 3:58 PM, VAN CAUSBROECK Wannes wannes.vancausbro...@onprvp.fgov.be wrote: Hello all, I've already tried to post this, but i'm unsure it arrived to the mailing list. I have some issues regarding my nfs mounts. My setup is as follows: Rhel 6.4, gluster 3.4.2-1 running on a vm (4 cores, 8GB ram) attached to a san. I have one disk on which are all the bricks (formatted ext4 in 64 bit mode) of 25TB. On the gluster side of things, everything works without issues. The trouble starts when I mount a volume as an nfs mount. Lots of volumes work without issues, but others behave strangely. The volumes that act weird generally contain many files (can be accidental?). The volumes in question mount without issues, but when I try to go into any subdirectory sometimes it works, sometimes I get errors. On windows with nfs client: access denied In nfslog: [2014-03-31 13:57:58.771241] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in gfid:c8d94120-6851-46ea-9f28-c629a44b1015. holes=1 overlaps=0 [2014-03-31 13:57:58.771348] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: gfid:c8d94120-6851-46ea-9f28-c629a44b1015: Invalid argument [2014-03-31 13:57:58.771380] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 : c8d94120-6851-46ea-9f28-c629a44b1015 [2014-03-31 13:57:58.771819] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1ec28530, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) [2014-03-31 13:57:58.798967] I [dht-layout.c:638:dht_layout_normalize] 0-caviar_data11-dht: found anomalies in gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b. holes=1 overlaps=0 [2014-03-31 13:57:58.799039] E [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b: Invalid argument [2014-03-31 13:57:58.799056] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 : 14972193-1039-4d7a-aed5-0d7e7eccf57b [2014-03-31 13:57:58.799088] W [nfs3-helpers.c:3380:nfs3_log_common_res] 0-nfs-nfsv3: XID: 1ec28531, LOOKUP: NFS: 22(Invalid argument for operation), POSIX: 14(Bad address) On linux: [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/ ls: /media/2011/201105/20110530/37: No such file or directory total 332 ... drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 32 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 34 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 35 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 36 drwxrwsr-x 2 nfsnobody 1003 4096 Jun 6 2011 37 ... [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/37 ls: /media/2011/201105/20110530/37/NN.073824357.1.tif: No such file or directory ls: /media/2011/201105/20110530/37/NN.073824357.3.tif: No such file or directory total 54 -rwxrwxr-x 0 nfsnobody 1003 9340 Jun 6 2011 NN.073824357.1.tif -rwxrwxr-x 1 nfsnobody 1003 35312 Jun 6 2011 NN.073824357.2.tif -rwxrwxr-x 0 nfsnobody 1003 9340 Jun 6 2011
[Gluster-users] [Gluster-user] Geo-Replication: (xtime) failed on peer with OSError On Debian 7.2
Need help on setting up Geo-Replication. OS: Debian 7.2 GlusterFS: 3.4.2-2 Keep getting the following message. I check and it seems it was a bug but has been fixed. What do I missing? Any way to circumvent this issue or a short term solution? Thanks in advance. Cary --- [2014-03-31 17:45:38.86431] I [monitor(monitor):81:monitor] Monitor: starting gsyncd worker [2014-03-31 17:45:38.163496] I [gsyncd:404:main_i] top: syncing: gluster://localhost:mirror - ssh://gluster@54.19.181.16:/data/mirror [2014-03-31 17:45:41.109936] I [master:60:gmaster_builder] top: setting up master for normal sync mode [2014-03-31 17:45:42.164649] I [master:679:crawl] _GMaster: new master is 5ccdcdb3-77b9-4ec2-92ad-7368d8e24b39 [2014-03-31 17:45:42.165154] I [master:683:crawl] _GMaster: primary master with volume id 5ccdcdb3-77b9-4ec2-92ad-7368d8e24b39 ... [2014-03-31 17:45:42.297504] E [repce:188:__call__] RepceClient: call 13218:140336006149888:1396287942.17 (xtime) failed on peer with OSError [2014-03-31 17:45:42.297789] E [syncdutils:190:log_raise_exception] top: FAIL: Traceback (most recent call last): File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py, line 120, in main main_i() File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py, line 415, in main_i local.service_loop(*[r for r in [remote] if r]) File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py, line 874, in service_loop gmaster_builder()(self, args[0]).crawl_loop() File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py, line 540, in crawl_loop self.crawl() File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py, line 704, in crawl xtr = self.xtime(path, self.slave) File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py, line 376, in xtime return self.xtime_low(rsc.server, path, **opts) File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py, line 110, in xtime_low xt = server.xtime(path, self.uuid) File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py, line 204, in __call__ return self.ins(self.meth, *a) File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py, line 189, in __call__ raise res OSError: [Errno 95] Operation not supported [2014-03-31 17:45:42.299632] I [syncdutils:148:finalize] top: exiting. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] 42 node gluster volume create fails silently
Not much of output - not sure where to see. This is the output in the cli.log - There are 42 servers and (21 brick pairs) - timeout perhaps ?? [2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 0-cli: Replicate cluster type found. Checking brick order. [2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 0-cli: Brick order okay [2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to socket [2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread [2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 0-cli: Received resp to list: 0 [2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0 [2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 42 node gluster volume create fails silently
Looks symptomatic of some timeout - subsequent status command gave: gluster volume status Another transaction is in progress. Please try again after sometime. From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:53 PM To: gluster-users@gluster.org Subject: [Gluster-users] 42 node gluster volume create fails silently Not much of output - not sure where to see. This is the output in the cli.log - There are 42 servers and (21 brick pairs) - timeout perhaps ?? [2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 0-cli: Replicate cluster type found. Checking brick order. [2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 0-cli: Brick order okay [2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to socket [2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread [2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 0-cli: Received resp to list: 0 [2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0 [2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 42 node gluster volume create fails silently
.. and glusterd died. I had success adding individually up to 21 nodes - will go down that path. Anyone interested in log files or core files? service glusterd status glusterd dead but pid file exists From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:57 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Looks symptomatic of some timeout - subsequent status command gave: gluster volume status Another transaction is in progress. Please try again after sometime. From: gluster-users-boun...@gluster.orgmailto:gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:53 PM To: gluster-users@gluster.orgmailto:gluster-users@gluster.org Subject: [Gluster-users] 42 node gluster volume create fails silently Not much of output - not sure where to see. This is the output in the cli.log - There are 42 servers and (21 brick pairs) - timeout perhaps ?? [2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 0-cli: Replicate cluster type found. Checking brick order. [2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 0-cli: Brick order okay [2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to socket [2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread [2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 0-cli: Received resp to list: 0 [2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0 [2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 42 node gluster volume create fails silently
Hello, the cli logs do not contain much. If you remount your gluster volume, and create the problem again, there may be more to see. On the client side: mount -t glusterfs -o log-level=DEBUG,log-file=/tmp/my_client.log 10.16.159.219:/myvol /mnt On the server side: gluster volume set myvol diagnostics.brick-sys-log-level WARNING gluster volume set myvol diagnostics.brick-log-level WARNING You could then attach the most recent log files to your email, or the parts that seem relevant so the email is not too large. /tmp/my_client.log /var/log/glusterfs/etc*.log /var/log/glusterfs/bricks/*.log - Original Message - From: Nirmal Prasad npra...@idirect.net To: gluster-users@gluster.org Sent: Monday, March 31, 2014 6:04:31 PM Subject: Re: [Gluster-users] 42 node gluster volume create fails silently .. and glusterd died. I had success adding individually up to 21 nodes – will go down that path. Anyone interested in log files or core files? service glusterd status glusterd dead but pid file exists From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:57 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Looks symptomatic of some timeout – subsequent status command gave: gluster volume status Another transaction is in progress. Please try again after sometime. From: gluster-users-boun...@gluster.org [ mailto:gluster-users-boun...@gluster.org ] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:53 PM To: gluster-users@gluster.org Subject: [Gluster-users] 42 node gluster volume create fails silently Not much of output – not sure where to see. This is the output in the cli.log – There are 42 servers and (21 brick pairs) – timeout perhaps ?? [2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 0-cli: Replicate cluster type found. Checking brick order. [2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 0-cli: Brick order okay [2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to socket [2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread [2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 0-cli: Received resp to list: 0 [2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0 [2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 42 node gluster volume create fails silently
Hi Dan, Thanks for the quick response. I'm trying to create the volume so have not reached this stage - the client exited out when I gave it : gluster volume create vol-name replica 2 server1:.. server2:.. server41:.. server42:.. If I do : gluster volume create vol-name replica 2 server1:.. server2:.. gluster volume add-brick vol-name replica 2 server3:.. server4:.. it gets me farther ... looks like there is some timeout for the gluster command - not sure - just an observation. Thanks Regards Nirmal -Original Message- From: Dan Lambright [mailto:dlamb...@redhat.com] Sent: Monday, March 31, 2014 6:16 PM To: Prasad, Nirmal Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Hello, the cli logs do not contain much. If you remount your gluster volume, and create the problem again, there may be more to see. On the client side: mount -t glusterfs -o log-level=DEBUG,log-file=/tmp/my_client.log 10.16.159.219:/myvol /mnt On the server side: gluster volume set myvol diagnostics.brick-sys-log-level WARNING gluster volume set myvol diagnostics.brick-log-level WARNING You could then attach the most recent log files to your email, or the parts that seem relevant so the email is not too large. /tmp/my_client.log /var/log/glusterfs/etc*.log /var/log/glusterfs/bricks/*.log - Original Message - From: Nirmal Prasad npra...@idirect.net To: gluster-users@gluster.org Sent: Monday, March 31, 2014 6:04:31 PM Subject: Re: [Gluster-users] 42 node gluster volume create fails silently .. and glusterd died. I had success adding individually up to 21 nodes – will go down that path. Anyone interested in log files or core files? service glusterd status glusterd dead but pid file exists From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:57 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Looks symptomatic of some timeout – subsequent status command gave: gluster volume status Another transaction is in progress. Please try again after sometime. From: gluster-users-boun...@gluster.org [ mailto:gluster-users-boun...@gluster.org ] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:53 PM To: gluster-users@gluster.org Subject: [Gluster-users] 42 node gluster volume create fails silently Not much of output – not sure where to see. This is the output in the cli.log – There are 42 servers and (21 brick pairs) – timeout perhaps ?? [2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 0-cli: Replicate cluster type found. Checking brick order. [2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 0-cli: Brick order okay [2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to socket [2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread [2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 0-cli: Received resp to list: 0 [2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0 [2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 42 node gluster volume create fails silently
Ok - for some reason it did not like 6 of my nodes - but able to add 34 nodes two at a time - may be client can do the similar split internally based on replica count. The failure from add-brick is simple volume add-brick: failed: gluster volume info Volume Name: gl_disk Type: Distributed-Replicate Volume ID: c70d525e-a255-41e2-af03-718d6dec0319 Status: Created Number of Bricks: 17 x 2 = 34 Transport-type: tcp -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 6:20 PM To: Dan Lambright Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Hi Dan, Thanks for the quick response. I'm trying to create the volume so have not reached this stage - the client exited out when I gave it : gluster volume create vol-name replica 2 server1:.. server2:.. server41:.. server42:.. If I do : gluster volume create vol-name replica 2 server1:.. server2:.. gluster volume add-brick vol-name replica 2 server3:.. server4:.. it gets me farther ... looks like there is some timeout for the gluster command - not sure - just an observation. Thanks Regards Nirmal -Original Message- From: Dan Lambright [mailto:dlamb...@redhat.com] Sent: Monday, March 31, 2014 6:16 PM To: Prasad, Nirmal Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Hello, the cli logs do not contain much. If you remount your gluster volume, and create the problem again, there may be more to see. On the client side: mount -t glusterfs -o log-level=DEBUG,log-file=/tmp/my_client.log 10.16.159.219:/myvol /mnt On the server side: gluster volume set myvol diagnostics.brick-sys-log-level WARNING gluster volume set myvol diagnostics.brick-log-level WARNING You could then attach the most recent log files to your email, or the parts that seem relevant so the email is not too large. /tmp/my_client.log /var/log/glusterfs/etc*.log /var/log/glusterfs/bricks/*.log - Original Message - From: Nirmal Prasad npra...@idirect.net To: gluster-users@gluster.org Sent: Monday, March 31, 2014 6:04:31 PM Subject: Re: [Gluster-users] 42 node gluster volume create fails silently .. and glusterd died. I had success adding individually up to 21 nodes – will go down that path. Anyone interested in log files or core files? service glusterd status glusterd dead but pid file exists From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:57 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Looks symptomatic of some timeout – subsequent status command gave: gluster volume status Another transaction is in progress. Please try again after sometime. From: gluster-users-boun...@gluster.org [ mailto:gluster-users-boun...@gluster.org ] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:53 PM To: gluster-users@gluster.org Subject: [Gluster-users] 42 node gluster volume create fails silently Not much of output – not sure where to see. This is the output in the cli.log – There are 42 servers and (21 brick pairs) – timeout perhaps ?? [2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 0-cli: Replicate cluster type found. Checking brick order. [2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 0-cli: Brick order okay [2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to socket [2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread [2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 0-cli: Received resp to list: 0 [2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0 [2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] 42 node gluster volume create fails silently
3.4.2 - tracing out the problem on other nodes. I think they had something left out partially. The probe and addition process could definitely use some speed - plumbing should be quick - fun is in the data Cleared out with - setfattr -x trusted.glusterfs.volume-id mount setfattr -x trusted.gfid mount -Original Message- From: Joe Julian [mailto:j...@julianfamily.org] Sent: Monday, March 31, 2014 7:06 PM To: Prasad, Nirmal Subject: Re: [Gluster-users] 42 node gluster volume create fails silently What version? On 03/31/2014 03:59 PM, Prasad, Nirmal wrote: Ok - for some reason it did not like 6 of my nodes - but able to add 34 nodes two at a time - may be client can do the similar split internally based on replica count. The failure from add-brick is simple volume add-brick: failed: gluster volume info Volume Name: gl_disk Type: Distributed-Replicate Volume ID: c70d525e-a255-41e2-af03-718d6dec0319 Status: Created Number of Bricks: 17 x 2 = 34 Transport-type: tcp -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 6:20 PM To: Dan Lambright Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Hi Dan, Thanks for the quick response. I'm trying to create the volume so have not reached this stage - the client exited out when I gave it : gluster volume create vol-name replica 2 server1:.. server2:.. server41:.. server42:.. If I do : gluster volume create vol-name replica 2 server1:.. server2:.. gluster volume add-brick vol-name replica 2 server3:.. server4:.. it gets me farther ... looks like there is some timeout for the gluster command - not sure - just an observation. Thanks Regards Nirmal -Original Message- From: Dan Lambright [mailto:dlamb...@redhat.com] Sent: Monday, March 31, 2014 6:16 PM To: Prasad, Nirmal Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Hello, the cli logs do not contain much. If you remount your gluster volume, and create the problem again, there may be more to see. On the client side: mount -t glusterfs -o log-level=DEBUG,log-file=/tmp/my_client.log 10.16.159.219:/myvol /mnt On the server side: gluster volume set myvol diagnostics.brick-sys-log-level WARNING gluster volume set myvol diagnostics.brick-log-level WARNING You could then attach the most recent log files to your email, or the parts that seem relevant so the email is not too large. /tmp/my_client.log /var/log/glusterfs/etc*.log /var/log/glusterfs/bricks/*.log - Original Message - From: Nirmal Prasad npra...@idirect.net To: gluster-users@gluster.org Sent: Monday, March 31, 2014 6:04:31 PM Subject: Re: [Gluster-users] 42 node gluster volume create fails silently .. and glusterd died. I had success adding individually up to 21 nodes – will go down that path. Anyone interested in log files or core files? service glusterd status glusterd dead but pid file exists From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:57 PM To: gluster-users@gluster.org Subject: Re: [Gluster-users] 42 node gluster volume create fails silently Looks symptomatic of some timeout – subsequent status command gave: gluster volume status Another transaction is in progress. Please try again after sometime. From: gluster-users-boun...@gluster.org [ mailto:gluster-users-boun...@gluster.org ] On Behalf Of Prasad, Nirmal Sent: Monday, March 31, 2014 5:53 PM To: gluster-users@gluster.org Subject: [Gluster-users] 42 node gluster volume create fails silently Not much of output – not sure where to see. This is the output in the cli.log – There are 42 servers and (21 brick pairs) – timeout perhaps ?? [2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 0-cli: Replicate cluster type found. Checking brick order. [2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 0-cli: Brick order okay [2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to socket [2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread [2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed [2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 0-cli: Received resp to list: 0 [2014-03-31 13:44:47.782086] I
[Gluster-users] gfid different on subvolume
I have seen some errors like gfid different on subvolume in my deployment. e.g. [2014-03-26 07:56:17.224262] W [afr-common.c:1196:afr_detect_self_heal_by_iatt] 0-sh_ugc4_mams-replicate-1: /operation_1/video/2014/03/26/24/19: gfid different on subvolume my clients (3.3) have already backported the patches mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=907072 CHANGE: http://review.gluster.org/4459 (cluster/dht: ignore EEXIST error in mkdir to avoid GFID mismatch) merged in master by Anand Avati CHANGE: http://review.gluster.org/5849 http://review.gluster.org/5849 (cluster/dht: assign layout onto missing directories too) But I still saw such errors. I though these changes were relative to clients? am I right? need I update my servers using the patched release? Or I miss other things? ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users