Re: [Gluster-users] Optimizing Gluster (gfapi) for high IOPS

2014-03-31 Thread Andrew Lau
On Sun, Mar 23, 2014 at 6:10 AM, Josh Boon glus...@joshboon.com wrote:

 Thanks for those options. My machines tend to be self-healing rather
 frequently.  Doing a gluster volume heal VMARRAY info the file list cycles
 through most of my high IOPS machines

 Also what's the best way to apply those options with out bricking the
 running VM's?  I just made a rough stab and took the cluster down.


The CPU problem thing sounds a lot like what I ran into with my ovirt on
gluster deployment (same boxes). What I did to solve that was use cgroups
to limit the CPU usage glusterd and glusterfsd is allowed to use. [1]

I'm not completely sure if libgfapi uses the glusterd process to access the
storage, could someone else comment? However, I know by limiting glusterfsd
we can slow down the replication process by limiting the CPU it sees, thus
not bringing the entire system to a halt.

[1]
http://www.andrewklau.com/controlling-glusterfsd-cpu-outbreaks-with-cgroups/



 - Original Message -
 From: Vijay Bellur vbel...@redhat.com
 To: Josh Boon glus...@joshboon.com, Nick Majeran nmaje...@gmail.com
 
 Cc: Gluster-users@gluster.org List gluster-users@gluster.org
 Sent: Saturday, March 22, 2014 1:36:09 PM
 Subject: Re: [Gluster-users] Optimizing Gluster (gfapi) for high IOPS

 On 03/21/2014 09:50 PM, Josh Boon wrote:
  Hardware RAID 5 on SSD's using LVM formatted with XFS default options
  mounted with noatime
 
  Also I don't a lot of history for this current troubled machine but the
  sysctl additions don't appear to have made a significant difference

 Performance tunables in [1] are normally recommended for qemu -
 libgfapi. The last two options are related to quorum and the remaining
 tunables are related to performance. It might be worth a check to see if
 these options help provide better performance.

 Do you happen to know if self-healing was in progress when the machines
 stall?

 -Vijay

 [1]
 https://github.com/gluster/glusterfs/blob/master/extras/group-virt.example


 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] cgroup config for glusterfsd in gluster 3.5?

2014-03-31 Thread Andrew Lau
Hi,

I had recent comment from Giuseppe on my post about gluster cgroups [1],
and also I heard some interesting things at a recent meetup around recent
progressions with systemd along with gluster etc.

Comment:
 I've noted that latest GlusterFS 3.5.0 nightly packages do not include
(nor use) the /etc/sysconfig/glusterfsd file anymore.
Should we deduce that the glusterd hierarchy/settings now controls both? 

I haven't had the time to look into 3.5, so now does the glusterd process
control glusterfsd as well? Or would this cgroup method no longer work.

[1]
http://www.andrewklau.com/controlling-glusterfsd-cpu-outbreaks-with-cgroups/
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Capturing config in a file

2014-03-31 Thread Steve Thomas
Hi,

Can anyone tell me how I can capture the gluster brick config in a file ? We're 
running RHN Satellite and I'd like to be able to push a config file out to any 
new brick servers and also store for existing servers.

I'm running 3.4.2 and wondered where and which files would be necessary to 
capture all of glusters configuration?

Thanks,
Steve




___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Fwd: Capturing config in a file

2014-03-31 Thread Carlos Capriotti
Adding the list, in order to have other people's opinions...

Today, the Monday after day light saving time change, I have only half a
neuron on duty, so, reply to all is a task next-to-impossible.

-- Forwarded message --
From: Carlos Capriotti capriotti.car...@gmail.com
Date: Mon, Mar 31, 2014 at 2:50 PM
Subject: Re: [Gluster-users] Capturing config in a file
To: Steve Thomas stho...@rpstechnologysolutions.co.uk


Steve:

Capturing config on a file CAN be done, BUt, using that config is another
story.

What I did when I needed it:



gluster volume info yourvolumenamehere  glusterconf.txt


next edit that file, trimming whatever unnecessary info it has, untill it
looks like this:



nfs.trusted-sync on
nfs.addr-namelookup off
nfs.nlm off
network.ping-timeout 20
performance.quick-read off
performance.read-ahead off
performance.io-cache off
performance.stat-prefetch off
cluster.eager-lock enable
network.remote-dio on
performance.cache-max-file-size 2MB
performance.cache-refresh-timeout 4
performance.cache-size 1GB
performance.write-behind-window-size 4MB
performance.io-thread-count 32






And then use this very simple script, using your glusterconf.txt file as a
parameter,  to duplicate your settings on your volume:

#!/bin/bash

while read line; do
echo $line
gluster volume set vmdata $line
done  $1

Of course there is a lot of room for improvement, but that gets the job
done.


Cheers,
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] nfs acces denied

2014-03-31 Thread VAN CAUSBROECK Wannes
Hello all,

I've already tried to post this, but i'm unsure it arrived to the mailing list.

I have some issues regarding my nfs mounts. My setup is as follows:
Rhel 6.4, gluster 3.4.2-1 running on a vm (4 cores, 8GB ram) attached to a san. 
I have one disk on which are all the bricks (formatted ext4 in 64 bit mode) of 
25TB.
On the gluster side of things, everything works without issues. The trouble 
starts when I mount a volume as an nfs mount.
Lots of volumes work without issues, but others behave strangely. The volumes 
that act weird generally contain many files (can be accidental?).
The volumes in question mount without issues, but when I try to go into any 
subdirectory sometimes it works, sometimes I get errors.

On windows with nfs client: access denied

In nfslog:
[2014-03-31 13:57:58.771241] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in 
gfid:c8d94120-6851-46ea-9f28-c629a44b1015. holes=1 overlaps=0
[2014-03-31 13:57:58.771348] E 
[nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup 
failed: gfid:c8d94120-6851-46ea-9f28-c629a44b1015: Invalid argument
[2014-03-31 13:57:58.771380] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: 
Unable to resolve FH: (192.168.148.46:984) caviar_data11 : 
c8d94120-6851-46ea-9f28-c629a44b1015
[2014-03-31 13:57:58.771819] W [nfs3-helpers.c:3380:nfs3_log_common_res] 
0-nfs-nfsv3: XID: 1ec28530, LOOKUP: NFS: 22(Invalid argument for operation), 
POSIX: 14(Bad address)
[2014-03-31 13:57:58.798967] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in 
gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b. holes=1 overlaps=0
[2014-03-31 13:57:58.799039] E 
[nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup 
failed: gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b: Invalid argument
[2014-03-31 13:57:58.799056] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: 
Unable to resolve FH: (192.168.148.46:984) caviar_data11 : 
14972193-1039-4d7a-aed5-0d7e7eccf57b
[2014-03-31 13:57:58.799088] W [nfs3-helpers.c:3380:nfs3_log_common_res] 
0-nfs-nfsv3: XID: 1ec28531, LOOKUP: NFS: 22(Invalid argument for operation), 
POSIX: 14(Bad address)



On linux:
[root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/
ls: /media/2011/201105/20110530/37: No such file or directory
total 332
...
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 32
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 34
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 35
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 36
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 37
...

[root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/37
ls: /media/2011/201105/20110530/37/NN.073824357.1.tif: No such file or 
directory
ls: /media/2011/201105/20110530/37/NN.073824357.3.tif: No such file or 
directory
total 54
-rwxrwxr-x 0 nfsnobody 1003  9340 Jun  6  2011 NN.073824357.1.tif
-rwxrwxr-x 1 nfsnobody 1003 35312 Jun  6  2011 NN.073824357.2.tif
-rwxrwxr-x 0 nfsnobody 1003  9340 Jun  6  2011 NN.073824357.3.tif


I see in the nfslog:
...
[2014-03-31 12:44:18.941083] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in /2011/201107/20110716/55. holes=1 
overlaps=0
[2014-03-31 12:44:18.958078] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in /2011/201107/20110716/30. holes=1 
overlaps=0
[2014-03-31 12:44:18.959980] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in /2011/201107/20110716/90. holes=1 
overlaps=0
[2014-03-31 12:44:18.961094] E [dht-helper.c:429:dht_subvol_get_hashed] 
(--/usr/lib64/glusterfs/3.4.2/xlator/debug/io-stats.so(io_stats_lookup+0x157) 
[0x7fd6a61282e7] (--/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) 
[0x3dfe01c03d] 
(--/usr/lib64/glusterfs/3.4.2/xlator/cluster/distribute.so(dht_lookup+0xa7e) 
[0x7fd6a656af2e]))) 0-caviar_data11-dht: invalid argument: loc-parent
[2014-03-31 12:44:18.961283] W [client-rpc-fops.c:2624:client3_3_lookup_cbk] 
0-caviar_data11-client-0: remote operation failed: Invalid argument. Path: 
gfid:---- 
(----)
[2014-03-31 12:44:18.961319] E [acl3.c:334:acl3_getacl_resume] 0-nfs-ACL: 
Unable to resolve FH: (192.168.151.21:740) caviar_data11 : 
----
[2014-03-31 12:44:18.961338] E [acl3.c:342:acl3_getacl_resume] 0-nfs-ACL: 
unable to open_and_resume
...

The weirdest thing is it changes from time to time which files and directories 
work and which don't
Any ideas?

Thanks!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] nfs acces denied

2014-03-31 Thread Carlos Capriotti
Well, saying your client-side is linux does not help much. Distro,
flavor, etc helps a lot, but I'll take a wild guess here.

First, force your NFS mount (client) to use nfs version 3.

The same for Microsoft. (It is fair to say I have no idea if the MS client
supports v4 or not).

Additionally, check that firewalls are disabled on both sides, just for
testing. The same goes for SElinux.

Windows and ACL, and user mapping is something that might be in your way
too. There is a Technet document that describes how to handle this mapping
if I am not wrong.

Just for testing, mount your nfs share you your own server, using
localhost:/nfs_share and see how it goes.

It is a good start.

Kr,

Carlos


On Mon, Mar 31, 2014 at 3:58 PM, VAN CAUSBROECK Wannes 
wannes.vancausbro...@onprvp.fgov.be wrote:

  Hello all,



 I've already tried to post this, but i'm unsure it arrived to the mailing
 list.



 I have some issues regarding my nfs mounts. My setup is as follows:

 Rhel 6.4, gluster 3.4.2-1 running on a vm (4 cores, 8GB ram) attached to a
 san. I have one disk on which are all the bricks (formatted ext4 in 64 bit
 mode) of 25TB.

 On the gluster side of things, everything works without issues. The
 trouble starts when I mount a volume as an nfs mount.

 Lots of volumes work without issues, but others behave strangely. The
 volumes that act weird generally contain many files (can be accidental?).

 The volumes in question mount without issues, but when I try to go into
 any subdirectory sometimes it works, sometimes I get errors.



 On windows with nfs client: access denied



 In nfslog:

 [2014-03-31 13:57:58.771241] I [dht-layout.c:638:dht_layout_normalize]
 0-caviar_data11-dht: found anomalies in
 gfid:c8d94120-6851-46ea-9f28-c629a44b1015. holes=1 overlaps=0

 [2014-03-31 13:57:58.771348] E
 [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup
 failed: gfid:c8d94120-6851-46ea-9f28-c629a44b1015: Invalid argument

 [2014-03-31 13:57:58.771380] E [nfs3.c:1380:nfs3_lookup_resume]
 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 :
 c8d94120-6851-46ea-9f28-c629a44b1015

 [2014-03-31 13:57:58.771819] W [nfs3-helpers.c:3380:nfs3_log_common_res]
 0-nfs-nfsv3: XID: 1ec28530, LOOKUP: NFS: 22(Invalid argument for
 operation), POSIX: 14(Bad address)

 [2014-03-31 13:57:58.798967] I [dht-layout.c:638:dht_layout_normalize]
 0-caviar_data11-dht: found anomalies in
 gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b. holes=1 overlaps=0

 [2014-03-31 13:57:58.799039] E
 [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup
 failed: gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b: Invalid argument

 [2014-03-31 13:57:58.799056] E [nfs3.c:1380:nfs3_lookup_resume]
 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 :
 14972193-1039-4d7a-aed5-0d7e7eccf57b

 [2014-03-31 13:57:58.799088] W [nfs3-helpers.c:3380:nfs3_log_common_res]
 0-nfs-nfsv3: XID: 1ec28531, LOOKUP: NFS: 22(Invalid argument for
 operation), POSIX: 14(Bad address)

 





 On linux:

 [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/

 ls: /media/2011/201105/20110530/37: No such file or directory

 total 332

 ...

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 32

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 34

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 35

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 36

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 37

 ...



 [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/37

 ls: /media/2011/201105/20110530/37/NN.073824357.1.tif: No such
 file or directory

 ls: /media/2011/201105/20110530/37/NN.073824357.3.tif: No such
 file or directory

 total 54

 -rwxrwxr-x 0 nfsnobody 1003  9340 Jun  6  2011 NN.073824357.1.tif

 -rwxrwxr-x 1 nfsnobody 1003 35312 Jun  6  2011 NN.073824357.2.tif

 -rwxrwxr-x 0 nfsnobody 1003  9340 Jun  6  2011 NN.073824357.3.tif





 I see in the nfslog:

 ...

 [2014-03-31 12:44:18.941083] I [dht-layout.c:638:dht_layout_normalize]
 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/55. holes=1
 overlaps=0

 [2014-03-31 12:44:18.958078] I [dht-layout.c:638:dht_layout_normalize]
 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/30. holes=1
 overlaps=0

 [2014-03-31 12:44:18.959980] I [dht-layout.c:638:dht_layout_normalize]
 0-caviar_data11-dht: found anomalies in /2011/201107/20110716/90. holes=1
 overlaps=0

 [2014-03-31 12:44:18.961094] E [dht-helper.c:429:dht_subvol_get_hashed]
 (--/usr/lib64/glusterfs/3.4.2/xlator/debug/io-stats.so(io_stats_lookup+0x157)
 [0x7fd6a61282e7] (--/usr/lib64/libglusterfs.so.0(default_lookup+0x6d)
 [0x3dfe01c03d]
 (--/usr/lib64/glusterfs/3.4.2/xlator/cluster/distribute.so(dht_lookup+0xa7e)
 [0x7fd6a656af2e]))) 0-caviar_data11-dht: invalid argument: loc-parent

 [2014-03-31 12:44:18.961283] W
 [client-rpc-fops.c:2624:client3_3_lookup_cbk] 0-caviar_data11-client-0:
 remote operation failed: Invalid 

Re: [Gluster-users] nfs acces denied

2014-03-31 Thread VAN CAUSBROECK Wannes
Well, with 'client' i do actually mean the server itself.
i've tried forcing linux and windows to nfs V3 and tcp, and on windows i played 
around with the uid and gid, but the result is always the same



On 31 Mar 2014, at 17:22, Carlos Capriotti 
capriotti.car...@gmail.commailto:capriotti.car...@gmail.com wrote:

Well, saying your client-side is linux does not help much. Distro, flavor, 
etc helps a lot, but I'll take a wild guess here.

First, force your NFS mount (client) to use nfs version 3.

The same for Microsoft. (It is fair to say I have no idea if the MS client 
supports v4 or not).

Additionally, check that firewalls are disabled on both sides, just for 
testing. The same goes for SElinux.

Windows and ACL, and user mapping is something that might be in your way too. 
There is a Technet document that describes how to handle this mapping if I am 
not wrong.

Just for testing, mount your nfs share you your own server, using 
localhost:/nfs_share and see how it goes.

It is a good start.

Kr,

Carlos


On Mon, Mar 31, 2014 at 3:58 PM, VAN CAUSBROECK Wannes 
wannes.vancausbro...@onprvp.fgov.bemailto:wannes.vancausbro...@onprvp.fgov.be
 wrote:
Hello all,

I’ve already tried to post this, but i’m unsure it arrived to the mailing list.

I have some issues regarding my nfs mounts. My setup is as follows:
Rhel 6.4, gluster 3.4.2-1 running on a vm (4 cores, 8GB ram) attached to a san. 
I have one disk on which are all the bricks (formatted ext4 in 64 bit mode) of 
25TB.
On the gluster side of things, everything works without issues. The trouble 
starts when I mount a volume as an nfs mount.
Lots of volumes work without issues, but others behave strangely. The volumes 
that act weird generally contain many files (can be accidental?).
The volumes in question mount without issues, but when I try to go into any 
subdirectory sometimes it works, sometimes I get errors.

On windows with nfs client: access denied

In nfslog:
[2014-03-31 13:57:58.771241] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in 
gfid:c8d94120-6851-46ea-9f28-c629a44b1015. holes=1 overlaps=0
[2014-03-31 13:57:58.771348] E 
[nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup 
failed: gfid:c8d94120-6851-46ea-9f28-c629a44b1015: Invalid argument
[2014-03-31 13:57:58.771380] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: 
Unable to resolve FH: (192.168.148.46:984http://192.168.148.46:984) 
caviar_data11 : c8d94120-6851-46ea-9f28-c629a44b1015
[2014-03-31 13:57:58.771819] W [nfs3-helpers.c:3380:nfs3_log_common_res] 
0-nfs-nfsv3: XID: 1ec28530, LOOKUP: NFS: 22(Invalid argument for operation), 
POSIX: 14(Bad address)
[2014-03-31 13:57:58.798967] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in 
gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b. holes=1 overlaps=0
[2014-03-31 13:57:58.799039] E 
[nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup 
failed: gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b: Invalid argument
[2014-03-31 13:57:58.799056] E [nfs3.c:1380:nfs3_lookup_resume] 0-nfs-nfsv3: 
Unable to resolve FH: (192.168.148.46:984http://192.168.148.46:984) 
caviar_data11 : 14972193-1039-4d7a-aed5-0d7e7eccf57b
[2014-03-31 13:57:58.799088] W [nfs3-helpers.c:3380:nfs3_log_common_res] 
0-nfs-nfsv3: XID: 1ec28531, LOOKUP: NFS: 22(Invalid argument for operation), 
POSIX: 14(Bad address)
….


On linux:
[root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/
ls: /media/2011/201105/20110530/37: No such file or directory
total 332
…
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 32
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 34
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 35
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 36
drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 37
…

[root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/37
ls: /media/2011/201105/20110530/37/NN.073824357.1.tif: No such file or 
directory
ls: /media/2011/201105/20110530/37/NN.073824357.3.tif: No such file or 
directory
total 54
-rwxrwxr-x 0 nfsnobody 1003  9340 Jun  6  2011 NN.073824357.1.tif
-rwxrwxr-x 1 nfsnobody 1003 35312 Jun  6  2011 NN.073824357.2.tif
-rwxrwxr-x 0 nfsnobody 1003  9340 Jun  6  2011 NN.073824357.3.tif


I see in the nfslog:
…
[2014-03-31 12:44:18.941083] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in /2011/201107/20110716/55. holes=1 
overlaps=0
[2014-03-31 12:44:18.958078] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in /2011/201107/20110716/30. holes=1 
overlaps=0
[2014-03-31 12:44:18.959980] I [dht-layout.c:638:dht_layout_normalize] 
0-caviar_data11-dht: found anomalies in /2011/201107/20110716/90. holes=1 
overlaps=0
[2014-03-31 12:44:18.961094] E [dht-helper.c:429:dht_subvol_get_hashed] 
(--/usr/lib64/glusterfs/3.4.2/xlator/debug/io-stats.so(io_stats_lookup+0x157) 
[0x7fd6a61282e7] (--/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) 

Re: [Gluster-users] nfs acces denied

2014-03-31 Thread Carlos Capriotti
maybe it would be nice to see your volume info for affected volumes.

Also, on the server side, what happens if you mount the share using
glusterfs instead of nfs ?

any change the native nfs server is running on your server ?

Are there any auto-heal processes running ?

There are a few name resolution messages on your logs, that seem to refer
to the nodes themselves. Any DNS conflicts ? Maybe add the names of servers
to the hosts file ?

You MS client seems to be having issues with user/group translation. It
seems to create files with gid 1003. (I could be wrong).

Again, is SElinux/ACLs/iptables disabled ?

All is very inconclusive os far.


On Mon, Mar 31, 2014 at 5:26 PM, VAN CAUSBROECK Wannes 
wannes.vancausbro...@onprvp.fgov.be wrote:

  Well, with 'client' i do actually mean the server itself.
 i've tried forcing linux and windows to nfs V3 and tcp, and on windows i
 played around with the uid and gid, but the result is always the same



 On 31 Mar 2014, at 17:22, Carlos Capriotti capriotti.car...@gmail.com
 wrote:

   Well, saying your client-side is linux does not help much. Distro,
 flavor, etc helps a lot, but I'll take a wild guess here.

  First, force your NFS mount (client) to use nfs version 3.

  The same for Microsoft. (It is fair to say I have no idea if the MS
 client supports v4 or not).

  Additionally, check that firewalls are disabled on both sides, just for
 testing. The same goes for SElinux.

  Windows and ACL, and user mapping is something that might be in your way
 too. There is a Technet document that describes how to handle this mapping
 if I am not wrong.

  Just for testing, mount your nfs share you your own server, using
 localhost:/nfs_share and see how it goes.

  It is a good start.

  Kr,

  Carlos


 On Mon, Mar 31, 2014 at 3:58 PM, VAN CAUSBROECK Wannes 
 wannes.vancausbro...@onprvp.fgov.be wrote:

  Hello all,



 I've already tried to post this, but i'm unsure it arrived to the mailing
 list.



 I have some issues regarding my nfs mounts. My setup is as follows:

 Rhel 6.4, gluster 3.4.2-1 running on a vm (4 cores, 8GB ram) attached to
 a san. I have one disk on which are all the bricks (formatted ext4 in 64
 bit mode) of 25TB.

 On the gluster side of things, everything works without issues. The
 trouble starts when I mount a volume as an nfs mount.

 Lots of volumes work without issues, but others behave strangely. The
 volumes that act weird generally contain many files (can be accidental?).

 The volumes in question mount without issues, but when I try to go into
 any subdirectory sometimes it works, sometimes I get errors.



 On windows with nfs client: access denied



 In nfslog:

 [2014-03-31 13:57:58.771241] I [dht-layout.c:638:dht_layout_normalize]
 0-caviar_data11-dht: found anomalies in
 gfid:c8d94120-6851-46ea-9f28-c629a44b1015. holes=1 overlaps=0

 [2014-03-31 13:57:58.771348] E
 [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup
 failed: gfid:c8d94120-6851-46ea-9f28-c629a44b1015: Invalid argument

 [2014-03-31 13:57:58.771380] E [nfs3.c:1380:nfs3_lookup_resume]
 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 :
 c8d94120-6851-46ea-9f28-c629a44b1015

 [2014-03-31 13:57:58.771819] W [nfs3-helpers.c:3380:nfs3_log_common_res]
 0-nfs-nfsv3: XID: 1ec28530, LOOKUP: NFS: 22(Invalid argument for
 operation), POSIX: 14(Bad address)

 [2014-03-31 13:57:58.798967] I [dht-layout.c:638:dht_layout_normalize]
 0-caviar_data11-dht: found anomalies in
 gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b. holes=1 overlaps=0

 [2014-03-31 13:57:58.799039] E
 [nfs3-helpers.c:3595:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup
 failed: gfid:14972193-1039-4d7a-aed5-0d7e7eccf57b: Invalid argument

 [2014-03-31 13:57:58.799056] E [nfs3.c:1380:nfs3_lookup_resume]
 0-nfs-nfsv3: Unable to resolve FH: (192.168.148.46:984) caviar_data11 :
 14972193-1039-4d7a-aed5-0d7e7eccf57b

 [2014-03-31 13:57:58.799088] W [nfs3-helpers.c:3380:nfs3_log_common_res]
 0-nfs-nfsv3: XID: 1ec28531, LOOKUP: NFS: 22(Invalid argument for
 operation), POSIX: 14(Bad address)

 





 On linux:

 [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/

 ls: /media/2011/201105/20110530/37: No such file or directory

 total 332

 ...

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 32

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 34

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 35

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 36

 drwxrwsr-x 2 nfsnobody 1003 4096 Jun  6  2011 37

 ...



 [root@lpr-nas01 brick-xiv2]# ll /media/2011/201105/20110530/37

 ls: /media/2011/201105/20110530/37/NN.073824357.1.tif: No such
 file or directory

 ls: /media/2011/201105/20110530/37/NN.073824357.3.tif: No such
 file or directory

 total 54

 -rwxrwxr-x 0 nfsnobody 1003  9340 Jun  6  2011 NN.073824357.1.tif

 -rwxrwxr-x 1 nfsnobody 1003 35312 Jun  6  2011 NN.073824357.2.tif

 -rwxrwxr-x 0 nfsnobody 1003  9340 Jun  6  2011 

[Gluster-users] [Gluster-user] Geo-Replication: (xtime) failed on peer with OSError On Debian 7.2

2014-03-31 Thread Cary Tsai
Need help on setting up Geo-Replication.
OS: Debian 7.2
GlusterFS: 3.4.2-2

Keep getting the following message.
I check and it seems it was a bug but has been fixed.
What do I missing?
Any way to circumvent this issue or a short term solution?
Thanks in advance.
Cary

---

[2014-03-31 17:45:38.86431] I [monitor(monitor):81:monitor] Monitor:
starting gsyncd worker
[2014-03-31 17:45:38.163496] I [gsyncd:404:main_i] top: syncing:
gluster://localhost:mirror - ssh://gluster@54.19.181.16:/data/mirror
[2014-03-31 17:45:41.109936] I [master:60:gmaster_builder] top: setting
up master for normal sync mode
[2014-03-31 17:45:42.164649] I [master:679:crawl] _GMaster: new master is
5ccdcdb3-77b9-4ec2-92ad-7368d8e24b39
[2014-03-31 17:45:42.165154] I [master:683:crawl] _GMaster: primary master
with volume id 5ccdcdb3-77b9-4ec2-92ad-7368d8e24b39 ...
[2014-03-31 17:45:42.297504] E [repce:188:__call__] RepceClient: call
13218:140336006149888:1396287942.17 (xtime) failed on peer with OSError
[2014-03-31 17:45:42.297789] E [syncdutils:190:log_raise_exception] top:
FAIL:
Traceback (most recent call last):
  File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py,
line 120, in main
main_i()
  File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py,
line 415, in main_i
local.service_loop(*[r for r in [remote] if r])
  File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py,
line 874, in service_loop
gmaster_builder()(self, args[0]).crawl_loop()
  File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py,
line 540, in crawl_loop
self.crawl()
  File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py,
line 704, in crawl
xtr = self.xtime(path, self.slave)
  File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py,
line 376, in xtime
return self.xtime_low(rsc.server, path, **opts)
  File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py,
line 110, in xtime_low
xt = server.xtime(path, self.uuid)
  File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py,
line 204, in __call__
return self.ins(self.meth, *a)
  File /usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py,
line 189, in __call__
raise res
OSError: [Errno 95] Operation not supported
[2014-03-31 17:45:42.299632] I [syncdutils:148:finalize] top: exiting.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] 42 node gluster volume create fails silently

2014-03-31 Thread Prasad, Nirmal
Not much of output - not sure where to see. This is the output in the cli.log - 
There are 42 servers and (21 brick pairs) - timeout perhaps ??

[2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed
[2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 
0-cli: Replicate cluster type found. Checking brick order.
[2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 
0-cli: Brick order okay
[2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 
0-rpc-transport: missing 'option transport-type'. defaulting to socket
[2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL 
support is NOT enabled
[2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using 
system polling thread
[2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed
[2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 
0-cli: Received resp to list: 0
[2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0
[2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 42 node gluster volume create fails silently

2014-03-31 Thread Prasad, Nirmal
Looks symptomatic of some timeout -  subsequent status command gave:

gluster volume status
Another transaction is in progress. Please try again after sometime.

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal
Sent: Monday, March 31, 2014 5:53 PM
To: gluster-users@gluster.org
Subject: [Gluster-users] 42 node gluster volume create fails silently

Not much of output - not sure where to see. This is the output in the cli.log - 
There are 42 servers and (21 brick pairs) - timeout perhaps ??

[2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed
[2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 
0-cli: Replicate cluster type found. Checking brick order.
[2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 
0-cli: Brick order okay
[2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 
0-rpc-transport: missing 'option transport-type'. defaulting to socket
[2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL 
support is NOT enabled
[2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using 
system polling thread
[2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed
[2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 
0-cli: Received resp to list: 0
[2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0
[2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 42 node gluster volume create fails silently

2014-03-31 Thread Prasad, Nirmal
.. and glusterd died. I had success adding individually up to 21 nodes - will 
go down that path. Anyone interested in log files or core files?

service glusterd status
glusterd dead but pid file exists

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal
Sent: Monday, March 31, 2014 5:57 PM
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently

Looks symptomatic of some timeout -  subsequent status command gave:

gluster volume status
Another transaction is in progress. Please try again after sometime.

From: 
gluster-users-boun...@gluster.orgmailto:gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal
Sent: Monday, March 31, 2014 5:53 PM
To: gluster-users@gluster.orgmailto:gluster-users@gluster.org
Subject: [Gluster-users] 42 node gluster volume create fails silently

Not much of output - not sure where to see. This is the output in the cli.log - 
There are 42 servers and (21 brick pairs) - timeout perhaps ??

[2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed
[2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 
0-cli: Replicate cluster type found. Checking brick order.
[2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 
0-cli: Brick order okay
[2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 
0-rpc-transport: missing 'option transport-type'. defaulting to socket
[2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL 
support is NOT enabled
[2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using 
system polling thread
[2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed
[2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 
0-cli: Received resp to list: 0
[2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0
[2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 42 node gluster volume create fails silently

2014-03-31 Thread Dan Lambright
Hello,

the cli logs do not contain much. If you remount your gluster volume, and 
create the problem again, there may be more to see.

On the client side:

mount -t glusterfs  -o log-level=DEBUG,log-file=/tmp/my_client.log 
10.16.159.219:/myvol /mnt

On the server side: 

gluster volume set myvol diagnostics.brick-sys-log-level WARNING
gluster volume set myvol diagnostics.brick-log-level WARNING

You could then attach the most recent log files to your email, or the parts 
that seem relevant so the email is not too large.
 
/tmp/my_client.log
/var/log/glusterfs/etc*.log
/var/log/glusterfs/bricks/*.log

- Original Message -
From: Nirmal Prasad npra...@idirect.net
To: gluster-users@gluster.org
Sent: Monday, March 31, 2014 6:04:31 PM
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently



.. and glusterd died. I had success adding individually up to 21 nodes – will 
go down that path. Anyone interested in log files or core files? 



service glusterd status 

glusterd dead but pid file exists 




From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal 
Sent: Monday, March 31, 2014 5:57 PM 
To: gluster-users@gluster.org 
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently 




Looks symptomatic of some timeout – subsequent status command gave: 



gluster volume status 

Another transaction is in progress. Please try again after sometime. 




From: gluster-users-boun...@gluster.org [ 
mailto:gluster-users-boun...@gluster.org ] On Behalf Of Prasad, Nirmal 
Sent: Monday, March 31, 2014 5:53 PM 
To: gluster-users@gluster.org 
Subject: [Gluster-users] 42 node gluster volume create fails silently 




Not much of output – not sure where to see. This is the output in the cli.log – 
There are 42 servers and (21 brick pairs) – timeout perhaps ?? 



[2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed 

[2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 
0-cli: Replicate cluster type found. Checking brick order. 

[2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 
0-cli: Brick order okay 

[2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 
0-rpc-transport: missing 'option transport-type'. defaulting to socket 

[2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL 
support is NOT enabled 

[2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using 
system polling thread 

[2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed 

[2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 
0-cli: Received resp to list: 0 

[2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0 

[2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 42 node gluster volume create fails silently

2014-03-31 Thread Prasad, Nirmal
Hi Dan,

Thanks for the quick response. I'm trying to create the volume so have not 
reached this stage - the client exited out when I gave it :

gluster volume create vol-name replica 2 server1:.. server2:..  
server41:.. server42:..

If I do :

gluster volume create vol-name replica 2 server1:.. server2:..
gluster volume add-brick vol-name replica 2 server3:.. server4:..

it gets me farther ... looks like there is some timeout for the gluster command 
- not sure - just an observation.

Thanks
Regards
Nirmal
-Original Message-
From: Dan Lambright [mailto:dlamb...@redhat.com] 
Sent: Monday, March 31, 2014 6:16 PM
To: Prasad, Nirmal
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently

Hello,

the cli logs do not contain much. If you remount your gluster volume, and 
create the problem again, there may be more to see.

On the client side:

mount -t glusterfs  -o log-level=DEBUG,log-file=/tmp/my_client.log 
10.16.159.219:/myvol /mnt

On the server side: 

gluster volume set myvol diagnostics.brick-sys-log-level WARNING gluster volume 
set myvol diagnostics.brick-log-level WARNING

You could then attach the most recent log files to your email, or the parts 
that seem relevant so the email is not too large.
 
/tmp/my_client.log
/var/log/glusterfs/etc*.log
/var/log/glusterfs/bricks/*.log

- Original Message -
From: Nirmal Prasad npra...@idirect.net
To: gluster-users@gluster.org
Sent: Monday, March 31, 2014 6:04:31 PM
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently



.. and glusterd died. I had success adding individually up to 21 nodes – will 
go down that path. Anyone interested in log files or core files? 



service glusterd status 

glusterd dead but pid file exists 




From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal
Sent: Monday, March 31, 2014 5:57 PM
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently 




Looks symptomatic of some timeout – subsequent status command gave: 



gluster volume status 

Another transaction is in progress. Please try again after sometime. 




From: gluster-users-boun...@gluster.org [ 
mailto:gluster-users-boun...@gluster.org ] On Behalf Of Prasad, Nirmal
Sent: Monday, March 31, 2014 5:53 PM
To: gluster-users@gluster.org
Subject: [Gluster-users] 42 node gluster volume create fails silently 




Not much of output – not sure where to see. This is the output in the cli.log – 
There are 42 servers and (21 brick pairs) – timeout perhaps ?? 



[2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed 

[2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 
0-cli: Replicate cluster type found. Checking brick order. 

[2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 
0-cli: Brick order okay 

[2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 
0-rpc-transport: missing 'option transport-type'. defaulting to socket 

[2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL 
support is NOT enabled 

[2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using 
system polling thread 

[2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed 

[2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 
0-cli: Received resp to list: 0 

[2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0 

[2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 42 node gluster volume create fails silently

2014-03-31 Thread Prasad, Nirmal
Ok - for some reason it did not like 6 of my nodes - but able to add 34 nodes 
two at a time - may be client can do the similar split internally based on 
replica count. The failure from add-brick is simple volume add-brick: failed: 

gluster volume info

Volume Name: gl_disk
Type: Distributed-Replicate
Volume ID: c70d525e-a255-41e2-af03-718d6dec0319
Status: Created
Number of Bricks: 17 x 2 = 34
Transport-type: tcp

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal
Sent: Monday, March 31, 2014 6:20 PM
To: Dan Lambright
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently

Hi Dan,

Thanks for the quick response. I'm trying to create the volume so have not 
reached this stage - the client exited out when I gave it :

gluster volume create vol-name replica 2 server1:.. server2:..  
server41:.. server42:..

If I do :

gluster volume create vol-name replica 2 server1:.. server2:..
gluster volume add-brick vol-name replica 2 server3:.. server4:..

it gets me farther ... looks like there is some timeout for the gluster command 
- not sure - just an observation.

Thanks
Regards
Nirmal
-Original Message-
From: Dan Lambright [mailto:dlamb...@redhat.com] 
Sent: Monday, March 31, 2014 6:16 PM
To: Prasad, Nirmal
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently

Hello,

the cli logs do not contain much. If you remount your gluster volume, and 
create the problem again, there may be more to see.

On the client side:

mount -t glusterfs  -o log-level=DEBUG,log-file=/tmp/my_client.log 
10.16.159.219:/myvol /mnt

On the server side: 

gluster volume set myvol diagnostics.brick-sys-log-level WARNING gluster volume 
set myvol diagnostics.brick-log-level WARNING

You could then attach the most recent log files to your email, or the parts 
that seem relevant so the email is not too large.
 
/tmp/my_client.log
/var/log/glusterfs/etc*.log
/var/log/glusterfs/bricks/*.log

- Original Message -
From: Nirmal Prasad npra...@idirect.net
To: gluster-users@gluster.org
Sent: Monday, March 31, 2014 6:04:31 PM
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently



.. and glusterd died. I had success adding individually up to 21 nodes – will 
go down that path. Anyone interested in log files or core files? 



service glusterd status 

glusterd dead but pid file exists 




From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal
Sent: Monday, March 31, 2014 5:57 PM
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently 




Looks symptomatic of some timeout – subsequent status command gave: 



gluster volume status 

Another transaction is in progress. Please try again after sometime. 




From: gluster-users-boun...@gluster.org [ 
mailto:gluster-users-boun...@gluster.org ] On Behalf Of Prasad, Nirmal
Sent: Monday, March 31, 2014 5:53 PM
To: gluster-users@gluster.org
Subject: [Gluster-users] 42 node gluster volume create fails silently 




Not much of output – not sure where to see. This is the output in the cli.log – 
There are 42 servers and (21 brick pairs) – timeout perhaps ?? 



[2014-03-31 13:44:34.228467] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed 

[2014-03-31 13:44:34.229619] I [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 
0-cli: Replicate cluster type found. Checking brick order. 

[2014-03-31 13:44:34.230821] I [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 
0-cli: Brick order okay 

[2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 
0-rpc-transport: missing 'option transport-type'. defaulting to socket 

[2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL 
support is NOT enabled 

[2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using 
system polling thread 

[2014-03-31 13:44:47.777000] I [cli-cmd-volume.c:1336:cli_check_gsync_present] 
0-: geo-replication not installed 

[2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 
0-cli: Received resp to list: 0 

[2014-03-31 13:44:47.782086] I [input.c:36:cli_batch] 0-: Exiting with: 0 

[2014-03-31 13:46:34.231761] I [input.c:36:cli_batch] 0-: Exiting with: 110 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 42 node gluster volume create fails silently

2014-03-31 Thread Prasad, Nirmal
3.4.2 - tracing out the problem on other nodes. I think they had something left 
out partially. The probe and addition process could definitely use some speed - 
plumbing should be quick - fun is in the data 

Cleared out with - 

setfattr -x trusted.glusterfs.volume-id mount
setfattr -x trusted.gfid mount


-Original Message-
From: Joe Julian [mailto:j...@julianfamily.org] 
Sent: Monday, March 31, 2014 7:06 PM
To: Prasad, Nirmal
Subject: Re: [Gluster-users] 42 node gluster volume create fails silently

What version?

On 03/31/2014 03:59 PM, Prasad, Nirmal wrote:
 Ok - for some reason it did not like 6 of my nodes - but able to add 34 nodes 
 two at a time - may be client can do the similar split internally based on 
 replica count. The failure from add-brick is simple volume add-brick: 
 failed: 

 gluster volume info

 Volume Name: gl_disk
 Type: Distributed-Replicate
 Volume ID: c70d525e-a255-41e2-af03-718d6dec0319
 Status: Created
 Number of Bricks: 17 x 2 = 34
 Transport-type: tcp

 -Original Message-
 From: gluster-users-boun...@gluster.org 
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal
 Sent: Monday, March 31, 2014 6:20 PM
 To: Dan Lambright
 Cc: gluster-users@gluster.org
 Subject: Re: [Gluster-users] 42 node gluster volume create fails silently

 Hi Dan,

 Thanks for the quick response. I'm trying to create the volume so have not 
 reached this stage - the client exited out when I gave it :

 gluster volume create vol-name replica 2 server1:.. server2:..  
 server41:.. server42:..

 If I do :

 gluster volume create vol-name replica 2 server1:.. server2:..
 gluster volume add-brick vol-name replica 2 server3:.. server4:..

 it gets me farther ... looks like there is some timeout for the gluster 
 command - not sure - just an observation.

 Thanks
 Regards
 Nirmal
 -Original Message-
 From: Dan Lambright [mailto:dlamb...@redhat.com]
 Sent: Monday, March 31, 2014 6:16 PM
 To: Prasad, Nirmal
 Cc: gluster-users@gluster.org
 Subject: Re: [Gluster-users] 42 node gluster volume create fails silently

 Hello,

 the cli logs do not contain much. If you remount your gluster volume, and 
 create the problem again, there may be more to see.

 On the client side:

 mount -t glusterfs  -o log-level=DEBUG,log-file=/tmp/my_client.log 
 10.16.159.219:/myvol /mnt

 On the server side:

 gluster volume set myvol diagnostics.brick-sys-log-level WARNING gluster 
 volume set myvol diagnostics.brick-log-level WARNING

 You could then attach the most recent log files to your email, or the parts 
 that seem relevant so the email is not too large.
   
 /tmp/my_client.log
 /var/log/glusterfs/etc*.log
 /var/log/glusterfs/bricks/*.log

 - Original Message -
 From: Nirmal Prasad npra...@idirect.net
 To: gluster-users@gluster.org
 Sent: Monday, March 31, 2014 6:04:31 PM
 Subject: Re: [Gluster-users] 42 node gluster volume create fails silently



 .. and glusterd died. I had success adding individually up to 21 nodes – will 
 go down that path. Anyone interested in log files or core files?



 service glusterd status

 glusterd dead but pid file exists




 From: gluster-users-boun...@gluster.org 
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of Prasad, Nirmal
 Sent: Monday, March 31, 2014 5:57 PM
 To: gluster-users@gluster.org
 Subject: Re: [Gluster-users] 42 node gluster volume create fails silently




 Looks symptomatic of some timeout – subsequent status command gave:



 gluster volume status

 Another transaction is in progress. Please try again after sometime.




 From: gluster-users-boun...@gluster.org [ 
 mailto:gluster-users-boun...@gluster.org ] On Behalf Of Prasad, Nirmal
 Sent: Monday, March 31, 2014 5:53 PM
 To: gluster-users@gluster.org
 Subject: [Gluster-users] 42 node gluster volume create fails silently




 Not much of output – not sure where to see. This is the output in the cli.log 
 – There are 42 servers and (21 brick pairs) – timeout perhaps ??



 [2014-03-31 13:44:34.228467] I 
 [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not 
 installed

 [2014-03-31 13:44:34.229619] I 
 [cli-cmd-volume.c:392:cli_cmd_volume_create_cbk] 0-cli: Replicate cluster 
 type found. Checking brick order.

 [2014-03-31 13:44:34.230821] I 
 [cli-cmd-volume.c:304:cli_cmd_check_brick_order] 0-cli: Brick order okay

 [2014-03-31 13:44:47.758977] W [rpc-transport.c:175:rpc_transport_load] 
 0-rpc-transport: missing 'option transport-type'. defaulting to socket

 [2014-03-31 13:44:47.763286] I [socket.c:3480:socket_init] 0-glusterfs: SSL 
 support is NOT enabled

 [2014-03-31 13:44:47.763326] I [socket.c:3495:socket_init] 0-glusterfs: using 
 system polling thread

 [2014-03-31 13:44:47.777000] I 
 [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not 
 installed

 [2014-03-31 13:44:47.780574] I [cli-rpc-ops.c:332:gf_cli_list_friends_cbk] 
 0-cli: Received resp to list: 0

 [2014-03-31 13:44:47.782086] I 

[Gluster-users] gfid different on subvolume

2014-03-31 Thread Mingfan Lu
I have seen some errors like gfid different on subvolume in my deployment.
e.g.
[2014-03-26 07:56:17.224262] W
[afr-common.c:1196:afr_detect_self_heal_by_iatt]
0-sh_ugc4_mams-replicate-1: /operation_1/video/2014/03/26/24/19: gfid
different on subvolume

my clients (3.3) have already backported the patches mentioned in
https://bugzilla.redhat.com/show_bug.cgi?id=907072

CHANGE: http://review.gluster.org/4459 (cluster/dht: ignore EEXIST error in
mkdir to avoid GFID mismatch) merged in master by Anand Avati

CHANGE: http://review.gluster.org/5849
http://review.gluster.org/5849 (cluster/dht: assign layout onto
missing directories too)


But I still saw such errors.

I though these changes were relative to clients? am I right? need I update
my servers using the patched release?

Or I miss other things?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users