Re: [Gluster-users] Apache hung tasks still occur with glusterfs 3.2.1

2011-06-14 Thread Christopher Anderlik

hello.

do you maybe have already feedback?
was it successfull? (disabled io-cache, disabled stat-prefetch, inreades 
io-thread count to 64)

is/was your problem similar to this one?
http://bugs.gluster.com/show_bug.cgi?id=3011


thx
christopher











Am 13.06.2011 19:14, schrieb Jiri Lunacek:

Thanks for the tip. I disabled io-cache and stat-prefetch, increased 
io-thread-count to 64 and
rebooted the server to clean off the hung apache processes. We'll see tomorrow.

On 13.6.2011, at 15:58, Justice London wrote:


Disable io-cache and up the threads to 64 and your problems should disappear. 
They did for me when
I made both of these changes.
Justice London
*From:*gluster-users-boun...@gluster.org
mailto:gluster-users-boun...@gluster.org[mailto:gluster-users-boun...@gluster.org]*On
 Behalf
Of*Jiri Lunacek
*Sent:*Monday, June 13, 2011 1:49 AM
*To:*gluster-users@gluster.org mailto:gluster-users@gluster.org
*Subject:*[Gluster-users] Apache hung tasks still occur with glusterfs 3.2.1
Hi all.
We have been having problems with hung tasks of apache reading from glusterfs 
2-replica volume
ever since upgrading to 3.2.0. The problems were identical to those described 
here:
http://gluster.org/pipermail/gluster-users/2011-May/007697.html
Yesterday we updated to 3.2.1.
A good thing is that the hung tasks stopped appearing when gluster is in 
intact operation, i.e.
when there are no modifications to the gluster configs at all.
Today we modified some other volume exported by the same cluster (but not 
sharing anything with
the volume used by the apache process). And, once again, two requests of apache 
reading from
glusterfs volume are stuck.
Any help with this issue would be very appreciated as right now we have to 
nightly-reboot the
machine as the processes re stuck in iowait - unkillable.
I really do not want to go through the downgrade to 3.1.4 since it seems from 
the mailing list
that it may not go exactly smooth. We are exporting millions of files and any 
large operation on
the exported filesystem takes days.
I am attaching tech info on the problem.
client:
Centos 5.6
2.6.18-238.9.1.el5
fuse-2.7.4-8.el5
glusterfs-fuse-3.2.1-1
glusterfs-core-3.2.1-1
servers:
Centos 5.6
2.6.18-194.32.1.el5
fuse-2.7.4-8.el5
glusterfs-fuse-3.2.1-1
glusterfs-core-3.2.1-1
dmesg:
INFO: task httpd:1246 blocked for more than 120 seconds.
echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
httpd D 81000101d7a0 0 1246 2394 1247 1191 (NOTLB)
81013ee7dc38 0082 0092 81013ee7dcd8
81013ee7dd04 000a 810144d0f7e0 81019fc28100
308f8b444727 14ee 810144d0f9c8 00038006e608
Call Trace:
[8006ec4e] do_gettimeofday+0x40/0x90
[80028c5a] sync_page+0x0/0x43
[800637ca] io_schedule+0x3f/0x67
[80028c98] sync_page+0x3e/0x43
[8006390e] __wait_on_bit_lock+0x36/0x66
[8003ff27] __lock_page+0x5e/0x64
[800a2921] wake_bit_function+0x0/0x23
[8003fd85] pagevec_lookup+0x17/0x1e
[800cc666] invalidate_inode_pages2_range+0x73/0x1bd
[8004fc94] finish_wait+0x32/0x5d
[884b9798] :fuse:wait_answer_interruptible+0xb6/0xbd
[800a28f3] autoremove_wake_function+0x0/0x2e
[8009a485] recalc_sigpending+0xe/0x25
[8001decc] sigprocmask+0xb7/0xdb
[884bd456] :fuse:fuse_finish_open+0x36/0x62
[884bda11] :fuse:fuse_open_common+0x147/0x158
[884bda22] :fuse:fuse_open+0x0/0x7
[8001eb99] __dentry_open+0xd9/0x1dc
[8002766e] do_filp_open+0x2a/0x38
[8001a061] do_sys_open+0x44/0xbe
[8005d28d] tracesys+0xd5/0xe0
INFO: task httpd:1837 blocked for more than 120 seconds.
echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
httpd D 810001004420 0 1837 2394 1856 1289 (NOTLB)
81013c6f9c38 0086 81013c6f9bf8 fffe
810170ce7000 000a 81019c0ae7a0 80311b60
308c0f83d792 0ec4 81019c0ae988 8006e608
Call Trace:
[8006ec4e] do_gettimeofday+0x40/0x90
[80028c5a] sync_page+0x0/0x43
[800637ca] io_schedule+0x3f/0x67
[80028c98] sync_page+0x3e/0x43
[8006390e] __wait_on_bit_lock+0x36/0x66
[8003ff27] __lock_page+0x5e/0x64
[800a2921] wake_bit_function+0x0/0x23
[8003fd85] pagevec_lookup+0x17/0x1e
[800cc666] invalidate_inode_pages2_range+0x73/0x1bd
[8004fc94] finish_wait+0x32/0x5d
[884b9798] :fuse:wait_answer_interruptible+0xb6/0xbd
[800a28f3] autoremove_wake_function+0x0/0x2e
[8009a485] recalc_sigpending+0xe/0x25
[8001decc] sigprocmask+0xb7/0xdb
[884bd456] :fuse:fuse_finish_open+0x36/0x62
[884bda11] :fuse:fuse_open_common+0x147/0x158
[884bda22] :fuse:fuse_open+0x0/0x7
[8001eb99] __dentry_open+0xd9/0x1dc
[8002766e] do_filp_open+0x2a/0x38
[8001a061] do_sys_open+0x44/0xbe
[8005d28d] 

Re: [Gluster-users] Apache hung tasks still occur with glusterfs 3.2.1

2011-06-14 Thread Anand Avati
Can you get us the process state dump of the glusterfs client where httpd is
hung? kill -USR1 glusterfs pid will generate /tmp/glusterdump.pid which is
the dumpfile.

Avati

On Mon, Jun 13, 2011 at 2:18 PM, Jiri Lunacek jiri.luna...@hosting90.czwrote:

 Hi all.

 We have been having problems with hung tasks of apache reading from
 glusterfs 2-replica volume ever since upgrading to 3.2.0. The problems were
 identical to those described here:
 http://gluster.org/pipermail/gluster-users/2011-May/007697.html

 Yesterday we updated to 3.2.1.

 A good thing is that the hung tasks stopped appearing when gluster is in
 intact operation, i.e. when there are no modifications to the gluster
 configs at all.
 Today we modified some other volume exported by the same cluster (but not
 sharing anything with the volume used by the apache process). And, once
 again, two requests of apache reading from glusterfs volume are stuck.

 Any help with this issue would be very appreciated as right now we have to
 nightly-reboot the machine as the processes re stuck in iowait -
 unkillable.

 I really do not want to go through the downgrade to 3.1.4 since it seems
 from the mailing list that it may not go exactly smooth. We are exporting
 millions of files and any large operation on the exported filesystem takes
 days.

 I am attaching tech info on the problem.

 client:
 Centos 5.6
 2.6.18-238.9.1.el5
 fuse-2.7.4-8.el5
 glusterfs-fuse-3.2.1-1
 glusterfs-core-3.2.1-1

 servers:
 Centos 5.6
 2.6.18-194.32.1.el5
 fuse-2.7.4-8.el5
 glusterfs-fuse-3.2.1-1
 glusterfs-core-3.2.1-1

 dmesg:
 INFO: task httpd:1246 blocked for more than 120 seconds.
 echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
 httpd D 81000101d7a0 0  1246   2394  1247  1191
 (NOTLB)
  81013ee7dc38 0082 0092 81013ee7dcd8
  81013ee7dd04 000a 810144d0f7e0 81019fc28100
  308f8b444727 14ee 810144d0f9c8 00038006e608
 Call Trace:
  [8006ec4e] do_gettimeofday+0x40/0x90
  [80028c5a] sync_page+0x0/0x43
  [800637ca] io_schedule+0x3f/0x67
  [80028c98] sync_page+0x3e/0x43
  [8006390e] __wait_on_bit_lock+0x36/0x66
  [8003ff27] __lock_page+0x5e/0x64
  [800a2921] wake_bit_function+0x0/0x23
  [8003fd85] pagevec_lookup+0x17/0x1e
  [800cc666] invalidate_inode_pages2_range+0x73/0x1bd
  [8004fc94] finish_wait+0x32/0x5d
  [884b9798] :fuse:wait_answer_interruptible+0xb6/0xbd
  [800a28f3] autoremove_wake_function+0x0/0x2e
  [8009a485] recalc_sigpending+0xe/0x25
  [8001decc] sigprocmask+0xb7/0xdb
  [884bd456] :fuse:fuse_finish_open+0x36/0x62
  [884bda11] :fuse:fuse_open_common+0x147/0x158
  [884bda22] :fuse:fuse_open+0x0/0x7
  [8001eb99] __dentry_open+0xd9/0x1dc
  [8002766e] do_filp_open+0x2a/0x38
  [8001a061] do_sys_open+0x44/0xbe
  [8005d28d] tracesys+0xd5/0xe0

 INFO: task httpd:1837 blocked for more than 120 seconds.
 echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
 httpd D 810001004420 0  1837   2394  1856  1289
 (NOTLB)
  81013c6f9c38 0086 81013c6f9bf8 fffe
  810170ce7000 000a 81019c0ae7a0 80311b60
  308c0f83d792 0ec4 81019c0ae988 8006e608
 Call Trace:
  [8006ec4e] do_gettimeofday+0x40/0x90
  [80028c5a] sync_page+0x0/0x43
  [800637ca] io_schedule+0x3f/0x67
  [80028c98] sync_page+0x3e/0x43
  [8006390e] __wait_on_bit_lock+0x36/0x66
  [8003ff27] __lock_page+0x5e/0x64
  [800a2921] wake_bit_function+0x0/0x23
  [8003fd85] pagevec_lookup+0x17/0x1e
  [800cc666] invalidate_inode_pages2_range+0x73/0x1bd
  [8004fc94] finish_wait+0x32/0x5d
  [884b9798] :fuse:wait_answer_interruptible+0xb6/0xbd
  [800a28f3] autoremove_wake_function+0x0/0x2e
  [8009a485] recalc_sigpending+0xe/0x25
  [8001decc] sigprocmask+0xb7/0xdb
  [884bd456] :fuse:fuse_finish_open+0x36/0x62
  [884bda11] :fuse:fuse_open_common+0x147/0x158
  [884bda22] :fuse:fuse_open+0x0/0x7
  [8001eb99] __dentry_open+0xd9/0x1dc
  [8002766e] do_filp_open+0x2a/0x38
  [8001a061] do_sys_open+0x44/0xbe
  [8005d28d] tracesys+0xd5/0xe0

 INFO: task httpd:383 blocked for more than 120 seconds.
 echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
 httpd D 81019fa21100 0   383   2394   534
 (NOTLB)
  81013e497c08 0082 810183eb8910 884b9219
  81019e41c600 0009 81019b1e2100 81019fa21100
  308c0e2c2bfb 00016477 81019b1e22e8 00038006e608
 Call Trace:
  [884b9219] :fuse:flush_bg_queue+0x2b/0x48
  [8006ec4e] do_gettimeofday+0x40/0x90
  

Re: [Gluster-users] Crossover cable: single point of failure?

2011-06-14 Thread Daniel Manser

Hi Whit,

Thanks for your reply.

I do know that it's not the Gluster-standard thing to use a crossover 
link.

(Seems to me it's the obvious best way to do it, but it's not a
configuration they're committed to.) It's possible that if you were 
doing
your replication over the LAN rather than the crossover that Gluster 
would

handle a disconnected system better. Might be worth testing.


It is still the same, even if no crossover cable is used and all 
traffic goes through an ethernet switch. The client can't write to the 
gluster volume anymore. I discovered that the NFS volume seems to be 
read-only in this state:


  client01:~# rm debian-6.0.1a-i386-DVD-1.iso
  rm: cannot remove `debian-6.0.1a-i386-DVD-1.iso': Read-only file 
system


So all traffic goes through one interface (NFS to the client, glusterfs 
replication, corosync).


I can reproduce the issue with the NFS client on VMware ESXi and with 
the NFS client on my Linux desktop.


My config:

  Volume Name: vmware
  Type: Replicate
  Status: Started
  Number of Bricks: 2
  Transport-type: tcp
  Bricks:
  Brick1: gluster1:/mnt/gvolumes/vmware
  Brick2: gluster2:/mnt/gvolumes/vmware

Regards,
Daniel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Crossover cable: single point of failure?

2011-06-14 Thread Anand Avati
Daniel,
 Can you confirm if you backend filesystem is proper? Can you delete the
file from the backend? Gluster does not return EROFS in any of the cases you
described. Also, try setting a lower ping-timeout and see if it helps in
case of crosscable failover test.

Avati

On Tue, Jun 14, 2011 at 12:58 PM, Daniel Manser dan...@clienta.ch wrote:

 Hi Whit,

 Thanks for your reply.


  I do know that it's not the Gluster-standard thing to use a crossover
 link.
 (Seems to me it's the obvious best way to do it, but it's not a
 configuration they're committed to.) It's possible that if you were doing
 your replication over the LAN rather than the crossover that Gluster would
 handle a disconnected system better. Might be worth testing.


 It is still the same, even if no crossover cable is used and all traffic
 goes through an ethernet switch. The client can't write to the gluster
 volume anymore. I discovered that the NFS volume seems to be read-only in
 this state:

  client01:~# rm debian-6.0.1a-i386-DVD-1.iso
  rm: cannot remove `debian-6.0.1a-i386-DVD-1.iso': Read-only file system

 So all traffic goes through one interface (NFS to the client, glusterfs
 replication, corosync).

 I can reproduce the issue with the NFS client on VMware ESXi and with the
 NFS client on my Linux desktop.

 My config:

  Volume Name: vmware
  Type: Replicate
  Status: Started
  Number of Bricks: 2
  Transport-type: tcp
  Bricks:
  Brick1: gluster1:/mnt/gvolumes/vmware
  Brick2: gluster2:/mnt/gvolumes/vmware

 Regards,
 Daniel

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Crossover cable: single point of failure?

2011-06-14 Thread Daniel Manser

Hi

Thanks for your reply.


 Can you confirm if you backend filesystem is proper? Can you delete
the file from the backend?


I was able to delete files on the server.


Also, try setting a lower ping-timeout and see if
it helps in case of crosscable failover test.


I set it to 5 seconds, but the result is still the same.

  Volume Name: vmware
  Type: Replicate
  Status: Started
  Number of Bricks: 2
  Transport-type: tcp
  Bricks:
  Brick1: gluster1:/mnt/gvolumes/vmware
  Brick2: gluster2:/mnt/gvolumes/vmware
  Options Reconfigured:
  network.ping-timeout: 5

Daniel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Strange errors reading/writing/editing/deleting JPGs, PDFs and PNG from PHP Application

2011-06-14 Thread Alan Zapolsky
Here is the log.  Nothing really stands out.  There is one entry from today
and the previous log entry was from 6/3.

[alan@app1:10.71.57.82:glusterfs]$ sudo cat /var/log/glusterfs/drives-d1.log
[2011-06-03 18:17:05.160722] W [io-stats.c:1644:init] d1: dangling volume.
check volfile
[2011-06-03 18:17:05.160865] W [dict.c:1205:data_to_str] dict: @data=(nil)
[2011-06-03 18:17:05.160897] W [dict.c:1205:data_to_str] dict: @data=(nil)
Given volfile:
+--+
  1: volume d1-client-0
  2: type protocol/client
  3: option remote-host 10.198.6.214
  4: option remote-subvolume /data/d1
  5: option transport-type tcp
  6: end-volume
  7:
  8: volume d1-client-1
  9: type protocol/client
 10: option remote-host 10.195.15.38
 11: option remote-subvolume /data/d1
 12: option transport-type tcp
 13: end-volume
 14:
 15: volume d1-replicate-0
 16: type cluster/replicate
 17: subvolumes d1-client-0 d1-client-1
 18: end-volume
 19:
 20: volume d1-write-behind
 21: type performance/write-behind
 22: option cache-size 4MB
 23: subvolumes d1-replicate-0
 24: end-volume
 25:
 26: volume d1-read-ahead
 27: type performance/read-ahead
 28: subvolumes d1-write-behind
 29: end-volume
 30:
 31: volume d1-io-cache
 32: type performance/io-cache
 33: option cache-size 1024MB
 34: subvolumes d1-read-ahead
 35: end-volume
 36:
 37: volume d1-quick-read
 38: type performance/quick-read
 39: option cache-size 1024MB
 40: subvolumes d1-io-cache
 41: end-volume
 42:
 43: volume d1-stat-prefetch
 44: type performance/stat-prefetch
 45: subvolumes d1-quick-read
 46: end-volume
 47:
 48: volume d1
 49: type debug/io-stats
 50: subvolumes d1-stat-prefetch
 51: end-volume

+--+
[2011-06-03 18:17:08.676157] I
[client-handshake.c:1005:select_server_supported_programs] d1-client-0:
Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2011-06-03 18:17:08.684299] I
[client-handshake.c:1005:select_server_supported_programs] d1-client-1:
Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
[2011-06-03 18:17:08.718624] I [client-handshake.c:841:client_setvolume_cbk]
d1-client-1: Connected to 10.195.15.38:24009, attached to remote volume
'/data/d1'.
[2011-06-03 18:17:08.718687] I [afr-common.c:2572:afr_notify]
d1-replicate-0: Subvolume 'd1-client-1' came back up; going online.
[2011-06-03 18:17:08.732772] I [fuse-bridge.c:2821:fuse_init]
glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel
7.14
[2011-06-03 18:17:08.735602] I [afr-common.c:819:afr_fresh_lookup_cbk]
d1-replicate-0: added root inode
[2011-06-03 18:17:08.748443] I [client-handshake.c:841:client_setvolume_cbk]
d1-client-0: Connected to 10.198.6.214:24009, attached to remote volume
'/data/d1'.
[2011-06-10 06:33:08.255922] W [fuse-bridge.c:2510:fuse_getxattr]
glusterfs-fuse: 3480740: GETXATTR (null)/3039291028 (security.capability)
(fuse_loc_fill() failed)
[alan@app1:10.71.57.82:glusterfs]$


Forgive me, I'm relatively new to GlusterFS.  I'm not sure what level of
logging I have setup.  How can I tell the level of logging I have
configured? Perhaps I could increase this?  Hopefully capture more detailed
information.


Thanks again for the help!
 - Alan

Just in case this helps, here are the volume configuration files from the
server-


[alan@file1:10.198.6.214:d1]$ sudo cat d1-fuse.vol
volume d1-client-0
type protocol/client
option remote-host 10.198.6.214
option remote-subvolume /data/d1
option transport-type tcp
end-volume

volume d1-client-1
type protocol/client
option remote-host 10.195.15.38
option remote-subvolume /data/d1
option transport-type tcp
end-volume

volume d1-replicate-0
type cluster/replicate
subvolumes d1-client-0 d1-client-1
end-volume

volume d1-write-behind
type performance/write-behind
option cache-size 4MB
subvolumes d1-replicate-0
end-volume

volume d1-read-ahead
type performance/read-ahead
subvolumes d1-write-behind
end-volume

volume d1-io-cache
type performance/io-cache
option cache-size 1024MB
subvolumes d1-read-ahead
end-volume

volume d1-quick-read
type performance/quick-read
option cache-size 1024MB
subvolumes d1-io-cache
end-volume

volume d1-stat-prefetch
type performance/stat-prefetch
subvolumes d1-quick-read
end-volume

volume d1
type debug/io-stats
subvolumes d1-stat-prefetch
end-volume
[alan@file1:10.198.6.214:d1]$


[alan@file1:10.198.6.214:d1]$ sudo cat d1.10.195.15.38.data-d1.vol
volume d1-posix
type storage/posix
option directory /data/d1
end-volume

volume d1-access-control
type features/access-control
subvolumes d1-posix
end-volume

volume d1-locks
type features/locks
subvolumes d1-access-control
end-volume

volume d1-io-threads
type 

Re: [Gluster-users] [Gluster3.2@Grid5000] 128 nodes failure and rr scheduler question

2011-06-14 Thread François Thiebolt
Hello,

To make things clear, what I've done is :
- deploying GlusterFS on 2, 4, 8, 16, 32, 64, 128 nodes
- running a variant of the MAB benchmark (it's all about compilation of 
openssl-1.0.0) on 2, 4, 8, 16, 32, 64, 128 nodes
- I used 'pdsh -f 512' to start MAB on all nodes at the same time
- on each experiment on each node, I ran MAB  in a dedicated directory within 
the glusterfs global namespace (e.g. nodeA used gluster global 
namespace/nodeA/mab files) to avoid a metadata storm on the parent directory 
inode
- between each experiment, I destroy and redeploy a complete new GlusterFS 
setup (and I also destroy everything within each brick i.e the exported storage 
dir)

I then compare the average compilation time vs the number of nodes ... and it 
increases due to the round robin scheduler that dispatches files on all the 
bricks
2 : Phase_V(s)avg   249.9332121175
4 : Phase_V(s)avg   262.808117374
8 : Phase_V(s)avg   293.572061537875
16 : Phase_V(s)avg   351.436554833375
32 : Phase_V(s)avg   546.503069517844
64 : Phase_V(s)avg   1010.61019479478
(phase V is related to the compilation itself, previous phases are about 
metadata ops)
You can also try to compile a linux kernel on your own, this is pretty much the 
same thing.

Now regarding the GlusterFS setup : yes, you're right, there is no replication 
so this is a simple stripping (on a file basis) setup
Each time, I create a glusterfs volume featuring one brick, then i add bricks 
(one by one) till I reach the number of nodes ... and after that, I start the 
volume.
Now regarding the 128bricks case, this is when I start the volume that I get a 
random error telling me that brickX does not respond, and this changes every 
time I retry to start the volume.
So far, I didn't tested with a number of nodes between 64 and 128

François
 
On Friday, June 10, 2011 16:38 CEST, Pavan T C t...@gluster.com wrote: 
 
 On Wednesday 08 June 2011 06:10 PM, Francois THIEBOLT wrote:
  Hello,
 
  I'm driving some experiments on grid'5000 with GlusterFS 3.2 and, as a
  first point, i've been unable to start a volume featuring 128bricks (64 ok)
 
  Then, due to the round-robin scheduler, as the number of nodes increase
  (every node is also a brick), the performance of an application on an
  individual node decrease!
 
 I would like to understand what you mean by increase of nodes. You 
 have 64 bricks and each brick also acts as a client. So, where is the 
 increase in the number of nodes? Are you referring to the mounts that 
 you are doing?
 
 What is your gluster configuration - I mean, is it a distribute only, or 
 is it a distributed-replicate setup? [From your command sequence, it 
 should be a pure distribute, but I just want to be sure].
 
 What is your application like? Is it mostly I/O intensive? It will help 
 if you provide a brief description of typical operations done by your 
 application.
 
 How are you measuring the performance? What parameter determines that 
 you are experiencing a decrease in performance with increase in the 
 number of nodes?
 
 Pavan
 
  So my question is : how to STOP the round-robin distribution of files
  over the bricks within a volume ?
 
  *** Setup ***
  - i'm using glusterfs3.2 from source
  - every node is both a client node and a brick (storage)
  Commands :
  - gluster peer probe each of the 128nodes
  - gluster volume create myVolume transport tcp 128 bricks:/storage
  - gluster volume start myVolume (fails with 128 bricks!)
  - mount -t glusterfs .. on all nodes
 
  Feel free to tell me how to improve things
 
  François
 
 
 
 
 
 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Files present on the backend but have become invisible from clients

2011-06-14 Thread Burnash, James
Hi Pranith.

Yes, I do see those messages in my mount logs on the client:

root@jc1lnxsamm100:~# fgrep afr-self-heal /var/log/glusterfs/pfs2.log | tail
[2011-06-14 07:30:56.152066] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:35:16.869848] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:39:48.500117] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:40:19.312364] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:44:27.714292] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:50:04.691154] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:54:17.853591] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:55:26.876415] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:59:51.702585] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 08:00:08.346056] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes

James Burnash
Unix Engineer
Knight Capital Group


-Original Message-
From: Pranith Kumar. Karampuri [mailto:prani...@gluster.com] 
Sent: Tuesday, June 14, 2011 1:28 AM
To: Burnash, James; Jeff Darcy (jda...@redhat.com); gluster-users@gluster.org
Subject: RE: [Gluster-users] Files present on the backend but have become 
invisible from clients

hi James,
bricks 3-10 dont have problems, I think brick 01, 02 went to split brain 
situation, could you confirm if you see the following logs in your mount's log 
file
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix]0-stress-volume-replicate-0: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes.

Pranith.

From: Burnash, James [jburn...@knight.com]
Sent: Monday, June 13, 2011 11:56 PM
To: Pranith Kumar. Karampuri; Jeff Darcy(jda...@redhat.com); 
gluster-users@gluster.org
Subject: RE: [Gluster-users] Files present on the backend but   havebecome  
invisible from clients

Hi Pranith.

Here is the revised listing - please notice that bricks g01 and g02 on the two 
servers (jc1letgfs14 and 15) have what appear to be normal trusted.afr 
attributes, but the balance of the bricks (3-10) all have 
=0x.

http://pastebin.com/j0hVFTzd

Is this right, or am I looking at this backwards / sideways?

James Burnash
Unix Engineer
Knight Capital Group

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Burnash, James
Sent: Monday, June 13, 2011 8:28 AM
To: 'Pranith Kumar. Karampuri'; Jeff Darcy (jda...@redhat.com); 
gluster-users@gluster.org
Subject: Re: [Gluster-users] Files present on the backend but have become 
invisible from clients

Hi Pranith.

Sorry - last week was a rough one. Disregard that pastebin - I will put up a 
new one that makes more sense and repost to the list.

James
-Original Message-
From: Pranith Kumar. Karampuri [mailto:prani...@gluster.com]
Sent: Monday, June 13, 2011 1:12 AM
To: Burnash, James; Jeff Darcy (jda...@redhat.com); gluster-users@gluster.org
Subject: RE: [Gluster-users] Files present on the backend but have become 
invisible from clients

hi James,
 I looked at the pastebin sample, I see that all of the attrs are complete 
zeros, Could you let me know what is it that I am missing.

Pranith

From: gluster-users-boun...@gluster.org [gluster-users-boun...@gluster.org] on 
behalf of Burnash, James [jburn...@knight.com]
Sent: 

Re: [Gluster-users] Apache hung tasks still occur with glusterfs 3.2.1

2011-06-14 Thread Jiri Lunacek
Hi.


 hello.
 
 do you maybe have already feedback?
 was it successfull? (disabled io-cache, disabled stat-prefetch, inreades 
 io-thread count to 64)
 

For now it seems that the work-arround has worked. We have not encountered any 
hung processes on the server since the change (io-cache disable,  stat-prefetch 
disable io-thread-count=64).

The only bad influence is expectable, the pages (mainly list of several 
hundred images per page) take a little while longer. Of course this is caused 
by the files not being cached.

 is/was your problem similar to this one?
 http://bugs.gluster.com/show_bug.cgi?id=3011

The symptoms were the same. The processes were hung on ioctl. /proc//wchan for 
the PIDs showed sync_page.

I'll experiment a bit once again today and set the volume back to original 
parameters and wait for a hung process to get you the information 
(/tmp/glusterdump.pid).

I'll report back later.

Jiri

 
 Am 13.06.2011 19:14, schrieb Jiri Lunacek:
 Thanks for the tip. I disabled io-cache and stat-prefetch, increased 
 io-thread-count to 64 and
 rebooted the server to clean off the hung apache processes. We'll see 
 tomorrow.
 
 On 13.6.2011, at 15:58, Justice London wrote:
 
 Disable io-cache and up the threads to 64 and your problems should 
 disappear. They did for me when
 I made both of these changes.
 Justice London
 *From:*gluster-users-boun...@gluster.org
 mailto:gluster-users-boun...@gluster.org[mailto:gluster-users-boun...@gluster.org]*On
  Behalf
 Of*Jiri Lunacek
 *Sent:*Monday, June 13, 2011 1:49 AM
 *To:*gluster-users@gluster.org mailto:gluster-users@gluster.org
 *Subject:*[Gluster-users] Apache hung tasks still occur with glusterfs 3.2.1
 Hi all.
 We have been having problems with hung tasks of apache reading from 
 glusterfs 2-replica volume
 ever since upgrading to 3.2.0. The problems were identical to those 
 described here:
 http://gluster.org/pipermail/gluster-users/2011-May/007697.html
 Yesterday we updated to 3.2.1.
 A good thing is that the hung tasks stopped appearing when gluster is in 
 intact operation, i.e.
 when there are no modifications to the gluster configs at all.
 Today we modified some other volume exported by the same cluster (but not 
 sharing anything with
 the volume used by the apache process). And, once again, two requests of 
 apache reading from
 glusterfs volume are stuck.
 Any help with this issue would be very appreciated as right now we have to 
 nightly-reboot the
 machine as the processes re stuck in iowait - unkillable.
 I really do not want to go through the downgrade to 3.1.4 since it seems 
 from the mailing list
 that it may not go exactly smooth. We are exporting millions of files and 
 any large operation on
 the exported filesystem takes days.
 I am attaching tech info on the problem.
 client:
 Centos 5.6
 2.6.18-238.9.1.el5
 fuse-2.7.4-8.el5
 glusterfs-fuse-3.2.1-1
 glusterfs-core-3.2.1-1
 servers:
 Centos 5.6
 2.6.18-194.32.1.el5
 fuse-2.7.4-8.el5
 glusterfs-fuse-3.2.1-1
 glusterfs-core-3.2.1-1
 dmesg:
 INFO: task httpd:1246 blocked for more than 120 seconds.
 echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
 httpd D 81000101d7a0 0 1246 2394 1247 1191 (NOTLB)
 81013ee7dc38 0082 0092 81013ee7dcd8
 81013ee7dd04 000a 810144d0f7e0 81019fc28100
 308f8b444727 14ee 810144d0f9c8 00038006e608
 Call Trace:
 [8006ec4e] do_gettimeofday+0x40/0x90
 [80028c5a] sync_page+0x0/0x43
 [800637ca] io_schedule+0x3f/0x67
 [80028c98] sync_page+0x3e/0x43
 [8006390e] __wait_on_bit_lock+0x36/0x66
 [8003ff27] __lock_page+0x5e/0x64
 [800a2921] wake_bit_function+0x0/0x23
 [8003fd85] pagevec_lookup+0x17/0x1e
 [800cc666] invalidate_inode_pages2_range+0x73/0x1bd
 [8004fc94] finish_wait+0x32/0x5d
 [884b9798] :fuse:wait_answer_interruptible+0xb6/0xbd
 [800a28f3] autoremove_wake_function+0x0/0x2e
 [8009a485] recalc_sigpending+0xe/0x25
 [8001decc] sigprocmask+0xb7/0xdb
 [884bd456] :fuse:fuse_finish_open+0x36/0x62
 [884bda11] :fuse:fuse_open_common+0x147/0x158
 [884bda22] :fuse:fuse_open+0x0/0x7
 [8001eb99] __dentry_open+0xd9/0x1dc
 [8002766e] do_filp_open+0x2a/0x38
 [8001a061] do_sys_open+0x44/0xbe
 [8005d28d] tracesys+0xd5/0xe0
 INFO: task httpd:1837 blocked for more than 120 seconds.
 echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
 httpd D 810001004420 0 1837 2394 1856 1289 (NOTLB)
 81013c6f9c38 0086 81013c6f9bf8 fffe
 810170ce7000 000a 81019c0ae7a0 80311b60
 308c0f83d792 0ec4 81019c0ae988 8006e608
 Call Trace:
 [8006ec4e] do_gettimeofday+0x40/0x90
 [80028c5a] sync_page+0x0/0x43
 [800637ca] io_schedule+0x3f/0x67
 [80028c98] sync_page+0x3e/0x43
 

[Gluster-users] read-ahead performance translator tweaking with 3.2.1?

2011-06-14 Thread mki-glusterfs
Hi

Is there a way to tweak the read-ahead settings via the gluster command
line?  For example:

gluster volume set somevolumename performance.read-ahead 2

Or is this no longer feasible?  With read-ahead set to the default of
8 like was the case with standard volgen generated configs, the amount
of useless reads happening to the bricks is way too high, and on 1 GbE
interconnects causes saturation and performance degradation in no time.

Thanks.

Mohan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Crossover cable: single point of failure?

2011-06-14 Thread Mohit Anchlia
On Tue, Jun 14, 2011 at 2:51 AM, Daniel Manser dan...@clienta.ch wrote:
 Hi

 Thanks for your reply.

  Can you confirm if you backend filesystem is proper? Can you delete
 the file from the backend?

 I was able to delete files on the server.

 Also, try setting a lower ping-timeout and see if
 it helps in case of crosscable failover test.

 I set it to 5 seconds, but the result is still the same.

It will be good to get to bottom of this. Do you see any errors in
server logs? Is it possible to do the same test with no vmware in
between, just using baremetals?


  Volume Name: vmware
  Type: Replicate
  Status: Started
  Number of Bricks: 2
  Transport-type: tcp
  Bricks:
  Brick1: gluster1:/mnt/gvolumes/vmware
  Brick2: gluster2:/mnt/gvolumes/vmware
  Options Reconfigured:
  network.ping-timeout: 5

 Daniel
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Variable sized bricks replication

2011-06-14 Thread Philip Poten
Hello,

we've been using glusterfs 3.0 for a while now, and it appears to be quite
stable and very useful. Next thing we need to do however is to migrate to
glusterfs 3.2 to allow for brick additions on the fly without client
restarts.

Now, since we are about to completely re-do the whole thing, we should
really do distributed replicated volumes, and here I was wondering: can I
use different brick sizes for that? For economical reasons, I need to use
the hardware on hand, and there is a lot, but the disks are anything from
500GB to 2TB.

Now, how does glusterfs handle the replication here? Will gluster just use
another node if one is full?

Any experience with that?

cheers,
Philip
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Files present on the backend but have become invisible from clients

2011-06-14 Thread Pranith Kumar. Karampuri
hi James,
   Could you please check if any of the file permissions of files in the 
directory are mis-matching, I also need the output of getxattr -d -m . 
filename for all the files in the following bricks in that order:

jc1letgfs14:export/read-only/g01
jc1letgfs15:export/read-only/g01

jc1letgfs14:export/read-only/g02
jc1letgfs15:export/read-only/g02

Please give the ls command output on the mount point so that we can check what 
files are missing.

Thanks
Pranith

From: Burnash, James [jburn...@knight.com]
Sent: Tuesday, June 14, 2011 5:37 PM
To: Pranith Kumar. Karampuri; Jeff Darcy(jda...@redhat.com); 
gluster-users@gluster.org
Subject: RE: [Gluster-users] Files present on the backend but   havebecome  
invisible from clients

Hi Pranith.

Yes, I do see those messages in my mount logs on the client:

root@jc1lnxsamm100:~# fgrep afr-self-heal /var/log/glusterfs/pfs2.log | tail
[2011-06-14 07:30:56.152066] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:35:16.869848] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:39:48.500117] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:40:19.312364] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:44:27.714292] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:50:04.691154] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:54:17.853591] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:55:26.876415] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 07:59:51.702585] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes
[2011-06-14 08:00:08.346056] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes

James Burnash
Unix Engineer
Knight Capital Group


-Original Message-
From: Pranith Kumar. Karampuri [mailto:prani...@gluster.com]
Sent: Tuesday, June 14, 2011 1:28 AM
To: Burnash, James; Jeff Darcy (jda...@redhat.com); gluster-users@gluster.org
Subject: RE: [Gluster-users] Files present on the backend but have become 
invisible from clients

hi James,
bricks 3-10 dont have problems, I think brick 01, 02 went to split brain 
situation, could you confirm if you see the following logs in your mount's log 
file
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix]0-stress-volume-replicate-0: 
Unable to self-heal permissions/ownership of '/' (possible split-brain). Please 
fix the file on all backend volumes.

Pranith.

From: Burnash, James [jburn...@knight.com]
Sent: Monday, June 13, 2011 11:56 PM
To: Pranith Kumar. Karampuri; Jeff Darcy(jda...@redhat.com); 
gluster-users@gluster.org
Subject: RE: [Gluster-users] Files present on the backend but   havebecome  
invisible from clients

Hi Pranith.

Here is the revised listing - please notice that bricks g01 and g02 on the two 
servers (jc1letgfs14 and 15) have what appear to be normal trusted.afr 
attributes, but the balance of the bricks (3-10) all have 
=0x.

http://pastebin.com/j0hVFTzd

Is this right, or am I looking at this backwards / sideways?

James Burnash
Unix Engineer
Knight Capital Group

-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Burnash, James
Sent: Monday, June 13, 2011 8:28 AM
To: 'Pranith Kumar. Karampuri'; Jeff Darcy (jda...@redhat.com); 
gluster-users@gluster.org
Subject: Re: [Gluster-users] Files present on the backend but have become 
invisible from