Hello,

We are using GFS2 on 3 nodes cluster, kernel 2.6.18-164.6.1.el5,
RHEL/CentOS5, x86_64 with 8-12GB memory in each node. The underlying storage
is HP 2312fc smart array equipped with 12 SAS 15K rpm, configured as RAID10
using 10 HDDs + 2 spares. The array has about 4GB cache. Communication is
4Gbps FC, through HP StorageWorks 8/8 Base e-port SAN Switch.

Our application is apache version 1.3.41, mostly serving static HTML file +
few PHP. Note that, we have to downgrade to 1.3.41 due to application
requirement. Apache was configured with 500 MaxClients. Each HTML file is
placed in different directory. The PHP script modify HTML file and do some
locking prior to HTML modification. We use round-robin DNS to load balance
between each web server.

The GFS2 storage was formatted with 4 journals, which is run over a LVM
volume. We have configured CMAN, QDiskd, Fencing as appropriate and
everything works just fine. We used QDiskd since the cluster initially only
has 2 nodes. We used manual_fence temporarily since no fencing hardware was
configured yet. GFS2 is mounted with noatime,nodiratime option.

Initially, the application was running fine. The problem we encountered is
that, over time, load average on some nodes would gradually reach about
300-500, where in normal workload the machine should have about 10. When the
load piled up, HTML modification will mostly fail.

We suspected that this might be plock_rate issue, so we modified
cluster.conf configuration as well as adding some more mount options, such
as num_glockd=16 and data=writeback to increase the performance. After we
successfully reboot the system and mount the volume. We tried ping_pong (
http://wiki.samba.org/index.php/Ping_pong) test to see how fast the lock can
perform. The lock speed greatly increase from 100 to 3-5k/sec. However,
after running ping_pong on all 3 nodes simultaneously, the ping_pong program
hang with D state and we could not kill the process even with SIGKILL.

Due to the time constraint, we decided to leave the system as is, letting
ping_pong stuck on all nodes while serving web request. After runing for
hours, the httpd process got stuck in D state and couldn't be killed. All
web serving was not possible at all. We have to reset all machine (unmount
was not possible). The machines were back and GFS volume was back to normal.


Since we have to reset all machines, I decided to run gfs2_fsck on the
volume. So I unmounted GFS2 on all nodes, run gfs2_fsck, answer "y" to many
question about freeing block, and I got the volume back. However, the
process stuck up occurred again very quickly. More seriously, trying to kill
a running process in GFS or unmount it yield kernel panic and suspend the
volume.

After this, the volume was never back to normal again. The volume will crash
(kernel panic) almost immediately when we try to write something to it. This
happened even if I removed mount option and just leave noatime and
nodiratime. I didn't run gfs2_fsck again yet, since we decided to leave it
as is and trying to backup as much data as possible.

Sorry for such a long story. In summary, my question is


   - What could be the cause of load average pile up? Note that sometimes
   happened only on some nodes, although DNS round robin should fairly
   distribute workload to all nodes. At the least the load different shouldn't
   be that much.
   - Should we run gfs2_fsck again? Why the lock up occur?


I have attached our cluster.conf as well as kernel panic log with this
e-mail.


Thank you very much in advance

Best Regards,

===========================================
Somsak Sriprayoonsakul

INOX

Attachment: cluster.conf
Description: Binary data

Apr 21 23:06:43 cafe2 kernel: dlm: data1: group leave failed -512 0
Apr 21 23:06:43 cafe2 dlm_controld[8075]: open "/sys/kernel/dlm/data1/event_done" error -1 2
Apr 21 23:06:43 cafe2 kernel: GFS2: fsid=pantip:data1.2: withdrawn
Apr 21 23:06:43 cafe2 kernel: 
Apr 21 23:06:43 cafe2 kernel: Call Trace:
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8854c3ce>] :gfs2:gfs2_lm_withdraw+0xc1/0xd0
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff80017a2d>] cache_grow+0x35a/0x3c1
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8005c2b4>] cache_alloc_refill+0x106/0x186
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8854e242>] :gfs2:__glock_lo_add+0x62/0x89
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8855f58f>] :gfs2:gfs2_consist_rgrpd_i+0x34/0x39
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8855c08c>] :gfs2:rgblk_free+0x13a/0x15c
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8855cd83>] :gfs2:gfs2_free_data+0x27/0x9a
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88541985>] :gfs2:do_strip+0x2c9/0x349
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff885407e2>] :gfs2:recursive_scan+0xf2/0x175
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff885408fe>] :gfs2:trunc_dealloc+0x99/0xe7
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff885416bc>] :gfs2:do_strip+0x0/0x349
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff80090000>] sched_exit+0xb4/0xb5
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88557dda>] :gfs2:gfs2_delete_inode+0xdd/0x191
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88557d43>] :gfs2:gfs2_delete_inode+0x46/0x191
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88547e77>] :gfs2:gfs2_glock_schedule_for_reclaim+0x5d/0x9a
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88557cfd>] :gfs2:gfs2_delete_inode+0x0/0x191
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8002f48f>] generic_delete_inode+0xc6/0x143
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8855c9a4>] :gfs2:gfs2_inplace_reserve_i+0x63b/0x691
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88547dd8>] :gfs2:do_promote+0xf5/0x137
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8855124a>] :gfs2:gfs2_write_begin+0x16c/0x339
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88552a83>] :gfs2:gfs2_file_buffered_write+0xf3/0x26c
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88552e54>] :gfs2:__gfs2_file_aio_write_nolock+0x258/0x28f
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88552ff6>] :gfs2:gfs2_file_write_nolock+0xaa/0x10f
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8009fc08>] autoremove_wake_function+0x0/0x2e
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8003f118>] vma_prio_tree_insert+0x20/0x38
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8001cbcb>] vma_link+0xd0/0xfd
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff88553146>] :gfs2:gfs2_file_write+0x49/0xa7
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8001691b>] vfs_write+0xce/0x174
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff800171d3>] sys_write+0x45/0x6e
Apr 21 23:06:43 cafe2 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Apr 21 23:06:43 cafe2 kernel: 
Apr 21 23:06:43 cafe2 kernel: GFS2: fsid=pantip:data1.2: gfs2_delete_inode: -5
Apr 21 23:06:43 cafe2 kernel: VFS:Filesystem freeze failed
Apr 21 23:07:33 cafe2 shutdown[8848]: shutting down for system reboot

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to