Thank You, for reply! As You can see from config, ping-timeout is not
set - default is asumed. Now started glusterfs with 8 threads on both
server and client (autoscaling switched off).
Hardware:
*server1:*
lspci
00:00.0 Host bridge: Intel Corporation E7505 Memory Controller Hub (rev 03)
00:00.1 Class ff00: Intel Corporation E7505/E7205 Series RAS Controller
(rev 03)
00:01.0 PCI bridge: Intel Corporation E7505/E7205 PCI-to-AGP Bridge (rev 03)
00:02.0 PCI bridge: Intel Corporation E7505 Hub Interface B PCI-to-PCI
Bridge (rev 03)
00:02.1 Class ff00: Intel Corporation E7505 Hub Interface B PCI-to-PCI
Bridge RAS Controller (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM
(ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM
(ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM
(ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 82)
00:1f.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC
Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller
(rev 02)
00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
SMBus Controller (rev 02)
02:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04)
02:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04)
02:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04)
02:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04)
03:01.0 RAID bus controller: Intel Corporation RAID Controller
04:02.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet
Controller (rev 02)
05:02.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
05:03.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet
Pro 100 (rev 0d)
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Xeon(TM) CPU 2.40GHz
stepping : 5
cpu MHz : 2392.024
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts cid xtpr
bogomips : 4784.04
clflush size : 64
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Xeon(TM) CPU 2.40GHz
stepping : 5
cpu MHz : 2392.024
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
apicid : 1
initial apicid : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts cid xtpr
bogomips : 4784.16
clflush size : 64
power management:
*server2:*
lspci
00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub (rev 0c)
00:00.1 Class ff00: Intel Corporation E7525/E7520 Error Reporting
Registers (rev 0c)
00:01.0 System peripheral: Intel Corporation E7520 DMA Controller (rev 0c)
00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port
A (rev 0c)
00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B
(rev 0c)
00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 (rev 0c)
00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev 0c)
00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2
EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC
Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE
Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus
Controller (rev 02)
01:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge
A (rev 09)
01:00.1 PIC: Intel Corporation 6700/6702PXH I/OxAPIC Interrupt
Controller A (rev 09)
01:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge
B (rev 09)
01:00.3 PIC: Intel Corporation 6700PXH I/OxAPIC Interrupt Controller B
(rev 09)
02:03.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X
Fusion-MPT SAS (rev 01)
02:05.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X
Fusion-MPT Dual Ultra320 SCSI (rev 08)
02:05.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X
Fusion-MPT Dual Ultra320 SCSI (rev 08)
03:01.0 I2O: LSI Logic / Symbios Logic MegaRAID (rev 01)
05:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8050 PCI-E
ASF Gigabit Ethernet Controller (rev 18)
07:04.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet
Controller (rev 05)
07:0c.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 2.80GHz
stepping : 1
cpu MHz : 2792.955
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx lm constant_tsc pebs bts pni monitor ds_cpl cid cx16 xtpr
bogomips : 5590.46
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 2.80GHz
stepping : 1
cpu MHz : 2792.955
cache size : 1024 KB
physical id : 3
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx lm constant_tsc pebs bts pni monitor ds_cpl cid cx16 xtpr
bogomips : 5586.06
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 2
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 2.80GHz
stepping : 1
cpu MHz : 2792.955
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx lm constant_tsc pebs bts pni monitor ds_cpl cid cx16 xtpr
bogomips : 5586.02
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 2.80GHz
stepping : 1
cpu MHz : 2792.955
cache size : 1024 KB
physical id : 3
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx lm constant_tsc pebs bts pni monitor ds_cpl cid cx16 xtpr
bogomips : 5586.05
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:
jvanwanr...@chatventure.nl wrote:
Hi Maris,
Can you tell me something more about the hardware you use? With our
tests yesterday we had some troubles with very high load inconjunction
with autoscaling. You can try a fixed limit of threads. What are the
ping-timeout settings by the way?
Best Regards Jasper
Jasper van Wanrooy - Chatventure BV
Technical Manager
T: +31 (0) 6 47 248 722
E: jvanwanr...@chatventure.nl
W: www.chatventure.nl
----- Original Message -----
From: "Maris Ruskulis" <ma...@chown.lv>
To: gluster-users@gluster.org
Sent: Friday, 29 May, 2009 10:11:45 GMT +01:00 Amsterdam / Berlin /
Bern / Rome / Stockholm / Vienna
Subject: Re: [Gluster-users] Glusterfs 2.0 hangs on high load
Is there way to solve this issue?
Maris Ruskulis wrote:
I have same issue with same config when both nodes are x64. But
difference is that, there is no bailout messages in logs.
Jasper van Wanrooy - Chatventure wrote:
Hi Maris,
I regret to hear that. I was also having problems with the
stability on 32bit platforms. Possibly you should try it on a
64bit platform. Is that an option?
Best Regards Jasper
On 28 mei 2009, at 09:36, Maris Ruskulis wrote:
Hello!
After upgrade to version 2.0, now using 2.0.1, I'm
experiencing problems with glusterfs stability.
I'm running 2 node setup with cliet side afr, and
glusterfsd also is running on same servers. Time to time
glusterfs just hangs, i can reproduce this running iozone
benchmarking tool. I'm using patched Fuse, but same
result is with unpatched.
================================================================================
Version : glusterfs 2.0.1 built on May 27 2009 16:04:01
TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
Starting Time: 2009-05-27 16:38:20
Command line : /usr/sbin/glusterfsd
--volfile=/etc/glusterfs/glusterfs-server.vol
--pid-file=/var/run/glusterfsd.pid
--log-file=/var/log/glusterfsd.log
PID : 31971
System name : Linux
Nodename : weeber.st-inst.lv
Kernel Release : 2.6.28-hardened-r7
Hardware Identifier: i686
Given volfile:
+------------------------------------------------------------------------------+
1: # file: /etc/glusterfs/glusterfs-server.vol
2: volume posix
3: type storage/posix
4: option directory /home/export
5: end-volume
6:
7: volume locks
8: type features/locks
9: option mandatory-locks on
10: subvolumes posix
11: end-volume
12:
13: volume brick
14: type performance/io-threads
15: option autoscaling on
16: subvolumes locks
17: end-volume
18:
19: volume server
20: type protocol/server
21: option transport-type tcp
22: option auth.addr.brick.allow 127.0.0.1,192.168.1.*
23: subvolumes brick
24: end-volume
+------------------------------------------------------------------------------+
[2009-05-27 16:38:20] N [glusterfsd.c:1152:main]
glusterfs: Successfully started
[2009-05-27 16:38:33] N
[server-protocol.c:7035:mop_setvolume] server: accepted
client from 192.168.1.233:1021
[2009-05-27 16:38:33] N
[server-protocol.c:7035:mop_setvolume] server: accepted
client from 192.168.1.233:1020
[2009-05-27 16:38:46] N
[server-protocol.c:7035:mop_setvolume] server: accepted
client from 192.168.1.252:1021
[2009-05-27 16:38:46] N
[server-protocol.c:7035:mop_setvolume] server: accepted
client from 192.168.1.252:1020
================================================================================
Version : glusterfs 2.0.1 built on May 27 2009 16:04:01
TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
Starting Time: 2009-05-27 16:38:46
Command line : /usr/sbin/glusterfs -N -f
/etc/glusterfs/glusterfs-client.vol /mnt/gluster
PID : 32161
System name : Linux
Nodename : weeber.st-inst.lv
Kernel Release : 2.6.28-hardened-r7
Hardware Identifier: i686
Given volfile:
+------------------------------------------------------------------------------+
1: volume xeon
2: type protocol/client
3: option transport-type tcp
4: option remote-host 192.168.1.233
5: option remote-subvolume brick
6: end-volume
7:
8: volume weeber
9: type protocol/client
10: option transport-type tcp
11: option remote-host 192.168.1.252
12: option remote-subvolume brick
13: end-volume
14:
15: volume replicate
16: type cluster/replicate
17: subvolumes xeon weeber
18: end-volume
20: volume readahead
21: type performance/read-ahead
22: option page-size 128kB
23: option page-count 16
24: option force-atime-update off
25: subvolumes replicate
26: end-volume
27:
28: volume writebehind
29: type performance/write-behind
30: option aggregate-size 1MB
31: option window-size 3MB
32: option flush-behind on
33: option enable-O_SYNC on
34: subvolumes readahead
35: end-volume
36:
37: volume iothreads
38: type performance/io-threads
39: option autoscaling on
40: subvolumes writebehind
41: end-volume
42:
43:
44:
45: #volume bricks
46: #type cluster/distribute
47: #option lookup-unhashed yes
48: #option min-free-disk 20%
49: # subvolumes weeber xeon
50: #end-volume
+------------------------------------------------------------------------------+
[2009-05-27 16:38:46] W
[xlator.c:555:validate_xlator_volume_options] writebehind:
option 'window-size' is deprecated, preferred is
'cache-size', continuing with correction
[2009-05-27 16:38:46] W
[glusterfsd.c:455:_log_if_option_is_invalid] writebehind:
option 'aggregate-size' is not recognized
[2009-05-27 16:38:46] W
[glusterfsd.c:455:_log_if_option_is_invalid] readahead:
option 'page-size' is not recognized
[2009-05-27 16:38:46] N [glusterfsd.c:1152:main]
glusterfs: Successfully started
[2009-05-27 16:38:46] N
[client-protocol.c:5557:client_setvolume_cbk] xeon:
Connected to 192.168.1.233:6996, attached to remote volume
'brick'.
[2009-05-27 16:38:46] N [afr.c:2190:notify] replicate:
Subvolume 'xeon' came back up; going online.
[2009-05-27 16:38:46] N
[client-protocol.c:5557:client_setvolume_cbk] xeon:
Connected to 192.168.1.233:6996, attached to remote volume
'brick'.
[2009-05-27 16:38:46] N [afr.c:2190:notify] replicate:
Subvolume 'xeon' came back up; going online.
[2009-05-27 16:38:46] N
[client-protocol.c:5557:client_setvolume_cbk] weeber:
Connected to 192.168.1.252:6996, attached to remote volume
'brick'.
[2009-05-27 18:46:02] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-27 18:16:01. frame-timeout = 1800
[2009-05-27 19:16:09] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-27 18:46:02. frame-timeout = 1800
[2009-05-27 19:46:18] E [client-protocol.c:292:call_bail]
weeber: bailing out frame OPEN(12) frame sent = 2009-05-27
19:16:09. frame-timeout = 1800
[2009-05-27 20:16:25] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-27 19:46:18. frame-timeout = 1800
[2009-05-27 20:46:34] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-27 20:16:25. frame-timeout = 1800
[2009-05-27 21:16:41] E [client-protocol.c:292:call_bail]
weeber: bailing out frame OPEN(12) frame sent = 2009-05-27
20:46:34. frame-timeout = 1800
[2009-05-27 21:47:00] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-27 21:16:53. frame-timeout = 1800
[2009-05-27 22:17:07] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-27 21:47:00. frame-timeout = 1800
[2009-05-27 22:47:15] E [client-protocol.c:292:call_bail]
weeber: bailing out frame OPENDIR(21) frame sent =
2009-05-27 22:17:07. frame-timeout = 1800
[2009-05-27 23:17:23] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-27 22:47:15. frame-timeout = 1800
[2009-05-27 23:47:31] E [client-protocol.c:292:call_bail]
weeber: bailing out frame OPEN(12) frame sent = 2009-05-27
23:17:23. frame-timeout = 1800
[2009-05-28 00:17:39] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-27 23:47:32. frame-timeout = 1800
[2009-05-28 00:47:47] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 00:17:39. frame-timeout = 1800
[2009-05-28 01:17:55] E [client-protocol.c:292:call_bail]
weeber: bailing out frame OPENDIR(21) frame sent =
2009-05-28 00:47:47. frame-timeout = 1800
[2009-05-28 01:48:03] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 01:17:55. frame-timeout = 1800
[2009-05-28 02:18:11] E [client-protocol.c:292:call_bail]
weeber: bailing out frame OPEN(12) frame sent = 2009-05-28
01:48:03. frame-timeout = 1800
[2009-05-28 02:48:29] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 02:18:24. frame-timeout = 1800
[2009-05-28 03:18:37] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 02:48:29. frame-timeout = 1800
[2009-05-28 03:48:45] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 03:18:37. frame-timeout = 1800
[2009-05-28 04:18:53] E [client-protocol.c:292:call_bail]
weeber: bailing out frame XATTROP(40) frame sent =
2009-05-28 03:48:45. frame-timeout = 1800
[2009-05-28 04:49:01] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 04:18:53. frame-timeout = 1800
[2009-05-28 05:19:09] E [client-protocol.c:292:call_bail]
weeber: bailing out frame OPENDIR(21) frame sent =
2009-05-28 04:49:01. frame-timeout = 1800
[2009-05-28 05:49:17] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 05:19:09. frame-timeout = 1800
[2009-05-28 06:19:25] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 05:49:17. frame-timeout = 1800
[2009-05-28 06:49:33] E [client-protocol.c:292:call_bail]
weeber: bailing out frame XATTROP(40) frame sent =
2009-05-28 06:19:25. frame-timeout = 1800
[2009-05-28 07:19:40] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 06:49:33. frame-timeout = 1800
[2009-05-28 07:49:48] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 07:19:40. frame-timeout = 1800
[2009-05-28 08:19:56] E [client-protocol.c:292:call_bail]
weeber: bailing out frame LOOKUP(32) frame sent =
2009-05-28 07:49:48. frame-timeout = 1800
<maris.vcf>_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing
list Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
begin:vcard
fn;quoted-printable:M=C4=81ris Ruskulis
n;quoted-printable:Ruskulis;M=C4=81ris
email;internet:ma...@chown.lv
tel;cell:28647890
x-mozilla-html:FALSE
url:http://ieraksti.lv
version:2.1
end:vcard
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users