Re: [Linux-cluster] mixing OS versions?

2014-03-28 Thread Alan Brown
On 28/03/14 19:31, Fabio M. Di Nitto wrote: Are there any known issues, guidelines, or recommendations for having a single RHCS cluster with different OS releases on the nodes? Only one answer.. don't do it. It's not supported and it's only asking for troubles. Seconded. There are _substanti

Re: [Linux-cluster] unformat gfs2

2014-03-19 Thread Alan Brown
On 18/03/14 13:38, Mr.Pine wrote: I have accidentally reformatted a GFS cluster. We need to unformat it.. is there any way to recover disk ? Backups? -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] gfs2 and quotas - system crash

2014-03-10 Thread Alan Brown
On 10/03/14 18:15, stephen.ran...@stfc.ac.uk wrote: Hello, When using gfs2 with quotas on a SAN that is providing storage to two clustered systems running CentOS6.5, As a matter of interest: how are you exporting the storage, or is this integral to the cluster itself? -- Linux-clust

Re: [Linux-cluster] gfs2 and quotas - system crash

2014-03-10 Thread Alan Brown
On 10/03/14 18:15, stephen.ran...@stfc.ac.uk wrote: Hello, When using gfs2 with quotas on a SAN that is providing storage to two clustered systems running CentOS6.5, one of the systems can crash. This crash appears to be caused when a user tries to add something to a SAN disk when they have exc

[Linux-cluster] NFS-ganesha - worthwhile replacement for kernel NFS?

2013-10-21 Thread Alan Brown
As anyone who's tried to use kernel NFS in a clustered environment knows, it's fraught with issues which risk severe data corruption. has anyone tried using the Userspace nfs-ganesha server? I'd be interested ot hear how you got on. -- Linux-cluster mailing list Linux-cluster@redhat.com ht

[Linux-cluster] Qlogic caching SAN adaptor

2013-03-22 Thread Alan Brown
Qlogic have announced some new adaptors which look promising. The $64million question: Will GFS play nice with these? http://www.theregister.co.uk/2013/03/21/fabriccache/ -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] RHEL/CentOS-6 HA NFS Configuration Question

2012-09-05 Thread Alan Brown
On 05/09/12 15:59, Randy Zagar wrote: What I don't understand is what changed between RHEL-5 and RHEL-6 that has made HA NFS failover so difficult? HA NFS failover has always been difficult for a number of reasons mostly related to how abysmal the Linux NFS implementation is. I have been r

Re: [Linux-cluster] GFS2 and fragmentation

2012-08-08 Thread Alan Brown
On 08/08/12 13:50, Bob Peterson wrote: We currently don't have any plans for defrag tool for GFS2. In theory, you can always copy the data from an old file system to a new one using this new kernel code, and it should be less fragmented. Defragfs works as well as any other userland method: h

Re: [Linux-cluster] Invitation to connect on LinkedIn

2012-04-12 Thread Alan Brown
On 12/04/12 14:04, AK wrote: > Ah, the evils of mass invite And the evils of Linkedin in particular. The only way to stop getting invites is to setup a Linkedin account yourself and from that point you _cannot_ opt out of receiving mail from them from time to time. I regard them as spammers

Re: [Linux-cluster] caching of san devices....

2012-04-03 Thread Alan Brown
On 03/04/12 14:28, Steven Whitehouse wrote: Spinning disks are slow to seek, large arrays even more so. Large arrays should be much faster, provided the data is in cache. Or not, when there's a lot of random IO involved and it's not in cache. I'm talking about arrays such as nexsan atabeast

[Linux-cluster] caching of san devices....

2012-04-03 Thread Alan Brown
Real Dumb Question[tm] time Has anyone tried putting bcache/flashcache in front of shared storage in a GFS2 cluster (on each node, of course) Did it work? Should it work? Is it safe? Are there ways of making it safe? Am I mad for thinking about it? Rationale: Spinning disks are slow

Re: [Linux-cluster] Clustered LVM for storage

2012-03-09 Thread Alan Brown
On 08/03/12 22:59, Jeff Sturm wrote: The downside of partitions is they aren't easy to change. You can add them safely while the storage array is in use, but each host needs to reload the partition table when you're done with changes before the new storage can be used, and that may not happe

Re: [Linux-cluster] Converting from dlm to lock_nolock?

2012-02-09 Thread Alan Brown
On 09/02/12 15:14, Ray Van Dolson wrote: I'm exploring some options for speeding that up -- the main one being dropping my cluster to only one node. Is this doable for a file system that was greated with the dlm lock manager instead of lock_nolock? Yes. You can force the use of lock_nolock in

Re: [Linux-cluster] Preventing clvmd timeouts

2012-01-26 Thread Alan Brown
On 26/01/12 16:05, Digimer wrote: Is anyone actually using DRDB for serious cluster implementations (ie, production systems) or is it just being used for hobby/test rigs? I use it rather extensively in production. I use it to back clustered LVM-backed virtual machines and GFS2 partitions. I st

Re: [Linux-cluster] Preventing clvmd timeouts

2012-01-26 Thread Alan Brown
On 26/01/12 14:41, Digimer wrote: As for qdisk, you can't use it on DRBD, only on a SAN (as it is possible to have a split-brain condition where both nodes go StandAlone and Primary, this allowing both nodes to think they have the qdisk vote). Is anyone actually using DRDB for serious cluster

Re: [Linux-cluster] centos5 to RHEL6 migration

2012-01-09 Thread Alan Brown
On 09/01/12 13:34, Fabio M. Di Nitto wrote: Something i forgot to mention in the other email, is that for example, you can just move the LUNs from your SAN from one cluster to another assuming you are running GFS2 and that will work. And assuming that you have 2 clusters. This might be a possi

Re: [Linux-cluster] rhel 6.2 network bonding interface in cluster environment

2012-01-09 Thread Alan Brown
On 09/01/12 13:33, Rajagopal Swaminathan wrote: Switches used for this purpose are best completely isolated from the rest of the network and multicast traffic control should be DISABLED. I distinctly remember asking the network guys Multicast mode to be on for the Heartbeat network (for the c

Re: [Linux-cluster] rhel 6.2 network bonding interface in cluster environment

2012-01-09 Thread Alan Brown
On 09/01/12 13:23, SATHYA - IT wrote: Alan, Corosync (heartbeat) network is not connected to switch. The network is connected between server to server directly. See my comment about direct hookups. My experience is that they are prone to playing up for no apparent reason (NICs simply aren't d

Re: [Linux-cluster] centos5 to RHEL6 migration

2012-01-09 Thread Alan Brown
On 09/01/12 09:36, Fabio M. Di Nitto wrote: RH's advice to use is to "Big Bang" it. It´s not much of an advice, as RH does not officially support this upgrade method. Indeed, but scheduling downtime in a 24*7*365.254 operation like space science ftp servers is tricky. (1: You can't please e

Re: [Linux-cluster] rhel 6.2 network bonding interface in cluster environment

2012-01-09 Thread Alan Brown
On 09/01/12 05:24, Digimer wrote: With both of the bond's NICs down, the bond itself is going to drop. Odds are, both NICs are plugged into the same switch. (assuming the OP isn't running things plugged nic-nic - which I have found in the past tends to be flakey when N-way negotiation becom

Re: [Linux-cluster] centos5 to RHEL6 migration

2012-01-09 Thread Alan Brown
On 09/01/12 04:51, Digimer wrote: > Alternatively, use some spare machines to mock-up the current cluster > and then test-upgrade. It might work flawlessly, I genuinely don't know. Test setups aren't always a good metric. Everything worked fine on our last changeover until we put real-world load

Re: [Linux-cluster] centos5 to RHEL6 migration

2012-01-09 Thread Alan Brown
On 09/01/12 02:38, Digimer wrote: > Technically yes, practically no. Or rather, not without a lot of > testing first. This is "rather a shame" I have a similar requirement (EL5 -> EL6 with GFS) > There may be some other things you need to do as well. Please be sure > to do proper testing an

Re: [Linux-cluster] error messages while use fsck.gfs2

2011-11-16 Thread Alan Brown
Steven Whitehouse wrote: Well, can't we (the Redhat/Centos fanboys) expect a critical Clustered filesystem like GFS2 (Which supports over 16TB on a 64-bit bit systems at least) take a leaf or two from () ZFS on this issue? I'm not quite sure which feature you are suggesting that we take, but I

Re: [Linux-cluster] error messages while use fsck.gfs2

2011-11-16 Thread Alan Brown
On Wed, 16 Nov 2011, Steven Whitehouse wrote: > The problem is the blocks following that, such as the master directory > which contains all the system files. If enough of that has been > destroyed, it would make it very tricky to reconstruct. Even so it might > be possible depending on exactly whi

Re: [Linux-cluster] error messages while use fsck.gfs2

2011-11-16 Thread Alan Brown
Bob Peterson wrote: I've taken a close look at the image file you created. This appears to be a normal, everyday GFS2 file system except there is a section of 16 blocks (or 0x10 in hex) that are completely destroyed near the beginning of the file system, right after the root directory. Unfortuna

Re: [Linux-cluster] Ext3/ext4 in a clustered environement

2011-11-09 Thread Alan Brown
Steven Whitehouse wrote: We see appreciable knee points in GFS directory performance at 512, 4096 and 16384 files/directory, with progressively worse performance deterioration between each knee pair. (It's a 2^n type problem) That is a bit strange. The GFS2 directory entries are sized accord

Re: [Linux-cluster] Ext3/ext4 in a clustered environement

2011-11-09 Thread Alan Brown
Nicolas Ross wrote: Get me right, there are millions of files, but no more than a few hundreds per directory. They are spread out splited on the database id, 2 caracters at a time. So a file name 1234567.jpg would end up in a directory 12/34/5/, or something similar. OK, the way you wrote it

Re: [Linux-cluster] Ext3/ext4 in a clustered environement

2011-11-07 Thread Alan Brown
Nicolas Ross wrote: On some services, there are document directories that are huge, not that much in size (about 35 gigs), but in number of files, around one million. One service even has 3 data directories with that many files each. You are utterly mad. Apart from the human readability aspe

Re: [Linux-cluster] Lost connection to storage - what happens?

2011-09-26 Thread Alan Brown
Laszlo Beres wrote: Hi, just a theoretical question: let's assume we have a cluster with GFS2 filesystem (not as a managed resource). What happens exactly if all paths to backend device get lost? GFS2 withdraws that filesystem and you'll have to reboot all the withdrawn machines to get it ba

Re: [Linux-cluster] Options other than reboot to stop DP processes thatcan't be killed -9

2011-08-22 Thread Alan Brown
Colin Simpson wrote: Probably not a cluster issue just pure kernel question. Sounds like the driver or device is locked up and the driver or device is confused, so the processes attached to it will be hung. A common problem in a fabric environment is that there are 2+ paths to the tapes (ie,

Re: [Linux-cluster] NFS Serving Issues

2011-08-17 Thread Alan Brown
Colin Simpson wrote: ,when the service is stopped I get a "Stale NFS file handle" from mounted filesystems accessing the NFS mount point at those times. i.e. if I have a copy going I get on the service being disabled: That's normal if a NFS server mount is unexported or nfsd shuts down. It _s

Re: [Linux-cluster] EFI in CLVM

2011-08-12 Thread Alan Brown
On 12/08/2011 17:24, Paras pradhan wrote: Does it mean that I don't need mpath0p1 ? If its the case i don't need to run kpartx on mpath0? You still need kpartx, but that's a bit clunky anyway. Let dm-multipath take care of all that for you. (The last time I used kpartx and friends was 2003.

Re: [Linux-cluster] EFI in CLVM

2011-08-12 Thread Alan Brown
On 12/08/2011 16:14, Paras pradhan wrote: If the entire LUN is a PV then you don't need to partition it. You mean don't use parted or any and directly proceed to pvcreate? Correct. -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-

Re: [Linux-cluster] EFI in CLVM

2011-08-12 Thread Alan Brown
Paras pradhan wrote: Hi, I have a 2199GB LUN assigned to my 3 node cluster. Since its >2TB, I used parted to create the EFI GPT parittion. After that pvcreate and vgcreate were successfull but I get the following error when doing lvcreate. If the entire LUN is a PV then you don't need to part

Re: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernelNULL pointer deference

2011-07-14 Thread Alan Brown
> Maybe I should try that again but the only was I know to get a kdump is to set a large fence delay. This is what I'd expect. We also found the fence delay has to be long enough to allow the crashdump to be written out. The only alternatives to speed this up are to use _very_ fast disk for

Re: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference

2011-07-11 Thread Alan Brown
On 08/07/11 22:09, J. Bruce Fields wrote: With default mount options, the linux NFS client (like most NFS clients) assumes that a file has a most one writer at a time. (Applications that need to do write-sharing over NFS need to use file locking.) The problem is that file locking on V3 isn't

Re: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference

2011-07-08 Thread Alan Brown
Colin Simpson wrote: But I guess you are also telling me that file locking between the two wouldn't be helping here either? Correct. NFSd (v2/3) doesn't pass client locks to the filesystem, nor does it respect locks set by other processes. It has a number of other foibles - try setting up

Re: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference

2011-07-08 Thread Alan Brown
On Fri, 8 Jul 2011, Colin Simpson wrote: > That's not ideal either when Samba isn't too happy working over NFS, and > that is not recommended by the Samba people as being a sensible config. I know but there's a real (and demonstrable) risk of data corruption for NFS vs _anything_ if NFS clients a

Re: [Linux-cluster] GFS2 + NFS crash BUG: Unable to handle kernel NULL pointer deference

2011-07-08 Thread Alan Brown
On Fri, 8 Jul 2011, Steven Whitehouse wrote: > Currently we don't recommend using NFS on a GFS2 filesystem which is > also being used locally. After much dealing with NFS internals, I would recommend NOT using it on any filesystem where the files are accessed locally. NFSv2/3 doesn't play nice w

Re: [Linux-cluster] gfs mount at boot

2011-06-09 Thread Alan Brown
On 09/06/11 15:46, Budai Laszlo wrote: Hi, What should be done in order to mount a gfs file system at boot? I've created the following line in /etc/fstab: /dev/clvg/gfsvol/mnt/testgfsgfs defaults0 0 but it is not mounting the fs at boot. If I run "mount -a" then

Re: [Linux-cluster] defragmentation.....

2011-06-02 Thread Alan Brown
Alan Brown wrote: This is interesting too. note the variation in extents (the file is a piece of marketing fluff, name is unimportant) I'm getting the same thing in sarch01 and that's mounted read-only by the clients - there's zero write activity going on. -- Linux-clust

Re: [Linux-cluster] defragmentation.....

2011-06-02 Thread Alan Brown
This is interesting too. note the variation in extents (the file is a piece of marketing fluff, name is unimportant) $ df -h . FilesystemSize Used Avail Use% Mounted on /dev/mapper/VolGroupBeast03-LogVolUser1 250G 113G 138G 45% /stage/user1 $ ls -l SUMO-SA

Re: [Linux-cluster] defragmentation.....

2011-06-02 Thread Alan Brown
Steven Whitehouse wrote: The thing to check is what size the extents are... filefrag doesn't show this. the on-disk layout is designed so that you should have a metadata block separating each data extent at exactly the place where we would need to read a new metadata block in order to contin

[Linux-cluster] defragmentation.....

2011-06-02 Thread Alan Brown
GFS2 seems horribly prone to fragmentation. I have a filesystem which has been written to once (data archive, migrated from a GFS1 filesystem to a clean GFS2 fs) and a lot of the files are composed of hundreds of extents - most of these are only 1-2Mb so this is a bit over the top and it badl

Re: [Linux-cluster] quorum dissolved but resources are still alive

2011-05-31 Thread Alan Brown
Digimer wrote: With a two-node, quorum is effectively useless, as a single node is allowed to continue. That's what qdiskd is for. It's also useful in larger clusters. Also, without proper fencing, things will not fail properly. This means that you are in somewhat of an undefined area. Un

Re: [Linux-cluster] |Optimizing DLM Speed

2011-05-18 Thread Alan Brown
Steven Whitehouse wrote: Hi, On Wed, 2011-05-18 at 16:14 +0100, Alan Brown wrote: Bob, Steve, Dave, Is there any progress on tuning the size of the tables (RHEL5) to allow larger values and see if they help things as far as caching goes? There is a bz open, I thought so, but I can&#

[Linux-cluster] |Optimizing DLM Speed

2011-05-18 Thread Alan Brown
Bob, Steve, Dave, Is there any progress on tuning the size of the tables (RHEL5) to allow larger values and see if they help things as far as caching goes? It would be advantageous to tweak the dentry limits too - the kernel limits this to 10% and attempts to increase are throttled back. Th

Re: [Linux-cluster] Write Performance Issues with GFS2

2011-05-14 Thread Alan Brown
On 13/05/11 23:21, Bob Peterson wrote: | Steve/Bob, how about opening this one up for public view? Sounds okay to me. Not sure how that's done, and not sure if I have the right authority in bugzilla to do it. I'm not entirely sure either but as the creator I think all you have to do is unch

Re: [Linux-cluster] Write Performance Issues with GFS2

2011-05-13 Thread Alan Brown
On 12/05/11 00:32, Ramiro Blanco wrote: https://bugzilla.redhat.com/show_bug.cgi?id=683155 Can't access that one: "You are not authorized to access bug #683155" There's no reason this bug should be private, however it's addressed in test kernel kernel-2.6.18-248.el5 Steve/Bob, how about op

Re: [Linux-cluster] Solution for HPC

2011-04-20 Thread Alan Brown
Gordan Bobic wrote: There is no such thing - period. On any OS. If your application is single-process/single-thread, it will only scale vertically. _If_ the problem is pleasantly or embarrassingly parallel a shell script can be used to run many parallel invokations. How to do that is offtop

Re: [Linux-cluster] >1500 MTU on EL5 causes things to go sideways

2011-04-12 Thread Alan Brown
Digimer wrote: As soon as I define MTU=2000 (for example), then cman on one note will start but not stop (the other node stops fine). Also, 'ccs_tool update /etc/cluster/cluster.conf' fails with: Have you configured the interfaces themselves to use jumbo frames? Does the switch support jumb

Re: [Linux-cluster] GFS2 cluster node is running very slow

2011-03-31 Thread Alan Brown
David Hill wrote: These directories are all on the same mount ... with a total size of 1.2TB! I _strongly_ suggest you setup one filesystem per directory. All files accessed by the application are within it's own folder/subdirectory. No files is ever accessed by more than one node. That wil

[Linux-cluster] Current stable(ish) EL test kernel?

2011-03-31 Thread Alan Brown
Bob, Steve et al, Which EL test kernel post 2.6.18.247 is stable enough for use in a production system for a few days? I'm seeing massive slowdowns on lots of 2-100Mb writes (someone's mirroring a ftp archive) and want to see if the .247 write speedups Bob mentioned 3 weeks back will help.

Re: [Linux-cluster] GFS2 cluster node is running very slow

2011-03-31 Thread Alan Brown
David Hill wrote: Hi Steve, We seems to be experiencing some new issues now... With 4 nodes, only one is slow but with 3 nodes, 2 of them are now slow. 2 nodes are doing 20k/s and one is doing 2mb/s ... Seems like all nodes will end up with poor performances. All nodes are locking fil

Re: [Linux-cluster] Attaching a service to a specific interface

2011-03-31 Thread Alan Brown
carlopmart wrote: Hi all, I have two rhel6.0 cluster nodes with five nic interfaces in each one. Actually, I have one free interface without an IP in each one. Can I assign a cluster service to this interface (service consists in one IP and one script)?? Yes but You will need to c

Re: [Linux-cluster] gfs2_quotad:2498 blocked

2011-03-25 Thread Alan Brown
Nicolas Ross wrote: It was a large, very large directory, with somewhere neer one million small files, so the rsync took something like 3 to 4 hours. At some point, all nodes' consoles dispalyed this : gfs2_quotad:2498 blocked for more that 120 seconds. "echo 0 > /proc/sys/kernel/hang_task_ti

Re: [Linux-cluster] GFS volume locks during cluster node join/leave

2011-03-23 Thread Alan Brown
to reduce data corruption risk. Martijn On Fri, Mar 18, 2011 at 2:18 PM, Alan Brown wrote: Martijn Storck wrote: Is this expected behaviour? Yes. Is there anything we can do to reduce these delays? Unmount all clustered filesystems on the host before rebooting. AB -- Linux-cluster ma

Re: [Linux-cluster] GFS volume locks during cluster node join/leave

2011-03-18 Thread Alan Brown
Martijn Storck wrote: Is this expected behaviour? Yes. Is there anything we can do to reduce these delays? Unmount all clustered filesystems on the host before rebooting. AB -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] GFS2 file system maintenance question.

2011-03-15 Thread Alan Brown
Jack Duston wrote: > Thanks Yue, but your information would seem dated if this site is correct: > > http://www.redhat.com/rhel/compare > > Even if 100TB is what's officially supported in RHEL6, it doesn't mean > that larger file systems won't work. Anyone considering such large filesystems shoul

[Linux-cluster] Resource groups

2011-03-14 Thread Alan Brown
Bob: You say this in your best practice document: "Our performance testing lab has experimented with various resource group sizes and found a performance problem with anything bigger than 768MB. Until this is properly diagnosed, we recommend staying below 768MB." What are the details? Nearly

Re: [Linux-cluster] which is better gfs2 and ocfs2?

2011-03-12 Thread Alan Brown
On 12/03/11 23:13, Bob Peterson wrote: Agreed. We're abundantly aware of the performance problems, and we're not ignoring them. I know Bob, thanks. (1) We recently found and fixed a problem that caused the dlm to pass locking traffic much slower than possible. Is this rolled into 2.6.

Re: [Linux-cluster] which is better gfs2 and ocfs2?

2011-03-12 Thread Alan Brown
I missed somthing: On 12/03/11 17:46, Jeff Sturm wrote: As an example, while running a "du" command on my GFS mount point, I observed the Ethernet traffic peak: 12:20:33 PM IFACE rxpck/s txpck/s rxbyt/s txbyt/s rxcmp/s txcmp/s rxmcst/s 12:20:38 PM eth0 3517

Re: [Linux-cluster] which is better gfs2 and ocfs2?

2011-03-12 Thread Alan Brown
On 12/03/11 17:46, Jeff Sturm wrote: [root@cluster1 76]# ls | wc -l 1970 The key is that only a few locks are needed to list the directory: You assume NFS clients are simply using "ls" Running "ls -l" on the same directory takes a bit longer (by a factor of about 20): Or

Re: [Linux-cluster] What is the proper procedure to reboot a node in a cluster?

2011-03-11 Thread Alan Brown
The only reliable way I have found (rhel4 and 5) is this: 1: Migrate all services off the node. 2: Unmount as many GFS disks as possible. 3: Power cycle the node. The other nodes will recover quickly. "cman leave (remove) (force)" sometimes works but often doesn't. -- Linux-cluster maili

Re: [Linux-cluster] which is better gfs2 and ocfs2?

2011-03-11 Thread Alan Brown
On 09/03/11 14:13, yue wrote: which is better gfs2 and ocfs2? i want to share fc-san, do you know which is better? "that depends" - it is highly dependent on the type of disk activity you are performing. There are various reviews of both FSes circulating. Personal observation: GFS and GFS2

Re: [Linux-cluster] clvmd hangs on startup

2011-03-11 Thread Alan Brown
On 08/03/11 17:11, Valeriu Mutu wrote: Hi, I think the problem is solved. I was using a 9000bytes MTU on the Xen virtual machines' iSCSI interface. Switching back to 1500bytes MTU caused the clvmd to start working. As long as everything on the network is 9000bytes then you should be ok. RH'

Re: [Linux-cluster] How fast can rsync be on GFS2?

2011-03-01 Thread Alan Brown
Nikola Savic wrote: Rsync is very slow in creating file list, little faster than 100files/s. That's about what I see too. Ditto on reading. -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] Fsck on GFS2

2011-02-27 Thread Alan Brown
On 25/02/11 13:43, Bob Peterson wrote: All of the fixes going into RHEL5.7 are in that version, and it is faster and more accurate than the version shipped with RHEL5.6. Will it be backported to 5.6? -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listi

Re: [Linux-cluster] optimising DLM speed?

2011-02-27 Thread Alan Brown
On 24/02/11 17:50, Steven Whitehouse wrote: Depending on the exact mix of I/O, that is expected behaviour. That is why it is so important to look at what can be done at the application layer to mitigate such problems. This is an academic environment. Telling users to adjust the way they do t

Re: [Linux-cluster] High DLM CPU usage - low GFS/iSCSI performance

2011-02-25 Thread Alan Brown
On 25/02/11 07:21, Martijn Storck wrote: Thanks for your message. Somehow the issue has not returned since yesterday when I applied some tuning to our GFS, specifically: glock_purge 50 demote_secs 100 scand_secs 5 statfs_fast 1 It's most likely the biggest contributor is the statfs_fast se

Re: [Linux-cluster] optimising DLM speed?

2011-02-24 Thread Alan Brown
On 24/02/11 22:40, Scooter Morris wrote: Hi all. After two tries, we've modified our cluster so that all nodes have increased their dlm hash table sizes to 1024. Initially, I put the echos in /etc/init.d/gfs2, but it turns out that /etc/init.d/gfs2 is sort of a no-op: /etc/init.d/netfs mounts

Re: [Linux-cluster] optimising DLM speed?

2011-02-24 Thread Alan Brown
Steven Whitehouse wrote: As soon as you mix creation/deletion on one node with accesses (of whatever kind) from other nodes, you run this risk. _ALL_ the GFS2 filesystems (bar one 5Gb one for common config files, etc) are mounted one-node-only. _ALL_ the GFS2 filesystems (with the same exc

[Linux-cluster] bug: nfsclient.sh

2011-02-24 Thread Alan Brown
If multiple NFS services are defined, a race condition exists with parallel invokations of /usr/share/cluster/nfsclient.sh exportfs in add/remove mode reads the existing exports in from kernel (or/etab/xtab), applies the command and then writes a _complete_ exportlist back to the kernel, not

Re: [Linux-cluster] optimising DLM speed?

2011-02-24 Thread Alan Brown
Steven Whitehouse wrote: That doesn't sound like it is related to a DLM issue. 150 entries is not a lot. It isn't, but when the machine's being hammered by requests in other filesystems, things can get very slow, very quickly. What do you mean be "access" in this case? Just looking up a si

Re: [Linux-cluster] optimising DLM speed?

2011-02-23 Thread Alan Brown
After running several days with the larger table sizes I don't think it's made any difference to individual thread performance or overall throughput. Likewise, the following changes have had no effect on access time for large directories (but they have improved caching and improved high load

Re: [Linux-cluster] optimising DLM speed?

2011-02-17 Thread Alan Brown
David Teigland wrote: Don't change the buffer size, but I'd increase all the hash table sizes to 4096 and see if anything changes. echo "4096" > /sys/kernel/config/dlm/cluster/rsbtbl_size echo "4096" > /sys/kernel/config/dlm/cluster/lkbtbl_size echo "4096" > /sys/kernel/config/dlm/cluster/dirtb

Re: [Linux-cluster] optimising DLM speed?

2011-02-16 Thread Alan Brown
> Yes, ls -l will always take longer because it is not just accessing the directory, but also every inode in the directory. As a result the I/O pattern will generally be poor. I know and accept that. It's common to most filesystems but the access time is particularly pronounced with GFS2 (pres

Re: [Linux-cluster] optimising DLM speed?

2011-02-16 Thread Alan Brown
> For the GFS2 glocks, that doesn't matter - all of the glocks are held in a single hash table no matter how many filesystems there are. Given nearly 4 mlllion glocks currently on one of the boxes in a quiet state (and nearly 6 million if everything was on one node), is the existing hash table

Re: [Linux-cluster] optimising DLM speed?

2011-02-16 Thread Alan Brown
> Directories of the size (number of entries) which you have indicated should not be causing a problem as lookup should still be quite fast at that scale. Perhaps, but even so 4000 file directories usually take over a minute to "ls -l" , while 85k file/directories take 5 mins (20-40 mins on a ba

Re: [Linux-cluster] optimising DLM speed?

2011-02-16 Thread Alan Brown
> A faster way to just grab lock numbers is to grep for gfs2 in /proc/slabinfo as that will show how many are allocated at any one time. True, but it doesn't show mow many are used per fs. FWIW, here are current stats on each cluster node (it's evening and lightly loaded) gfs2_quotad

Re: [Linux-cluster] optimising DLM speed?

2011-02-16 Thread Alan Brown
Steve: To add some interest (and give you numbers to work with as far as dlm config tuning goes), here are a selection of real world lock figures from our file cluster (cat $d | wc -l) /sys/kernel/debug/dlm/WwwHome-gfs2_locks 162299 (webserver exports) /sys/kernel/debug/dlm/soft2-gfs2_locks

Re: [Linux-cluster] optimising DLM speed?

2011-02-16 Thread Alan Brown
> You can set it via the configfs interface: Given 24Gb ram, 100 filesystems, several hundred million of files and the usual user habits of trying to put 100k files in a directory: Is 24Gb enough or should I add more memory? (96Gb is easy, beyond that is harder) What would you consider safe

Re: [Linux-cluster] optimising DLM speed?

2011-02-16 Thread Alan Brown
> There is a config option to increase the resource table size though, so perhaps you could try that? ..details? -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] Linux-cluster Digest, Vol 82, Issue 20

2011-02-15 Thread Alan Brown
I'm seeing heartbeat/lock lan traffic peak out at about 120kb/s and 4000pps per node at the moment. Clearly the switch isn't the problem - and using hardware acclerated igb devices I'm pretty sure the networking's fine too. During the actual workload, or just during the ping pong test? Duri

[Linux-cluster] QLA2xxx tagged queue bug.

2011-02-15 Thread Alan Brown
I'm documenting this in case anyone else gets bitten (This is supposed to have been fixed since October, but we encountered it in the last few days on RHEL5.6 - either it's not fully fixed or the patch has fallen out of the production kernel) We kept getting GFS and GFS2 filesystems mysteriou

Re: [Linux-cluster] optimising DLM speed?

2011-02-15 Thread Alan Brown
> It would be really interesting how long the described backup takes when the gfs2 filesystem is mounted exclusively on one node without locking. The 2 million inode system backs up in about 30 minutes when mounted lock_nolock (0 file incremental backup using bacula) > For me it looks like

Re: [Linux-cluster] optimising DLM speed?

2011-02-15 Thread Alan Brown
The setup described is all on RHEL5.6. Fileserver filesystems are each mounted on one cluster node only (scattered across nodes) and then NFS exported as individual services for portability. (That exposed a major race condition with exportfs as it's not parallel aware in any way, shape or fo

[Linux-cluster] optimising DLM speed?

2011-02-15 Thread Alan Brown
After lots of headbanging, I'm slowly realising that limits on GFS2 lock rates and totem message passing appears to be the main inhibitor of cluster performance. Even on disks which are only mounted on one node (using lock_dlm), the ping_pong rate is - quite frankly - appalling, at about 500