Re: [Linux-cluster] NTP time steps causes cluster reconfiguration

2010-07-16 Thread Steven Dake
On 07/16/2010 06:18 AM, Martin Waite wrote: Hi, During testing, I noticed that a time step caused by ntpd caused the cluster to drop into GATHER state: Jun 16 12:13:16 cp1edidbm001 ntpd[30917]: time reset -16.332117 s Jun 16 12:13:26 cp1edidbm001 openais[15929]: [TOTEM] entering GATHER state f

Re: [Linux-cluster] Stateful Samba\CTDB Failover

2010-07-16 Thread Christopher R. Hertel
Are you running Samba with CTDB on a cluster? This question really belongs on the samba-technical mailing list since it probably has more to do with Samba's state management than the underlying cluster file system. Chris -)- Justin Shafer wrote: > I have read this on the mailing list.. > htt

[Linux-cluster] How do i stop VM's on a failed node?

2010-07-16 Thread Nathan Lager
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have a 4-node cluster, running KVM, and a host of VM's. One of the nodes failed unexpectedly. I'm having two issues. First, the VM's which were on that node are still reported as up and running on that node in clustat, and second, i'm unable to in

Re: [Linux-cluster] NTP time steps causes cluster reconfiguration

2010-07-16 Thread Kaloyan Kovachev
On Fri, 16 Jul 2010 16:11:38 +0100, "Martin Waite" wrote: > Hi, > > NTP has a step-threshold - if the time difference is greater than the > threshold, it will step the time rather than speeding it up or down. So > even using ntpd can cause clock steps (especially in our test > environment where

Re: [Linux-cluster] NTP time steps causes cluster reconfiguration

2010-07-16 Thread Martin Waite
Hi, NTP has a step-threshold - if the time difference is greater than the threshold, it will step the time rather than speeding it up or down. So even using ntpd can cause clock steps (especially in our test environment where our crappy overloaded NTP servers sometimes lose 30 seconds). On some

Re: [Linux-cluster] NTP time steps causes cluster reconfiguration

2010-07-16 Thread Kaloyan Kovachev
Hi, i can confirm, that time steps do cause reconfiguration. Not sure if this was the reason, but one of my nodes was fenced from time to time (previously) after several reconfigurations and also it caused some problems with gfs being withdrawn. ntpdate running as cron job does step changes, but

Re: [Linux-cluster] Stateful Samba\CTDB Failover

2010-07-16 Thread Justin Shafer
Also, Im not using the round-robin method for failover with DNS records, Im using the LVS method. Don't think that matters And I don't know if Corosync expects to use LVS or go off the public_addresses file as corosync is supposed to take your ctdb file and reconfigure it.. Like should I have c

Re: [Linux-cluster] Stateful Samba\CTDB Failover

2010-07-16 Thread Justin Shafer
Yep I agree. The failover itself is almost instant.. I too am running samba ctdb in active/active with corosync/drbd/ocfs2.. though I haven't added ctdb as a resource to corosync, I have just been starting it manually. Corosync wants to know there the SMB private directory is and I was told by some

[Linux-cluster] NTP time steps causes cluster reconfiguration

2010-07-16 Thread Martin Waite
Hi, During testing, I noticed that a time step caused by ntpd caused the cluster to drop into GATHER state: Jun 16 12:13:16 cp1edidbm001 ntpd[30917]: time reset -16.332117 s Jun 16 12:13:26 cp1edidbm001 openais[15929]: [TOTEM] entering GATHER state from 12. Jun 16 12:13:26 cp1edidbm001 op

Re: [Linux-cluster] Stateful Samba\CTDB Failover

2010-07-16 Thread Jason Fitzpatrick
Hi Justin. My understanding of all this is that SMB2 was only introduced with Vista (http://blogs.technet.com/b/josebda/archive/2008/12/05/smb2-a-complete-redesign-of-the-main-remote-file-protocol-for-windows.aspx) and as a result your client has to be using SMB1 I was looking into SMBv4 for RHE

[Linux-cluster] unable to start luci service

2010-07-16 Thread Mahantesh Chiniwar
Hi, I am trying to setup a Conga cluster on Redhat 5.4 and i am unable to start the luci service. Details: [r...@rhel-n1 ~]# rpm -qa | grep luci luci-0.12.2-6.el5 [r...@rhel-n1 ~]# rpm -qa | grep ricci ricci-0.12.2-6.el5 When I start ricci service i get the following message in /var/log/messa