Re: [ClusterLabs] newbie questions

2016-05-31 Thread Digimer
On 31/05/16 10:41 PM, Jay Scott wrote:
> hooray for me, but, how?
> 
> I got about 3/4 of Digimer's list done and got stuck.
> I did a pcs cluster status, and, behold, the cluster was up.
> I pinged the ClusterIP and it answered.  I didn't know what
> to do with the 'delay="x"' part, that's the thing I couldn't figure
> out.  (I've been assuming the delay part is a big deal.)

Delay works like this;

Both nodes are up, but comms break (switch loop/broadcast storm,
STP/stack renegotiation, iptables oops, whatever)... Both nodes declare
their peer lost.

Node 1's stonith config includes 'delay="15"'.

Node 1 looks up how to fence node 2, calls the fence.

Node 2 looks up how to fence node 1, calls fence (passing to the agent
the delay).

The fence agent running on node 1 executes without delay.

The fence agent running on node 2 sees a delay of 15 seconds, and sleeps.

Node 1 kills node 2 before the sleep exits, thus ensuring that node 1
lived and node 2 died. Assuming you have your services on node 1, then
that means no recovery is needed.

Now assume that node 1 truly died. Node 2's fence agent would exit the
sleep after 15 seconds and proceed to shoot node 1 and then recover any
resources that had been on node 1.

digimer

> However, there are more things for me to read and more experiments
> for me to try so I'm good for now.
> 
> Thanks to everyone for the prompt help.
> 
> j.
> 
> On Tue, May 31, 2016 at 5:22 PM, Ken Gaillot  > wrote:
> 
> On 05/31/2016 03:59 PM, Jay Scott wrote:
> > Greetings,
> >
> > Cluster newbie
> > Centos 7
> > trying to follow the "Clusters from Scratch" intro.
> > 2 nodes (yeah, I know, but I'm just learning)
> > 
> > [root@smoking ~]# pcs status
> > Cluster name:
> > Last updated: Tue May 31 15:32:18 2016Last change: Tue May 31
> > 15:02:21
> >  2016 by root via cibadmin on smoking
> > Stack: unknown
> 
> "Stack: unknown" is a big problem. The cluster isn't aware of the
> corosync configuration. Did you do the "pcs cluster setup" step?
> 
> > Current DC: NONE
> > 2 nodes and 1 resource configured
> >
> > OFFLINE: [ mars smoking ]
> >
> > Full list of resources:
> >
> >  ClusterIP(ocf::heartbeat:IPaddr2):Stopped
> >
> > PCSD Status:
> >   smoking: Online
> >   mars: Online
> >
> > Daemon Status:
> >   corosync: active/enabled
> >   pacemaker: active/enabled
> >   pcsd: active/enabled
> > 
> >
> > What concerns me at the moment:
> > I did
> > pcs resource enable ClusterIP
> > while simultaneously doing
> > tail -f /var/log/cluster/corosync.log
> > (the only log in there)
> 
> The system log (/var/log/messages or whatever your system has
> configured) is usually the best place to start. The cluster software
> sends messages of interest to end users there, and it includes messages
> from all components (corosync, pacemaker, resource agents, etc.).
> 
> /var/log/cluster/corosync.log (and in some configurations,
> /var/log/pacemaker.log) have more detailed log information for
> debugging.
> 
> > and nothing happens in the log, but the ClusterIP
> > stays "Stopped".  Should I be able to ping that addr?
> > I can't.
> > It also says OFFLINE:  and both of my machines are offline,
> > though the PCSD says they're online.  Which do I trust?
> 
> The first online/offline output is most important, and refers to the
> node's status in the actual cluster; the "PSCD" online/offline output
> simply tells whether the pcs daemon is running. Typically, the pcs
> daemon is enabled at boot and is always running. The pcs daemon is not
> part of the clustering itself; it's a front end to configuring and
> administering the cluster.
> 
> > [root@smoking ~]# pcs property show stonith-enabled
> > Cluster Properties:
> >  stonith-enabled: false
> >
> > yet I see entries in the corosync.log referring to stonith.
> > I'm guessing that's normal.
> 
> Yes, you can enable stonith at any time, so the stonith daemon will
> still run, to stay aware of the cluster status.
> 
> > My corosync.conf file says the quorum is off.
> >
> > I also don't know what to include in this for any of you to
> > help me debug.
> >
> > Ahh, also, is this considered "long", and if so, where would I post
> > to the web?
> >
> > thx.
> >
> > j.
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> 
> ___
> Users 

Re: [ClusterLabs] newbie questions

2016-05-31 Thread Jay Scott
hooray for me, but, how?

I got about 3/4 of Digimer's list done and got stuck.
I did a pcs cluster status, and, behold, the cluster was up.
I pinged the ClusterIP and it answered.  I didn't know what
to do with the 'delay="x"' part, that's the thing I couldn't figure
out.  (I've been assuming the delay part is a big deal.)

However, there are more things for me to read and more experiments
for me to try so I'm good for now.

Thanks to everyone for the prompt help.

j.

On Tue, May 31, 2016 at 5:22 PM, Ken Gaillot  wrote:

> On 05/31/2016 03:59 PM, Jay Scott wrote:
> > Greetings,
> >
> > Cluster newbie
> > Centos 7
> > trying to follow the "Clusters from Scratch" intro.
> > 2 nodes (yeah, I know, but I'm just learning)
> > 
> > [root@smoking ~]# pcs status
> > Cluster name:
> > Last updated: Tue May 31 15:32:18 2016Last change: Tue May 31
> > 15:02:21
> >  2016 by root via cibadmin on smoking
> > Stack: unknown
>
> "Stack: unknown" is a big problem. The cluster isn't aware of the
> corosync configuration. Did you do the "pcs cluster setup" step?
>
> > Current DC: NONE
> > 2 nodes and 1 resource configured
> >
> > OFFLINE: [ mars smoking ]
> >
> > Full list of resources:
> >
> >  ClusterIP(ocf::heartbeat:IPaddr2):Stopped
> >
> > PCSD Status:
> >   smoking: Online
> >   mars: Online
> >
> > Daemon Status:
> >   corosync: active/enabled
> >   pacemaker: active/enabled
> >   pcsd: active/enabled
> > 
> >
> > What concerns me at the moment:
> > I did
> > pcs resource enable ClusterIP
> > while simultaneously doing
> > tail -f /var/log/cluster/corosync.log
> > (the only log in there)
>
> The system log (/var/log/messages or whatever your system has
> configured) is usually the best place to start. The cluster software
> sends messages of interest to end users there, and it includes messages
> from all components (corosync, pacemaker, resource agents, etc.).
>
> /var/log/cluster/corosync.log (and in some configurations,
> /var/log/pacemaker.log) have more detailed log information for debugging.
>
> > and nothing happens in the log, but the ClusterIP
> > stays "Stopped".  Should I be able to ping that addr?
> > I can't.
> > It also says OFFLINE:  and both of my machines are offline,
> > though the PCSD says they're online.  Which do I trust?
>
> The first online/offline output is most important, and refers to the
> node's status in the actual cluster; the "PSCD" online/offline output
> simply tells whether the pcs daemon is running. Typically, the pcs
> daemon is enabled at boot and is always running. The pcs daemon is not
> part of the clustering itself; it's a front end to configuring and
> administering the cluster.
>
> > [root@smoking ~]# pcs property show stonith-enabled
> > Cluster Properties:
> >  stonith-enabled: false
> >
> > yet I see entries in the corosync.log referring to stonith.
> > I'm guessing that's normal.
>
> Yes, you can enable stonith at any time, so the stonith daemon will
> still run, to stay aware of the cluster status.
>
> > My corosync.conf file says the quorum is off.
> >
> > I also don't know what to include in this for any of you to
> > help me debug.
> >
> > Ahh, also, is this considered "long", and if so, where would I post
> > to the web?
> >
> > thx.
> >
> > j.
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] newbie questions

2016-05-31 Thread Ken Gaillot
On 05/31/2016 03:59 PM, Jay Scott wrote:
> Greetings,
> 
> Cluster newbie
> Centos 7
> trying to follow the "Clusters from Scratch" intro.
> 2 nodes (yeah, I know, but I'm just learning)
> 
> [root@smoking ~]# pcs status
> Cluster name:
> Last updated: Tue May 31 15:32:18 2016Last change: Tue May 31
> 15:02:21
>  2016 by root via cibadmin on smoking
> Stack: unknown

"Stack: unknown" is a big problem. The cluster isn't aware of the
corosync configuration. Did you do the "pcs cluster setup" step?

> Current DC: NONE
> 2 nodes and 1 resource configured
> 
> OFFLINE: [ mars smoking ]
> 
> Full list of resources:
> 
>  ClusterIP(ocf::heartbeat:IPaddr2):Stopped
> 
> PCSD Status:
>   smoking: Online
>   mars: Online
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> 
> What concerns me at the moment:
> I did
> pcs resource enable ClusterIP
> while simultaneously doing
> tail -f /var/log/cluster/corosync.log
> (the only log in there)

The system log (/var/log/messages or whatever your system has
configured) is usually the best place to start. The cluster software
sends messages of interest to end users there, and it includes messages
from all components (corosync, pacemaker, resource agents, etc.).

/var/log/cluster/corosync.log (and in some configurations,
/var/log/pacemaker.log) have more detailed log information for debugging.

> and nothing happens in the log, but the ClusterIP
> stays "Stopped".  Should I be able to ping that addr?
> I can't.
> It also says OFFLINE:  and both of my machines are offline,
> though the PCSD says they're online.  Which do I trust?

The first online/offline output is most important, and refers to the
node's status in the actual cluster; the "PSCD" online/offline output
simply tells whether the pcs daemon is running. Typically, the pcs
daemon is enabled at boot and is always running. The pcs daemon is not
part of the clustering itself; it's a front end to configuring and
administering the cluster.

> [root@smoking ~]# pcs property show stonith-enabled
> Cluster Properties:
>  stonith-enabled: false
> 
> yet I see entries in the corosync.log referring to stonith.
> I'm guessing that's normal.

Yes, you can enable stonith at any time, so the stonith daemon will
still run, to stay aware of the cluster status.

> My corosync.conf file says the quorum is off.
> 
> I also don't know what to include in this for any of you to
> help me debug.
> 
> Ahh, also, is this considered "long", and if so, where would I post
> to the web?
> 
> thx.
> 
> j.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] newbie questions

2016-05-31 Thread Jay Scott
Greetings,

Cluster newbie
Centos 7
trying to follow the "Clusters from Scratch" intro.
2 nodes (yeah, I know, but I'm just learning)

[root@smoking ~]# pcs status
Cluster name:
Last updated: Tue May 31 15:32:18 2016Last change: Tue May 31
15:02:21
 2016 by root via cibadmin on smoking
Stack: unknown
Current DC: NONE
2 nodes and 1 resource configured

OFFLINE: [ mars smoking ]

Full list of resources:

 ClusterIP(ocf::heartbeat:IPaddr2):Stopped

PCSD Status:
  smoking: Online
  mars: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled


What concerns me at the moment:
I did
pcs resource enable ClusterIP
while simultaneously doing
tail -f /var/log/cluster/corosync.log
(the only log in there)
and nothing happens in the log, but the ClusterIP
stays "Stopped".  Should I be able to ping that addr?
I can't.
It also says OFFLINE:  and both of my machines are offline,
though the PCSD says they're online.  Which do I trust?

[root@smoking ~]# pcs property show stonith-enabled
Cluster Properties:
 stonith-enabled: false

yet I see entries in the corosync.log referring to stonith.
I'm guessing that's normal.

My corosync.conf file says the quorum is off.

I also don't know what to include in this for any of you to
help me debug.

Ahh, also, is this considered "long", and if so, where would I post
to the web?

thx.

j.
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] (no subject)

2016-05-31 Thread DacioMF
Hi,

I had 4 nodes with Ubuntu 14.04LTS in my cluster and all of then worked well. I 
need upgrade all my cluster nodes to Ubuntu 16.04LTS without stop my resources. 
Two nodes have been updated to 16.04 and the two others remains with 14.04. The 
problem is that my cluster was splited and the nodes with Ubuntu 14.04 only 
work with the other in the same version. The same is true for the nodes with 
Ubuntu 16.04. The feature set of pacemaker in Ubuntu 14.04 is v3.0.7 and in 
16.04 is v3.0.10.

The following commands shows what's happening:

root@xenserver50:/var/log/corosync# crm status
Last updated: Thu May 19 17:19:06 2016
Last change: Thu May 19 09:00:48 2016 via cibadmin on xenserver50
Stack: corosync
Current DC: xenserver51 (51) - partition with quorum
Version: 1.1.10-42f2063
4 Nodes configured
4 Resources configured

Online: [ xenserver50 xenserver51 ]
OFFLINE: [ xenserver52 xenserver54 ]

-

root@xenserver52:/var/log/corosync# crm status
Last updated: Thu May 19 17:20:04 2016Last change: Thu May 19 08:54:57 
2016 by hacluster via crmd on xenserver54
Stack: corosync
Current DC: xenserver52 (version 1.1.14-70404b0) - partition with quorum
4 nodes and 4 resources configured

Online: [ xenserver52 xenserver54 ]
OFFLINE: [ xenserver50 xenserver51 ]

xenserver52 and xenserver54 are Ubuntu 16.04 the others are Ubuntu 14.04.

Someone knows what's the problem?

Sorry by my poor english.

Best regards,
 DacioMF Analista de Redes e Infraestrutura

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] How to use totem.interface.dynamic for corosync ?

2016-05-31 Thread wd
On Tue, May 31, 2016 at 3:08 PM, Jan Friesse  wrote:

> I'll try corosync 2.x
>> So I need to edit configuration files on all node then call
>> corosync-cfgtool -R or on one  node is ok?
>>
>
> You have to edit file on all nodes. Then call corosync-cfgtool -R on one
> of the nodes.


I see, thanks for  your help.


>
>
>
>> On Fri, May 27, 2016 at 6:28 PM, Jan Friesse  wrote:
>>
>> hi,
>>>

 I use corosync with udpu ( since multi broadcast not work here), and
 want
 the ability to add node dynamic.

 I found this post http://www.spinics.net/lists/corosync/msg00141.html ,
 and
 tried it, but nothing happened.How can I use this feature?


>>> Yes, with corosync 2.x (it was never backported into 1.x). Also it's
>>> probably easier to just edit configuration file and call corosync-cfgtool
>>> -R to reload configuration.
>>>
>>> Regards,
>>>Honza
>>>
>>>
>>>

 ___
 Users mailing list: Users@clusterlabs.org
 http://clusterlabs.org/mailman/listinfo/users

 Project Home: http://www.clusterlabs.org
 Getting started:
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org



>>> ___
>>> Users mailing list: Users@clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>>
>> ___
>> Users mailing list: Users@clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org