Re: [ClusterLabs] newbie questions

2016-05-31 Thread Digimer
On 31/05/16 10:41 PM, Jay Scott wrote:
> hooray for me, but, how?
> 
> I got about 3/4 of Digimer's list done and got stuck.
> I did a pcs cluster status, and, behold, the cluster was up.
> I pinged the ClusterIP and it answered.  I didn't know what
> to do with the 'delay="x"' part, that's the thing I couldn't figure
> out.  (I've been assuming the delay part is a big deal.)

Delay works like this;

Both nodes are up, but comms break (switch loop/broadcast storm,
STP/stack renegotiation, iptables oops, whatever)... Both nodes declare
their peer lost.

Node 1's stonith config includes 'delay="15"'.

Node 1 looks up how to fence node 2, calls the fence.

Node 2 looks up how to fence node 1, calls fence (passing to the agent
the delay).

The fence agent running on node 1 executes without delay.

The fence agent running on node 2 sees a delay of 15 seconds, and sleeps.

Node 1 kills node 2 before the sleep exits, thus ensuring that node 1
lived and node 2 died. Assuming you have your services on node 1, then
that means no recovery is needed.

Now assume that node 1 truly died. Node 2's fence agent would exit the
sleep after 15 seconds and proceed to shoot node 1 and then recover any
resources that had been on node 1.

digimer

> However, there are more things for me to read and more experiments
> for me to try so I'm good for now.
> 
> Thanks to everyone for the prompt help.
> 
> j.
> 
> On Tue, May 31, 2016 at 5:22 PM, Ken Gaillot  > wrote:
> 
> On 05/31/2016 03:59 PM, Jay Scott wrote:
> > Greetings,
> >
> > Cluster newbie
> > Centos 7
> > trying to follow the "Clusters from Scratch" intro.
> > 2 nodes (yeah, I know, but I'm just learning)
> > 
> > [root@smoking ~]# pcs status
> > Cluster name:
> > Last updated: Tue May 31 15:32:18 2016Last change: Tue May 31
> > 15:02:21
> >  2016 by root via cibadmin on smoking
> > Stack: unknown
> 
> "Stack: unknown" is a big problem. The cluster isn't aware of the
> corosync configuration. Did you do the "pcs cluster setup" step?
> 
> > Current DC: NONE
> > 2 nodes and 1 resource configured
> >
> > OFFLINE: [ mars smoking ]
> >
> > Full list of resources:
> >
> >  ClusterIP(ocf::heartbeat:IPaddr2):Stopped
> >
> > PCSD Status:
> >   smoking: Online
> >   mars: Online
> >
> > Daemon Status:
> >   corosync: active/enabled
> >   pacemaker: active/enabled
> >   pcsd: active/enabled
> > 
> >
> > What concerns me at the moment:
> > I did
> > pcs resource enable ClusterIP
> > while simultaneously doing
> > tail -f /var/log/cluster/corosync.log
> > (the only log in there)
> 
> The system log (/var/log/messages or whatever your system has
> configured) is usually the best place to start. The cluster software
> sends messages of interest to end users there, and it includes messages
> from all components (corosync, pacemaker, resource agents, etc.).
> 
> /var/log/cluster/corosync.log (and in some configurations,
> /var/log/pacemaker.log) have more detailed log information for
> debugging.
> 
> > and nothing happens in the log, but the ClusterIP
> > stays "Stopped".  Should I be able to ping that addr?
> > I can't.
> > It also says OFFLINE:  and both of my machines are offline,
> > though the PCSD says they're online.  Which do I trust?
> 
> The first online/offline output is most important, and refers to the
> node's status in the actual cluster; the "PSCD" online/offline output
> simply tells whether the pcs daemon is running. Typically, the pcs
> daemon is enabled at boot and is always running. The pcs daemon is not
> part of the clustering itself; it's a front end to configuring and
> administering the cluster.
> 
> > [root@smoking ~]# pcs property show stonith-enabled
> > Cluster Properties:
> >  stonith-enabled: false
> >
> > yet I see entries in the corosync.log referring to stonith.
> > I'm guessing that's normal.
> 
> Yes, you can enable stonith at any time, so the stonith daemon will
> still run, to stay aware of the cluster status.
> 
> > My corosync.conf file says the quorum is off.
> >
> > I also don't know what to include in this for any of you to
> > help me debug.
> >
> > Ahh, also, is this considered "long", and if so, where would I post
> > to the web?
> >
> > thx.
> >
> > j.
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> 
> ___
> Users 

Re: [ClusterLabs] newbie questions

2016-05-31 Thread Jay Scott
hooray for me, but, how?

I got about 3/4 of Digimer's list done and got stuck.
I did a pcs cluster status, and, behold, the cluster was up.
I pinged the ClusterIP and it answered.  I didn't know what
to do with the 'delay="x"' part, that's the thing I couldn't figure
out.  (I've been assuming the delay part is a big deal.)

However, there are more things for me to read and more experiments
for me to try so I'm good for now.

Thanks to everyone for the prompt help.

j.

On Tue, May 31, 2016 at 5:22 PM, Ken Gaillot  wrote:

> On 05/31/2016 03:59 PM, Jay Scott wrote:
> > Greetings,
> >
> > Cluster newbie
> > Centos 7
> > trying to follow the "Clusters from Scratch" intro.
> > 2 nodes (yeah, I know, but I'm just learning)
> > 
> > [root@smoking ~]# pcs status
> > Cluster name:
> > Last updated: Tue May 31 15:32:18 2016Last change: Tue May 31
> > 15:02:21
> >  2016 by root via cibadmin on smoking
> > Stack: unknown
>
> "Stack: unknown" is a big problem. The cluster isn't aware of the
> corosync configuration. Did you do the "pcs cluster setup" step?
>
> > Current DC: NONE
> > 2 nodes and 1 resource configured
> >
> > OFFLINE: [ mars smoking ]
> >
> > Full list of resources:
> >
> >  ClusterIP(ocf::heartbeat:IPaddr2):Stopped
> >
> > PCSD Status:
> >   smoking: Online
> >   mars: Online
> >
> > Daemon Status:
> >   corosync: active/enabled
> >   pacemaker: active/enabled
> >   pcsd: active/enabled
> > 
> >
> > What concerns me at the moment:
> > I did
> > pcs resource enable ClusterIP
> > while simultaneously doing
> > tail -f /var/log/cluster/corosync.log
> > (the only log in there)
>
> The system log (/var/log/messages or whatever your system has
> configured) is usually the best place to start. The cluster software
> sends messages of interest to end users there, and it includes messages
> from all components (corosync, pacemaker, resource agents, etc.).
>
> /var/log/cluster/corosync.log (and in some configurations,
> /var/log/pacemaker.log) have more detailed log information for debugging.
>
> > and nothing happens in the log, but the ClusterIP
> > stays "Stopped".  Should I be able to ping that addr?
> > I can't.
> > It also says OFFLINE:  and both of my machines are offline,
> > though the PCSD says they're online.  Which do I trust?
>
> The first online/offline output is most important, and refers to the
> node's status in the actual cluster; the "PSCD" online/offline output
> simply tells whether the pcs daemon is running. Typically, the pcs
> daemon is enabled at boot and is always running. The pcs daemon is not
> part of the clustering itself; it's a front end to configuring and
> administering the cluster.
>
> > [root@smoking ~]# pcs property show stonith-enabled
> > Cluster Properties:
> >  stonith-enabled: false
> >
> > yet I see entries in the corosync.log referring to stonith.
> > I'm guessing that's normal.
>
> Yes, you can enable stonith at any time, so the stonith daemon will
> still run, to stay aware of the cluster status.
>
> > My corosync.conf file says the quorum is off.
> >
> > I also don't know what to include in this for any of you to
> > help me debug.
> >
> > Ahh, also, is this considered "long", and if so, where would I post
> > to the web?
> >
> > thx.
> >
> > j.
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] newbie questions

2016-05-31 Thread Ken Gaillot
On 05/31/2016 03:59 PM, Jay Scott wrote:
> Greetings,
> 
> Cluster newbie
> Centos 7
> trying to follow the "Clusters from Scratch" intro.
> 2 nodes (yeah, I know, but I'm just learning)
> 
> [root@smoking ~]# pcs status
> Cluster name:
> Last updated: Tue May 31 15:32:18 2016Last change: Tue May 31
> 15:02:21
>  2016 by root via cibadmin on smoking
> Stack: unknown

"Stack: unknown" is a big problem. The cluster isn't aware of the
corosync configuration. Did you do the "pcs cluster setup" step?

> Current DC: NONE
> 2 nodes and 1 resource configured
> 
> OFFLINE: [ mars smoking ]
> 
> Full list of resources:
> 
>  ClusterIP(ocf::heartbeat:IPaddr2):Stopped
> 
> PCSD Status:
>   smoking: Online
>   mars: Online
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> 
> What concerns me at the moment:
> I did
> pcs resource enable ClusterIP
> while simultaneously doing
> tail -f /var/log/cluster/corosync.log
> (the only log in there)

The system log (/var/log/messages or whatever your system has
configured) is usually the best place to start. The cluster software
sends messages of interest to end users there, and it includes messages
from all components (corosync, pacemaker, resource agents, etc.).

/var/log/cluster/corosync.log (and in some configurations,
/var/log/pacemaker.log) have more detailed log information for debugging.

> and nothing happens in the log, but the ClusterIP
> stays "Stopped".  Should I be able to ping that addr?
> I can't.
> It also says OFFLINE:  and both of my machines are offline,
> though the PCSD says they're online.  Which do I trust?

The first online/offline output is most important, and refers to the
node's status in the actual cluster; the "PSCD" online/offline output
simply tells whether the pcs daemon is running. Typically, the pcs
daemon is enabled at boot and is always running. The pcs daemon is not
part of the clustering itself; it's a front end to configuring and
administering the cluster.

> [root@smoking ~]# pcs property show stonith-enabled
> Cluster Properties:
>  stonith-enabled: false
> 
> yet I see entries in the corosync.log referring to stonith.
> I'm guessing that's normal.

Yes, you can enable stonith at any time, so the stonith daemon will
still run, to stay aware of the cluster status.

> My corosync.conf file says the quorum is off.
> 
> I also don't know what to include in this for any of you to
> help me debug.
> 
> Ahh, also, is this considered "long", and if so, where would I post
> to the web?
> 
> thx.
> 
> j.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] newbie questions

2016-05-31 Thread Jay Scott
Greetings,

Cluster newbie
Centos 7
trying to follow the "Clusters from Scratch" intro.
2 nodes (yeah, I know, but I'm just learning)

[root@smoking ~]# pcs status
Cluster name:
Last updated: Tue May 31 15:32:18 2016Last change: Tue May 31
15:02:21
 2016 by root via cibadmin on smoking
Stack: unknown
Current DC: NONE
2 nodes and 1 resource configured

OFFLINE: [ mars smoking ]

Full list of resources:

 ClusterIP(ocf::heartbeat:IPaddr2):Stopped

PCSD Status:
  smoking: Online
  mars: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled


What concerns me at the moment:
I did
pcs resource enable ClusterIP
while simultaneously doing
tail -f /var/log/cluster/corosync.log
(the only log in there)
and nothing happens in the log, but the ClusterIP
stays "Stopped".  Should I be able to ping that addr?
I can't.
It also says OFFLINE:  and both of my machines are offline,
though the PCSD says they're online.  Which do I trust?

[root@smoking ~]# pcs property show stonith-enabled
Cluster Properties:
 stonith-enabled: false

yet I see entries in the corosync.log referring to stonith.
I'm guessing that's normal.

My corosync.conf file says the quorum is off.

I also don't know what to include in this for any of you to
help me debug.

Ahh, also, is this considered "long", and if so, where would I post
to the web?

thx.

j.
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org