Thank you, Rorthais. I see now.

-----邮件原件-----
发件人: Jehan-Guillaume de Rorthais [mailto:j...@dalibo.com] 
发送时间: 2018年4月13日 17:17
收件人: 范国腾 <fanguot...@highgo.com>
抄送: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
主题: Re: [ClusterLabs] No slave is promoted to be master

OK, I know what happen.

It seems like your standbies were not replicating when the master "crashed", 
you can find tons of messages like this in the log files:

  WARNING: No secondary connected to the master
  WARNING: "db2" is not connected to the primary
  WARNING: "db3" is not connected to the primary

When a standby is not replicating, the master set negative master score to them 
to forbid the promotion on them, as they are probably lagging for some 
undefined time.

The following command shows the scores just before the simulated master crash:

  $ crm_simulate -x pe-input-2039.bz2 -s|grep -E 'date|promotion'
  Using the original execution date of: 2018-04-11 16:23:07Z
  pgsqld:0 promotion score on db1: 1001
  pgsqld:1 promotion score on db2: -1000
  pgsqld:2 promotion score on db3: -1000

"1001" score design the master. Streaming standbies always have a positive 
master score between 1000 and 1000-N*10 where N is the number of connected 
standbies.



On Fri, 13 Apr 2018 01:37:54 +0000
范国腾 <fanguot...@highgo.com> wrote:

> The log is in the attachment.
> 
> We make a bug in the PG code in master node to make it not be 
> restarted any more in order to test the following scenario: One slave 
> could be promoted when the master crashed,
> 
> -----邮件原件-----
> 发件人: Jehan-Guillaume de Rorthais [mailto:j...@dalibo.com]
> 发送时间: 2018年4月12日 17:39
> 收件人: 范国腾 <fanguot...@highgo.com>
> 抄送: Cluster Labs - All topics related to open-source clustering 
> welcomed <users@clusterlabs.org> 主题: Re: [ClusterLabs] No slave is 
> promoted to be master
> 
> Hi,
> On Thu, 12 Apr 2018 08:31:39 +0000
> 范国腾 <fanguot...@highgo.com> wrote:
> 
> > Thank you very much for help check this issue. The information is in 
> > the attachment.
> > 
> > I have restarted the cluster after I send my first email. Not sure 
> > if it affects the checking of "the result of "crm_simulate -sL"
> 
> It does...
> 
> Could you please provide files
> from /var/lib/pacemaker/pengine/pe-input-2039.bz2 to  pe-input-2065.bz2 ?
> 
> [...]
> > Then the master is restarted and it could not start(that is ok and 
> > we know the reason)。
> 
> Why couldn't it start ?



--
Jehan-Guillaume de Rorthais
Dalibo
_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to