[Ubuntu-ha] [Bug 1586876] [NEW] Corosync report "Started" itself too early

guessi Sun, 29 May 2016 19:06:08 -0700

Public bug reported:

Problem description:
currently, we have no service state check after start-stop-daemon in do_start(),
it might lead to an error if corosync report itself started too early,
pacemaker might think it is a 'heartbeat' backended, which is not we desired,
we should check if corosync is "really" started, then report its state,


syslog with wrong state:
May 24 19:53:50 myhost corosync[1018]:   [MAIN  ] Corosync Cluster Engine 
('1.4.2'): started and ready to provide service.
May 24 19:53:50 myhost corosync[1018]:   [MAIN  ] Corosync built-in features: 
nss
May 24 19:53:50 myhost corosync[1018]:   [MAIN  ] Successfully read main 
configuration file '/etc/corosync/corosync.conf'.
May 24 19:53:50 myhost corosync[1018]:   [TOTEM ] Initializing transport 
(UDP/IP Unicast).
May 24 19:53:50 myhost corosync[1018]:   [TOTEM ] Initializing transmit/receive 
security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 24 19:53:50 myhost pacemakerd: [1094]: info: Invoked: pacemakerd
May 24 19:53:50 myhost pacemakerd: [1094]: info: crm_log_init_worker: Changed 
active directory to /var/lib/heartbeat/cores/root
May 24 19:53:50 myhost pacemakerd: [1094]: info: get_cluster_type: Assuming a 
'heartbeat' based cluster
May 24 19:53:50 myhost pacemakerd: [1094]: info: read_config: Reading configure 
for stack: heartbeat

expected result:
May 24 21:45:02 myhost corosync[1021]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
May 24 21:45:02 myhost pacemakerd: [1106]: info: Invoked: pacemakerd
May 24 21:45:02 myhost pacemakerd: [1106]: info: crm_log_init_worker: Changed 
active directory to /var/lib/heartbeat/cores/root
May 24 21:45:02 myhost pacemakerd: [1106]: info: config_find_next: Processing 
additional service options...
May 24 21:45:02 myhost pacemakerd: [1106]: info: get_config_opt: Found 
'pacemaker' for option: name
May 24 21:45:02 myhost pacemakerd: [1106]: info: get_config_opt: Found '1' for 
option: ver
May 24 21:45:02 myhost pacemakerd: [1106]: info: get_cluster_type: Detected an 
active 'classic openais (with plugin)' cluster

please note the order of following two lines:
* corosync: [MAIN  ] Completed service synchronization, ready to provide 
service.
* pacemakerd: info: get_cluster_type: ...

affected versions:
ALL (precise, trusty, vivid, wily, xenial, yakkety)

upstream solution: wait_for_ipc()
https://github.com/corosync/corosync/blob/master/init/corosync.in#L84-L99

** Affects: corosync (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: corosync precise trusty vivid wily xenial yakkety

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1586876

Title:
  Corosync report "Started" itself too early

Status in corosync package in Ubuntu:
  New

Bug description:
  Problem description:
  currently, we have no service state check after start-stop-daemon in 
do_start(),
  it might lead to an error if corosync report itself started too early,
  pacemaker might think it is a 'heartbeat' backended, which is not we desired,
  we should check if corosync is "really" started, then report its state,

  syslog with wrong state:
  May 24 19:53:50 myhost corosync[1018]:   [MAIN  ] Corosync Cluster Engine 
('1.4.2'): started and ready to provide service.
  May 24 19:53:50 myhost corosync[1018]:   [MAIN  ] Corosync built-in features: 
nss
  May 24 19:53:50 myhost corosync[1018]:   [MAIN  ] Successfully read main 
configuration file '/etc/corosync/corosync.conf'.
  May 24 19:53:50 myhost corosync[1018]:   [TOTEM ] Initializing transport 
(UDP/IP Unicast).
  May 24 19:53:50 myhost corosync[1018]:   [TOTEM ] Initializing 
transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
  May 24 19:53:50 myhost pacemakerd: [1094]: info: Invoked: pacemakerd
  May 24 19:53:50 myhost pacemakerd: [1094]: info: crm_log_init_worker: Changed 
active directory to /var/lib/heartbeat/cores/root
  May 24 19:53:50 myhost pacemakerd: [1094]: info: get_cluster_type: Assuming a 
'heartbeat' based cluster
  May 24 19:53:50 myhost pacemakerd: [1094]: info: read_config: Reading 
configure for stack: heartbeat

  expected result:
  May 24 21:45:02 myhost corosync[1021]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
  May 24 21:45:02 myhost pacemakerd: [1106]: info: Invoked: pacemakerd
  May 24 21:45:02 myhost pacemakerd: [1106]: info: crm_log_init_worker: Changed 
active directory to /var/lib/heartbeat/cores/root
  May 24 21:45:02 myhost pacemakerd: [1106]: info: config_find_next: Processing 
additional service options...
  May 24 21:45:02 myhost pacemakerd: [1106]: info: get_config_opt: Found 
'pacemaker' for option: name
  May 24 21:45:02 myhost pacemakerd: [1106]: info: get_config_opt: Found '1' 
for option: ver
  May 24 21:45:02 myhost pacemakerd: [1106]: info: get_cluster_type: Detected 
an active 'classic openais (with plugin)' cluster

  please note the order of following two lines:
  * corosync: [MAIN  ] Completed service synchronization, ready to provide 
service.
  * pacemakerd: info: get_cluster_type: ...

  affected versions:
  ALL (precise, trusty, vivid, wily, xenial, yakkety)

  upstream solution: wait_for_ipc()
  https://github.com/corosync/corosync/blob/master/init/corosync.in#L84-L99

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1586876/+subscriptions

_______________________________________________
Mailing list: https://launchpad.net/~ubuntu-ha
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp

[Ubuntu-ha] [Bug 1586876] [NEW] Corosync report "Started" itself too early

Reply via email to