[Ubuntu-ha] [Bug 1881762] [NEW] resource timeout not respecting units

Jason Grammenos Tue, 02 Jun 2020 07:06:08 -0700

Public bug reported:

While working on pacemaker, i discovered a issue with timeouts


haproxy_stop_0 on primary 'OCF_TIMEOUT' (198): call=583, status='Timed
Out', exitreason='', last-rc-change='1970-01-04 17:21:18 -05:00',
queued=44ms,      exec=176272ms

this lead me down the path of finding that setting a timeout unit value
was not doing anything

primitive haproxy systemd:haproxy \
        op monitor interval=2s \
        op start interval=0s timeout=500s \
        op stop interval=0s timeout=500s \
        meta migration-threshold=2

primitive haproxy systemd:haproxy \
        op monitor interval=2s \
        op start interval=0s timeout=500 \
        op stop interval=0s timeout=500 \
        meta migration-threshold=2

the two above configs result in the same behaviour, pacemaker/crm seems to be 
ignoring the "s"
I file a bug with pacemaker itself
https://bugs.clusterlabs.org/show_bug.cgi?id=5429

but this lead to the following responsed, copied from the ticket:

<<Looking back on your irc chat, I see you have a version of Pacemaker
with a known bug:

<<haproxy_stop_0 on primary 'OCF_TIMEOUT' (198): call=583, status='Timed
Out', exitreason='', last-rc-<<change='1970-01-04 17:21:18 -05:00',
queued=44ms,      exec=176272ms

<<The incorrect date is a result of bugs that occur in systemd resources
when Pacemaker 2.0.3 is built <<with the -UPCMK_TIME_EMERGENCY_CGT C
flag (which is not the default). I was only aware of that being the
<<case in one Fedora release. If those are stock Ubuntu packages, please
file an Ubuntu bug to make sure <<they are aware of it.

<<The underlying bugs are fixed as of the Pacemaker 2.0.4 release. If
anyone wants to backport specific <<commits instead, the github pull
requests #1992 and #1997 should take care of it.

It appears the the root cause of my issue with setting timeout values
with units ("600s") is a bug in the build process of ubuntu pacemaker

1) lsb_release -d Description:    Ubuntu 20.04 LTS
2) ii  pacemaker                            2.0.3-3ubuntu3                    
amd64        cluster resource manager
3) setting "100s" in the timeout of a resource should result in a 100 second 
timeout, not a 100 milisecond timeout
4) the settings unit value "s", is being ignored. force me to set the timeout 
to 10000 to get a 10 second timeout

** Affects: pacemaker (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1881762

Title:
  resource timeout not respecting units

Status in pacemaker package in Ubuntu:
  New

Bug description:
  While working on pacemaker, i discovered a issue with timeouts

  haproxy_stop_0 on primary 'OCF_TIMEOUT' (198): call=583, status='Timed
  Out', exitreason='', last-rc-change='1970-01-04 17:21:18 -05:00',
  queued=44ms,      exec=176272ms

  this lead me down the path of finding that setting a timeout unit
  value was not doing anything

  primitive haproxy systemd:haproxy \
          op monitor interval=2s \
          op start interval=0s timeout=500s \
          op stop interval=0s timeout=500s \
          meta migration-threshold=2

  primitive haproxy systemd:haproxy \
          op monitor interval=2s \
          op start interval=0s timeout=500 \
          op stop interval=0s timeout=500 \
          meta migration-threshold=2

  the two above configs result in the same behaviour, pacemaker/crm seems to be 
ignoring the "s"
  I file a bug with pacemaker itself
  https://bugs.clusterlabs.org/show_bug.cgi?id=5429

  but this lead to the following responsed, copied from the ticket:

  <<Looking back on your irc chat, I see you have a version of Pacemaker
  with a known bug:

  <<haproxy_stop_0 on primary 'OCF_TIMEOUT' (198): call=583,
  status='Timed Out', exitreason='', last-rc-<<change='1970-01-04
  17:21:18 -05:00', queued=44ms,      exec=176272ms

  <<The incorrect date is a result of bugs that occur in systemd
  resources when Pacemaker 2.0.3 is built <<with the
  -UPCMK_TIME_EMERGENCY_CGT C flag (which is not the default). I was
  only aware of that being the <<case in one Fedora release. If those
  are stock Ubuntu packages, please file an Ubuntu bug to make sure
  <<they are aware of it.

  <<The underlying bugs are fixed as of the Pacemaker 2.0.4 release. If
  anyone wants to backport specific <<commits instead, the github pull
  requests #1992 and #1997 should take care of it.

  It appears the the root cause of my issue with setting timeout values
  with units ("600s") is a bug in the build process of ubuntu pacemaker

  1) lsb_release -d Description:    Ubuntu 20.04 LTS
  2) ii  pacemaker                            2.0.3-3ubuntu3                    
amd64        cluster resource manager
  3) setting "100s" in the timeout of a resource should result in a 100 second 
timeout, not a 100 milisecond timeout
  4) the settings unit value "s", is being ignored. force me to set the timeout 
to 10000 to get a 10 second timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1881762/+subscriptions

_______________________________________________
Mailing list: https://launchpad.net/~ubuntu-ha
Post to     : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp

[Ubuntu-ha] [Bug 1881762] [NEW] resource timeout not respecting units

Reply via email to