Re: [ClusterLabs] pacemaker systemd resource

2020-07-23 Thread Хиль Эдуард


Thx Andrei, and to all of you guys for your time, i appreciate that!

Yeah, it’s very sad to see that. Looks like a bug described here:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1869751
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1881762
Well, for me no other way, but change OS from ubuntu to something else, cuz i 
am very disappointed there are so critical bugs :(
  
>Среда, 22 июля 2020, 22:57 +05:00 от Andrei Borzenkov :
> 
>22.07.2020 12:46, Хиль Эдуард пишет:
>>
>> Hey, Andrei! Thanx for ur time!
>> A-a-and there is no chance to do something? :( 
>> The pacemaker’s log below.
>>  
>
>Resource was started:
>
>...
>> Jul 22 12:38:36 node2.local pacemaker-execd     [1721] (log_execute)     
>> info: executing - rsc:dummy.service action:start call_id:76
>> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: Diff: --- 0.131.4 2
>> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: Diff: +++ 0.131.5 (null)
>> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: +  /cib:  @num_updates=5
>> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: +  
>> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='dummy.service']/lrm_rsc_op[@id='dummy.service_last_0']:
>>  @operation_key=dummy.service_start_0, @operation=start, 
>> @transition-key=164:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a, 
>> @transition-magic=-1:193;164:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a, 
>> @call-id=-1, @rc-code=193, @op-status=-1, @last-rc-change=1595410716, 
>> @last-run=1595410716, @e
>> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_process_request) 
>>     info: Completed cib_modify operation for section status: OK (rc=0, 
>> origin=node2.local/crmd/62, version=0.131.5)
>> Jul 22 12:38:36 node2.local pacemaker-execd     [1721] (systemd_exec_result) 
>>     info: Call to start passed: /org/freedesktop/systemd1/job/703
>> Jul 22 12:38:38 node2.local pacemaker-controld  [1724] (process_lrm_event)   
>>   notice: Result of start operation for dummy.service on node2.local: 0 (ok) 
>> | call=76 key=dummy.service_start_0 confirmed=true cib-update=63
>
>So start operation at least was successfully completed.
>
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_process_request) 
>>     info: Forwarding cib_modify operation for section status to all 
>> (origin=local/crmd/63)
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: Diff: --- 0.131.5 2
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: Diff: +++ 0.131.6 (null)
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: +  /cib:  @num_updates=6
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: +  
>> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='dummy.service']/lrm_rsc_op[@id='dummy.service_last_0']:
>>   @transition-magic=0:0;164:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a, 
>> @call-id=76, @rc-code=0, @op-status=0, @last-rc-change=1986, @last-run=1986, 
>> @exec-time=-587720, @queue-time=59
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_process_request) 
>>     info: Completed cib_modify operation for section status: OK (rc=0, 
>> origin=node2.local/crmd/63, version=0.131.6)
>> Jul 22 12:38:38 node2.local pacemaker-controld  [1724] (do_lrm_rsc_op)     
>> info: Performing key=165:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a 
>> op=dummy.service_monitor_6
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_process_request) 
>>     info: Forwarding cib_modify operation for section status to all 
>> (origin=local/crmd/64)
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: Diff: --- 0.131.6 2
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: Diff: +++ 0.131.7 (null)
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: +  /cib:  @num_updates=7
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
>> info: ++ 
>> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='dummy.service']:
>>   > operation_key="dummy.service_monitor_6" operation="monitor" 
>> crm-debug-origin="do_update_resource" crm_feature_set="3.2.0" 
>> transition-key="165:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a" 
>> transition-magic="-1:193;165:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a" 
>> exit-reason="" on_
>> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_process_request) 
>>     info: Completed cib_modify operation for section status: OK (rc=0, 
>> origin=node2.local/crmd/64, version=0.131.7)
>> Jul 22 12:38:38 node2.local pacemaker-controld  [1724] (process_lrm_event)   
>>   notice: Result of monitor operation for dummy.service on 

Re: [ClusterLabs] pacemaker systemd resource

2020-07-22 Thread Ken Gaillot
On Wed, 2020-07-22 at 17:04 +0300, Andrei Borzenkov wrote:
> 
> 
> On Wed, Jul 22, 2020 at 4:58 PM Ken Gaillot 
> wrote:
> > On Wed, 2020-07-22 at 10:59 +0300, Хиль  Эдуард wrote:
> > > Hi there! I have 2 nodes with Pacemaker 2.0.3, corosync 3.0.3 on
> > > ubuntu 20 + 1 qdevice. I want to define new resource as systemd
> > > unit dummy.service :
> > >  
> > > [Unit]
> > > Description=Dummy
> > > [Service]
> > > Restart=on-failure
> > > StartLimitInterval=20
> > > StartLimitBurst=5
> > > TimeoutStartSec=0
> > > RestartSec=5
> > > Environment="HOME=/root"
> > > SyslogIdentifier=dummy
> > > ExecStart=/usr/local/sbin/dummy.sh
> > > [Install]
> > > WantedBy=multi-user.target
> > >  
> > > and /usr/local/sbin/dummy.sh :
> > >  
> > > #!/bin/bash
> > > CNT=0
> > > while true; do
> > >   let CNT++
> > >   echo "hello world $CNT"
> > >   sleep 5
> > > done
> > >  
> > > and then i try to define it with: pcs resource create
> > dummy.service
> > > systemd:dummy op monitor interval="10s" timeout="15s"
> > > after 2 seconds node2 reboot. In logs i see pacemaker in 2
> > seconds
> > > tried to start this unit, and it started, but pacemaker somehow
> > think
> > > he is «Timed Out» . What i am doing wrong? Logs below.
> > 
> > The start is timing out because the ExecStart script never returns.
> > 
> 
> Type=simple does not expect script to go into background. Quite the
> contrary - systemd expects ExecStart command to remain, going into
> background would be interpreted as "service terminated".
> 
> To quote systemd: "the service manager will consider the unit started
> immediately after the main service process has been forked off. It is
> expected that the process configured with ExecStart= is the main
> process of the service".
> 
>  
> > systemd starts processes but it doesn't daemonize them -- the
> > script is
> > responsible for doing that itself. 
> 
> Only for Type=forking

Ah, my bad, sorry for the noise :)
 
> > You can search online for more
> > details about daemonization, but most importantly you want to run
> > your
> > daemon as a subprocess in the background and have your main process
> > return as soon as the daemon is ready for service.
> > 
> > 
> > > Jul 21 15:53:41 node2.local pacemaker-controld[1813]:  notice:
> > Result
> > > of probe operation for dummy.service on node2.local: 7 (not
> > running) 
> > > Jul 21 15:53:41 node2.local systemd[1]: Reloading.
> > > Jul 21 15:53:42 node2.local systemd[1]:
> > > /lib/systemd/system/dbus.socket:5: ListenStream= references a
> > path
> > > below legacy directory /var/run/, updating
> > > /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket;
> > please
> > > update the unit file accordingly.
> > > Jul 21 15:53:42 node2.local systemd[1]:
> > > /lib/systemd/system/docker.socket:6: ListenStream= references a
> > path
> > > below legacy directory /var/run/, updating /var/run/docker.sock →
> > > /run/docker.sock; please update the unit file accordingly.
> > > Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice:
> > Giving up
> > > on dummy.service start (rc=0): timeout (elapsed=259719ms,
> > remaining=-
> > > 159719ms)
> > > Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error:
> > Result
> > > of start operation for dummy.service on node2.local: Timed Out 
> > > Jul 21 15:53:42 node2.local systemd[1]: Started Cluster
> > Controlled
> > > dummy.
> > > Jul 21 15:53:42 node2.local dummy[9330]: hello world 1
> > > Jul 21 15:53:42 node2.local systemd-udevd[922]: Network interface
> > > NamePolicy= disabled on kernel command line, ignoring.
> > > Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice:
> > Setting
> > > fail-count-dummy.service#start_0[node2.local]: (unset) ->
> > INFINITY 
> > > Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice:
> > Setting
> > > last-failure-dummy.service#start_0[node2.local]: (unset) ->
> > > 1595336022 
> > > Jul 21 15:53:42 node2.local systemd[1]: Reloading.
> > > Jul 21 15:53:42 node2.local systemd[1]:
> > > /lib/systemd/system/dbus.socket:5: ListenStream= references a
> > path
> > > below legacy directory /var/run/, updating
> > > /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket;
> > please
> > > update the unit file accordingly.
> > > Jul 21 15:53:42 node2.local systemd[1]:
> > > /lib/systemd/system/docker.socket:6: ListenStream= references a
> > path
> > > below legacy directory /var/run/, updating /var/run/docker.sock →
> > > /run/docker.sock; please update the unit file accordingly.
> > > Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice:
> > Giving up
> > > on dummy.service stop (rc=0): timeout (elapsed=317181ms,
> > remaining=-
> > > 217181ms)
> > > Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error:
> > Result
> > > of stop operation for dummy.service on node2.local: Timed Out 
> > > Jul 21 15:53:42 node2.local systemd[1]: Stopping Daemon for
> > dummy...
> > > Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice:
> > Setting
> > > 

Re: [ClusterLabs] pacemaker systemd resource

2020-07-22 Thread Reid Wahl
On Wed, Jul 22, 2020 at 10:57 AM Andrei Borzenkov 
wrote:

> 22.07.2020 12:46, Хиль Эдуард пишет:
> >
> > Hey, Andrei! Thanx for ur time!
> > A-a-and there is no chance to do something? :(
> > The pacemaker’s log below.
> >
>
> Resource was started:
>
> ...
> > Jul 22 12:38:36 node2.local pacemaker-execd [1721] (log_execute)
>  info: executing - rsc:dummy.service action:start call_id:76
> > Jul 22 12:38:36 node2.local pacemaker-based [1719] (cib_perform_op)
> info: Diff: --- 0.131.4 2
> > Jul 22 12:38:36 node2.local pacemaker-based [1719] (cib_perform_op)
> info: Diff: +++ 0.131.5 (null)
> > Jul 22 12:38:36 node2.local pacemaker-based [1719] (cib_perform_op)
> info: +  /cib:  @num_updates=5
> > Jul 22 12:38:36 node2.local pacemaker-based [1719] (cib_perform_op)
> info: +
>  
> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='dummy.service']/lrm_rsc_op[@id='dummy.service_last_0']:
>  @operation_key=dummy.service_start_0, @operation=start,
> @transition-key=164:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a,
> @transition-magic=-1:193;164:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a,
> @call-id=-1, @rc-code=193, @op-status=-1, @last-rc-change=1595410716,
> @last-run=1595410716, @e
> > Jul 22 12:38:36 node2.local pacemaker-based [1719]
> (cib_process_request) info: Completed cib_modify operation for section
> status: OK (rc=0, origin=node2.local/crmd/62, version=0.131.5)
> > Jul 22 12:38:36 node2.local pacemaker-execd [1721]
> (systemd_exec_result) info: Call to start passed:
> /org/freedesktop/systemd1/job/703
> > Jul 22 12:38:38 node2.local pacemaker-controld  [1724]
> (process_lrm_event) notice: Result of start operation for dummy.service
> on node2.local: 0 (ok) | call=76 key=dummy.service_start_0 confirmed=true
> cib-update=63
>
> So start operation at least was successfully completed.
>
> > Jul 22 12:38:38 node2.local pacemaker-based [1719]
> (cib_process_request) info: Forwarding cib_modify operation for section
> status to all (origin=local/crmd/63)
> > Jul 22 12:38:38 node2.local pacemaker-based [1719] (cib_perform_op)
> info: Diff: --- 0.131.5 2
> > Jul 22 12:38:38 node2.local pacemaker-based [1719] (cib_perform_op)
> info: Diff: +++ 0.131.6 (null)
> > Jul 22 12:38:38 node2.local pacemaker-based [1719] (cib_perform_op)
> info: +  /cib:  @num_updates=6
> > Jul 22 12:38:38 node2.local pacemaker-based [1719] (cib_perform_op)
> info: +
>  
> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='dummy.service']/lrm_rsc_op[@id='dummy.service_last_0']:
>  @transition-magic=0:0;164:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a,
> @call-id=76, @rc-code=0, @op-status=0, @last-rc-change=1986,
> @last-run=1986, @exec-time=-587720, @queue-time=59
> > Jul 22 12:38:38 node2.local pacemaker-based [1719]
> (cib_process_request) info: Completed cib_modify operation for section
> status: OK (rc=0, origin=node2.local/crmd/63, version=0.131.6)
> > Jul 22 12:38:38 node2.local pacemaker-controld  [1724] (do_lrm_rsc_op)
> info: Performing key=165:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a
> op=dummy.service_monitor_6
> > Jul 22 12:38:38 node2.local pacemaker-based [1719]
> (cib_process_request) info: Forwarding cib_modify operation for section
> status to all (origin=local/crmd/64)
> > Jul 22 12:38:38 node2.local pacemaker-based [1719] (cib_perform_op)
> info: Diff: --- 0.131.6 2
> > Jul 22 12:38:38 node2.local pacemaker-based [1719] (cib_perform_op)
> info: Diff: +++ 0.131.7 (null)
> > Jul 22 12:38:38 node2.local pacemaker-based [1719] (cib_perform_op)
> info: +  /cib:  @num_updates=7
> > Jul 22 12:38:38 node2.local pacemaker-based [1719] (cib_perform_op)
> info: ++
> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='dummy.service']:
>   operation_key="dummy.service_monitor_6" operation="monitor"
> crm-debug-origin="do_update_resource" crm_feature_set="3.2.0"
> transition-key="165:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a"
> transition-magic="-1:193;165:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a"
> exit-reason="" on_
> > Jul 22 12:38:38 node2.local pacemaker-based [1719]
> (cib_process_request) info: Completed cib_modify operation for section
> status: OK (rc=0, origin=node2.local/crmd/64, version=0.131.7)
> > Jul 22 12:38:38 node2.local pacemaker-controld  [1724]
> (process_lrm_event) notice: Result of monitor operation for
> dummy.service on node2.local: 0 (ok) | call=77
> key=dummy.service_monitor_6 confirmed=false cib-update=65
>
> And monitor confirmed that it was started
>
> > Jul 22 12:38:38 node2.local pacemaker-based [1719]
> (cib_process_request) info: Forwarding cib_modify operation for section
> status to all (origin=local/crmd/65)
> > Jul 22 12:38:38 node2.local pacemaker-based [1719] (cib_perform_op)
> info: Diff: --- 0.131.7 2
> > Jul 22 12:38:38 node2.local 

Re: [ClusterLabs] pacemaker systemd resource

2020-07-22 Thread Andrei Borzenkov
22.07.2020 12:46, Хиль Эдуард пишет:
> 
> Hey, Andrei! Thanx for ur time!
> A-a-and there is no chance to do something? :( 
> The pacemaker’s log below.
>  

Resource was started:

...
> Jul 22 12:38:36 node2.local pacemaker-execd     [1721] (log_execute)     
> info: executing - rsc:dummy.service action:start call_id:76
> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: Diff: --- 0.131.4 2
> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: Diff: +++ 0.131.5 (null)
> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: +  /cib:  @num_updates=5
> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: +  
> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='dummy.service']/lrm_rsc_op[@id='dummy.service_last_0']:
>   @operation_key=dummy.service_start_0, @operation=start, 
> @transition-key=164:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a, 
> @transition-magic=-1:193;164:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a, 
> @call-id=-1, @rc-code=193, @op-status=-1, @last-rc-change=1595410716, 
> @last-run=1595410716, @e
> Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_process_request)  
>    info: Completed cib_modify operation for section status: OK (rc=0, 
> origin=node2.local/crmd/62, version=0.131.5)
> Jul 22 12:38:36 node2.local pacemaker-execd     [1721] (systemd_exec_result)  
>    info: Call to start passed: /org/freedesktop/systemd1/job/703
> Jul 22 12:38:38 node2.local pacemaker-controld  [1724] (process_lrm_event)    
>  notice: Result of start operation for dummy.service on node2.local: 0 (ok) | 
> call=76 key=dummy.service_start_0 confirmed=true cib-update=63

So start operation at least was successfully completed.

> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_process_request)  
>    info: Forwarding cib_modify operation for section status to all 
> (origin=local/crmd/63)
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: Diff: --- 0.131.5 2
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: Diff: +++ 0.131.6 (null)
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: +  /cib:  @num_updates=6
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: +  
> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='dummy.service']/lrm_rsc_op[@id='dummy.service_last_0']:
>   @transition-magic=0:0;164:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a, 
> @call-id=76, @rc-code=0, @op-status=0, @last-rc-change=1986, @last-run=1986, 
> @exec-time=-587720, @queue-time=59
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_process_request)  
>    info: Completed cib_modify operation for section status: OK (rc=0, 
> origin=node2.local/crmd/63, version=0.131.6)
> Jul 22 12:38:38 node2.local pacemaker-controld  [1724] (do_lrm_rsc_op)     
> info: Performing key=165:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a 
> op=dummy.service_monitor_6
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_process_request)  
>    info: Forwarding cib_modify operation for section status to all 
> (origin=local/crmd/64)
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: Diff: --- 0.131.6 2
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: Diff: +++ 0.131.7 (null)
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: +  /cib:  @num_updates=7
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: ++ 
> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='dummy.service']:
>    operation_key="dummy.service_monitor_6" operation="monitor" 
> crm-debug-origin="do_update_resource" crm_feature_set="3.2.0" 
> transition-key="165:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a" 
> transition-magic="-1:193;165:23:0:76f4932e-716b-45b8-8fed-a20c3806df8a" 
> exit-reason="" on_
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_process_request)  
>    info: Completed cib_modify operation for section status: OK (rc=0, 
> origin=node2.local/crmd/64, version=0.131.7)
> Jul 22 12:38:38 node2.local pacemaker-controld  [1724] (process_lrm_event)    
>  notice: Result of monitor operation for dummy.service on node2.local: 0 (ok) 
> | call=77 key=dummy.service_monitor_6 confirmed=false cib-update=65

And monitor confirmed that it was started

> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_process_request)  
>    info: Forwarding cib_modify operation for section status to all 
> (origin=local/crmd/65)
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: Diff: --- 0.131.7 2
> Jul 22 12:38:38 node2.local pacemaker-based     [1719] (cib_perform_op)     
> info: Diff: +++ 0.131.8 (null)
> Jul 22 

Re: [ClusterLabs] pacemaker systemd resource

2020-07-22 Thread Andrei Borzenkov
On Wed, Jul 22, 2020 at 4:58 PM Ken Gaillot  wrote:

> On Wed, 2020-07-22 at 10:59 +0300, Хиль  Эдуард wrote:
> > Hi there! I have 2 nodes with Pacemaker 2.0.3, corosync 3.0.3 on
> > ubuntu 20 + 1 qdevice. I want to define new resource as systemd
> > unit dummy.service :
> >
> > [Unit]
> > Description=Dummy
> > [Service]
> > Restart=on-failure
> > StartLimitInterval=20
> > StartLimitBurst=5
> > TimeoutStartSec=0
> > RestartSec=5
> > Environment="HOME=/root"
> > SyslogIdentifier=dummy
> > ExecStart=/usr/local/sbin/dummy.sh
> > [Install]
> > WantedBy=multi-user.target
> >
> > and /usr/local/sbin/dummy.sh :
> >
> > #!/bin/bash
> > CNT=0
> > while true; do
> >   let CNT++
> >   echo "hello world $CNT"
> >   sleep 5
> > done
> >
> > and then i try to define it with: pcs resource create dummy.service
> > systemd:dummy op monitor interval="10s" timeout="15s"
> > after 2 seconds node2 reboot. In logs i see pacemaker in 2 seconds
> > tried to start this unit, and it started, but pacemaker somehow think
> > he is «Timed Out» . What i am doing wrong? Logs below.
>
> The start is timing out because the ExecStart script never returns.
>
>
Type=simple does not expect script to go into background. Quite the
contrary - systemd expects ExecStart command to remain, going into
background would be interpreted as "service terminated".

To quote systemd: "the service manager will consider the unit started
immediately after the main service process has been forked off. It is
expected that the process configured with ExecStart= is the main process of
the service".



> systemd starts processes but it doesn't daemonize them -- the script is
> responsible for doing that itself.


Only for Type=forking



> You can search online for more
> details about daemonization, but most importantly you want to run your
> daemon as a subprocess in the background and have your main process
> return as soon as the daemon is ready for service.
>
>
> > Jul 21 15:53:41 node2.local pacemaker-controld[1813]:  notice: Result
> > of probe operation for dummy.service on node2.local: 7 (not running)
> > Jul 21 15:53:41 node2.local systemd[1]: Reloading.
> > Jul 21 15:53:42 node2.local systemd[1]:
> > /lib/systemd/system/dbus.socket:5: ListenStream= references a path
> > below legacy directory /var/run/, updating
> > /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please
> > update the unit file accordingly.
> > Jul 21 15:53:42 node2.local systemd[1]:
> > /lib/systemd/system/docker.socket:6: ListenStream= references a path
> > below legacy directory /var/run/, updating /var/run/docker.sock →
> > /run/docker.sock; please update the unit file accordingly.
> > Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up
> > on dummy.service start (rc=0): timeout (elapsed=259719ms, remaining=-
> > 159719ms)
> > Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result
> > of start operation for dummy.service on node2.local: Timed Out
> > Jul 21 15:53:42 node2.local systemd[1]: Started Cluster Controlled
> > dummy.
> > Jul 21 15:53:42 node2.local dummy[9330]: hello world 1
> > Jul 21 15:53:42 node2.local systemd-udevd[922]: Network interface
> > NamePolicy= disabled on kernel command line, ignoring.
> > Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> > fail-count-dummy.service#start_0[node2.local]: (unset) -> INFINITY
> > Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> > last-failure-dummy.service#start_0[node2.local]: (unset) ->
> > 1595336022
> > Jul 21 15:53:42 node2.local systemd[1]: Reloading.
> > Jul 21 15:53:42 node2.local systemd[1]:
> > /lib/systemd/system/dbus.socket:5: ListenStream= references a path
> > below legacy directory /var/run/, updating
> > /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please
> > update the unit file accordingly.
> > Jul 21 15:53:42 node2.local systemd[1]:
> > /lib/systemd/system/docker.socket:6: ListenStream= references a path
> > below legacy directory /var/run/, updating /var/run/docker.sock →
> > /run/docker.sock; please update the unit file accordingly.
> > Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up
> > on dummy.service stop (rc=0): timeout (elapsed=317181ms, remaining=-
> > 217181ms)
> > Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result
> > of stop operation for dummy.service on node2.local: Timed Out
> > Jul 21 15:53:42 node2.local systemd[1]: Stopping Daemon for dummy...
> > Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> > fail-count-dummy.service#stop_0[node2.local]: (unset) -> INFINITY
> > Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> > last-failure-dummy.service#stop_0[node2.local]: (unset) ->
> > 1595336022
> > Jul 21 15:53:42 node2.local systemd[1]: dummy.service: Succeeded.
> > Jul 21 15:53:42 node2.local systemd[1]: Stopped Daemon for dummy.
> > ... lost connection (node rebooting)
> >
> >
> > 

Re: [ClusterLabs] pacemaker systemd resource

2020-07-22 Thread Ken Gaillot
On Wed, 2020-07-22 at 10:59 +0300, Хиль  Эдуард wrote:
> Hi there! I have 2 nodes with Pacemaker 2.0.3, corosync 3.0.3 on
> ubuntu 20 + 1 qdevice. I want to define new resource as systemd
> unit dummy.service :
>  
> [Unit]
> Description=Dummy
> [Service]
> Restart=on-failure
> StartLimitInterval=20
> StartLimitBurst=5
> TimeoutStartSec=0
> RestartSec=5
> Environment="HOME=/root"
> SyslogIdentifier=dummy
> ExecStart=/usr/local/sbin/dummy.sh
> [Install]
> WantedBy=multi-user.target
>  
> and /usr/local/sbin/dummy.sh :
>  
> #!/bin/bash
> CNT=0
> while true; do
>   let CNT++
>   echo "hello world $CNT"
>   sleep 5
> done
>  
> and then i try to define it with: pcs resource create dummy.service
> systemd:dummy op monitor interval="10s" timeout="15s"
> after 2 seconds node2 reboot. In logs i see pacemaker in 2 seconds
> tried to start this unit, and it started, but pacemaker somehow think
> he is «Timed Out» . What i am doing wrong? Logs below.

The start is timing out because the ExecStart script never returns.

systemd starts processes but it doesn't daemonize them -- the script is
responsible for doing that itself. You can search online for more
details about daemonization, but most importantly you want to run your
daemon as a subprocess in the background and have your main process
return as soon as the daemon is ready for service.


> Jul 21 15:53:41 node2.local pacemaker-controld[1813]:  notice: Result
> of probe operation for dummy.service on node2.local: 7 (not running) 
> Jul 21 15:53:41 node2.local systemd[1]: Reloading.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/dbus.socket:5: ListenStream= references a path
> below legacy directory /var/run/, updating
> /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please
> update the unit file accordingly.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/docker.socket:6: ListenStream= references a path
> below legacy directory /var/run/, updating /var/run/docker.sock →
> /run/docker.sock; please update the unit file accordingly.
> Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up
> on dummy.service start (rc=0): timeout (elapsed=259719ms, remaining=-
> 159719ms)
> Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result
> of start operation for dummy.service on node2.local: Timed Out 
> Jul 21 15:53:42 node2.local systemd[1]: Started Cluster Controlled
> dummy.
> Jul 21 15:53:42 node2.local dummy[9330]: hello world 1
> Jul 21 15:53:42 node2.local systemd-udevd[922]: Network interface
> NamePolicy= disabled on kernel command line, ignoring.
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> fail-count-dummy.service#start_0[node2.local]: (unset) -> INFINITY 
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> last-failure-dummy.service#start_0[node2.local]: (unset) ->
> 1595336022 
> Jul 21 15:53:42 node2.local systemd[1]: Reloading.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/dbus.socket:5: ListenStream= references a path
> below legacy directory /var/run/, updating
> /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please
> update the unit file accordingly.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/docker.socket:6: ListenStream= references a path
> below legacy directory /var/run/, updating /var/run/docker.sock →
> /run/docker.sock; please update the unit file accordingly.
> Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up
> on dummy.service stop (rc=0): timeout (elapsed=317181ms, remaining=-
> 217181ms)
> Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result
> of stop operation for dummy.service on node2.local: Timed Out 
> Jul 21 15:53:42 node2.local systemd[1]: Stopping Daemon for dummy...
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> fail-count-dummy.service#stop_0[node2.local]: (unset) -> INFINITY 
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> last-failure-dummy.service#stop_0[node2.local]: (unset) ->
> 1595336022 
> Jul 21 15:53:42 node2.local systemd[1]: dummy.service: Succeeded.
> Jul 21 15:53:42 node2.local systemd[1]: Stopped Daemon for dummy.
> ... lost connection (node rebooting)
>  
>  
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] pacemaker systemd resource

2020-07-22 Thread Хиль Эдуард

Hey, Andrei! Thanx for ur time!
A-a-and there is no chance to do something? :( 
The pacemaker’s log below.
 
 
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_process_request)    
 info: Forwarding cib_apply_diff operation for section 'all' to all 
(origin=local/cibadmin/2)
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: Diff: --- 0.130.94 2
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: Diff: +++ 0.131.0 29b403fcf3c8d30705dceac1ba701963
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: +  /cib:  @epoch=131, @num_updates=0
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++ /cib/configuration/resources:  
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++                                  
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++                                    
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++                                    
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++                                    
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++                                  
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++                                
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_process_request)    
 info: Completed cib_apply_diff operation for section 'all': OK (rc=0, 
origin=node2.local/cibadmin/2, version=0.131.0)
Jul 22 12:38:36 node2.local pacemaker-fenced    [1720] 
(update_cib_stonith_devices_v2)     info: Updating device list from the cib: 
create resources
Jul 22 12:38:36 node2.local pacemaker-fenced    [1720] (cib_devices_update)     
info: Updating devices to version 0.131.0
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_file_backup)     
info: Archived previous version as /var/lib/pacemaker/cib/cib-20.raw
Jul 22 12:38:36 node2.local pacemaker-fenced    [1720] (cib_device_update)     
info: Device mfs4.stonith has been disabled on node2.local: score=-INFINITY
Jul 22 12:38:36 node2.local pacemaker-based     [1719] 
(cib_file_write_with_digest)     info: Wrote version 0.131.0 of the CIB to disk 
(digest: 8a11f99f10fb5b69aee4da9460d9134b)
Jul 22 12:38:36 node2.local pacemaker-based     [1719] 
(cib_file_write_with_digest)     info: Reading cluster configuration file 
/var/lib/pacemaker/cib/cib.Ap1Vqv (digest: /var/lib/pacemaker/cib/cib.aiva1s)
Jul 22 12:38:36 node2.local pacemaker-execd     [1721] 
(process_lrmd_get_rsc_info)     info: Agent information for 'dummy.service' not 
in cache
Jul 22 12:38:36 node2.local pacemaker-execd     [1721] 
(process_lrmd_rsc_register)     info: Cached agent information for 
'dummy.service'
Jul 22 12:38:36 node2.local pacemaker-controld  [1724] (do_lrm_rsc_op)     
info: Performing key=13:23:7:76f4932e-716b-45b8-8fed-a20c3806df8a 
op=dummy.service_monitor_0
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_process_request)    
 info: Forwarding cib_modify operation for section status to all 
(origin=local/crmd/60)
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: Diff: --- 0.131.0 2
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: Diff: +++ 0.131.1 (null)
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: +  /cib:  @num_updates=1
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++ /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources:  

Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++                                                                

Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_process_request)    
 info: Completed cib_modify operation for section status: OK (rc=0, 
origin=node1.local/crmd/240, version=0.131.1)
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: Diff: --- 0.131.1 2
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: Diff: +++ 0.131.2 (null)
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: +  /cib:  @num_updates=2
Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++ /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources:  

Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_perform_op)     
info: ++                                                                

Jul 22 12:38:36 node2.local pacemaker-based     [1719] (cib_process_request)    
 info: Completed cib_modify operation for section status: OK (rc=0, 
origin=node2.local/crmd/60, version=0.131.2)
Jul 22 12:38:36 node2.local pacemaker-controld  [1724] (process_lrm_event)     
notice: 

Re: [ClusterLabs] pacemaker systemd resource

2020-07-22 Thread Хиль Эдуард

Hi Klaus! Thank you for your attention, but isn’t work. I have added 
Type=simple and there is no changes. I think problem not in service. As we can 
see from logs, the service is starting (Jul 21 15:53:42 node2.local 
dummy[9330]: hello world 1) but for the some reason pacemaker isn’t see it (Jul 
21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result of stop 
operation for dummy.service on node2.local: Timed Out ) and he draws his 
conclusions for 2 seconds (from 15:53:41 to 15:53:42) and i have no idea what 
to do :(

  
>Среда, 22 июля 2020, 13:15 +05:00 от Klaus Wenninger :
> 
>On 7/22/20 9:59 AM, Хиль Эдуард wrote:
>>Hi there! I have 2 nodes with Pacemaker 2.0.3, corosync 3.0.3 on ubuntu 20 + 
>>1 qdevice. I want to define new resource as systemd unit  dummy.service  :
>> 
>>[Unit]
>>Description=Dummy
>>[Service] 
>Type=simple
>
>That could do the trick. Actually I thought simple would be
>the default but ...
>
>Klaus
>>Restart=on-failure
>>StartLimitInterval=20
>>StartLimitBurst=5
>>TimeoutStartSec=0
>>RestartSec=5
>>Environment="HOME=/root"
>>SyslogIdentifier=dummy
>>ExecStart=/usr/local/sbin/dummy.sh
>>[Install]
>>WantedBy=multi-user.target
>> 
>>and /usr/local/sbin/dummy.sh :
>> 
>>#!/bin/bash
>>CNT=0
>>while true; do
>>  let CNT++
>>  echo "hello world $CNT"
>>  sleep 5
>>done
>> 
>>and then i try to define it with: pcs resource create dummy.service 
>>systemd:dummy op monitor interval="10s" timeout="15s"
>>after 2 seconds node2 reboot. In logs i see pacemaker in 2 seconds tried to 
>>start this unit, and it started, but pacemaker somehow think he is «Timed 
>>Out» . What i am doing wrong? Logs below.
>> 
>> 
>>Jul 21 15:53:41 node2.local pacemaker-controld[1813]:  notice: Result of 
>>probe operation for dummy.service on node2.local: 7 (not running) 
>>Jul 21 15:53:41 node2.local systemd[1]: Reloading.
>>Jul 21 15:53:42 node2.local systemd[1]: /lib/systemd/system/dbus.socket:5: 
>>ListenStream= references a path below legacy directory /var/run/, updating 
>>/var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update 
>>the unit file accordingly.
>>Jul 21 15:53:42 node2.local systemd[1]: /lib/systemd/system/docker.socket:6: 
>>ListenStream= references a path below legacy directory /var/run/, updating 
>>/var/run/docker.sock → /run/docker.sock; please update the unit file 
>>accordingly.
>>Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up on 
>>dummy.service start (rc=0): timeout (elapsed=259719ms, remaining=-159719ms)
>>Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result of start 
>>operation for dummy.service on node2.local: Timed Out 
>>Jul 21 15:53:42 node2.local systemd[1]: Started Cluster Controlled dummy.
>>Jul 21 15:53:42 node2.local dummy[9330]: hello world 1
>>Jul 21 15:53:42 node2.local systemd-udevd[922]: Network interface NamePolicy= 
>>disabled on kernel command line, ignoring.
>>Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting 
>>fail-count-dummy.service#start_0[node2.local]: (unset) -> INFINITY 
>>Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting 
>>last-failure-dummy.service#start_0[node2.local]: (unset) -> 1595336022 
>>Jul 21 15:53:42 node2.local systemd[1]: Reloading.
>>Jul 21 15:53:42 node2.local systemd[1]: /lib/systemd/system/dbus.socket:5: 
>>ListenStream= references a path below legacy directory /var/run/, updating 
>>/var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update 
>>the unit file accordingly.
>>Jul 21 15:53:42 node2.local systemd[1]: /lib/systemd/system/docker.socket:6: 
>>ListenStream= references a path below legacy directory /var/run/, updating 
>>/var/run/docker.sock → /run/docker.sock; please update the unit file 
>>accordingly.
>>Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up on 
>>dummy.service stop (rc=0): timeout (elapsed=317181ms, remaining=-217181ms)
>>Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result of stop 
>>operation for dummy.service on node2.local: Timed Out 
>>Jul 21 15:53:42 node2.local systemd[1]: Stopping Daemon for dummy...
>>Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting 
>>fail-count-dummy.service#stop_0[node2.local]: (unset) -> INFINITY 
>>Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting 
>>last-failure-dummy.service#stop_0[node2.local]: (unset) -> 1595336022 
>>Jul 21 15:53:42 node2.local systemd[1]: dummy.service: Succeeded.
>>Jul 21 15:53:42 node2.local systemd[1]: Stopped Daemon for dummy.
>>... lost connection (node rebooting)
>> 
>>   
>> 
>>___
>>Manage your subscription:
>>https://lists.clusterlabs.org/mailman/listinfo/users
>>
>>ClusterLabs home: https://www.clusterlabs.org/
>> 
 
   
 ___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] pacemaker systemd resource

2020-07-22 Thread Andrei Borzenkov
On Wed, Jul 22, 2020 at 10:59 AM Хиль Эдуард  wrote:

> Hi there! I have 2 nodes with Pacemaker 2.0.3, corosync 3.0.3 on ubuntu 20
> + 1 qdevice. I want to define new resource as systemd unit *dummy.service
> *:
>
> [Unit]
> Description=Dummy
> [Service]
> Restart=on-failure
> StartLimitInterval=20
> StartLimitBurst=5
> TimeoutStartSec=0
> RestartSec=5
> Environment="HOME=/root"
> SyslogIdentifier=dummy
> ExecStart=/usr/local/sbin/dummy.sh
> [Install]
> WantedBy=multi-user.target
>
> and /usr/local/sbin/dummy.sh :
>
> #!/bin/bash
> CNT=0
> while true; do
>   let CNT++
>   echo "hello world $CNT"
>   sleep 5
> done
>
> and then i try to define it with: pcs resource create dummy.service
> systemd:dummy op monitor interval="10s" timeout="15s"
> after 2 seconds node2 reboot.
>

Node reboots because stop operation failed, no start.



> In logs i see pacemaker in 2 seconds tried to start this unit, and it
> started, but pacemaker somehow think he is «Timed Out» . What i am doing
> wrong? Logs below.
>
>
> Jul 21 15:53:41 node2.local pacemaker-controld[1813]:  notice: Result of
> probe operation for dummy.service on node2.local: 7 (not running)
> Jul 21 15:53:41 node2.local systemd[1]: Reloading.
> Jul 21 15:53:42 node2.local systemd[1]: /lib/systemd/system/dbus.socket:5:
> ListenStream= references a path below legacy directory /var/run/, updating
> /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please
> update the unit file accordingly.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/docker.socket:6: ListenStream= references a path below
> legacy directory /var/run/, updating /var/run/docker.sock →
> /run/docker.sock; please update the unit file accordingly.
> Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up on
> dummy.service start (rc=0): timeout (elapsed=259719ms, remaining=-159719ms)
> Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result of
> start operation for dummy.service on node2.local: Timed Out
> Jul 21 15:53:42 node2.local systemd[1]: Started Cluster Controlled dummy.
> Jul 21 15:53:42 node2.local dummy[9330]: hello world 1
> Jul 21 15:53:42 node2.local systemd-udevd[922]: Network interface
> NamePolicy= disabled on kernel command line, ignoring.
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> fail-count-dummy.service#start_0[node2.local]: (unset) -> INFINITY
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> last-failure-dummy.service#start_0[node2.local]: (unset) -> 1595336022
> Jul 21 15:53:42 node2.local systemd[1]: Reloading.
> Jul 21 15:53:42 node2.local systemd[1]: /lib/systemd/system/dbus.socket:5:
> ListenStream= references a path below legacy directory /var/run/, updating
> /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please
> update the unit file accordingly.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/docker.socket:6: ListenStream= references a path below
> legacy directory /var/run/, updating /var/run/docker.sock →
> /run/docker.sock; please update the unit file accordingly.
> Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up on
> dummy.service stop (rc=0): timeout (elapsed=317181ms, remaining=-217181ms)
>

317181ms == 5 minutes. Barring pacemaker bug, you need to show pacemaker
log since the very first start operation so we can see proper timing.
Seeing that systemd was reloaded in between, it is quite possible that
systemd lost track of pending job so any client waiting for confirmation
hangs forever. Such problems were known, not sure what current status is
(if it ever was fixed).



> Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result of
> stop operation for dummy.service on node2.local: Timed Out
> Jul 21 15:53:42 node2.local systemd[1]: Stopping Daemon for dummy...
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> fail-count-dummy.service#stop_0[node2.local]: (unset) -> INFINITY
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> last-failure-dummy.service#stop_0[node2.local]: (unset) -> 1595336022
> Jul 21 15:53:42 node2.local systemd[1]: dummy.service: Succeeded.
> Jul 21 15:53:42 node2.local systemd[1]: Stopped Daemon for dummy.
> ... lost connection (node rebooting)
>
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] pacemaker systemd resource

2020-07-22 Thread Klaus Wenninger
On 7/22/20 9:59 AM, Хиль Эдуард wrote:
> Hi there! I have 2 nodes with Pacemaker 2.0.3, corosync 3.0.3 on
> ubuntu 20 + 1 qdevice. I want to define new resource as systemd
> unit *dummy.service *:
>  
> [Unit]
> Description=Dummy
> [Service]
Type=simple

That could do the trick. Actually I thought simple would be
the default but ...

Klaus
> Restart=on-failure
> StartLimitInterval=20
> StartLimitBurst=5
> TimeoutStartSec=0
> RestartSec=5
> Environment="HOME=/root"
> SyslogIdentifier=dummy
> ExecStart=/usr/local/sbin/dummy.sh
> [Install]
> WantedBy=multi-user.target
>  
> and /usr/local/sbin/dummy.sh :
>  
> #!/bin/bash
> CNT=0
> while true; do
>   let CNT++
>   echo "hello world $CNT"
>   sleep 5
> done
>  
> and then i try to define it with: pcs resource create dummy.service
> systemd:dummy op monitor interval="10s" timeout="15s"
> after 2 seconds node2 reboot. In logs i see pacemaker in 2 seconds
> tried to start this unit, and it started, but pacemaker somehow think
> he is «Timed Out» . What i am doing wrong? Logs below.
>  
>  
> Jul 21 15:53:41 node2.local pacemaker-controld[1813]:  notice: Result
> of probe operation for dummy.service on node2.local: 7 (not running) 
> Jul 21 15:53:41 node2.local systemd[1]: Reloading.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/dbus.socket:5: ListenStream= references a path
> below legacy directory /var/run/, updating
> /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please
> update the unit file accordingly.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/docker.socket:6: ListenStream= references a path
> below legacy directory /var/run/, updating /var/run/docker.sock →
> /run/docker.sock; please update the unit file accordingly.
> Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up
> on dummy.service start (rc=0): timeout (elapsed=259719ms,
> remaining=-159719ms)
> Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result
> of start operation for dummy.service on node2.local: Timed Out 
> Jul 21 15:53:42 node2.local systemd[1]: Started Cluster Controlled dummy.
> Jul 21 15:53:42 node2.local dummy[9330]: hello world 1
> Jul 21 15:53:42 node2.local systemd-udevd[922]: Network interface
> NamePolicy= disabled on kernel command line, ignoring.
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> fail-count-dummy.service#start_0[node2.local]: (unset) -> INFINITY 
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> last-failure-dummy.service#start_0[node2.local]: (unset) -> 1595336022 
> Jul 21 15:53:42 node2.local systemd[1]: Reloading.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/dbus.socket:5: ListenStream= references a path
> below legacy directory /var/run/, updating
> /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please
> update the unit file accordingly.
> Jul 21 15:53:42 node2.local systemd[1]:
> /lib/systemd/system/docker.socket:6: ListenStream= references a path
> below legacy directory /var/run/, updating /var/run/docker.sock →
> /run/docker.sock; please update the unit file accordingly.
> Jul 21 15:53:42 node2.local pacemaker-execd[1808]:  notice: Giving up
> on dummy.service stop (rc=0): timeout (elapsed=317181ms,
> remaining=-217181ms)
> Jul 21 15:53:42 node2.local pacemaker-controld[1813]:  error: Result
> of stop operation for dummy.service on node2.local: Timed Out 
> Jul 21 15:53:42 node2.local systemd[1]: Stopping Daemon for dummy...
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> fail-count-dummy.service#stop_0[node2.local]: (unset) -> INFINITY 
> Jul 21 15:53:42 node2.local pacemaker-attrd[1809]:  notice: Setting
> last-failure-dummy.service#stop_0[node2.local]: (unset) -> 1595336022 
> Jul 21 15:53:42 node2.local systemd[1]: dummy.service: Succeeded.
> Jul 21 15:53:42 node2.local systemd[1]: Stopped Daemon for dummy.
> ... lost connection (node rebooting)
>  
>  
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/