Hi, On Wed, Nov 03, 2010 at 05:08:50PM +0100, Eberhard Kuemmerle wrote: > On 3 Nov 2010 11:06, Dejan Muhamedagic wrote: > > On Tue, Nov 02, 2010 at 06:45:08PM +0100, Dejan Muhamedagic wrote: > > > >> On Tue, Nov 02, 2010 at 04:26:40PM +0100, Eberhard Kuemmerle wrote: > >> > >>> On 2 Nov 2010 16:15 02.11.2010 16:18, Eberhard Kuemmerle wrote: > >>> > >>>> Hi, > >>>> here is what you requested: > >>>> > >>>> TEST 1: > >>>> stonith -t rcd_serial -p "test /dev/ttyS0 rts 2000" test > >>>> ** (process:2928): DEBUG: rcd_serial_set_config:called > >>>> Alarm clock > >>>> # echo $? > >>>> 142 > >>>> > >>>> TEST 2: > >>>> stonith -t rcd_serial hostlist="node2" ttydev="/dev/ttyS0" dtr_rts="rts" > >>>> msduration="2000" -S > >>>> ** (process:6851): DEBUG: rcd_serial_set_config:called > >>>> stonith: rcd_serial device OK. > >>>> # echo $? > >>>> 0 > >>>> > >>>> TEST 3: > >>>> stonith -t rcd_serial hostlist="node2" ttydev="/dev/ttyS0" dtr_rts="rts" > >>>> msduration="2000" -T reset node2 > >>>> ** (process:8142): DEBUG: rcd_serial_set_config:called > >>>> Alarm clock > >>>> # echo $? > >>>> 142 > >>>> > >>>> TEST 1 as well as TEST 2 caused a reboot of node2! > >>>> > >>>> > >>> SORRY, that's wrong! > >>> I wanted to say: > >>> TEST 1 as well as TEST 3 caused a reboot of node2! > >>> > >> Well, then there seems to be a problem with rcd_serial. > >> According to the exit code (142 = 128 + 14), it seems like the > >> plugin instance gets killed by the ALRM signal. The signal > >> should've been caught, but there is something wrong with the > >> registration of the signal handler. > >> > >> Looks like this fails unexpectedly: > >> > >> #if !defined(HAVE_POSIX_SIGNALS) > >> > >> because our autoconf doesn't do tests for signal implementation. > >> > >> Can you please try the attached patch? You'll have to rebuild > >> the package for that. > >> > > If you've wondered which patch, here's finally one. > > > > Thanks, > > > > Dejan > > > > -------------- next part -------------- > > A non-text attachment was scrubbed... > > Name: have-posix-signals.patch > > Type: text/x-diff > > Size: 1032 bytes > > Desc: not available > > URL: > > <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20101103/a5cd5005/attachment-0001.bin> > > > Wow, success! > > With your patch and additionally replacing 'dtr|rts' by 'dtr_rts' in > rcd_serial.c, everything works fine!!!
Great. > There are still some strange entries in /var/log/messages, but the > STONITH action is performed correctly! > > Just for your information, here are the messages: > > Nov 3 16:41:50 node2 pengine: [5327]: WARN: stage6: Scheduling Node > node1 for STONITH > Nov 3 16:41:50 node2 stonith-ng: [5323]: WARN: parse_host_line: Could > not parse (0 2): ** (process:8669): DEBUG: rcd_serial_set_config:called > Nov 3 16:41:50 node2 stonith-ng: [5323]: WARN: parse_host_line: Could > not parse (3 18): (process:8669): DEBUG: rcd_serial_set_config:called > Nov 3 16:41:50 node2 stonith-ng: [5323]: WARN: parse_host_line: Could > not parse (0 0): > Nov 3 16:41:50 node2 pengine: [5327]: WARN: process_pe_message: > Transition 102: WARNINGs found during PE processing. PEngine Input > stored in: /var/lib/pengine/pe-warn-0.bz2 > Nov 3 16:41:52 node2 crmd: [5328]: notice: crmd_peer_update: Status > update: Client node1/crmd now has status [offline] (DC=true) > Nov 3 16:41:52 node2 crmd: [5328]: notice: run_graph: Transition 102 > (Complete=11, Pending=0, Fired=0, Skipped=23, Incomplete=11, > Source=/var/lib/pengine/pe-warn-0.bz2): Stopped > Nov 3 16:41:52 node2 lrmd: [5325]: ERROR: crm_abort: crm_strdup_fn: > Triggered assert at utils.c:964 : src != NULL > Nov 3 16:41:52 node2 lrmd: [5325]: ERROR: crm_strdup_fn: Could not > perform copy at st_client.c:514 (stonith_api_device_metadata) I guess that these two were fixed in the meantime. Can you post output of "crmd version". Thanks, Dejan > Nov 3 16:41:52 node2 lrmd: [5325]: WARN: stonith_api_device_metadata: > no short description in rcd_serial's metadata. > > Thank you very much! > Eberhard > > > > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher > Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), > Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, > Prof. Dr. Sebastian M. Schmidt > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker