On 04/27/2016 04:19 PM, renayama19661...@ybb.ne.jp wrote: > Hi All, > > We have a request for a new SNMP function. > > > The order of traps is not right. > > The turn of the trap is not sometimes followed. > This is because the handling of notice carries out "path" in async. > I think that it is necessary to wait for completion of the practice at "path" > unit of "alerts". > > The turn of the trap is different from the real stop order of the resource. Writing the alerts in a local list and having the alert-scripts called in a serialized manner would lead to the snmptrap-tool creating timestamps in the order of the occurrence of the alerts. Having the snmp-manager order the traps by timestamp this would indeed lead to seeing them in the order they had occured.
But this approach has a number of drawbacks: - it works just when the traps are coming from one node as there is no way to serialize over nodes - at least none that would work under all circumstances we want alerts to be delivered - it distorts the timestamps created even more from the points in time when the alert had been triggered - making the result in a multi-node-scenario even worse and making it hard to correlate with other sources of information like logfiles - if you imagine a scenario with multiple mechanisms of delivering an alert + multiple recipients we couldn't use a single list but we would need something more complicated to prevent unneeded delays, delays coming from one of the delivery methods not working properly due to e.g. a recipient that is not reachable, ... (all solvable of course but if it doesn't solve your problem in the first place why the effort) The alternative approach taken doesn't create the timestamps in the scripts but provides timestamps to the scripts already. This way it doesn't matter if the execution of the script is delayed. A short example how this approach could be used with snmp-traps: edit pcmk_snmp_helper.sh: ... starttickfile="/var/run/starttick" # hack to have a reference # can have it e.g. in an attribute to be visible throughout the cluster if [ ! -f ${starttickfile} ] ; then echo ${CRM_alert_timestamp} > ${starttickfile} fi starttick=`cat ${starttickfile}` ticks=`eval ${CRM_alert_timestamp} - ${starttick}` if [[ ${CRM_alert_rc} != 0 && ${CRM_alert_task} == "monitor" ]] || [[ ${CRM_alert_task} != "monitor" ]] ; then # This trap is compliant with PACEMAKER MIB # https://github.com/ClusterLabs/pacemaker/blob/master/extra/PCMK-MIB.txt /usr/bin/snmptrap -v 2c -c public ${CRM_alert_recipient} ${ticks} PACEMAKER-MIB::pacemakerNotificationTrap \ PACEMAKER-MIB::pacemakerNotificationNode s "${CRM_alert_node}" \ PACEMAKER-MIB::pacemakerNotificationResource s "${CRM_alert_rsc}" \ PACEMAKER-MIB::pacemakerNotificationOperation s "${CRM_alert_task}" \ PACEMAKER-MIB::pacemakerNotificationDescription s "${CRM_alert_desc}" \ PACEMAKER-MIB::pacemakerNotificationStatus i "${CRM_alert_status}" \ PACEMAKER-MIB::pacemakerNotificationReturnCode i ${CRM_alert_rc} \ PACEMAKER-MIB::pacemakerNotificationTargetReturnCode i ${CRM_alert_target_rc} && exit 0 || exit 1 fi exit 0 ... add a section to the cib: cibadmin --create --xml-text '<configuration> <alerts> <alert id="snmp_traps" path="/usr/share/pacemaker/tests/pcmk_snmp_helper.sh"> <meta_attributes id="meta_snmp_traps"> <nvpair id="snmp_timestamp" name="tstamp_format" value="%s%02N"/> </meta_attributes> <recipient id="trap_destination" value="192.168.123.3"/> </alert> </alerts> </configuration>' This should solve the issue of correct order after being sorted by timestamps without having the ugly side-effects as described above. I hope I understood your scenario correctly and this small example points out how I roughly would suggest to cope with the issue. Regards, Klaus > > ---- > [root@rh72-01 ~]# grep Operation /var/log/ha-log | grep stop > Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation prmDummy1_stop_0: ok > (node=rh72-01, call=33, rc=0, cib-update=56, confirmed=true) > Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation prmDummy3_stop_0: ok > (node=rh72-01, call=37, rc=0, cib-update=57, confirmed=true) > Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation prmDummy4_stop_0: ok > (node=rh72-01, call=39, rc=0, cib-update=58, confirmed=true) > Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation prmDummy2_stop_0: ok > (node=rh72-01, call=35, rc=0, cib-update=59, confirmed=true) > Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation prmDummy5_stop_0: ok > (node=rh72-01, call=41, rc=0, cib-update=60, confirmed=true) > > Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25 18:48:50 <UNKNOWN> > [UDP: > [192.168.28.170]:40613->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance > = Timeticks: (25512486) 2 days, 22:52:04.86#011SNMPv2-MIB::snmpTrapOID.0 = > OID: > PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode > = STRING: "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource = > STRING: "prmDummy3"#011PACEMAKER-MIB::pacemakerNotificationOperation = > STRING: "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription = STRING: > "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode = INTEGER: 0 > Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25 18:48:50 <UNKNOWN> > [UDP: > [192.168.28.170]:39581->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance > = Timeticks: (25512489) 2 days, 22:52:04.89#011SNMPv2-MIB::snmpTrapOID.0 = > OID: > PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode > = STRING: "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource = > STRING: "prmDummy4"#011PACEMAKER-MIB::pacemakerNotificationOperation = > STRING: "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription = STRING: > "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode = INTEGER: 0 > Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25 18:48:50 <UNKNOWN> > [UDP: > [192.168.28.170]:37166->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance > = Timeticks: (25512490) 2 days, 22:52:04.90#011SNMPv2-MIB::snmpTrapOID.0 = > OID: > PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode > = STRING: "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource = > STRING: "prmDummy1"#011PACEMAKER-MIB::pacemakerNotificationOperation = > STRING: "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription = STRING: > "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode = INTEGER: 0 > Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25 18:48:50 <UNKNOWN> > [UDP: > [192.168.28.170]:53502->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance > = Timeticks: (25512494) 2 days, 22:52:04.94#011SNMPv2-MIB::snmpTrapOID.0 = > OID: > PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode > = STRING: "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource = > STRING: "prmDummy2"#011PACEMAKER-MIB::pacemakerNotificationOperation = > STRING: "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription = STRING: > "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode = INTEGER: 0 > Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25 18:48:50 <UNKNOWN> > [UDP: > [192.168.28.170]:45956->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance > = Timeticks: (25512497) 2 days, 22:52:04.97#011SNMPv2-MIB::snmpTrapOID.0 = > OID: > PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode > = STRING: "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource = > STRING: "prmDummy5"#011PACEMAKER-MIB::pacemakerNotificationOperation = > STRING: "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription = STRING: > "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER: > 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode = INTEGER: 0 > > ---- > > I think that there is "timestamp" attribute for async by this change. > > The order of traps may be important to a user. > I suggest addition to "alert" element with "orderd" attribute. > > * orderd > false : The present processing. > true : Control the transmission order of the trap. > > ---- > <configuration> > <alerts> > <alert id="notify_9" > path="/usr/share/pacemaker/tests/pcmk_alert_sample1.sh" ordered="true"> > (snip) > </alert> > <alert id="notify_9" > path="/usr/share/pacemaker/tests/pcmk_alert_sample2.sh" ordered="false"> > (snip) > </alert> > </alerts> > </configuration> > > ---- > > I send a patch to cope with this problem before. > The former patch may be useful for the correction. > * https://github.com/ClusterLabs/pacemaker/pull/847 > > I intend to write the patch if everybody agrees to "ordered" attribute. > > Best Regards, > Hideo Yamauchi. > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org