15. Same configuration as Test Case #12, SI lock. Keep gdb in both the SUs for
csi remove and keep timeout as 100 sec. Slock SI and stop controller.
Start controller and allow csi remove to timeout.
Two things:
SU2 has Standby assignment(which is wrong), SU1 has not assignment.
Error at PL-4 : SU-SI record addition failed
PM_SC-1:/home/nagu/views/staging # amf-state siass
safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SU2\,safSg=AmfDemo\,safApp=AmfDemo1,safSi=AmfDemo,safApp=AmfDemo1
saAmfSISUHAState=STANDBY(2)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
Syslog of PL-4:
Feb 9 21:24:50 PM_PL-4 osafamfnd[7998]: NO
'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' component restart probation timer
started (timeout: 60000000000 ns)
Feb 9 21:24:50 PM_PL-4 osafamfnd[7998]: NO Restarting a component of
'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' (comp restart count: 1)
Feb 9 21:24:50 PM_PL-4 osafamfnd[7998]: NO
'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' faulted due to
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
Feb 9 21:24:55 PM_PL-4 amf_demo_script: killproc /opt/amf_demo/amf_demo failed
Feb 9 21:24:55 PM_PL-4 amf_demo[8200]:
'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' started
Feb 9 21:24:55 PM_PL-4 osafamfnd[7998]: NO Removed
'safSi=AmfDemo1,safApp=AmfDemo1' from 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1'
Feb 9 21:24:55 PM_PL-4 amf_demo[8200]: HC started with AMF
Feb 9 21:24:55 PM_PL-4 amf_demo[8200]: Registered with AMF
Feb 9 21:24:55 PM_PL-4 amf_demo[8200]: CSI Set - add
'safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1' HAState Standby
Feb 9 21:24:55 PM_PL-4 amf_demo[8200]: name: abcdef, value: val1
Feb 9 21:24:55 PM_PL-4 amf_demo[8200]: name: abcdef, value: val2
Feb 9 21:24:55 PM_PL-4 osafamfnd[7998]: CR SU-SI record addition failed, SU=
safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 : SI=safSi=AmfDemo,safApp=AmfDemo1
Feb 9 21:24:55 PM_PL-4 amf_demo[8200]: Health check 1
Feb 9 21:25:50 PM_PL-4 osafamfnd[7998]: NO
'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' Component or SU restart probation
timer expired
Thanks
-Nagu
> -----Original Message-----
> From: Nagendra Kumar
> Sent: 09 February 2016 20:44
> To: minh chau; [email protected]; [email protected];
> Praveen Malviya
> Cc: [email protected]
> Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add support
> for cloud resilience [#1620] V2
>
> >> SI Swap again and the commands come out with success, but swap
> doesn't happen and syslog prints:
>
> Modification in #13, SU1 gets Act, but SU2 gets assignment removed as an
> outcome of SI swap.
>
> Next Si-swap failed as only one assignment.
>
> > -----Original Message-----
> > From: Nagendra Kumar
> > Sent: 09 February 2016 20:41
> > To: minh chau; [email protected]; [email protected];
> > Praveen Malviya
> > Cc: [email protected]
> > Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add
> > support for cloud resilience [#1620] V2
> >
> > 12. Issue shutdown on SI and keep sleep in csi set callback, stop
> > controller and let csi set callback timeout. Start SC-1 and immlist
> > the SI, it is in shutting down state:
> > saAmfSIAdminState SA_UINT32_T 4 (0x4)
> > 13. Issue SI Swap of appl SI (SU1 Act, SU2 Std): Keep gdb in Quisced csi
> > callback and allow to timeout and stop the controller.
> > At one time: Start the controller, SU1 gets Standby and SU2 gets Act.
> > Now issue, SI Swap again and the commands come out with success, but
> > swap doesn't happen and syslog prints:
> > Feb 9 20:33:51 PM_SC-1 osafamfd[9497]: NO
> > safSi=AmfDemo,safApp=AmfDemo1 Swap initiated
> >
> > Please find the amfd trace attached.
> >
> > 14.) test Case #13: At another time: Amfnd crash: Bt and syslog(below)
> > and Amfnd traces(osafamfnd-PL-3) attached.
> >
> > Program terminated with signal 11, Segmentation fault.
> > #0 0x000000000041deaa in avnd_err_process(avnd_cb_tag*,
> > avnd_comp_tag*, avnd_err_tag*)
> > ()
> > (gdb) bt
> > #0 0x000000000041deaa in avnd_err_process(avnd_cb_tag*,
> > avnd_comp_tag*, avnd_err_tag*)
> > ()
> > #1 0x0000000000407559 in avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*,
> > avnd_evt_tag*) ()
> > #2 0x000000000042133f in avnd_main_process() () at main.cc:667
> > #3 0x0000000000405517 in main () at main.cc:186
> > (gdb) thread apply bt all
> > (gdb) thread apply all bt
> >
> > Thread 4 (Thread 0x7fe84b5b3b00 (LWP 7892)):
> > #0 0x00007fe84a4d976d in read () from /lib64/libpthread.so.0
> > #1 0x00007fe84b19af17 in ncs_exec_mod_hdlr () from
> > /usr/local/lib/libopensaf_core.so.0
> > #2 0x00007fe84a4d27b6 in start_thread () from /lib64/libpthread.so.0
> > #3 0x00007fe849a889cd in clone () from /lib64/libc.so.6
> > #4 0x0000000000000000 in ?? ()
> >
> > Thread 3 (Thread 0x7fe84b5d3b00 (LWP 7890)):
> > #0 0x00007fe849a7f4f6 in poll () from /lib64/libc.so.6
> > #1 0x00007fe84b1c5623 in mdtm_process_recv_events ()
> > from /usr/local/lib/libopensaf_core.so.0
> > #2 0x00007fe84a4d27b6 in start_thread () from /lib64/libpthread.so.0
> > #3 0x00007fe849a889cd in clone () from /lib64/libc.so.6
> > #4 0x0000000000000000 in ?? ()
> >
> > Thread 2 (Thread 0x7fe84b606b00 (LWP 7889)):
> > #0 0x00007fe849a7f4f6 in poll () from /lib64/libc.so.6
> > #1 0x00007fe84b18922f in osaf_ppoll () from
> > /usr/local/lib/libopensaf_core.so.0
> > #2 0x00007fe84b190acf in ncs_tmr_wait () from
> > /usr/local/lib/libopensaf_core.so.0
> > #3 0x00007fe84a4d27b6 in start_thread () from /lib64/libpthread.so.0
> > #4 0x00007fe849a889cd in clone () from /lib64/libc.so.6
> > #5 0x0000000000000000 in ?? ()
> > ---Type <return> to continue, or q <return> to quit---
> >
> > Thread 1 (Thread 0x7fe84b5d6720 (LWP 7888)):
> > #0 0x000000000041deaa in avnd_err_process(avnd_cb_tag*,
> > avnd_comp_tag*, avnd_err_tag*)
> > ()
> > #1 0x0000000000407559 in avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*,
> > avnd_evt_tag*) ()
> > #2 0x000000000042133f in avnd_main_process() () at main.cc:667
> > #3 0x0000000000405517 in main () at main.cc:186
> >
> > Syslog:
> > Feb 9 20:05:44 PM_PL-3 osafimmnd[7869]: NO Re-introduce-me
> > highestProcessed:1514 highestReceived:1514 Feb 9 20:05:46 PM_PL-3
> > kernel: [117927.208595] TIPC: Resetting link <1.1.3:eth0-1.1.1:eth0>,
> > peer not responding Feb 9 20:05:46 PM_PL-3 kernel: [117927.208604]
> > TIPC: Lost link <1.1.3:eth0-1.1.1:eth0> on network plane A Feb 9
> > 20:05:46 PM_PL-3
> > kernel: [117927.208610] TIPC: Lost contact with <1.1.1> Feb 9
> > 20:05:49
> > PM_PL-3 osafimmnd[7869]: WA MDS Send Failed to service:IMMD rc:2 Feb
> > 9
> > 20:05:49 PM_PL-3 osafamfnd[7888]: NO component with
> QUIESCED/QUIESCING
> > assignment failed Feb 9 20:05:49 PM_PL-3
> > osafamfnd[7888]: NO recovery action 'comp restart' escalated to 'comp
> > failover'
> > Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO SU failover probation
> > timer started (timeout: 1200000000000 ns) Feb 9 20:05:49 PM_PL-3
> > osafamfnd[7888]: NO Performing failover of
> > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' (SU failover count: 1) Feb
> > 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO
> > 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> > recovery action escalated from 'componentRestart' to
> 'componentFailover'
> > Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO
> > 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> > faulted due to 'csiSetcallbackTimeout' : Recovery is 'componentFailover'
> > Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO
> > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State
> INSTANTIATED
> > => TERMINATING Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]:
> > NO Removed 'safSi=AmfDemo,safApp=AmfDemo1' from
> > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> > Feb 9 20:05:49 PM_PL-3 osafamfnd[7888]: NO Assigned
> > 'safSi=AmfDemo1,safApp=AmfDemo1' QUIESCED to
> > 'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
> > Feb 9 20:05:49 PM_PL-3 osafclmna[7879]: AL AMF Node Director is down,
> > terminate this process Feb 9 20:05:49 PM_PL-3 osafamfwd[7947]:
> > Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: AMF
> > unexpectedly crashed, OwnNodeId = 131855, SupervisionTime = 60 Feb 9
> > 20:05:49 PM_PL-3 osafckptnd[7937]: AL AMF Node Director is down,
> > terminate this process Feb 9 20:05:49 PM_PL-3 osaflcknd[7927]: AL AMF
> > Node Director is down, terminate this process Feb 9 20:05:49 PM_PL-3
> > osafimmnd[7869]: AL AMF Node Director is down, terminate this process
> > Feb
> > 9 20:05:49 PM_PL-3 osafmsgnd[7908]: AL AMF Node Director is down,
> > terminate this process Feb 9 20:05:49 PM_PL-3 osafsmfnd[7898]: AL AMF
> > Node Director is down, terminate this process Feb 9 20:05:49 PM_PL-3
> > opensaf_reboot: Rebooting local node; timeout=60
> >
> >
> > > -----Original Message-----
> > > From: Nagendra Kumar
> > > Sent: 09 February 2016 19:40
> > > To: minh chau; [email protected];
> [email protected];
> > > Praveen Malviya
> > > Cc: [email protected]
> > > Subject: Re: [devel] FW: [PATCH 0 of 5] Review Request for amf: Add
> > > support for cloud resilience [#1620] V2
> > >
> > > Testing continued....
> > >
> > > 11. Lock SI and then unlock SI and keep sleep in csi set callback
> > > and then
> > > reboot SC-1. Allow csi set timeout. When SC-1 is coming Amfd crashes.
> > > Complete Amfd Logs attached and Amfnd of SC-1 and PL-3 is coming in
> > > next email.
> > >
> > > Thanks
> > > -Nagu
> > >
> > > > -----Original Message-----
> > > > From: Nagendra Kumar
> > > > Sent: 09 February 2016 15:57
> > > > To: minh chau; [email protected];
> > [email protected];
> > > > Praveen Malviya
> > > > Cc: [email protected]
> > > > Subject: RE: [devel] FW: [PATCH 0 of 5] Review Request for amf:
> > > > Add support for cloud resilience [#1620] V2
> > > >
> > > > Continued....
> > > >
> > > > > -----Original Message-----
> > > > > From: Nagendra Kumar [mailto:[email protected]]
> > > > > Sent: 09 February 2016 15:56
> > > > > To: 'minh chau'; '[email protected]';
> > > > > '[email protected]'; Praveen Malviya
> > > > > Cc: '[email protected]'
> > > > > Subject: RE: [devel] FW: [PATCH 0 of 5] Review Request for amf:
> > > > > Add support for cloud resilience [#1620] V2
> > > > >
> > > > > Hi Hans N,
> > > > > Please find the amfd and amfnd of SC-1 and amfnd of PL-3
> > > > traces
> > > > > attached in 3 emails coming(because of limit of devel list, I am
> > > > > not able to send it in one go). It took second reboot to
> > > > > reproduce it for TC #6, but it is coming at the same location.
> > > > >
> > > > > Feb 9 15:32:28 PM_SC-1 osafamfd[3962]: NO Received node_up
> from
> > > > > 2010f: msg_id 1 Feb 9 15:32:28 PM_SC-1 osafamfd[3962]:
> siass.cc:842:
> > > > > avd_susi_recreate: Assertion 'su' failed.
> > > > > Feb 9 15:32:28 PM_SC-1 osafamfnd[3972]: WA AMF director
> > > > > unexpectedly crashed Feb 9 15:32:28 PM_SC-1 osafamfnd[3972]: WA
> > > AMF
> > > > > director unexpectedly crashed
> > > > >
> > > > > Thanks
> > > > > -Nagu
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance APM +
> Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor
> end-to-end web transactions and take corrective actions now Troubleshoot
> faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Opensaf-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel