[tickets] [opensaf:tickets] #3350 ntf: crash when remove overdue notification
- **status**: review --> fixed --- **[tickets:#3350] ntf: crash when remove overdue notification** **Status:** fixed **Milestone:** 5.24.09 **Created:** Thu Apr 11, 2024 04:15 AM UTC by PhanTranQuocDat **Last Updated:** Wed Apr 17, 2024 08:12 AM UTC **Owner:** PhanTranQuocDat There is a ntfd coredump happens after[ ticket #3349](https://sourceforge.net/p/opensaf/tickets/3350/) . Reproduce: 1/ Send STOP signal to logd stimulating log service is busy. 2/ ntfsend and send only one notification 3/ Wait for notification to be overdue and observe ACTIVE node down ~~~ 2024-04-11 11:09:55.848 SC-1 osafntfd[232]: NO Notification overdue, remove notification Id: 86 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: ER safComp=NTF,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: Rebooting OpenSAF NodeId = 2010f EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 2010f, SupervisionTime = 60 2024-04-11 11:09:55.901 SC-1 opensaf_reboot: Rebooting local node; timeout=60 ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3350 ntf: crash when remove overdue notification
commit 7b80b26c5ac19135820854a6edc6d50a5dea82dc (HEAD -> develop, origin/develop, ticket-3350) Author: dat.tq.phan Date: Thu Apr 11 11:22:32 2024 +0700 ntf: check existed notification before retrieving [#3350] Related to ticket #3349, when ntf removes an overdue notification from logger buffer, it will try to retrieve next notification in queue. In case there's only a single notification in queue and it's overdue, after removing it, ntf will try to retrieve non-existed notification, causing ntf crash. Solution is to check if there's existed notification in queue before retrieving it. Also, improve checking existed notification when logd calls saLogWriteCallback. --- **[tickets:#3350] ntf: crash when remove overdue notification** **Status:** review **Milestone:** 5.24.09 **Created:** Thu Apr 11, 2024 04:15 AM UTC by PhanTranQuocDat **Last Updated:** Wed Apr 17, 2024 06:19 AM UTC **Owner:** PhanTranQuocDat There is a ntfd coredump happens after[ ticket #3349](https://sourceforge.net/p/opensaf/tickets/3350/) . Reproduce: 1/ Send STOP signal to logd stimulating log service is busy. 2/ ntfsend and send only one notification 3/ Wait for notification to be overdue and observe ACTIVE node down ~~~ 2024-04-11 11:09:55.848 SC-1 osafntfd[232]: NO Notification overdue, remove notification Id: 86 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: ER safComp=NTF,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: Rebooting OpenSAF NodeId = 2010f EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 2010f, SupervisionTime = 60 2024-04-11 11:09:55.901 SC-1 opensaf_reboot: Rebooting local node; timeout=60 ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3350 ntf: crash when remove overdue notification
- Description has changed: Diff: --- old +++ new @@ -1,4 +1,4 @@ -When ntf removes an overdue notification from logger buffer, it will try to retrieve next notification in queue to enhance notification processing speed. In case there's only a single notification in queue and it's overdue, after removing it, ntf will try to retrieve non-existed notification, causing ntf crash. +There is a ntfd coredump happens after[ ticket #3349](https://sourceforge.net/p/opensaf/tickets/3350/) . Reproduce: 1/ Send STOP signal to logd stimulating log service is busy. --- **[tickets:#3350] ntf: crash when remove overdue notification** **Status:** assigned **Milestone:** 5.24.09 **Created:** Thu Apr 11, 2024 04:15 AM UTC by PhanTranQuocDat **Last Updated:** Thu Apr 11, 2024 04:40 AM UTC **Owner:** PhanTranQuocDat There is a ntfd coredump happens after[ ticket #3349](https://sourceforge.net/p/opensaf/tickets/3350/) . Reproduce: 1/ Send STOP signal to logd stimulating log service is busy. 2/ ntfsend and send only one notification 3/ Wait for notification to be overdue and observe ACTIVE node down ~~~ 2024-04-11 11:09:55.848 SC-1 osafntfd[232]: NO Notification overdue, remove notification Id: 86 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: ER safComp=NTF,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: Rebooting OpenSAF NodeId = 2010f EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 2010f, SupervisionTime = 60 2024-04-11 11:09:55.901 SC-1 opensaf_reboot: Rebooting local node; timeout=60 ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3350 ntf: crash when remove overdue notification
- Description has changed: Diff: --- old +++ new @@ -6,6 +6,7 @@ 3/ Wait for notification to be overdue and observe ACTIVE node down ~~~ +2024-04-11 11:09:55.848 SC-1 osafntfd[232]: NO Notification overdue, remove notification Id: 86 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: ER safComp=NTF,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: Rebooting OpenSAF NodeId = 2010f EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 2010f, SupervisionTime = 60 2024-04-11 11:09:55.901 SC-1 opensaf_reboot: Rebooting local node; timeout=60 --- **[tickets:#3350] ntf: crash when remove overdue notification** **Status:** assigned **Milestone:** 5.24.09 **Created:** Thu Apr 11, 2024 04:15 AM UTC by PhanTranQuocDat **Last Updated:** Thu Apr 11, 2024 04:15 AM UTC **Owner:** PhanTranQuocDat When ntf removes an overdue notification from logger buffer, it will try to retrieve next notification in queue to enhance notification processing speed. In case there's only a single notification in queue and it's overdue, after removing it, ntf will try to retrieve non-existed notification, causing ntf crash. Reproduce: 1/ Send STOP signal to logd stimulating log service is busy. 2/ ntfsend and send only one notification 3/ Wait for notification to be overdue and observe ACTIVE node down ~~~ 2024-04-11 11:09:55.848 SC-1 osafntfd[232]: NO Notification overdue, remove notification Id: 86 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: ER safComp=NTF,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: Rebooting OpenSAF NodeId = 2010f EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 2010f, SupervisionTime = 60 2024-04-11 11:09:55.901 SC-1 opensaf_reboot: Rebooting local node; timeout=60 ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3350 ntf: crash when remove overdue notification
--- **[tickets:#3350] ntf: crash when remove overdue notification** **Status:** assigned **Milestone:** 5.24.09 **Created:** Thu Apr 11, 2024 04:15 AM UTC by PhanTranQuocDat **Last Updated:** Thu Apr 11, 2024 04:15 AM UTC **Owner:** PhanTranQuocDat When ntf removes an overdue notification from logger buffer, it will try to retrieve next notification in queue to enhance notification processing speed. In case there's only a single notification in queue and it's overdue, after removing it, ntf will try to retrieve non-existed notification, causing ntf crash. Reproduce: 1/ Send STOP signal to logd stimulating log service is busy. 2/ ntfsend and send only one notification 3/ Wait for notification to be overdue and observe ACTIVE node down ~~~ 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: ER safComp=NTF,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery is:nodeFailfast 2024-04-11 11:09:55.901 SC-1 osafamfnd[281]: Rebooting OpenSAF NodeId = 2010f EE Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 2010f, SupervisionTime = 60 2024-04-11 11:09:55.901 SC-1 opensaf_reboot: Rebooting local node; timeout=60 ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets