[tickets] [opensaf:tickets] Re: #2648 smf: smfd crashes after cluster reboot when campaign is in ExecutionCompleted

2017-10-25 Thread Rafael Odzakow via Opensaf-tickets
That would work. As long as it is possible to rollback the campaign it 
is fine.


On 10/20/2017 03:18 PM, Alex Jones wrote:
>
> I understand the intention. It makes sense.
>
> One of the other solutions I had considered is to put a check at the 
> beginning of SmfCampaign::initExecution(). If the campaign state is 
> EXECUTION_COMPLETED, then just return. What is the point of 
> reexecuting a campaign that already completed?
>
> Are you OK with that?
>
> 
>
> *[tickets:#2648]  
> smf: smfd crashes after cluster reboot when campaign is in 
> ExecutionCompleted*
>
> *Status:* review
> *Milestone:* 5.17.10
> *Created:* Thu Oct 19, 2017 06:45 PM UTC by Alex Jones
> *Last Updated:* Fri Oct 20, 2017 10:04 AM UTC
> *Owner:* Alex Jones
>
> smfd crashes in updateImmAttr because it returns NO_RESOURCES. Here is 
> how to reproduce:
>
>  1. enable PBE, and make sure the "disable" flag is set in
> OpenSafSmfConfig
>  2. execute an upgrade campaign, and let it go to "execution
> completed", but don't commit it
>  3. reboot the entire cluster
>  4. only allow 1 system controller to come up
>  5. smfd will attempt to re-execute the campaign
>  6. any writes to IMM (like setting an error because the campaign file
> can't be found) will fail with NO_RESOURCES and smfd will assert
> and crash
>
> The reason for the assert and crash is because PBE has not been turned 
> off by smfd before the campaign has been inititialized.
>
> 
>
> Sent from sourceforge.net because you indicated interest in 
> https://sourceforge.net/p/opensaf/tickets/2648/
>
> To unsubscribe from further messages, please visit 
> https://sourceforge.net/auth/subscriptions/
>




---

** [tickets:#2648] smf: smfd crashes after cluster reboot when campaign is in 
ExecutionCompleted**

**Status:** review
**Milestone:** 5.17.10
**Created:** Thu Oct 19, 2017 06:45 PM UTC by Alex Jones
**Last Updated:** Fri Oct 20, 2017 01:18 PM UTC
**Owner:** Alex Jones


smfd crashes in updateImmAttr because it returns NO_RESOURCES. Here is how to 
reproduce:

1. enable PBE, and make sure the "disable" flag is set in OpenSafSmfConfig
2. execute an upgrade campaign, and let it go to "execution completed", but 
don't commit it
3. reboot the entire cluster
4. only allow 1 system controller to come up
5. smfd will attempt to re-execute the campaign
6. any writes to IMM (like setting an error because the campaign file can't be 
found) will fail with NO_RESOURCES and smfd will assert and crash

The reason for the assert and crash is because PBE has not been turned off by 
smfd before the campaign has been inititialized.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #2648 smf: smfd crashes after cluster reboot when campaign is in ExecutionCompleted

2017-10-19 Thread Rafael Odzakow via Opensaf-tickets
I can not comment on moving restorePbe to commit for now. But a SMF rollback is 
what is used to undo the campaign operations. A reboot would only clear changed 
IMM data if PBE was off. That leaves software and CLI operations, which would 
cause incompatibilities.

A rollback has to be planned for in a campaign and does not handle errors. So 
by default SMF takes a backup before starting a campaign to be able to recover.



---

** [tickets:#2648] smf: smfd crashes after cluster reboot when campaign is in 
ExecutionCompleted**

**Status:** review
**Milestone:** 5.17.10
**Created:** Thu Oct 19, 2017 06:45 PM UTC by Alex Jones
**Last Updated:** Thu Oct 19, 2017 07:21 PM UTC
**Owner:** Alex Jones


smfd crashes in updateImmAttr because it returns NO_RESOURCES. Here is how to 
reproduce:

1. enable PBE, and make sure the "disable" flag is set in OpenSafSmfConfig
2. execute an upgrade campaign, and let it go to "execution completed", but 
don't commit it
3. reboot the entire cluster
4. only allow 1 system controller to come up
5. smfd will attempt to re-execute the campaign
6. any writes to IMM (like setting an error because the campaign file can't be 
found) will fail with NO_RESOURCES and smfd will assert and crash

The reason for the assert and crash is because PBE has not been turned off by 
smfd before the campaign has been inititialized.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets