[tickets] [opensaf:tickets] #2499 SMF: 20 seconds timeout in getting node destination is not enough

2017-06-28 Thread Rafael Odzakow via Opensaf-tickets
This issue is as far as I could see a bug. In other campaign sequences SMF will 
wait with rebootTimeout before doing any operation after reboot. In this 
campaign sequence the first operation type after a reboot was to to a CLI 
command on a payload node. This timed out because the CLI command is not 
wrapped in a retry using the rebootTimeout of SMF. 

SMF does not keep track of all nodes after a cluster reboot therefore the 
mechanism for handling a cluster reboot is to wrap any operations that is done 
after a reboot in a retry loop.



---

** [tickets:#2499] SMF: 20 seconds timeout in getting node destination is not 
enough**

**Status:** unassigned
**Milestone:** 5.17.08
**Created:** Fri Jun 16, 2017 08:04 AM UTC by Tai Dinh
**Last Updated:** Wed Jun 21, 2017 03:58 AM UTC
**Owner:** nobody


We're now using a hard coded timeout value (20 seconds) in getting node 
destination.
This is sometimes not enough especially in cluster reboot procedure. Controller 
may come up first and continue the campaign without waiting for the rest to be 
up.
This can make the getNodeDestination() fail sometimes, especially for a large 
cluster.
In our case, it needs 3 more seconds.

I guess this timeout need to be increased or should be configurable.
Reuse some existing attribute for this purpose is also fine, e.g: 
smfRebootTimeout.

/Tai


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2513 rde: allow to early change peer role only when active or standby nodes exist

2017-06-28 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

release(5.17.06):

commit aba7bc5460136cca39b6b067d934973b3099fe0f
Author: Zoran Milinkovic 
Date:   Wed Jun 28 13:25:29 2017 +0200

rde: allow early role change when active or standby nodes are introduced 
[#2513]

When active or standby nodes are introduced with request message, there is 
no need to wait more for requesting the active role.
When standby node is introduced, then we are sure that there is an active 
node somewhere in the cluster. So, changing the peer state is safe.

-

develop(5.17.08):

commit f089f030a322a43c79f3f259f07a4c42bb4d0da1
Author: Zoran Milinkovic 
Date:   Wed Jun 28 13:25:29 2017 +0200

rde: allow early role change when active or standby nodes are introduced 
[#2513]

When active or standby nodes are introduced with request message, there is 
no need to wait more for requesting the active role.
When standby node is introduced, then we are sure that there is an active 
node somewhere in the cluster. So, changing the peer state is safe.



---

** [tickets:#2513] rde: allow to early change peer role only when active or 
standby nodes exist**

**Status:** fixed
**Milestone:** 5.17.06
**Created:** Wed Jun 28, 2017 09:14 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Jun 28, 2017 01:01 PM UTC
**Owner:** Zoran Milinkovic


The pushed patch from ticket #2423 triggers some side effects, like multiple 
active nodes, losing MDS logs, etc.
The first proposed patch with allowing to change a node role only when it's 
known that there is active or standby node seems to work well.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2513 rde: allow to early change peer role only when active or standby nodes exist

2017-06-28 Thread Zoran Milinkovic via Opensaf-tickets
- **status**: accepted --> review
- **Comment**:

https://sourceforge.net/p/opensaf/mailman/message/35916824/



---

** [tickets:#2513] rde: allow to early change peer role only when active or 
standby nodes exist**

**Status:** review
**Milestone:** 5.17.06
**Created:** Wed Jun 28, 2017 09:14 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Jun 28, 2017 09:14 AM UTC
**Owner:** Zoran Milinkovic


The pushed patch from ticket #2423 triggers some side effects, like multiple 
active nodes, losing MDS logs, etc.
The first proposed patch with allowing to change a node role only when it's 
known that there is active or standby node seems to work well.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2459 try-again for opensafd stop

2017-06-28 Thread Rafael Odzakow via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit a051496719a3c862594af17d88b082031dd53b33 (ticket-2459)

base: Try again for opensafd stop [#2459]

Internally opensafd creates a mutex during start/stop to avoid parallel
execution. Makes mutex more robust and add a short retry if mutex is
taken.

















---

** [tickets:#2459] try-again for opensafd stop**

**Status:** fixed
**Milestone:** 5.17.08
**Created:** Thu May 11, 2017 12:42 PM UTC by Rafael Odzakow
**Last Updated:** Tue Jun 13, 2017 08:01 AM UTC
**Owner:** Rafael Odzakow


Today there is no way for SMF (or others) to know when opensafd start is 
completed. Calling stop when a start is ongoing will not stop opensafd so the 
reboot will not shutdown opensafd. Resulting in errors reports from several 
components.

Internally opensafd creates a lock file during start/stop. Internally implement 
a try-again loop that will use the opensafd lockfile.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2513 rde: allow to early change peer role only when active or standby nodes exist

2017-06-28 Thread Zoran Milinkovic via Opensaf-tickets



---

** [tickets:#2513] rde: allow to early change peer role only when active or 
standby nodes exist**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Wed Jun 28, 2017 09:14 AM UTC by Zoran Milinkovic
**Last Updated:** Wed Jun 28, 2017 09:14 AM UTC
**Owner:** Zoran Milinkovic


The pushed patch from ticket #2423 triggers some side effects, like multiple 
active nodes, losing MDS logs, etc.
The first proposed patch with allowing to change a node role only when it's 
known that there is active or standby node seems to work well.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets