Summary: imm: fix amfd stuck when multi partitioned clusters rejoin [#3237]
Review request for Ticket(s): 3237
Peer Reviewer(s): *** LIST THE TECH REVIEWER(S) / MAINTAINER(S) HERE ***
Pull request to: *** LIST THE PERSON WITH PUSH ACCESS HERE ***
Affected branch(es): develop
Development branch: ticket-3237
Base revision: c4091499e28980c732c8ac4136e10243617ac81d
Personal repository: git://git.code.sf.net/u/thuantr/review

--------------------------------
Impacted area       Impact y/n
--------------------------------
 Docs                    n
 Build system            n
 RPM/packaging           n
 Configuration files     n
 Startup scripts         n
 SAF services            y
 OpenSAF services        n
 Core libraries          n
 Samples                 n
 Tests                   n
 Other                   n

NOTE: Patch(es) contain lines longer than 80 characers

Comments (indicate scope for each "y" above):
---------------------------------------------
N/A

revision dda452c5486137f6bc9653e219d5ea16d6323def
Author: thuan.tran <thuan.t...@dektech.com.au>
Date:   Wed, 18 Nov 2020 10:24:49 +0700

imm: fix amfd stuck when multi partitioned clusters rejoin [#3237]

- IMMND coordinator take longer time to sync because incorrectly
postpone sync to wait for incorrect number of down nodes.
- IMMND should restart after being accepted re-intro and not be
a new coordinator to sync again with new coordinator.
- Active IMMD only update ex-IMMD from coordinator if info exist.
Update ex-IMMD to node id itself when new coord announce sync.
- IMMND on active IMMD node will start split-brain detected timer
to reboot node if see another acitve IMMD, not reboot immedidately
to avoid messing up RDE split-brain detection mechanism.
- Quick reboot sometimes not quick then active IMMD on node may
impact to new promoted Active node. Let stop AMFND, kill AMFD/IMMD
to avoid any impact.



Complete diffstat:
------------------
 scripts/opensaf_reboot     |  5 +++--
 src/imm/immd/immd_evt.c    | 16 +++++++++++++---
 src/imm/immnd/immnd.h      |  1 +
 src/imm/immnd/immnd_cb.h   |  2 ++
 src/imm/immnd/immnd_evt.c  | 37 +++++++++++++++++++++++++++++--------
 src/imm/immnd/immnd_main.c |  2 ++
 6 files changed, 50 insertions(+), 13 deletions(-)


Testing Commands:
-----------------
N/A

Testing, Expected Results:
--------------------------
N/A

Conditions of Submission:
-------------------------
ACK by reviewers

Arch      Built     Started    Linux distro
-------------------------------------------
mips        n          n
mips64      n          n
x86         n          n
x86_64      y          y
powerpc     n          n
powerpc64   n          n


Reviewer Checklist:
-------------------
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
    that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to put in a proper Trac Ticket # into your commits.

___ You have incorrectly put/left internal data in your comments/files
    (i.e. internal bug tracking tool IDs, product names etc)

___ You have not given any evidence of testing beyond basic build tests.
    Demonstrate some level of runtime or other sanity testing.

___ You have ^M present in some of your files. These have to be removed.

___ You have needlessly changed whitespace or added whitespace crimes
    like trailing spaces, or spaces before tabs.

___ You have mixed real technical changes with whitespace and other
    cosmetic code cleanup changes. These have to be separate commits.

___ You need to refactor your submission into logical chunks; there is
    too much content into a single commit.

___ You have extraneous garbage in your review (merge commits etc)

___ You have giant attachments which should never have been sent;
    Instead you should place your content in a public tree to be pulled.

___ You have too many commits attached to an e-mail; resend as threaded
    commits, or place in a public tree for a pull.

___ You have resent this content multiple times without a clear indication
    of what has changed between each re-send.

___ You have failed to adequately and individually address all of the
    comments and change requests that were proposed in the initial review.

___ You have a misconfigured ~/.gitconfig file (i.e. user.name, user.email etc)

___ Your computer have a badly configured date and time; confusing the
    the threaded patch review.

___ Your changes affect IPC mechanism, and you don't present any results
    for in-service upgradability test.

___ Your changes affect user manual and documentation, your patch series
    do not contain the patch that updates the Doxygen manual.



_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to