Hi Praveen

I normally dont get involved in AMF patch reviews but this ticket and the fix 
caught my attention.
There is a general issue that bothers me about the approach, if I have not 
missunderstood it.

I understand this is a node failover of active controller.
That is inherrently an event that is not fully under control.
It is also an event that really is time critical.
A failover may occurr in several ways.

Here it seems that one kind of failover is "semi-controlable" and old active is 
in
essence trying to "clean up" its backlog in a job queue before it triggers the 
failover.

There will be other failover cases, such as a crash of the IMMD where it will 
not
be able to do this. So any cleanup (if necessary) must anyway be covered by new 
active.

In addition, updates to cached runtime data is a secondary duty of the AMF. 
Cached runtime data is CACHED and not absolutely obligated to reflect the 
original
State (which is in the AMF) in realtime. So updates of cached runtiome data 
should not
Really be a reason for delaying a failover.

/AndersBj


-----Original Message-----
From: praveen.malv...@oracle.com [mailto:praveen.malv...@oracle.com] 
Sent: den 7 maj 2014 10:26
To: Hans Feldt; nagendr...@oracle.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: [devel] [PATCH 0 of 1] Review Request for amfd: update RT objects 
before node-failover of active controller [#494].

Summary: amfd: update RT objects before node-failover of active controller 
[#494]. 
Review request for Trac Ticket(s): #494 (its duplicates #853 and #858) Peer 
Reviewer(s): Hans F., Nagendra. 
Pull request to: <<LIST THE PERSON WITH PUSH ACCESS HERE>> Affected branch(es): 
All Development branch: <<IF ANY GIVE THE REPO URL>>

--------------------------------
Impacted area       Impact y/n
--------------------------------
 Docs                    n
 Build system            n
 RPM/packaging           n
 Configuration files     n
 Startup scripts         n
 SAF services            n
 OpenSAF services        y
 Core libraries          n
 Samples                 n
 Tests                   n
 Other                   n


Comments (indicate scope for each "y" above):
---------------------------------------------
Please see the analysis og tickets and commit log below.

changeset bcf6eda79102f83c6940d75dd13073a9130026d0
Author: praveen.malv...@oracle.com
Date:   Wed, 07 May 2014 13:43:33 +0530

        amfd: update RT objects before node-failover of active controller 
[#494].

        Problem: Run time objects and attributes are not updated when 
node-failover
        gots escalated for active controller and standby controller took the 
active
        role.

        Reason: Activities related to update of runtime objects and certain
        attribute to IMM are given low priotiy and are pushed in Job queue by 
AMF.
        These jobs are completed when AMF is not busy in any other high priority
        activity. When node-failover is escalated, AMFD sends reboot message to
        AMFND to reboot the node. In case node-failover is escalated for active
        controller, it will send reboot message to AMFND which will reboot the
        controller. In such a case, some IMM related activites in JOB queue will
        remian uncompleted. All such activites should be compleleted before
        rebooting the active controller when node-failover is escalated for it.

        Fix: Fix will finish all IMM related jobs before sending reboot message 
to
        AMFND when node-failover is escalated for active controller.


Complete diffstat:
------------------
 osaf/services/saf/amf/amfd/sgproc.cc |  6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)


Testing Commands:
-----------------
Tested the duplicate bug #858.
This is easy to reproduce. 
After reproducing observed the states:
safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=UNINSTANTIATED(1)
        saAmfSUReadinessState=IN-SERVICE(2)


Testing, Expected Results:
--------------------------
Pass observed the satates:
safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=DISABLED(2)
        saAmfSUPresenceState=UNINSTANTIATED(1)
        saAmfSUReadinessState=OUT-OF-SERVICE(1)
AMFD logs:
May  7 12:05:47.624746 osafamfd [26472:imm.cc:0143] >> exec: Update 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' saAmfSUReadinessState May  7 
12:05:47.624799 osafamfd [26472:imma_oi_api.c:2270] >> saImmOiRtObjectUpdate_2 
May  7 12:05:47.626863 osafamfd [26472:mds_dt_trans.c:0671] >> 
mdtm_process_poll_recv_data_tcp May  7 12:05:47.627392 osafamfd 
[26472:imma_oi_api.c:2554] << saImmOiRtObjectUpdate_2 May  7 12:05:47.627419 
osafamfd [26472:imm.cc:0172] << exec

May  7 12:05:47.634134 osafamfd [26472:util.cc:1681] TR Sending REBOOT MSG to 
2010f May  7 12:05:47.634372 osafamfd [26472:sgproc.cc:0715] << 
avd_su_oper_state_evh



Conditions of Submission:
-------------------------
Ack from one of the reviewers.

Arch      Built     Started    Linux distro
-------------------------------------------
mips        n          n
mips64      n          n
x86         n          n
x86_64      y          y
powerpc     n          n
powerpc64   n          n


Reviewer Checklist:
-------------------
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
    that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to put in a proper Trac Ticket # into your commits.

___ You have incorrectly put/left internal data in your comments/files
    (i.e. internal bug tracking tool IDs, product names etc)

___ You have not given any evidence of testing beyond basic build tests.
    Demonstrate some level of runtime or other sanity testing.

___ You have ^M present in some of your files. These have to be removed.

___ You have needlessly changed whitespace or added whitespace crimes
    like trailing spaces, or spaces before tabs.

___ You have mixed real technical changes with whitespace and other
    cosmetic code cleanup changes. These have to be separate commits.

___ You need to refactor your submission into logical chunks; there is
    too much content into a single commit.

___ You have extraneous garbage in your review (merge commits etc)

___ You have giant attachments which should never have been sent;
    Instead you should place your content in a public tree to be pulled.

___ You have too many commits attached to an e-mail; resend as threaded
    commits, or place in a public tree for a pull.

___ You have resent this content multiple times without a clear indication
    of what has changed between each re-send.

___ You have failed to adequately and individually address all of the
    comments and change requests that were proposed in the initial review.

___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)

___ Your computer have a badly configured date and time; confusing the
    the threaded patch review.

___ Your changes affect IPC mechanism, and you don't present any results
    for in-service upgradability test.

___ Your changes affect user manual and documentation, your patch series
    do not contain the patch that updates the Doxygen manual.


------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity &#149; Requirements for 
releasing software faster &#149; Expert tips and advice for migrating your SCM 
now http://p.sf.net/sfu/perforce _______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to