Summary: Pre-review of 2PBE complete dataflow.
Review request for Trac Ticket(s): (#21)
Peer Reviewer(s): Neel
Pull request to: 
Affected branch(es): devel(4.4)
Development branch: 

--------------------------------
Impacted area       Impact y/n
--------------------------------
 Docs                    n
 Build system            n
 RPM/packaging           n
 Configuration files     n
 Startup scripts         n
 SAF services            n
 OpenSAF services        n
 Core libraries          n
 Samples                 n
 Tests                   n
 Other                   n


Comments (indicate scope for each "y" above):
---------------------------------------------

Thisd is a pre-review of the basic data flow solution that I propose for 2PBE.
By pre-review I mean: 
These specific patches will not be pushed.
Thus I am not waiting for an "ack" on these patches.
The intention is to communicate the development of 2PBE so far and to allow
anyone to test and experimentat with it, or just inspect the code to get
an understanding of how it will work. 

This is not a complete solution for 2PBE.
There is still work to be done to get tighter syncronisation in the commit
of an imm-transaction between the two PBEs. With regular single PBE, the commit
of one imm-transaction was atomic with the commit with the commit to the sqlite 
file.
ion fact the commit of the sqlite transaction *was* the commit of the 
imm-transaction.

With 2PBE, the commit of the transaction to PBE is still the commit of the 
imm-transaction.
But now there is also the issue of if and how to get the two sqlite files to 
commit atomically.
My current stance on this is that I will not attempt to make the commit to the 
two sqlite files
100% atomic. This would either require at least two sqlite transaction 
commits/writes for every 
one imm-transaction, or require the utilization of some open 2pc interface in 
sqlite if that 
is available. I believe there is some kind of callback hook availble for sqlite 
transaction
prepare. But I will not attempt to use that for now. The basic idea is instead 
to keep
one primary PBE and then add a slave PBE. To make each imm-transaction commit 
with 
syncronization between the two PBEs before sqlite commit, minimizing the 
probability that
one sqlite instance succeeds in commiting while the other sqlite instance fails.
Still they may of course diverge, due to file system problems or crashes. 
The first thing to note here is that a cluster start, after such a broken PBE 
commit,
will be handled by the loading arbitration and that arbitration will choose to 
load
from the sqlite file that succeeded in the commit (the latest file) as long as 
it 
is available. And even if the cluster restart should come up with one SC with 
the 
file that failed to commit the last ccb and timeout in waiting for the other and
thus load from the older file, then there is also no problem because this only 
means
that the transaction will have been aborted. Note that the user/client of  the 
transaction
will not have obtained any answer on the outcome unless both PBEs succeeded in 
their commit. 

For regular CCBs, the primary PBE will prepare by executing all the sqlite 
calls needed to
build the transaction, but before comitting it sends a a message to the slave 
PBE asking
it it has received all requests to be part of the ccb and if so to also execute 
all
the sqlite calls to prepare the ccb. When/if the standby replies with ok, then 
the
primary PBE will commit its sqlite transaction and reply to immsv. The immsv 
will then
commit the ccb in imm-cluster-ram. Finaly as part of sending messages about the 
commit
of the ccb to all appliers, the slave PBE will get both completed and apply and 
will 
(hopefully) commit its sqlite transaction. 

In general, the slave PBE is tightly controlled by the primary PBE (an 
asymetric solution).
For class management and PRTO/PRTA handling the solution is slightly different. 
In general I have also tried to leverage from existing mechanisms, such as 
making the 
slave PBE an applier of all config classes, thus avoiding additional and 
unnecessary
distributed messaging for the payload of a ccb. 

changeset 131d635a526a0e894245ecb08b630268f9035976
Author: Anders Bjornerstedt <[email protected]>
Date:   Fri, 19 Jul 2013 08:39:25 +0200

        IMM: 2PBE loading (test patch-1) [#21]

        This is a testpatch containing a test version of the 2PBE loading 
mechanism.
        This patch will not be pushed. The intent with this patch is to allow
        testing and obtaining feedback on the 2PBE loading.

This firt patch has aready been sent out once earlier for pre-review.
Keep in mind that it is for cluster restarts that all this is intended.
So in one sense this is the most important patch to be as reliable as possible.
--------------------------------------------------------------

changeset f240bedd6aa2b92bdea38d1c6532ac8097ac9444
Author: Anders Bjornerstedt <[email protected]>
Date:   Fri, 19 Jul 2013 08:45:07 +0200

        IMM: 2PBE ccb-handling (test patch-2) [#21]

        This is a testpatch containing a test version of the 2PBE ccb-handling. 
This
        patch will not be pushed. The intent with this patch is to allow 
testing and
        obtaining feedback on the 2PBE ccb-handling. This test-patch goes on 
top of
        the "2PBE loading" testpatch.

Contains the process management changes to start and restart 2 PBEs, including 
failover
handling. Plus the data flow solution for CCBs. THe detailed synhronisation of 
ccb commit
between the two PBEs is still missing. 
-------------------------------------------------------------

changeset 421acf69d0765d0860bfe726b8f1ba1ccb519e6b
Author: Anders Bjornerstedt <[email protected]>
Date:   Fri, 19 Jul 2013 08:55:23 +0200

        IMM: 2PBE class-create/delete/schema change handling (test patch-3) 
[#21]

        This is a testpatch containing a test version of the 2PBE 
class-handling.
        This patch will not be pushed. The intent with this patch is to allow
        testing and obtaining feedback on the 2PBE handling of imm classes. This
        test-patch goes on top of the "2PBE ccb-handling" testpatch.

Provides 2PBE persistification of class create/delete/schema-change. 
Extends the existing PBE oultion that uses an admin-op from immsv to PBE
so that the primary PBE invokes the same admin-op towards slave.

---------------------------------------------------------------

changeset d6f91100b51e2b7215bf5fab95e10d0c069d14db
Author: Anders Bjornerstedt <[email protected]>
Date:   Fri, 19 Jul 2013 09:10:43 +0200

        IMM: 2PBE PRTO-create handling (test patch-4) [#21]

        This is a testpatch containing a test version of the 2PBE handling of
        creates of persistent runtime objects (PRTOs). This patch will not be
        pushed. The intent with this patch is to allow testing and obtaining
        feedback on the 2PBE handling of PRTO creates This test-patch goes on 
top of
        the "2PBE class-handling" testpatch.

Provides 2PBE persistification of PRTO create.
In this case the immsv directly sends the same payload callbacks to both the 
primary
and slave PBE. The primary will invoke an admin-op towards the slave to 
syncronize
(not implemented).

changeset 416b82d2d116c2a9ede63b507c72c10ee2126f42
Author: Anders Bjornerstedt <[email protected]>
Date:   Fri, 19 Jul 2013 09:16:17 +0200

        IMM: 2PBE PRTO-delete handling (test patch-5) [#21]

        This is a testpatch containing a test version of the 2PBE handling of
        deletes of persistent runtime objects (PRTOs). This patch will not be
        pushed. The intent with this patch is to allow testing and obtaining
        feedback on the 2PBE handling of PRTO deletes This test-patch goes on 
top of
        the "2PBE PRTO-create" testpatch.

Provides 2PBE persistification of PRTO delete. This operation is quite different
from PRTO create because PRTO delete can in general be the delete of a subtree,
i.e. several PRTOs that have to be deleted as one transaction. Again the immsv 
sends the same messages that it sent to the PBE, now also to the slave PBE. 

changeset 894df553ef7e1b829db43329a5bafe4ce7fb2cbd
Author: Anders Bjornerstedt <[email protected]>
Date:   Fri, 19 Jul 2013 09:20:45 +0200

        IMM: 2PBE PRTA-update handling (test patch-6) [#21]

        This is a testpatch containing a test version of the 2PBE handling of
        updates of persistent runtime attributes (PRTAs). PRTAs can exist in 
either
        PRTOs or config objects. This patch will not be pushed. The intent with 
this
        patch is to allow testing and obtaining feedback on the 2PBE handling of
        PRTA updates This test-patch goes on top of the "2PBE PRTO-delete"

Provides 2PBE persistification of PRTA updates. This more similar to PRTO create
than it is PRTO delete, but PRTA updaes can be an update in a config object.



Complete diffstat:
------------------
 osaf/libs/agents/saf/imma/imma_oi_api.c          |    4 +-
 osaf/libs/agents/saf/imma/imma_proc.c            |   34 +++-
 osaf/libs/common/immsv/immpbe_dump.cc            |   62 ++++++-
 osaf/libs/common/immsv/immsv_evt.c               |   55 ++++++-
 osaf/libs/common/immsv/include/immpbe_dump.hh    |    6 +-
 osaf/libs/common/immsv/include/immsv_api.h       |   17 +-
 osaf/libs/common/immsv/include/immsv_evt.h       |   18 +-
 osaf/libs/common/immsv/include/immsv_evt_model.h |    4 +
 osaf/services/saf/immsv/immd/immd_amf.c          |    5 +-
 osaf/services/saf/immsv/immd/immd_cb.h           |    6 +-
 osaf/services/saf/immsv/immd/immd_db.c           |    2 +
 osaf/services/saf/immsv/immd/immd_evt.c          |   82 ++++++++++-
 osaf/services/saf/immsv/immd/immd_main.c         |   40 +++++-
 osaf/services/saf/immsv/immd/immd_proc.c         |  229 
+++++++++++++++++++++++++++++-
 osaf/services/saf/immsv/immd/immd_proc.h         |    3 +-
 osaf/services/saf/immsv/immd/immd_sbevt.c        |   23 ++-
 osaf/services/saf/immsv/immloadd/imm_loader.cc   |  315 
++++++++++++++++++++++++++++++++++++------
 osaf/services/saf/immsv/immloadd/imm_loader.hh   |    8 +-
 osaf/services/saf/immsv/immloadd/imm_pbe_load.cc |  196 
++++++++++++++++++++++++-
 osaf/services/saf/immsv/immnd/ImmModel.cc        |  333 
++++++++++++++++++++++++++++++++++---------
 osaf/services/saf/immsv/immnd/ImmModel.hh        |   17 +-
 osaf/services/saf/immsv/immnd/ImmSearchOp.cc     |    5 -
 osaf/services/saf/immsv/immnd/immnd_cb.h         |    6 +-
 osaf/services/saf/immsv/immnd/immnd_evt.c        |  345 
+++++++++++++++++++++++++++++++++++++++++++---
 osaf/services/saf/immsv/immnd/immnd_init.h       |    8 +-
 osaf/services/saf/immsv/immnd/immnd_main.c       |    3 +-
 osaf/services/saf/immsv/immnd/immnd_proc.c       |  246 
+++++++++++++++++++++++++++------
 osaf/services/saf/immsv/immpbed/immpbe.cc        |   77 +++++----
 osaf/services/saf/immsv/immpbed/immpbe.hh        |    4 +
 osaf/services/saf/immsv/immpbed/immpbe_daemon.cc |  677 
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 tests/immsv/implementer/applier.c                |  121 ++++++++++++---
 31 files changed, 2559 insertions(+), 392 deletions(-)


Testing Commands:
-----------------


Testing, Expected Results:
--------------------------
The basic normal positive use cases should work, 
including the persistification of data to *both* pbe files.

Most negative test cases (failover, killing of processes) should also.

The main risk to watch for is when/if a PBE restarts with '--recover'
which means it simply re-attaches to the sqlite file without regenerating
a fresh file dumped from imm-ram, then there is a risk that there could
have been introduced a discrepancy between the files. In particular, the
last transaction being processed at the time of the crash could have 
commited at the other PBE, yet have been rolled back at the restarted PBE. 

Persistent writes are blocked as long as not both PBEs are available. 

Conditions of Submission:
-------------------------
These patches will not be pushed.


Arch      Built     Started    Linux distro
-------------------------------------------
mips        n          n
mips64      n          n
x86         n          n
x86_64      n          n
powerpc     n          n
powerpc64   n          n


Reviewer Checklist:
-------------------
[Submitters: make sure that your review doesn't trigger any checkmarks!]


Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
    that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to put in a proper Trac Ticket # into your commits.

___ You have incorrectly put/left internal data in your comments/files
    (i.e. internal bug tracking tool IDs, product names etc)

___ You have not given any evidence of testing beyond basic build tests.
    Demonstrate some level of runtime or other sanity testing.

___ You have ^M present in some of your files. These have to be removed.

___ You have needlessly changed whitespace or added whitespace crimes
    like trailing spaces, or spaces before tabs.

___ You have mixed real technical changes with whitespace and other
    cosmetic code cleanup changes. These have to be separate commits.

___ You need to refactor your submission into logical chunks; there is
    too much content into a single commit.

___ You have extraneous garbage in your review (merge commits etc)

___ You have giant attachments which should never have been sent;
    Instead you should place your content in a public tree to be pulled.

___ You have too many commits attached to an e-mail; resend as threaded
    commits, or place in a public tree for a pull.

___ You have resent this content multiple times without a clear indication
    of what has changed between each re-send.

___ You have failed to adequately and individually address all of the
    comments and change requests that were proposed in the initial review.

___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)

___ Your computer have a badly configured date and time; confusing the
    the threaded patch review.

___ Your changes affect IPC mechanism, and you don't present any results
    for in-service upgradability test.

___ Your changes affect user manual and documentation, your patch series
    do not contain the patch that updates the Doxygen manual.


------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to