Re: [Linux-ha-dev] New master/slave resource agent for DB2 databases in HADR (High Availability Disaster Recovery) mode

Dejan Muhamedagic Thu, 10 Feb 2011 04:08:10 -0800

On Wed, Feb 09, 2011 at 07:38:49PM +0100, Holger Teutsch wrote:
> Hi,
> please find enclosed the revised version.
> 
> -holger
> 
> On Wed, 2011-02-09 at 14:03 +0100, Dejan Muhamedagic wrote:
> > Hi,
> > 
> 
> > Please find below some comments.
[...]
> > > #
> > > # maintain the fal (first active log) attribute
> > > # db2_fal_attrib DB {set val|get|delete}
> > > #
> > > db2_fal_attrib() {
> > >     local db=$1
> > >     local attr
> > > 
> > >     attr=db2hadr_${instance}_${db}_fal
> > > 
> > >     case "$2" in
> > >         set)
> > >         crm_attribute -t crm_config -n $attr -v "$3" 
> > 
> > Hmm, this is an attribute which should be on the master
> > right? Using crm_config for that looks wrong. I think
> > that it should go into the node status.
> 
> No, it is an attribute that communicates the log position from the
> Master node to a restarting slave node.


crm_config is for static configuration and actually for cluster
wide properties which are to be interpreted by the CRM. Now, I
cannot say how does db2 promote/demote work, but isn't it
possible to keep that as a dynamic node attribute? I think that
you should get information through the environment about which
node is currently the master. Then, the slave can read whichever
attribute it needs from that node's attributes. Can somebody with
more experience in ms resources confirm?

> > > # unfortunately a first connect after a crash may need several minutes
> > > # for some internal cleanup stuff in DB2.
> > > # We run a connect in background so other connects (i.e. monitoring!) may 
> > > proceed.
> > > #
> > > db2_run_connect() {
> > >     local db=$1
> > > 
> > >     logasdb2 "db2 connect to $db; db2 terminate"
> > 
> > What does this do? It is run in the background from the
> > start action, but it is not waited on (no exit code check)
> > 
> 
> From long year experience with DB2 I know that after a DB crash with
> crash recovery completed a *first* connect to the DB may need minutes
> (for a TB database) for some internal cleanup stuff.  All other connects
> issued in parallel go through immediately. So that is for getting a
> 'blocking' connect fired immediately. Status doesn't matter as monitor
> has it's own connect.

OK. That's some deep db2 knowledge :)

> > Also, since it goes to eval below, better add single quotes around
> > /HADR_PEER_WINDOW/ {printf "HADR_PEER_WINDOW='%s'\n" ...
> 
> I've put it in but the output is *very* expected (hopefully the DB is
> really DB2 8-).

Yes, we can hope, but no telling when another idle engineer is
going to change the output :)

IIRC, you still didn't reply about adding the hadr parameter.
This is the last I wrote on the matter elsewhere in this
jumbo-thread:

        HADR is a very different beast from non-HADR db, right? Why not
        then add the "hadr" boolean parameter and use that instead of
        checking if the resource has been configured as multi-state?
        Then the RA can complain if the resource is not ms. And in the
        configuration it is going to be obvious that this is a HADR
        instance.

In short, the idea is to have the user state their intentions
clearly. What's your opinion?

Cheers,

Dejan



> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: [Linux-ha-dev] New master/slave resource agent for DB2 databases in HADR (High Availability Disaster Recovery) mode

Reply via email to