Dear Mahesh,

Based on what I saw, in this case, retention time cannot detect CPND
temporarily down because its pid changed.

If cpnd is temporary down only, we don't need clean up anything.
If cpnd is permanently down, the bad effect of this proposal is that replica
is not clean up. But if cpnd permanently down, we have to reboot node for
recovering so I think this cleanup is not really necessary.

I also checked this implementation with possible test cases and have not
seen any side effect.
Please consider it.

Thank you and best regards,
Hoang

-----Original Message-----
From: A V Mahesh [mailto:mahesh.va...@oracle.com] 
Sent: Friday, February 10, 2017 10:40 AM
To: Hoang Vo <hoang.m...@dektech.com.au>; zoran.milinko...@ericsson.com
Cc: opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 1 of 1] cpd: to correct failover behavior of cpsv
[#1765] V5

Hi Hoang,

The CPD_CPND_DOWN_RETENTION  is to recognize, ether CPND temporarily down or
permanently down, this is started a CPND is down and based on
cpd_evt_proc_timer_expiry(), cpd recognize that the CPND is complete down
and do cleanup, else  cpnd rejoined with in CPD_CPND_DOWN_RETENTION_TIME ,
the CPD_CPND_DOWN_RETENTION is stoped.

If we stop CPD_CPND_DOWN_RETENTION timer in cpd_process_cpnd_dow(), do cpd
recognize the CPD permanently down, the cpd_process_cpnd_dow() being called
in multiple flows, can you please check all the flows, is stopping
CPD_CPND_DOWN_RETENTION timer has any impact ?

-AVM

On 2/9/2017 1:35 PM, Hoang Vo wrote:
>   src/ckpt/ckptd/cpd_proc.c |  11 ++++++++++-
>   1 files changed, 10 insertions(+), 1 deletions(-)
>
>
> problem:
> In case failover multiple times, the cpnd is down for a moment so 
> there is no cpnd opening specific checkpoint. This lead to retention timer
is trigger.
> When cpnd is up again but has different pid so retention timer is not
stoped.
> Repica is deleted at retention while its information still be in ckpt
database.
> That cause problem
>
> Fix:
> - Stop timer of removed node.
> - Update data in patricia trees (for retention value consistence).
>
> diff --git a/src/ckpt/ckptd/cpd_proc.c b/src/ckpt/ckptd/cpd_proc.c
> --- a/src/ckpt/ckptd/cpd_proc.c
> +++ b/src/ckpt/ckptd/cpd_proc.c
> @@ -679,7 +679,8 @@ uint32_t cpd_process_cpnd_down(CPD_CB *c
>       cpd_cpnd_info_node_find_add(&cb->cpnd_tree, cpnd_dest, &cpnd_info,
&add_flag);
>       if (!cpnd_info)
>               return NCSCC_RC_SUCCESS;
> -
> +     /* Stop timer before processing down */
> +     cpd_tmr_stop(&cpnd_info->cpnd_ret_timer);
>       cref_info = cpnd_info->ckpt_ref_list;
>   
>       while (cref_info) {
> @@ -989,6 +990,14 @@ uint32_t cpd_proc_retention_set(CPD_CB *
>   
>       /* Update the retention Time */
>       (*ckpt_node)->ret_time = reten_time;
> +     (*ckpt_node)->attributes.retentionDuration = reten_time;
> +
> +     /* Update the related patricia tree */
> +     CPD_CKPT_MAP_INFO *map_info = NULL;
> +     cpd_ckpt_map_node_get(&cb->ckpt_map_tree, (*ckpt_node)->ckpt_name,
&map_info);
> +     if (map_info) {
> +             map_info->attributes.retentionDuration = reten_time;
> +     }
>       return rc;
>   }
>   



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to