Hi Honza,
Yes, my patch will break cfg API behavior which is not good. I've also
considered trying to resolve it in a more general way instead of simply cut
off all API during SYNC. The root of this issue is that SYNC mistakenly
believed that his last barrier message was the last message generated
during SYNC, or say, the right message to finish using
assembly_list_*_trans. So if we try to solve the problem dundamentally, we
may need  to find a way to find the right message. For example, SYNC
firstly tell Totem about it has done, then Totem waits until
assembly_list_*_trans back to initial state to finally call
sync_synchronization_complete(). But in very extreme cases, Totem may be
not able to wait such a message coming. Besides, this way causes SYNC and
Totem(which are allready coupled) to be more tightly coupled. So I
currently find it not easy to have a general solution. What do you think?

Besides, the upper layer such as pacemaker do not use CFG shutdown API. And
I personally always using "service corosync stop" and know little about CFG
shutdown API. If with my patch, someone calls corosync-cfgtool -H (locally)
and barred by SYNC and timed out, how about switch to "service corosync
stop" instead?
 On Jul 1, 2015 11:06 PM, "Jan Friesse" <[email protected]> wrote:

> Jason,
> thanks for the patch. I was thinking a little deeper about this problem,
> and I think there are few problems for practical use point.
>
> Reload config is no brainier, no need to do it during sync, but kill node
> (local one), and try shutdown (again only for local host) may be wanted
> action even during sync. For example lets say there is problem with
> configuration and corosync is constantly in sync state, then there is no
> possibility how to turn it off via corosync-cfgtool.
>
> What is your opinion?
>
> Regards,
>   Honza
>
> Jason napsal(a):
>
>> From: Jason HU <[email protected]>
>>
>> During SYNC, corosync-cfgtool -R/-H commands can pass through IPC then
>> send totem messages. This may corrupts
>> assembly_list_inuse/assembly_list_free if those messages are recedived
>> after SYNC is done.
>>
>> The solution is marking related CFG APIs as CS_LIB_FLOW_CONTROL_REQUIRED.
>>
>> Signed-off-by: Jason HU <[email protected]>
>> ---
>>   exec/cfg.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/exec/cfg.c b/exec/cfg.c
>> index b061bd4..b209331 100644
>> --- a/exec/cfg.c
>> +++ b/exec/cfg.c
>> @@ -177,15 +177,15 @@ static struct corosync_lib_handler cfg_lib_engine[]
>> =
>>         },
>>         { /* 2 */
>>                 .lib_handler_fn         =
>> message_handler_req_lib_cfg_killnode,
>> -               .flow_control           = CS_LIB_FLOW_CONTROL_NOT_REQUIRED
>> +               .flow_control           = CS_LIB_FLOW_CONTROL_REQUIRED
>>         },
>>         { /* 3 */
>>                 .lib_handler_fn         =
>> message_handler_req_lib_cfg_tryshutdown,
>> -               .flow_control           = CS_LIB_FLOW_CONTROL_NOT_REQUIRED
>> +               .flow_control           = CS_LIB_FLOW_CONTROL_REQUIRED
>>         },
>>         { /* 4 */
>>                 .lib_handler_fn         =
>> message_handler_req_lib_cfg_replytoshutdown,
>> -               .flow_control           = CS_LIB_FLOW_CONTROL_NOT_REQUIRED
>> +               .flow_control           = CS_LIB_FLOW_CONTROL_REQUIRED
>>         },
>>         { /* 5 */
>>                 .lib_handler_fn         =
>> message_handler_req_lib_cfg_get_node_addrs,
>> @@ -197,7 +197,7 @@ static struct corosync_lib_handler cfg_lib_engine[] =
>>         },
>>         { /* 7 */
>>                 .lib_handler_fn         =
>> message_handler_req_lib_cfg_reload_config,
>> -               .flow_control           = CS_LIB_FLOW_CONTROL_NOT_REQUIRED
>> +               .flow_control           = CS_LIB_FLOW_CONTROL_REQUIRED
>>         }
>>   };
>>
>>
>>
>
_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss

Reply via email to