Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 06:06:13PM -0700, Steven Dake wrote: > I'd like to clear up that when Andrew talks about the membership not > generating a leave event for totem processes in this scenario (which he > integrates directly with), this is true. But cpg should generate a > leave event. Even if the pid is the same? That is, if my node reboots very fast, and my daemon comes back. What happens in cpg if a) my daemon has a different pid, b) my daemon has the same pid? I'd like to see a) a leave event for the old nodeid+pid and a join event for the new nodeid+pid, b) a leave and a join event for the nodeid+pid. Joel -- Life's Little Instruction Book #306 "Take a nap on Sunday afternoons." Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH]: openais/trunk: Fix confchg_fn for message service
This patch fixes the issues that Steve mentioned yesterday regarding the message service confchg_fn. Most importantly, it renames the global member_list such that there is not collision with the member_list passed as an argument to confchg_fn. Also, the entire confchg_fn code was reworked to behave like the checkpoint service code. This is based on the code in whitetank, known to be stable. Index: services/msg.c === --- services/msg.c (revision 1789) +++ services/msg.c (working copy) @@ -101,6 +101,8 @@ }; enum msg_sync_state { + MSG_SYNC_STATE_NOT_STARTED, + MSG_SYNC_STATE_STARTED, MSG_SYNC_STATE_QUEUE, MSG_SYNC_STATE_GROUP, }; @@ -432,7 +434,7 @@ void *conn, void *msg); -static enum msg_sync_state msg_sync_state; +static enum msg_sync_state msg_sync_state = MSG_SYNC_STATE_NOT_STARTED; static enum msg_sync_iteration_state msg_sync_iteration_state; static struct list_head *msg_sync_iteration_queue; @@ -452,8 +454,8 @@ static void msg_sync_refcount_calculate ( struct queue_entry *queue); -static unsigned int member_list[PROCESSOR_COUNT_MAX]; -static unsigned int member_list_entries = 0; +static unsigned int msg_member_list[PROCESSOR_COUNT_MAX]; +static unsigned int msg_member_list_entries = 0; static unsigned int lowest_nodeid = 0; static struct memb_ring_id saved_ring_id; @@ -1016,8 +1018,8 @@ { unsigned int i; - for (i = 0; i < member_list_entries; i++) { - if (nodeid == member_list[i]) { + for (i = 0; i < msg_member_list_entries; i++) { + if (nodeid == msg_member_list[i]) { return (1); } } @@ -1086,34 +1088,34 @@ log_printf (LOG_LEVEL_NOTICE, "[DEBUG]: msg_confchg_fn\n"); -#ifdef TODO - if (configuration_type == TOTEM_CONFIGURATION_TRANSITIONAL) { - for (i = 0; i < left_list_entries; i++) { - for (j = 0; j < member_list_entries; j++) { - if (left_list[i] == member_list[j]) { - member_list[j] = 0; -/* The above line should not assign to member_list since it is internal to totem. */ + memcpy (&saved_ring_id, ring_id, + sizeof (struct memb_ring_id)); + + if (configuration_type != TOTEM_CONFIGURATION_REGULAR) { + return; + } + if (msg_sync_state != MSG_SYNC_STATE_NOT_STARTED) { + return; + } + + msg_sync_state = MSG_SYNC_STATE_STARTED; + + for (i = 0; i < msg_member_list_entries; i++) { + for (j = 0; j < member_list_entries; j++) { + if (msg_member_list[i] == member_list[j]) { + if (lowest_nodeid > member_list[j]) { + lowest_nodeid = member_list[j]; } } } } -#endif - lowest_nodeid = 0x; + memcpy (msg_member_list, member_list, + sizeof (unsigned int) * member_list_entries); - if (configuration_type == TOTEM_CONFIGURATION_REGULAR) { - memcpy (member_list, member_list, - sizeof (unsigned int) * member_list_entries); - member_list_entries = member_list_entries; - memcpy (&saved_ring_id, ring_id, - sizeof (struct memb_ring_id)); - for (i = 0; i < member_list_entries; i++) { - if ((member_list[i] != 0) && - (member_list[i] < lowest_nodeid)) { - lowest_nodeid = member_list[i]; - } - } - } + msg_member_list_entries = member_list_entries; + + return; } static int msg_name_match (const SaNameT *name_a, const SaNameT *name_b) @@ -2156,6 +2158,8 @@ /* msg_print_queue_list (&queue_list_head); msg_print_group_list (&group_list_head); */ + msg_sync_state = MSG_SYNC_STATE_NOT_STARTED; + return; } ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
> guarantees you seek, and if it doesn't, it is defective. The only > exception might be if the new process reuses the same PID since the > pid/nodeid/group are the uniqifiers and if pid is the same, there is no > way to detect the new process (and remove the old one). PID reuse happens more often than you may think. We finally started to use PID/starttime tuple to get unique process identifiers. - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH]: openais/trunk: Create timer_hdb with DECLARE_HDB_DATABASE
good for merge regards -steve On Thu, 2009-04-09 at 23:10 -0500, Ryan O'Hara wrote: > This patch uses new DECLARE_HDB_DATABASE macro create > timer_hdb. Replaces old method of declaring and initializing the > handle database. > > ___ > Openais mailing list > Openais@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH]: openais/trunk: Create timer_hdb with DECLARE_HDB_DATABASE
This patch uses new DECLARE_HDB_DATABASE macro create timer_hdb. Replaces old method of declaring and initializing the handle database. Index: services/tmr.c === --- services/tmr.c (revision 1789) +++ services/tmr.c (working copy) @@ -76,12 +76,7 @@ struct list_head cleanup_list; }; -static struct hdb_handle_database timer_hdb = { - .handle_count = 0, - .handles= 0, - .iterator = 0, - .mutex = PTHREAD_MUTEX_INITIALIZER -}; +DECLARE_HDB_DATABASE (timer_hdb); static struct corosync_api_v1 *api; ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [corosync trunk] do proper check in main.c
The service_available function was returning an int from the ais_service data struct. Instead the function should check that the ais_service entry is not NULL. Regards -steve Index: exec/main.c === --- exec/main.c (revision 2050) +++ exec/main.c (working copy) @@ -538,7 +538,7 @@ static int corosync_service_available (unsigned int service) { - return (ais_service[service]); + return (ais_service[service] != NULL); } static int corosync_response_size_get (unsigned int service, unsigned int id) ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [corosync trunk] cast coroipcs iovec entry from const to non const
see subject Index: exec/coroipcs.c === --- exec/coroipcs.c (revision 2050) +++ exec/coroipcs.c (working copy) @@ -869,7 +869,7 @@ { struct iovec iov; - iov.iov_base = msg; + iov.iov_base = (void *)msg; iov.iov_len = mlen; msg_send_or_queue (conn, &iov, 1); ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] remove warning from keygen and report error condition properly
Regards -steve Index: tools/corosync-keygen.c === --- tools/corosync-keygen.c (revision 2050) +++ tools/corosync-keygen.c (working copy) @@ -83,9 +83,13 @@ exit (1); } /* - * Set security of authorization key to uid = 0 uid = 0 mode = 0400 + * Set security of authorization key to uid = 0 gid = 0 mode = 0400 */ - fchown (authkey_fd, 0, 0); + res = fchown (authkey_fd, 0, 0); + if (res == -1) { + perror ("Could not fchown key to uid 0 and gid 0\n"); + exit (1); + } fchmod (authkey_fd, 0400); printf ("Writing corosync key to " KEYFILE ".\n"); ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [corosync trunk] remove warnings in wthread.c
see subject Index: exec/wthread.c === --- exec/wthread.c (revision 2050) +++ exec/wthread.c (working copy) @@ -65,8 +65,7 @@ struct thread_data thread_data; }; -void *worker_thread (void *thread_data_in) __attribute__((__noreturn__)); -void *worker_thread (void *thread_data_in) { +static void *worker_thread (void *thread_data_in) { struct thread_data *thread_data = (struct thread_data *)thread_data_in; struct worker_thread *worker_thread = (struct worker_thread *)thread_data->data; @@ -96,6 +95,7 @@ } pthread_mutex_unlock (&worker_thread->done_work_mutex); } + return (NULL); } int worker_thread_group_init ( ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [corosync trunk] remove -Wcast-qual from warnings list in configure.ac
This warning only appears now in iovectors which have a const qualifier which are recast to non-const to match the iovector definition. Remove this class of warnings from the tree since they are invalid. Regards -steve Index: configure.ac === --- configure.ac (revision 2050) +++ configure.ac (working copy) @@ -244,7 +244,6 @@ declaration-after-statement pointer-arith write-strings - cast-qual cast-align bad-function-cast missing-format-attribute ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [corosync trunk] remove admin_state_set and admin_state_get from cfg apis.
remove admin state set and get from api since they are unused and only necessary by amf in openais. Index: include/corosync/cfg.h === --- include/corosync/cfg.h (revision 2050) +++ include/corosync/cfg.h (working copy) @@ -188,18 +188,6 @@ unsigned int service_ver); cs_error_t -corosync_cfg_administrative_state_get ( - corosync_cfg_handle_t cfg_handle, - corosync_cfg_administrative_target_t administrative_target, - corosync_cfg_administrative_state_t *administrative_state); - -cs_error_t -corosync_cfg_administrative_state_set ( - corosync_cfg_handle_t cfg_handle, - corosync_cfg_administrative_target_t administrative_target, - corosync_cfg_administrative_state_t administrative_state); - -cs_error_t corosync_cfg_kill_node ( corosync_cfg_handle_t cfg_handle, unsigned int nodeid, Index: lib/cfg.c === --- lib/cfg.c (revision 2050) +++ lib/cfg.c (working copy) @@ -586,95 +586,6 @@ } cs_error_t -corosync_cfg_admin_state_get ( - corosync_cfg_handle_t cfg_handle, - corosync_cfg_administrative_target_t administrative_target, - corosync_cfg_administrative_state_t *administrative_state) -{ - struct cfg_instance *cfg_instance; - struct req_lib_cfg_administrativestateget req_lib_cfg_administrativestateget; - struct res_lib_cfg_administrativestateget res_lib_cfg_administrativestateget; - cs_error_t error; - struct iovec iov; - - error = saHandleInstanceGet (&cfg_hdb, cfg_handle, - (void *)&cfg_instance); - if (error != CS_OK) { - return (error); - } - - req_lib_cfg_administrativestateget.header.id = MESSAGE_REQ_CFG_ADMINISTRATIVESTATEGET; - req_lib_cfg_administrativestateget.header.size = sizeof (struct req_lib_cfg_administrativestateget); - req_lib_cfg_administrativestateget.administrative_target = administrative_target; - - pthread_mutex_lock (&cfg_instance->response_mutex); - - iov.iov_base = &req_lib_cfg_administrativestateget, - iov.iov_len = sizeof (struct req_lib_cfg_administrativestateget), - - pthread_mutex_lock (&cfg_instance->response_mutex); - - error = coroipcc_msg_send_reply_receive (cfg_instance->ipc_ctx, - &iov, - 1, - &res_lib_cfg_administrativestateget, - sizeof (struct res_lib_cfg_administrativestateget)); - - error = res_lib_cfg_administrativestateget.header.error; - - pthread_mutex_unlock (&cfg_instance->response_mutex); - - (void)saHandleInstancePut (&cfg_hdb, cfg_handle); - -return (error == CS_OK ? res_lib_cfg_administrativestateget.header.error : error); -} - -cs_error_t -corosync_cfg_admin_state_set ( - corosync_cfg_handle_t cfg_handle, - corosync_cfg_administrative_target_t administrative_target, - corosync_cfg_administrative_state_t administrative_state) -{ - struct cfg_instance *cfg_instance; - struct req_lib_cfg_administrativestateset req_lib_cfg_administrativestateset; - struct res_lib_cfg_administrativestateset res_lib_cfg_administrativestateset; - cs_error_t error; - struct iovec iov; - - error = saHandleInstanceGet (&cfg_hdb, cfg_handle, - (void *)&cfg_instance); - if (error != CS_OK) { - return (error); - } - - req_lib_cfg_administrativestateset.header.id = MESSAGE_REQ_CFG_ADMINISTRATIVESTATEGET; - req_lib_cfg_administrativestateset.header.size = sizeof (struct req_lib_cfg_administrativestateset); - req_lib_cfg_administrativestateset.administrative_target = administrative_target; - req_lib_cfg_administrativestateset.administrative_state = administrative_state; - - pthread_mutex_lock (&cfg_instance->response_mutex); - - iov.iov_base = &req_lib_cfg_administrativestateset, - iov.iov_len = sizeof (struct req_lib_cfg_administrativestateset), - - pthread_mutex_lock (&cfg_instance->response_mutex); - - error = coroipcc_msg_send_reply_receive (cfg_instance->ipc_ctx, - &iov, - 1, - &res_lib_cfg_administrativestateset, - sizeof (struct res_lib_cfg_administrativestateset)); - - error = res_lib_cfg_administrativestateset.header.error; - - pthread_mutex_unlock (&cfg_instance->response_mutex); - - (void)saHandleInstancePut (&cfg_hdb, cfg_handle); - -return (error == CS_OK ? res_lib_cfg_administrativestateset.header.error : error); -} - -cs_error_t corosync_cfg_kill_node ( corosync_cfg_handle_t cfg_handle, unsigned int nodeid, @@ -846,10 +757,11 @@ addrlen = sizeof(struct sockaddr_in6); for (i=0; inum_addrs; i++) { - addrs[i].address_length = addrlen; struct sockaddr_in *in; struct sockaddr_in6 *in6; + addrs[i].address_length = addrlen; + if (res_lib_cfg_get_node_addrs->family == AF_INET) { in = (struct sockaddr_in *)addrs[i].address; in->sin_family = AF_INET; ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [coroysnc trunk] remove warning in corosync-objctl
see subject Index: tools/corosync-objctl.c === --- tools/corosync-objctl.c (revision 2050) +++ tools/corosync-objctl.c (working copy) @@ -249,7 +249,7 @@ hdb_handle_t obj_handle; confdb_handle_t parent_object_handle = OBJECT_PARENT_HANDLE; char tmp_name[OBJ_NAME_SIZE]; - cs_error_t res; + cs_error_t res = CS_OK; strncpy (tmp_name, name_pt, OBJ_NAME_SIZE); obj_name_pt = strtok_r(tmp_name, SEPERATOR_STR, &save_pt); ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [corosync trunk] use spin locks in hdb api
The hdb api is a perfect candidate for spinlocks to protect the critical sections in the database. This patch adds them (if available on the os) to the hdb API. It also adds functions to declare different kinds of hdb databases. performance increase = 25% msgs/sec for evsbench with small message sizes. regards -steve Index: lcr/lcr_ifact.c === --- lcr/lcr_ifact.c (revision 2045) +++ lcr/lcr_ifact.c (working copy) @@ -61,6 +61,11 @@ void (*destructor) (void *context); }; +DECLARE_HDB_DATABASE_FIRSTRUN (lcr_component_instance_database); + +DECLARE_HDB_DATABASE_FIRSTRUN (lcr_iface_instance_database); + +/* static struct hdb_handle_database lcr_component_instance_database = { .handle_count = 0, .handles = 0, @@ -72,6 +77,7 @@ .handles = 0, .iterator = 0 }; +*/ static hdb_handle_t g_component_handle = 0x; Index: include/corosync/hdb.h === --- include/corosync/hdb.h (revision 2045) +++ include/corosync/hdb.h (working copy) @@ -60,14 +60,80 @@ unsigned int handle_count; struct hdb_handle *handles; unsigned int iterator; - pthread_mutex_t mutex; +#if defined(HAVE_PTHREAD_SPIN_LOCK) + pthread_spinlock_t lock; +#else + pthread_mutex_t lock; +#endif + unsigned int first_run; }; +#if defined(HAVE_PTHREAD_SPIN_LOCK) +static inline void hdb_database_lock (pthread_spinlock_t *spinlock) +{ + pthread_spin_lock (spinlock); +} + +static inline void hdb_database_unlock (pthread_spinlock_t *spinlock) +{ + pthread_spin_unlock (spinlock); +} +static inline void hdb_database_lock_init (pthread_spinlock_t *spinlock) +{ + pthread_spin_init (spinlock, 0); +} + +static inline void hdb_database_lock_destroy (pthread_spinlock_t *spinlock) +{ + pthread_spin_destroy (spinlock); +} + +#else +static inline void hdb_database_lock (pthread_mutex_t *mutex) +{ + pthread_mutex_lock (mutex); +} + +static inline void hdb_database_unlock (pthread_mutex_t *mutex) +{ + pthread_mutex_unlock (mutex); +} +static inline void hdb_database_lock_init (pthread_mutex_t *mutex) +{ + pthread_mutex_init (mutex, NULL); +} + +static inline void hdb_database_lock_destroy (pthread_mutex_t *mutex) +{ + pthread_mutex_destroy (mutex); +} +#endif + +#define DECLARE_HDB_DATABASE(database_name)\ +static struct hdb_handle_database (database_name); \ +static void database_name##_init(void)__attribute__((constructor)); \ +static void database_name##_init(void) \ +{ \ + memset (&(database_name), 0, sizeof (struct hdb_handle_database));\ + hdb_database_lock_init (&(database_name).lock); \ +} + +#define DECLARE_HDB_DATABASE_FIRSTRUN(database_name) \ +static struct hdb_handle_database (database_name) = { \ + .first_run = 1, \ +}; \ +static void database_name##_init(void)__attribute__((constructor)); \ +static void database_name##_init(void) \ +{ \ + memset (&(database_name), 0, sizeof (struct hdb_handle_database));\ + hdb_database_lock_init (&(database_name).lock); \ +} + static inline void hdb_create ( struct hdb_handle_database *handle_database) { memset (handle_database, 0, sizeof (struct hdb_handle_database)); - pthread_mutex_init (&handle_database->mutex, NULL); + hdb_database_lock_init (&handle_database->lock); } static inline void hdb_destroy ( @@ -76,7 +142,7 @@ if (handle_database->handles) { free (handle_database->handles); } - pthread_mutex_destroy (&handle_database->mutex); + hdb_database_lock_destroy (&handle_database->lock); memset (handle_database, 0, sizeof (struct hdb_handle_database)); } @@ -93,7 +159,11 @@ void *instance; int i; - pthread_mutex_lock (&handle_database->mutex); + if (handle_database->first_run == 1) { + memset (handle_database, 0, sizeof (struct hdb_handle_database)); + hdb_database_lock_init (&handle_database->lock); + } + hdb_database_lock (&handle_database->lock); for (handle = 0; handle < handle_database->handle_count; handle++) { if (handle_database->handles[handle].state == HDB_HANDLE_STATE_EMPTY) { @@ -107,7 +177,7 @@ new_handles = (struct hdb_handle *)realloc (handle_database->handles, sizeof (struct hdb_handle) * handle_database->handle_count); if (new_handles == NULL) { - pthread_mutex_unlock (&handle_database->mutex); + hdb_database_unlock (&handle_database->lock); return (-1); } handle_database->handles = new_handles; @@ -143,7 +213,7 @@ *handle_id_out = (((unsigned long long)(check)) << 32) | handle; - pthread_mutex_unlock (&handle_database->mutex); + hdb_database_unlock (&handle_database->lock); return (0); } @@ -156,23 +226,23 @@ unsigned int check = ((unsigned int)(((unsigned long long)handle_in) >> 32)); unsigned int handle = handle_in & 0x; - pthread_mutex_lock (&handle_database->mutex); + hdb_database_l
Re: [Openais] detecting cpg joiners
On Thu, 2009-04-09 at 17:17 -0700, Joel Becker wrote: > On Thu, Apr 09, 2009 at 04:09:18PM -0500, David Teigland wrote: > > On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > > > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > > > On Thu, Apr 9, 2009 at 20:49, Joel Becker > > > > wrote: > > > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > > > >> even appear to have left (or rejoined) the cluster. > > > > >> At the next totem confchg event, It will simply just be there again > > > > >> with no indication that anything happened. > > > > > > > > > > ? ? ? ?This had BETTER not happen. > > > > > > > > It does, I've seen it enough times that Pacemaker has code to deal with > > > > it. > > > > > > I'd call that a serious flaw we need to get fixed. I'll see if I can > > > make it > > > happen here. > > > > That was pretty simple. > > - set token to 5 minutes > > - nodes 1,2,3,4 are cluster members and members of a cpg > > - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync > > - nodes 1,2,3 seem completely unaware that 4 ever went away > > > > When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think > > that > > a new fifth process/node is joining the cpg. The cpg on node 4 shows itself > > being added as a new fourth cpg member. > > Steve, > If node 4's old process went away, shouldn't we be getting a > 'leave' for that, rather than it persisting in the member list? > > Joel > I'd like to clear up that when Andrew talks about the membership not generating a leave event for totem processes in this scenario (which he integrates directly with), this is true. But cpg should generate a leave event. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, 2009-04-09 at 17:17 -0700, Joel Becker wrote: > On Thu, Apr 09, 2009 at 04:09:18PM -0500, David Teigland wrote: > > On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > > > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > > > On Thu, Apr 9, 2009 at 20:49, Joel Becker > > > > wrote: > > > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > > > >> even appear to have left (or rejoined) the cluster. > > > > >> At the next totem confchg event, It will simply just be there again > > > > >> with no indication that anything happened. > > > > > > > > > > ? ? ? ?This had BETTER not happen. > > > > > > > > It does, I've seen it enough times that Pacemaker has code to deal with > > > > it. > > > > > > I'd call that a serious flaw we need to get fixed. I'll see if I can > > > make it > > > happen here. > > > > That was pretty simple. > > - set token to 5 minutes > > - nodes 1,2,3,4 are cluster members and members of a cpg > > - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync > > - nodes 1,2,3 seem completely unaware that 4 ever went away > > > > When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think > > that > > a new fifth process/node is joining the cpg. The cpg on node 4 shows itself > > being added as a new fourth cpg member. > > Steve, > If node 4's old process went away, shouldn't we be getting a > 'leave' for that, rather than it persisting in the member list? > The issue that Dave is talking about I believe is described in the following bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=489451 The bugzilla is a little misleading. I think sync prior to this bug fix didn't work at all. IMO you should get a leave event for any process that leaves the process group independent of how totem works underneath. CPG should provide the guarantees you seek, and if it doesn't, it is defective. The only exception might be if the new process reuses the same PID since the pid/nodeid/group are the uniqifiers and if pid is the same, there is no way to detect the new process (and remove the old one). How it works in reality, i am not sure. Have you tried Dave's test case with a recent whitetank? Honza and I are working on a rework of the cpg service engine which should have correct behavior in whitetank and corosync when it is finished as well as fix race condition crashes. Regards -steve > Joel > ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 04:09:18PM -0500, David Teigland wrote: > On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > > >> even appear to have left (or rejoined) the cluster. > > > >> At the next totem confchg event, It will simply just be there again > > > >> with no indication that anything happened. > > > > > > > > ? ? ? ?This had BETTER not happen. > > > > > > It does, I've seen it enough times that Pacemaker has code to deal with > > > it. > > > > I'd call that a serious flaw we need to get fixed. I'll see if I can make > > it > > happen here. > > That was pretty simple. > - set token to 5 minutes > - nodes 1,2,3,4 are cluster members and members of a cpg > - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync > - nodes 1,2,3 seem completely unaware that 4 ever went away > > When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think that > a new fifth process/node is joining the cpg. The cpg on node 4 shows itself > being added as a new fourth cpg member. Steve, If node 4's old process went away, shouldn't we be getting a 'leave' for that, rather than it persisting in the member list? Joel -- "I don't want to achieve immortality through my work; I want to achieve immortality through not dying." - Woody Allen Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > >> For added fun, a node that restarts quickly enough (think a VM) won't > >> even appear to have left (or rejoined) the cluster. > >> At the next totem confchg event, It will simply just be there again > >> with no indication that anything happened. > > > > This had BETTER not happen. > > It does, I've seen it enough times that Pacemaker has code to deal with it. Andrew, I'm mad at you. This is death for filesystems. Next time, please let us know when the system is this bad :-) Joel -- "Ninety feet between bases is perhaps as close as man has ever come to perfection." - Red Smith Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 03:17:47PM -0700, Steven Dake wrote: > A proper system using this model doesn't care - it synchronizes every > time regardless of who left or joined based upon whether it has state to > sync that is unique. Dave, If we're going to use cpg for our membership, we need to come up with a scheme to detect these node downs. We probably should do this together, so we don't reinvent it. Joel -- "If you are ever in doubt as to whether or not to kiss a pretty girl, give her the benefit of the doubt" -Thomas Carlyle Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 03:17:47PM -0700, Steven Dake wrote: > You want a guarantee that virtual synchrony doesn't provide. Virtual > synchrony doesn't provide indications of join or left, but only the > current membership. It has no way of knowing who joined, or left other > then to take the previous membership list and compare it to the current. > Keep that in mind when looking at the joined and left list in your > callbacks. > > A proper system using this model doesn't care - it synchronizes every > time regardless of who left or joined based upon whether it has state to > sync that is unique. > > I was tempted long ago to remove the join and left lists from the > callbacks, since they don't really make any sense, but the community > said they could deal with this quirk. Hmm, I don't think any of us in the world of dlms realized this. You're providing the level-triggered case, and we mostly only care about the edges. ocfs2, for example, doesn't really care who the members are. It just needs to know when one died. And if we can't reliably detect that, we're dead in the water. Joel -- Life's Little Instruction Book #497 "Go down swinging." Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, 2009-04-09 at 16:09 -0500, David Teigland wrote: > On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > > >> even appear to have left (or rejoined) the cluster. > > > >> At the next totem confchg event, It will simply just be there again > > > >> with no indication that anything happened. > > > > > > > > ? ? ? ?This had BETTER not happen. > > > > > > It does, I've seen it enough times that Pacemaker has code to deal with > > > it. > > > > I'd call that a serious flaw we need to get fixed. I'll see if I can make > > it > > happen here. > > That was pretty simple. > - set token to 5 minutes > - nodes 1,2,3,4 are cluster members and members of a cpg > - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync > - nodes 1,2,3 seem completely unaware that 4 ever went away > > When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think that > a new fifth process/node is joining the cpg. The cpg on node 4 shows itself > being added as a new fourth cpg member. > > Dave > You want a guarantee that virtual synchrony doesn't provide. Virtual synchrony doesn't provide indications of join or left, but only the current membership. It has no way of knowing who joined, or left other then to take the previous membership list and compare it to the current. Keep that in mind when looking at the joined and left list in your callbacks. A proper system using this model doesn't care - it synchronizes every time regardless of who left or joined based upon whether it has state to sync that is unique. I was tempted long ago to remove the join and left lists from the callbacks, since they don't really make any sense, but the community said they could deal with this quirk. regards -steve > ___ > Openais mailing list > Openais@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > >> even appear to have left (or rejoined) the cluster. > > >> At the next totem confchg event, It will simply just be there again > > >> with no indication that anything happened. > > > > > > ? ? ? ?This had BETTER not happen. > > > > It does, I've seen it enough times that Pacemaker has code to deal with it. > > I'd call that a serious flaw we need to get fixed. I'll see if I can make it > happen here. Yeah, if this is the way it works, ocfs2's going to have to go drop openais, and I don't want to do that. Joel -- "All alone at the end of the evening When the bright lights have faded to blue. I was thinking about a woman who had loved me And I never knew" Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > >> even appear to have left (or rejoined) the cluster. > > >> At the next totem confchg event, It will simply just be there again > > >> with no indication that anything happened. > > > > > > ? ? ? ?This had BETTER not happen. > > > > It does, I've seen it enough times that Pacemaker has code to deal with it. > > I'd call that a serious flaw we need to get fixed. I'll see if I can make it > happen here. That was pretty simple. - set token to 5 minutes - nodes 1,2,3,4 are cluster members and members of a cpg - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync - nodes 1,2,3 seem completely unaware that 4 ever went away When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think that a new fifth process/node is joining the cpg. The cpg on node 4 shows itself being added as a new fourth cpg member. Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH]: openais/trunk: Fix saLck dispatch code
Same as other patches for service dispatch code. Hold lock on dispatch_mutex only during coroipcc_dispatch_recv. Handle finalize/dispatch_avail correctly. Ryan Index: lib/lck.c === --- lib/lck.c (revision 1787) +++ lib/lck.c (working copy) @@ -363,16 +363,13 @@ } do { + pthread_mutex_lock(&lckInstance->dispatch_mutex); + dispatch_avail = coroipcc_dispatch_recv (lckInstance->ipc_ctx, (void *)&dispatch_data, sizeof (dispatch_data), timeout); - pthread_mutex_lock(&lckInstance->dispatch_mutex); + pthread_mutex_unlock(&lckInstance->dispatch_mutex); - if (lckInstance->finalize == 1) { - error = SA_AIS_OK; - goto error_unlock; - } - if (dispatch_avail == 0 && dispatchFlags == SA_DISPATCH_ALL) { pthread_mutex_unlock(&lckInstance->dispatch_mutex); break; /* exit do while cont is 1 loop */ @@ -381,7 +378,15 @@ pthread_mutex_unlock(&lckInstance->dispatch_mutex); continue; } - + if (dispatch_avail == -1) { + if (lckInstance->finalize == 1) { + error = SA_AIS_OK; + } else { + error = SA_AIS_ERR_LIBRARY; + } + goto error_put; + } + /* * Make copy of callbacks, message data, unlock instance, * and call callback. A risk of this dispatch method is that @@ -389,7 +394,7 @@ * LckFinalize has been called in another thread. */ memcpy(&callbacks,&lckInstance->callbacks, sizeof(lckInstance->callbacks)); - pthread_mutex_unlock(&lckInstance->dispatch_mutex); + /* * Dispatch incoming response */ @@ -527,7 +532,7 @@ */ switch (dispatchFlags) { case SA_DISPATCH_ONE: - cont = 0; + cont = 0; break; case SA_DISPATCH_ALL: break; @@ -535,8 +540,8 @@ break; } } while (cont); -error_unlock: - pthread_mutex_unlock(&lckInstance->dispatch_mutex); + +error_put: saHandleInstancePut(&lckHandleDatabase, lckHandle); error_exit: return (error); ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH]: openais/trunk: Fix saTmr dispatch code
Same as other patches for service dispatch code. Hold lock on dispatch_mutex only during coroipcc_dispatch_recv. Handle finalize/dispatch_avail correctly. Ryan Index: lib/tmr.c === --- lib/tmr.c (revision 1787) +++ lib/tmr.c (working copy) @@ -226,32 +226,33 @@ } do { + pthread_mutex_lock (&tmrInstance->dispatch_mutex); + dispatch_avail = coroipcc_dispatch_recv (tmrInstance->ipc_ctx, (void *)&dispatch_data, sizeof (dispatch_data), timeout); - pthread_mutex_lock (&tmrInstance->dispatch_mutex); + pthread_mutex_unlock (&tmrInstance->dispatch_mutex); - if (tmrInstance->finalize == 1) { - error = SA_AIS_OK; - goto error_unlock; - } - if (dispatch_avail == 0 && dispatchFlags == SA_DISPATCH_ALL) { - pthread_mutex_unlock (&tmrInstance->dispatch_mutex); break; } else if (dispatch_avail == 0) { - pthread_mutex_unlock (&tmrInstance->dispatch_mutex); continue; } + if (dispatch_avail == -1) { + if (tmrInstance->finalize == 1) { + error = SA_AIS_OK; + } else { + error = SA_AIS_ERR_LIBRARY; + } + goto error_put; + } memset (&dispatch_data, 0, sizeof (struct message_overlay)); memcpy (&callbacks, &tmrInstance->callbacks, sizeof (tmrInstance->callbacks)); - pthread_mutex_unlock (&tmrInstance->dispatch_mutex); - /* DEBUG */ printf ("[DEBUG]: saTmrDispatch { id = %d }\n", dispatch_data.header.id); @@ -288,8 +289,7 @@ } } while (cont); -error_unlock: - pthread_mutex_unlock (&tmrInstance->dispatch_mutex); +error_put: saHandleInstancePut (&tmrHandleDatabase, tmrHandle); error_exit: return (error); ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH]: openais/trunk: Fix saEvt dispatch code
Also, this patch changes how/when timeout is set to 0. Existing code would set timeout=0 when dispatchFlags was set to SA_DISPATCH_ALL -or- SA_DISPATCH_ONE. This is not correct. We should only set timeout=0 when dispatchFlags == SA_DISPATCH_ALL. This is worth nothing, and I failed to mention it in my original email. Ryan On Thu, Apr 09, 2009 at 03:47:03PM -0500, Ryan O'Hara wrote: > > This patch fixes a few issues with the event service dispatch code. It > is similar to the patch I posted eariler for the saAmf service. > > Fixes include: > > * Change checking of dispatchFlags to be same as other services. > * Wrap coroipcc_dispatch_recv with pthread_mutex_lock/unlock. As a > result, remove other unlocks. Note that it appears that existing > code never called lock on dispatch_mutex. > * Change finalize check to only occur when dispatch_avail == -1. > * Remove ignore_dispatch variable, which wasn't used. > > Index: lib/evt.c > === > --- lib/evt.c (revision 1787) > +++ lib/evt.c (working copy) > @@ -597,48 +597,53 @@ > struct event_instance *evti; > SaEvtEventHandleT event_handle; > SaEvtCallbacksT callbacks; > - int ignore_dispatch = 0; > int cont = 1; /* always continue do loop except when set to 0 */ > struct lib_event_data *evt = 0; > > - if (dispatchFlags < SA_DISPATCH_ONE || > - dispatchFlags > SA_DISPATCH_BLOCKING) { > + if (dispatchFlags != SA_DISPATCH_ONE && > + dispatchFlags != SA_DISPATCH_ALL && > + dispatchFlags != SA_DISPATCH_BLOCKING) { > + > return SA_AIS_ERR_INVALID_PARAM; > } > > error = saHandleInstanceGet(&evt_instance_handle_db, evtHandle, > (void *)&evti); > if (error != SA_AIS_OK) { > - return error; > + goto dispatch_exit; > } > > /* > - * Timeout instantly for SA_DISPATCH_ALL > + * Timeout instantly for SA_DISPATCH_ALL, otherwise don't timeout > + * for SA_DISPATCH_BLOCKING or SA_DISPATCH_ONE >*/ > - if (dispatchFlags == SA_DISPATCH_ALL || dispatchFlags == > SA_DISPATCH_ONE) { > + if (dispatchFlags == SA_DISPATCH_ALL) { > timeout = 0; > } > > do { > + pthread_mutex_lock (&evti->ei_dispatch_mutex); > + > dispatch_avail = coroipcc_dispatch_recv (evti->ipc_ctx, > (void *)&evti->ei_dispatch_data, sizeof > (evti->ei_dispatch_data), timeout); > > - /* > - * Handle has been finalized in another thread > - */ > - if (evti->ei_finalize == 1) { > - error = SA_AIS_OK; > - goto dispatch_unlock; > - } > + pthread_mutex_unlock (&evti->ei_dispatch_mutex); > > if (dispatch_avail == 0 && dispatchFlags == SA_DISPATCH_ALL) { > - pthread_mutex_unlock (&evti->ei_dispatch_mutex); > break; /* exit do while cont is 1 loop */ > } else > if (dispatch_avail == 0) { > - pthread_mutex_unlock (&evti->ei_dispatch_mutex); > - continue; /* next poll */ > + continue; > } > + if (dispatch_avail == -1) { > + if (evti->ei_finalize == 1) { > + error = SA_AIS_OK; > + } else { > + error = SA_AIS_ERR_LIBRARY; > + } > + goto dispatch_put; > + } > + > > /* >* Make copy of callbacks, message data, unlock instance, > @@ -732,26 +737,16 @@ > default: > printf ("Dispatch: Bad message type 0x%x\n", > evti->ei_dispatch_data.header.id); > error = SA_AIS_ERR_LIBRARY; > - goto dispatch_unlock; > } > > - pthread_mutex_unlock(&evti->ei_dispatch_mutex); > - > /* >* Determine if more messages should be processed >*/ > switch (dispatchFlags) { > case SA_DISPATCH_ONE: > - if (ignore_dispatch) { > - ignore_dispatch = 0; > - } else { > - cont = 0; > - } > + cont = 0; > break; > case SA_DISPATCH_ALL: > - if (ignore_dispatch) { > - ignore_dispatch = 0; > - } > break; > case SA_DISPATCH_BLOCKING: > break; > @@ -760,10 +755,9 @@ > > goto dispatch_put; > > -dispatch_unlock: > - pthread_mutex_unlock(&evti->ei_dispatch_mutex); > dispatch_put:
Re: [Openais] API change vestiges
Jim Meyering wrote: > Here are a few more. > >>From 052f43f2c3ec1a1a7d6a2e9038ee1fb0e7d222e9 Mon Sep 17 00:00:00 2001 > From: Jim Meyering > Date: Thu, 2 Apr 2009 22:18:51 +0200 > Subject: [PATCH 1/2] cpg.h, objdb.h, coroaph.h: more const/size_t > > * include/corosync/cpg.h (cpg_callbacks_t): > * include/corosync/mar_cpg.h (marshall_to_mar_cpg_name_t): > * lib/cpg.c (cpg_join, cpg_leave): > * lib/cpg.c (cpg_mcast_joined): make iovec const. > * include/corosync/cpg.h (cpg_mcast_joined): update prototype ... >>From 3b8203561bb4709e4151b6bc456d27ec9d4e5f95 Mon Sep 17 00:00:00 2001 > From: Jim Meyering > Date: Thu, 9 Apr 2009 19:51:30 +0200 > Subject: [PATCH 2/2] coroapi.h: Make totem_mcast's *iovec param const. > > * include/corosync/engine/coroapi.h (struct corosync_api_v1): > [totem_mcast]: Make *iovec param const. > --- > exec/main.c |2 +- > exec/main.h |4 ++-- > include/corosync/engine/coroapi.h |4 ++-- Committed after ACK on IRC. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > >> For added fun, a node that restarts quickly enough (think a VM) won't > >> even appear to have left (or rejoined) the cluster. > >> At the next totem confchg event, It will simply just be there again > >> with no indication that anything happened. > > > > ? ? ? ?This had BETTER not happen. > > It does, I've seen it enough times that Pacemaker has code to deal with it. I'd call that a serious flaw we need to get fixed. I'll see if I can make it happen here. Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH]: openais/trunk: Fix saEvt dispatch code
This patch fixes a few issues with the event service dispatch code. It is similar to the patch I posted eariler for the saAmf service. Fixes include: * Change checking of dispatchFlags to be same as other services. * Wrap coroipcc_dispatch_recv with pthread_mutex_lock/unlock. As a result, remove other unlocks. Note that it appears that existing code never called lock on dispatch_mutex. * Change finalize check to only occur when dispatch_avail == -1. * Remove ignore_dispatch variable, which wasn't used. Index: lib/evt.c === --- lib/evt.c (revision 1787) +++ lib/evt.c (working copy) @@ -597,48 +597,53 @@ struct event_instance *evti; SaEvtEventHandleT event_handle; SaEvtCallbacksT callbacks; - int ignore_dispatch = 0; int cont = 1; /* always continue do loop except when set to 0 */ struct lib_event_data *evt = 0; - if (dispatchFlags < SA_DISPATCH_ONE || - dispatchFlags > SA_DISPATCH_BLOCKING) { + if (dispatchFlags != SA_DISPATCH_ONE && + dispatchFlags != SA_DISPATCH_ALL && + dispatchFlags != SA_DISPATCH_BLOCKING) { + return SA_AIS_ERR_INVALID_PARAM; } error = saHandleInstanceGet(&evt_instance_handle_db, evtHandle, (void *)&evti); if (error != SA_AIS_OK) { - return error; + goto dispatch_exit; } /* -* Timeout instantly for SA_DISPATCH_ALL +* Timeout instantly for SA_DISPATCH_ALL, otherwise don't timeout +* for SA_DISPATCH_BLOCKING or SA_DISPATCH_ONE */ - if (dispatchFlags == SA_DISPATCH_ALL || dispatchFlags == SA_DISPATCH_ONE) { + if (dispatchFlags == SA_DISPATCH_ALL) { timeout = 0; } do { + pthread_mutex_lock (&evti->ei_dispatch_mutex); + dispatch_avail = coroipcc_dispatch_recv (evti->ipc_ctx, (void *)&evti->ei_dispatch_data, sizeof (evti->ei_dispatch_data), timeout); - /* -* Handle has been finalized in another thread -*/ - if (evti->ei_finalize == 1) { - error = SA_AIS_OK; - goto dispatch_unlock; - } + pthread_mutex_unlock (&evti->ei_dispatch_mutex); if (dispatch_avail == 0 && dispatchFlags == SA_DISPATCH_ALL) { - pthread_mutex_unlock (&evti->ei_dispatch_mutex); break; /* exit do while cont is 1 loop */ } else if (dispatch_avail == 0) { - pthread_mutex_unlock (&evti->ei_dispatch_mutex); - continue; /* next poll */ + continue; } + if (dispatch_avail == -1) { + if (evti->ei_finalize == 1) { + error = SA_AIS_OK; + } else { + error = SA_AIS_ERR_LIBRARY; + } + goto dispatch_put; + } + /* * Make copy of callbacks, message data, unlock instance, @@ -732,26 +737,16 @@ default: printf ("Dispatch: Bad message type 0x%x\n", evti->ei_dispatch_data.header.id); error = SA_AIS_ERR_LIBRARY; - goto dispatch_unlock; } - pthread_mutex_unlock(&evti->ei_dispatch_mutex); - /* * Determine if more messages should be processed */ switch (dispatchFlags) { case SA_DISPATCH_ONE: - if (ignore_dispatch) { - ignore_dispatch = 0; - } else { - cont = 0; - } + cont = 0; break; case SA_DISPATCH_ALL: - if (ignore_dispatch) { - ignore_dispatch = 0; - } break; case SA_DISPATCH_BLOCKING: break; @@ -760,10 +755,9 @@ goto dispatch_put; -dispatch_unlock: - pthread_mutex_unlock(&evti->ei_dispatch_mutex); dispatch_put: saHandleInstancePut(&evt_instance_handle_db, evtHandle); +dispatch_exit: return error; } ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 9, 2009 at 19:15, David Teigland wrote: > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: >> For added fun, a node that restarts quickly enough (think a VM) won't >> even appear to have left (or rejoined) the cluster. >> At the next totem confchg event, It will simply just be there again >> with no indication that anything happened. >> >> At least this is true for the raw corosync/openais membership data, >> perhaps CPG can infer this some other way. > > Cpg should not let a node go away and come back without notice. In practice > I'd expect back to back confchg's: one showing it leave and another showing it > join. If you mean the raw confchg's that lcrsos see, then nope. Try this, set token: to longer than your node takes to reboot and reboot a node. For physical nodes this isn't a realistic scenario, but VMs can easily boot in 10 seconds or so. > As Chrissie mentioned earlier, cpg shouldn't show the same node both > leaving and joining in a single confchg. In theory I think it would be > legitimate. > > Consider a couple examples. > m: member list, j: joined list, l: left list > > 1. nodes A and B join at once > A gets confchg: m=A,B j=A,B l= > B gets confchg: m=A,B j=A,B l= > > 2. node C joins > A gets confchg: m=A,B,C j=C l= > B gets confchg: m=A,B,C j=C l= > C gets confchg: m=A,B,C j=C l= > > 3. node C leaves and quickly rejoins in a single confchg > A gets confchg: m=A,B,C j=C l=C > B gets confchg: m=A,B,C j=C l=C > C gets confchg: m=A,B,C j=C l=C > > 4. node D joins and quickly leaves (or fails) in a single confchg > A gets confchg: m=A,B,C j=D l=D > B gets confchg: m=A,B,C j=D l=D > C gets confchg: m=A,B,C j=D l=D > D gets confchg: m=A,B,C j=D l=D ?* > > * if D does a quick join+leave it may expect to see this confchg showing it in > the joined list, the left list, and not in the member list. > > Again, the examples in 3 and 4 are, I think, legitimate in theory. In > practice it sounds like they won't occur. > > If a quick leave+join is guaranteed to be visible through cpg, then it must be > possible to observe at the lower level from raw corosync data. > > Dave > > ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: >> For added fun, a node that restarts quickly enough (think a VM) won't >> even appear to have left (or rejoined) the cluster. >> At the next totem confchg event, It will simply just be there again >> with no indication that anything happened. > > This had BETTER not happen. It does, I've seen it enough times that Pacemaker has code to deal with it. > If it does, we can't recover the > dead+restarted node, and our filesystems are going to corrupt all the > time. > > Joel > > -- > > "If you are ever in doubt as to whether or not to kiss a pretty girl, > give her the benefit of the doubt" > -Thomas Carlyle > > Joel Becker > Principal Software Developer > Oracle > E-mail: joel.bec...@oracle.com > Phone: (650) 506-8127 > ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] howto distribute data accross all nodes?
> > Ah, that probably works. But can lead to very high memory usage if > traffic > > is high. > > If that's a problem you could block normal activity during the sync > period. wow. that 'virtual synchrony' sound nice first, but gets incredible complex soon ;-) > > > Is somebody really using that? If so, is there some code available > > (for safe/replay)? > > There is no general purpose code. dlm_controld is an example of a > program > doing something like this, http://git.fedorahosted.org/git/dlm.git > > It uses cpg to replicate state of posix locks, uses checkpoints to sync > existing lock state to new nodes, and saves messages on a new node > until it > has completed syncing (i.e. reading pre-existing state from the > checkpoint.) Thanks for that link, Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH]: openais/trunk: Fix handle leak in saMsg service
good for merge regards -steve On Thu, 2009-04-09 at 10:17 -0500, Ryan O'Hara wrote: > This is exactly the same fix as I found in the checkpoint service and > Steve fixed. When dispatch_avail == -1, we must call > saHandleInstancePut. > > ___ > Openais mailing list > Openais@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH]: openais/trunk: Fix saAmf dispatch code
good for merge On Thu, 2009-04-09 at 11:23 -0500, Ryan O'Hara wrote: > This patch fixed what I believe to be a few problems with the saAmf > service's dispatch code. The few things I changed in this patch seem > to be fallout from the ipc changes that went in a while back. > > First, we only need to hold a lock on the dispatch_mutex while calling > coroipcc_dispatch_recv. This removes the need to call > pthread_mutex_unlock in a few other places, and allows us to remove > error_unlock. > > Also put finalize == 1 check under dispatch_avail == -1. > > ___ > Openais mailing list > Openais@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] howto distribute data accross all nodes?
On Thu, Apr 09, 2009 at 09:00:08PM +0200, Dietmar Maurer wrote: > > If new, normal read/write messages to the replicated state continue while > > the new node is syncing the pre-existing state, the new node needs to save > > those operations to apply after it's synced. > > Ah, that probably works. But can lead to very high memory usage if traffic > is high. If that's a problem you could block normal activity during the sync period. > Is somebody really using that? If so, is there some code available > (for safe/replay)? There is no general purpose code. dlm_controld is an example of a program doing something like this, http://git.fedorahosted.org/git/dlm.git It uses cpg to replicate state of posix locks, uses checkpoints to sync existing lock state to new nodes, and saves messages on a new node until it has completed syncing (i.e. reading pre-existing state from the checkpoint.) Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] howto distribute data accross all nodes?
> 1. Have an old cpg member (e.g. the one with the lowest nodeid) send > messages > containing the state to the new node after it's joined. These "sync > messages" > are separate from the messages used to read/write the replicated state > during > normal operation. This is not bullet proof. State can change while joining node is not 100% synced, resulting in endless "sync messages"? > 2. Have an old cpg member write all the state to a checkpoint (see > saCkpt) > when a node joins, it sends a message to the new node when it's done > writing > indicating that the checkpoint is ready, and the new node then reads > the state > from the checkpoint. same problem as above. > There are probably other ways of doing this as well. > > If new, normal read/write messages to the replicated state continue > while the > new node is syncing the pre-existing state, the new node needs to save > those > operations to apply after it's synced. Ah, that probably works. But can lead to very high memory usage if traffic is high. Is somebody really using that? If so, is there some code available (for safe/replay)? - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 08:37:00AM +0100, Chrissie Caulfield wrote: > 1) If member_count == join count, then it's a safe bet that they are all > new nodes, and yes , it is true that all nodes should see the same > confchg messages > > 2) if join_count > 0 then leave_count will always be zero. That's a > consequence of how CPG sends its messages really, join and leave > messages are always separate. Don't rely on this behaviour though! > Although I can't see any reason to change it, I'd rather not have it > burned into the defacto specification. I agree we shouldn't rely on this. I'm just more concerned that if there is member_count==join_count and leave_count>0, we can rely on members == joiners, and thus treat it as a newly created group (all members are in the "just joined" state). Joel -- "War doesn't determine who's right; war determines who's left." Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > For added fun, a node that restarts quickly enough (think a VM) won't > even appear to have left (or rejoined) the cluster. > At the next totem confchg event, It will simply just be there again > with no indication that anything happened. This had BETTER not happen. If it does, we can't recover the dead+restarted node, and our filesystems are going to corrupt all the time. Joel -- "If you are ever in doubt as to whether or not to kiss a pretty girl, give her the benefit of the doubt" -Thomas Carlyle Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] API change vestiges
Here are a few more. >From 052f43f2c3ec1a1a7d6a2e9038ee1fb0e7d222e9 Mon Sep 17 00:00:00 2001 From: Jim Meyering Date: Thu, 2 Apr 2009 22:18:51 +0200 Subject: [PATCH 1/2] cpg.h, objdb.h, coroaph.h: more const/size_t * include/corosync/cpg.h (cpg_callbacks_t): * include/corosync/mar_cpg.h (marshall_to_mar_cpg_name_t): * lib/cpg.c (cpg_join, cpg_leave): * lib/cpg.c (cpg_mcast_joined): make iovec const. * include/corosync/cpg.h (cpg_mcast_joined): update prototype ... --- exec/objdb.c |4 ++-- include/corosync/cpg.h|8 include/corosync/engine/coroapi.h |5 +++-- include/corosync/engine/objdb.h |4 ++-- lib/cpg.c |8 5 files changed, 15 insertions(+), 14 deletions(-) diff --git a/exec/objdb.c b/exec/objdb.c index 609fe5d..d04f959 100644 --- a/exec/objdb.c +++ b/exec/objdb.c @@ -673,7 +673,7 @@ static int object_destroy ( static int object_valid_set ( hdb_handle_t object_handle, struct object_valid *object_valid_list, - unsigned int object_valid_list_entries) + size_t object_valid_list_entries) { struct object_instance *instance; unsigned int res; @@ -701,7 +701,7 @@ error_exit: static int object_key_valid_set ( hdb_handle_t object_handle, struct object_key_valid *object_key_valid_list, - unsigned int object_key_valid_list_entries) + size_t object_key_valid_list_entries) { struct object_instance *instance; unsigned int res; diff --git a/include/corosync/cpg.h b/include/corosync/cpg.h index 6e849d3..8093dfb 100644 --- a/include/corosync/cpg.h +++ b/include/corosync/cpg.h @@ -130,7 +130,7 @@ cs_error_t cpg_fd_get ( cpg_handle_t handle, int *fd); -/* +/* * Get and set contexts for a CPG handle */ cs_error_t cpg_context_get ( @@ -157,14 +157,14 @@ cs_error_t cpg_dispatch ( */ cs_error_t cpg_join ( cpg_handle_t handle, - struct cpg_name *group); + const struct cpg_name *group); /* * Leave one or more groups */ cs_error_t cpg_leave ( cpg_handle_t handle, - struct cpg_name *group); + const struct cpg_name *group); /* * Multicast to groups joined with cpg_join. @@ -174,7 +174,7 @@ cs_error_t cpg_leave ( cs_error_t cpg_mcast_joined ( cpg_handle_t handle, cpg_guarantee_t guarantee, - struct iovec *iovec, + const struct iovec *iovec, unsigned int iov_len); /* diff --git a/include/corosync/engine/coroapi.h b/include/corosync/engine/coroapi.h index 6f6c69d..2537557 100644 --- a/include/corosync/engine/coroapi.h +++ b/include/corosync/engine/coroapi.h @@ -244,12 +244,12 @@ struct corosync_api_v1 { int (*object_valid_set) ( hdb_handle_t object_handle, struct object_valid *object_valid_list, - unsigned int object_valid_list_entries); + size_t object_valid_list_entries); int (*object_key_valid_set) ( hdb_handle_t object_handle, struct object_key_valid *object_key_valid_list, - unsigned int object_key_valid_list_entries); + size_t object_key_valid_list_entries); int (*object_find_create) ( hdb_handle_t parent_object_handle, @@ -415,6 +415,7 @@ struct corosync_api_v1 { int (*totem_ring_reenable) (void); + /* FIXME: const iovec? */ int (*totem_mcast) (struct iovec *iovec, unsigned int iov_len, unsigned int guarantee); int (*totem_ifaces_get) ( diff --git a/include/corosync/engine/objdb.h b/include/corosync/engine/objdb.h index 56cd8a7..ce401a9 100644 --- a/include/corosync/engine/objdb.h +++ b/include/corosync/engine/objdb.h @@ -119,12 +119,12 @@ struct objdb_iface_ver0 { int (*object_valid_set) ( hdb_handle_t object_handle, struct object_valid *object_valid_list, - unsigned int object_valid_list_entries); + size_t object_valid_list_entries); int (*object_key_valid_set) ( hdb_handle_t object_handle, struct object_key_valid *object_key_valid_list, - unsigned int object_key_valid_list_entries); + size_t object_key_valid_list_entries); int (*object_find_create) ( hdb_handle_t parent_object_handle, diff --git a/lib/cpg.c b/lib/cpg.c index 74b0e2f..5ac3555 100644 --- a/lib/cpg.c +++ b/lib/cpg.c @@ -1,7 +1,7 @@ /* * vi: set autoindent tabstop=4 shiftwidth=4 : * - * Copyright (c) 2006-2008 Red Hat, Inc. + * Copyright (c) 2006-2009 Red Hat, Inc. * * All rights reserved. * @@ -386,7 +386,7 @@ error_put: cs_error_t cpg_join ( cpg_handle_t handle, -struct cpg_name *group) +const struct cpg_name *group) { cs_error_t error; struct cpg_inst *cpg_inst; @@ -449,7 +449,7 @@ error_exit: cs_error_t cpg_leave ( cpg
Re: [Openais] howto distribute data accross all nodes?
On Thu, Apr 09, 2009 at 05:49:49PM +0200, Dietmar Maurer wrote: > > > > need for locks. An example of why not is creation of a resource > > > called > > > > "datasetA". > > > > > > > > 3 nodes: > > > > node A sends "create datasetA" > > > > node B sends "create datasetA" > > > > node C sends "create datasetA" > > > > > > > > Only one of those nodes create dataset will arrive first. The > > > > remainder > > > > will arrive second and third. Also, vs requires that each node > > sends > > > > in > > > > the same order so it may be something like on all nodes: > > > > B received, C received, A received. > > > > > > > > In this case, B creates the dataset, C says "dataset exists" A > says > > > > "dataset exists". All nodes see this same ordering > > But how does a node gets its initial state?? When a node joins it does > not know the > state of the other nodes, but it receives state change messages from > other nodes. A distributed state machine only works if a node knows the > state before joining the group. > > Is there a 'standard' solution to that problem? There is no standard solution AFAIK. I've done it two different ways: 1. Have an old cpg member (e.g. the one with the lowest nodeid) send messages containing the state to the new node after it's joined. These "sync messages" are separate from the messages used to read/write the replicated state during normal operation. 2. Have an old cpg member write all the state to a checkpoint (see saCkpt) when a node joins, it sends a message to the new node when it's done writing indicating that the checkpoint is ready, and the new node then reads the state from the checkpoint. There are probably other ways of doing this as well. If new, normal read/write messages to the replicated state continue while the new node is syncing the pre-existing state, the new node needs to save those operations to apply after it's synced. Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > For added fun, a node that restarts quickly enough (think a VM) won't > even appear to have left (or rejoined) the cluster. > At the next totem confchg event, It will simply just be there again > with no indication that anything happened. > > At least this is true for the raw corosync/openais membership data, > perhaps CPG can infer this some other way. Cpg should not let a node go away and come back without notice. In practice I'd expect back to back confchg's: one showing it leave and another showing it join. As Chrissie mentioned earlier, cpg shouldn't show the same node both leaving and joining in a single confchg. In theory I think it would be legitimate. Consider a couple examples. m: member list, j: joined list, l: left list 1. nodes A and B join at once A gets confchg: m=A,B j=A,B l= B gets confchg: m=A,B j=A,B l= 2. node C joins A gets confchg: m=A,B,C j=C l= B gets confchg: m=A,B,C j=C l= C gets confchg: m=A,B,C j=C l= 3. node C leaves and quickly rejoins in a single confchg A gets confchg: m=A,B,C j=C l=C B gets confchg: m=A,B,C j=C l=C C gets confchg: m=A,B,C j=C l=C 4. node D joins and quickly leaves (or fails) in a single confchg A gets confchg: m=A,B,C j=D l=D B gets confchg: m=A,B,C j=D l=D C gets confchg: m=A,B,C j=D l=D D gets confchg: m=A,B,C j=D l=D ?* * if D does a quick join+leave it may expect to see this confchg showing it in the joined list, the left list, and not in the member list. Again, the examples in 3 and 4 are, I think, legitimate in theory. In practice it sounds like they won't occur. If a quick leave+join is guaranteed to be visible through cpg, then it must be possible to observe at the lower level from raw corosync data. Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH]: openais/trunk: Fix saAmf dispatch code
This patch fixed what I believe to be a few problems with the saAmf service's dispatch code. The few things I changed in this patch seem to be fallout from the ipc changes that went in a while back. First, we only need to hold a lock on the dispatch_mutex while calling coroipcc_dispatch_recv. This removes the need to call pthread_mutex_unlock in a few other places, and allows us to remove error_unlock. Also put finalize == 1 check under dispatch_avail == -1. Index: lib/amf.c === --- lib/amf.c (revision 1787) +++ lib/amf.c (working copy) @@ -202,38 +202,39 @@ error = saHandleInstanceGet (&amfHandleDatabase, amfHandle, (void *)&amfInstance); if (error != SA_AIS_OK) { - return (error); + goto error_exit; } /* -* Timeout instantly for SA_DISPATCH_ALL +* Timeout instantly for SA_DISPATCH_ALL, otherwise don't timeout +* for SA_DISPATCH_BLOCKING or SA_DISPATCH_ONE */ if (dispatchFlags == SA_DISPATCH_ALL) { timeout = 0; } do { + pthread_mutex_lock (&amfInstance->dispatch_mutex); + dispatch_avail = coroipcc_dispatch_recv (amfInstance->ipc_ctx, (void *)&dispatch_data, sizeof (dispatch_data), timeout); - pthread_mutex_lock (&amfInstance->dispatch_mutex); + pthread_mutex_unlock (&amfInstance->dispatch_mutex); - /* -* Handle has been finalized in another thread -*/ - if (amfInstance->finalize == 1) { - error = SA_AIS_OK; - goto error_unlock; - } - if (dispatch_avail == 0 && dispatchFlags == SA_DISPATCH_ALL) { - pthread_mutex_unlock (&amfInstance->dispatch_mutex); break; /* exit do while cont is 1 loop */ } else if (dispatch_avail == 0) { - pthread_mutex_unlock (&amfInstance->dispatch_mutex); - continue; /* next poll */ + continue; } + if (dispatch_avail == -1) { + if (amfInstance->finalize == 1) { + error = SA_AIS_OK; + } else { + error = SA_AIS_ERR_LIBRARY; + } + goto error_put; + } /* * Make copy of callbacks, message data, unlock instance, and call callback @@ -242,7 +243,6 @@ */ memcpy (&callbacks, &amfInstance->callbacks, sizeof (SaAmfCallbacksT)); - pthread_mutex_unlock (&amfInstance->dispatch_mutex); /* * Dispatch incoming response @@ -355,11 +355,9 @@ } } while (cont); -error_unlock: - pthread_mutex_unlock (&amfInstance->dispatch_mutex); error_put: saHandleInstancePut (&amfHandleDatabase, amfHandle); - +error_exit: return (error); } ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] howto distribute data accross all nodes?
> > > need for locks. An example of why not is creation of a resource > > called > > > "datasetA". > > > > > > 3 nodes: > > > node A sends "create datasetA" > > > node B sends "create datasetA" > > > node C sends "create datasetA" > > > > > > Only one of those nodes create dataset will arrive first. The > > > remainder > > > will arrive second and third. Also, vs requires that each node > sends > > > in > > > the same order so it may be something like on all nodes: > > > B received, C received, A received. > > > > > > In this case, B creates the dataset, C says "dataset exists" A says > > > "dataset exists". All nodes see this same ordering But how does a node gets its initial state?? When a node joins it does not know the state of the other nodes, but it receives state change messages from other nodes. A distributed state machine only works if a node knows the state before joining the group. Is there a 'standard' solution to that problem? - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH]: openais/trunk: Fix handle leak in saMsg service
This is exactly the same fix as I found in the checkpoint service and Steve fixed. When dispatch_avail == -1, we must call saHandleInstancePut. Index: lib/msg.c === --- lib/msg.c (revision 1786) +++ lib/msg.c (working copy) @@ -275,11 +275,9 @@ } else { error = SA_AIS_ERR_LIBRARY; } - goto error_exit; + goto error_put; } - /* memset (&dispatch_data, 0, sizeof (struct message_overlay)); */ - memcpy (&callbacks, &msgInstance->callbacks, sizeof (msgInstance->callbacks)); ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] another big batch of API changes
Jim Meyering wrote: > Here is a tiny API change, along with the many changes it induces. > > 0001 is just something I saw along the way. > 0002 is the tiny change that the adjustments in all of the following. > > Here's that tiny diff: > > diff --git a/include/corosync/engine/coroapi.h ... > ... > struct corosync_lib_handler { > - void (*lib_handler_fn) (void *conn, void *msg); > + void (*lib_handler_fn) (void *conn, const void *msg); > > The following changes pick up the pieces. > > $ wc -l 00* : > 6 -api >38 0001-testevsth.c-const-size_t-evs_deliver_fn-evs_confc.patch >29 0002-coroapi.h-change-lib_handler_fn-s-msg-to-be-const.patch > 585 0003-Propagate-the-above-into-cfg.c-and-votequorum.c.patch >65 0004-Propagate-the-above-into-vsf_quorum.c.patch > 157 0005-propagate-to-evc.c.patch >37 0006--services-pload.c-Likewise.patch >27 0007-coroipcs.h-update-signature-of-coroipcs_handler_fn_.patch > 167 0008--services-cpg.c-Likewise.patch > 304 0009--services-confdb.c-Likewise.patch > 1415 total I've committed that as two change sets: - 0001, since it is independent - 0002-9, since they're tied together ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
Robert Wipfel wrote: On 4/9/2009 at 5:50 AM, in message > <26ef5e70904090450s40e92dcfgea0fc34826360...@mail.gmail.com>, Andrew Beekhof > wrote: >> On Thu, Apr 9, 2009 at 09:37, Chrissie Caulfield wrote: >>> Joel Becker wrote: Steve, Dave, etc, Someone told me a while back that a node joining a cpg group would be by its lonesome in the join message. That is, when the node gets its first confchg, it will be the only node in the list of joins. I've been using this to detect the first joiner of the group ("I joined, and the member count is 1"). Dave's since told me that this assumption is not valid (if it ever was). So two or more nodes can join in parallel, and each can see more than node in the list of joins for its first confchg. I'm now trying to figure out an algorithm for "first joiner". I have a couple of questions: 1) If I see member_count == join_count, does that mean every member has just joinded, and all the members are receiving the same join message? 2) If member_count == join_count, can leave_count be non-zero? If it is, am I guaranteed that we're looking at "all old members left, all new members joined"? If these both are true, I can simply isolate a "first joiner" by checking member_count == join_count and selecting the lowest node number. >>> >>> I don't think you can detect a first-joiner using CPG. cman does it by >>> reading the totem confchg messages. It is quite possible for two nodes >>> to join at the same time ... during the same SYNC phase so you certainly >>> can't rely on that. >>> >>> 1) If member_count == join count, then it's a safe bet that they are all >>> new nodes, and yes , it is true that all nodes should see the same >>> confchg messages >>> >>> 2) if join_count > 0 then leave_count will always be zero. That's a >>> consequence of how CPG sends its messages really, join and leave >>> messages are always separate. Don't rely on this behaviour though! >>> Although I can't see any reason to change it, I'd rather not have it >>> burned into the defacto specification. >> For added fun, a node that restarts quickly enough (think a VM) won't >> even appear to have left (or rejoined) the cluster. >> At the next totem confchg event, It will simply just be there again >> with no indication that anything happened. >> >> At least this is true for the raw corosync/openais membership data, >> perhaps CPG can infer this some other way. > > When a new node joins the group does it also create the group? > e.g. http://www.opengroup.org/RI/technologies/cords/gipc.pdf > has an epoch number with each join/leave message, the group is > created by whoever joined in epoch 0. That would work but it would also break the wire-protocol AND the API! -- Chrissie ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
>>> On 4/9/2009 at 5:50 AM, in message <26ef5e70904090450s40e92dcfgea0fc34826360...@mail.gmail.com>, Andrew Beekhof wrote: > On Thu, Apr 9, 2009 at 09:37, Chrissie Caulfield wrote: >> Joel Becker wrote: >>> Steve, Dave, etc, >>> Someone told me a while back that a node joining a cpg group >>> would be by its lonesome in the join message. That is, when the node >>> gets its first confchg, it will be the only node in the list of joins. >>> I've been using this to detect the first joiner of the group ("I joined, >>> and the member count is 1"). >>> Dave's since told me that this assumption is not valid (if it >>> ever was). So two or more nodes can join in parallel, and each can see >>> more than node in the list of joins for its first confchg. I'm now >>> trying to figure out an algorithm for "first joiner". I have a couple >>> of questions: >>> >>> 1) If I see member_count == join_count, does that mean every member has >>> just joinded, and all the members are receiving the same join message? >>> >>> 2) If member_count == join_count, can leave_count be non-zero? If it >>> is, am I guaranteed that we're looking at "all old members left, all new >>> members joined"? >>> >>> If these both are true, I can simply isolate a "first joiner" by >>> checking member_count == join_count and selecting the lowest node >>> number. >> >> >> I don't think you can detect a first-joiner using CPG. cman does it by >> reading the totem confchg messages. It is quite possible for two nodes >> to join at the same time ... during the same SYNC phase so you certainly >> can't rely on that. >> >> 1) If member_count == join count, then it's a safe bet that they are all >> new nodes, and yes , it is true that all nodes should see the same >> confchg messages >> >> 2) if join_count > 0 then leave_count will always be zero. That's a >> consequence of how CPG sends its messages really, join and leave >> messages are always separate. Don't rely on this behaviour though! >> Although I can't see any reason to change it, I'd rather not have it >> burned into the defacto specification. > > For added fun, a node that restarts quickly enough (think a VM) won't > even appear to have left (or rejoined) the cluster. > At the next totem confchg event, It will simply just be there again > with no indication that anything happened. > > At least this is true for the raw corosync/openais membership data, > perhaps CPG can infer this some other way. When a new node joins the group does it also create the group? e.g. http://www.opengroup.org/RI/technologies/cords/gipc.pdf has an epoch number with each join/leave message, the group is created by whoever joined in epoch 0. Hth, Robert ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 9, 2009 at 09:37, Chrissie Caulfield wrote: > Joel Becker wrote: >> Steve, Dave, etc, >> Someone told me a while back that a node joining a cpg group >> would be by its lonesome in the join message. That is, when the node >> gets its first confchg, it will be the only node in the list of joins. >> I've been using this to detect the first joiner of the group ("I joined, >> and the member count is 1"). >> Dave's since told me that this assumption is not valid (if it >> ever was). So two or more nodes can join in parallel, and each can see >> more than node in the list of joins for its first confchg. I'm now >> trying to figure out an algorithm for "first joiner". I have a couple >> of questions: >> >> 1) If I see member_count == join_count, does that mean every member has >> just joinded, and all the members are receiving the same join message? >> >> 2) If member_count == join_count, can leave_count be non-zero? If it >> is, am I guaranteed that we're looking at "all old members left, all new >> members joined"? >> >> If these both are true, I can simply isolate a "first joiner" by >> checking member_count == join_count and selecting the lowest node >> number. > > > I don't think you can detect a first-joiner using CPG. cman does it by > reading the totem confchg messages. It is quite possible for two nodes > to join at the same time ... during the same SYNC phase so you certainly > can't rely on that. > > 1) If member_count == join count, then it's a safe bet that they are all > new nodes, and yes , it is true that all nodes should see the same > confchg messages > > 2) if join_count > 0 then leave_count will always be zero. That's a > consequence of how CPG sends its messages really, join and leave > messages are always separate. Don't rely on this behaviour though! > Although I can't see any reason to change it, I'd rather not have it > burned into the defacto specification. For added fun, a node that restarts quickly enough (think a VM) won't even appear to have left (or rejoined) the cluster. At the next totem confchg event, It will simply just be there again with no indication that anything happened. At least this is true for the raw corosync/openais membership data, perhaps CPG can infer this some other way. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 3/9] Propagate the above into cfg.c and votequorum.c.
From: Jim Meyering * services/cfg.c (message_handler_req_lib_cfg_get_node_addrs): Constification exposed a bug in this function whereby it mistakenly modified storage through its now-const *msg parameter. Since it did that solely to store a temporary result, we've changed it to use a local variable instead. * services/votequorum.c (message_handler_req_lib_votequorum_setvotes): Likewise. --- services/cfg.c| 83 ++- services/votequorum.c | 116 +--- 2 files changed, 112 insertions(+), 87 deletions(-) diff --git a/services/cfg.c b/services/cfg.c index d93324a..fdb9046 100644 --- a/services/cfg.c +++ b/services/cfg.c @@ -123,55 +123,55 @@ static void exec_cfg_killnode_endian_convert (void *msg); static void message_handler_req_lib_cfg_ringstatusget ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_ringreenable ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_statetrack ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_statetrackstop ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_administrativestateset ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_administrativestateget ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_serviceload ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_serviceunload ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_killnode ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_tryshutdown ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_replytoshutdown ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_get_node_addrs ( void *conn, - void *msg); + const void *msg); static void message_handler_req_lib_cfg_local_get ( void *conn, - void *msg); + const void *msg); /* * Service Handler Definition @@ -635,13 +635,13 @@ static void message_handler_req_exec_cfg_shutdown ( */ static void message_handler_req_lib_cfg_ringstatusget ( void *conn, - void *msg) + const void *msg) { struct res_lib_cfg_ringstatusget res_lib_cfg_ringstatusget; struct totem_ip_address interfaces[INTERFACE_MAX]; unsigned int iface_count; char **status; - char *totem_ip_string; + const char *totem_ip_string; unsigned int i; ENTER(); @@ -659,7 +659,8 @@ static void message_handler_req_lib_cfg_ringstatusget ( res_lib_cfg_ringstatusget.interface_count = iface_count; for (i = 0; i < iface_count; i++) { - totem_ip_string = (char *)api->totem_ip_print (&interfaces[i]); + totem_ip_string + = (const char *)api->totem_ip_print (&interfaces[i]); strcpy ((char *)&res_lib_cfg_ringstatusget.interface_status[i], status[i]); strcpy ((char *)&res_lib_cfg_ringstatusget.interface_name[i], @@ -675,7 +676,7 @@ static void message_handler_req_lib_cfg_ringstatusget ( static void message_handler_req_lib_cfg_ringreenable ( void *conn, - void *msg) + const void *msg) { struct req_exec_cfg_ringreenable req_exec_cfg_ringreenable; struct iovec iovec; @@ -697,7 +698,7 @@ static void message_handler_req_lib_cfg_ringreenable ( static void message_handler_req_lib_cfg_statetrack ( void *conn, - void *msg) + const void *msg) { struct cfg_info *ci = (struct cfg_info *)api->ipc_private_data_get (conn); struct res_lib_cfg_statetrack res_lib_cfg_statetrack; @@ -733,7 +734,7 @@ static void message_handler_req_lib_cfg_statetrack ( static void message_handler_req_lib_cfg_statetrackstop ( void *conn, - void *msg) + const void *msg) { struct cfg_info *ci = (struct cfg_info *)api->ipc_private_data_get (conn); // struct req_lib_cfg_statetrackstop *req_lib_cfg_statetrackstop = (struct req_lib_cfg_statetrackstop *)message; @@ -745,7 +746,7 @@ static void message_handler_req_lib_cfg_statetrackstop ( static void message_handler_req_lib_cfg_administrativestateset ( void *conn, - void *msg) + const void *msg) { // struct req_lib_cfg_administrativestateset *req_lib_cfg_administrativestateset = (struct req_lib_cfg_administrativestateset *)message; @@ -754,7 +755,7 @@ static void message_handler_req_lib_cfg_administrativestateset ( } stati
[Openais] [PATCH 5/9] propagate to evc.c
From: Jim Meyering * services/evs.c: add const to msg param --- services/evs.c | 46 +++--- 1 files changed, 23 insertions(+), 23 deletions(-) diff --git a/services/evs.c b/services/evs.c index 389af98..24fff4d 100644 --- a/services/evs.c +++ b/services/evs.c @@ -7,7 +7,7 @@ * Author: Steven Dake (sd...@redhat.com) * * This software licensed under BSD license, the text of which follows: - * + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * @@ -83,11 +83,11 @@ static void message_handler_req_exec_mcast (const void *msg, unsigned int nodeid static void req_exec_mcast_endian_convert (void *msg); -static void message_handler_req_evs_join (void *conn, void *msg); -static void message_handler_req_evs_leave (void *conn, void *msg); -static void message_handler_req_evs_mcast_joined (void *conn, void *msg); -static void message_handler_req_evs_mcast_groups (void *conn, void *msg); -static void message_handler_req_evs_membership_get (void *conn, void *msg); +static void message_handler_req_evs_join (void *conn, const void *msg); +static void message_handler_req_evs_leave (void *conn, const void *msg); +static void message_handler_req_evs_mcast_joined (void *conn, const void *msg); +static void message_handler_req_evs_mcast_groups (void *conn, const void *msg); +static void message_handler_req_evs_membership_get (void *conn, const void *msg); static int evs_lib_init_fn (void *conn); static int evs_lib_exit_fn (void *conn); @@ -98,7 +98,7 @@ struct evs_pd { struct list_head list; void *conn; }; - + static struct corosync_api_v1 *api; static struct corosync_lib_handler evs_lib_engine[] = @@ -147,7 +147,7 @@ struct corosync_service_engine evs_service_engine = { .name = "corosync extended virtual synchrony service", .id = EVS_SERVICE, .private_data_size = sizeof (struct evs_pd), - .flow_control = CS_LIB_FLOW_CONTROL_REQUIRED, + .flow_control = CS_LIB_FLOW_CONTROL_REQUIRED, .lib_init_fn= evs_lib_init_fn, .lib_exit_fn= evs_lib_exit_fn, .lib_engine = evs_lib_engine, @@ -277,10 +277,10 @@ static int evs_lib_exit_fn (void *conn) return (0); } -static void message_handler_req_evs_join (void *conn, void *msg) +static void message_handler_req_evs_join (void *conn, const void *msg) { cs_error_t error = CS_OK; - struct req_lib_evs_join *req_lib_evs_join = (struct req_lib_evs_join *)msg; + const struct req_lib_evs_join *req_lib_evs_join = msg; struct res_lib_evs_join res_lib_evs_join; void *addr; struct evs_pd *evs_pd = (struct evs_pd *)api->ipc_private_data_get (conn); @@ -290,7 +290,7 @@ static void message_handler_req_evs_join (void *conn, void *msg) goto exit_error; } - addr = realloc (evs_pd->groups, sizeof (struct evs_group) * + addr = realloc (evs_pd->groups, sizeof (struct evs_group) * (evs_pd->group_entries + req_lib_evs_join->group_entries)); if (addr == NULL) { error = CS_ERR_NO_MEMORY; @@ -313,9 +313,9 @@ exit_error: sizeof (struct res_lib_evs_join)); } -static void message_handler_req_evs_leave (void *conn, void *msg) +static void message_handler_req_evs_leave (void *conn, const void *msg) { - struct req_lib_evs_leave *req_lib_evs_leave = (struct req_lib_evs_leave *)msg; + const struct req_lib_evs_leave *req_lib_evs_leave = msg; struct res_lib_evs_leave res_lib_evs_leave; cs_error_t error = CS_OK; int error_index; @@ -359,10 +359,10 @@ static void message_handler_req_evs_leave (void *conn, void *msg) sizeof (struct res_lib_evs_leave)); } -static void message_handler_req_evs_mcast_joined (void *conn, void *msg) +static void message_handler_req_evs_mcast_joined (void *conn, const void *msg) { cs_error_t error = CS_ERR_TRY_AGAIN; - struct req_lib_evs_mcast_joined *req_lib_evs_mcast_joined = (struct req_lib_evs_mcast_joined *)msg; + const struct req_lib_evs_mcast_joined *req_lib_evs_mcast_joined = msg; struct res_lib_evs_mcast_joined res_lib_evs_mcast_joined; struct iovec req_exec_evs_mcast_iovec[3]; struct req_exec_evs_mcast req_exec_evs_mcast; @@ -399,14 +399,14 @@ static void message_handler_req_evs_mcast_joined (void *conn, void *msg) sizeof (struct res_lib_evs_mcast_joined)); } -static void message_handler_req_evs_mcast_groups (void *conn, void *msg) +static void message_handler_req_evs_mcast_groups (void *conn, const void *msg) { cs_error_t error = CS_ERR_TRY_AGAIN; - struct req_lib_evs_mcast_groups *req_lib_evs_mcast_groups = (struct req_lib_evs_mcast_groups *)msg; + const
[Openais] [PATCH 8/9] * services/cpg.c: Likewise.
From: Jim Meyering --- services/cpg.c | 56 +--- 1 files changed, 33 insertions(+), 23 deletions(-) diff --git a/services/cpg.c b/services/cpg.c index 7bae0fd..f60427a 100644 --- a/services/cpg.c +++ b/services/cpg.c @@ -158,21 +158,26 @@ static void exec_cpg_mcast_endian_convert (void *msg); static void exec_cpg_downlist_endian_convert (void *msg); -static void message_handler_req_lib_cpg_join (void *conn, void *message); +static void message_handler_req_lib_cpg_join (void *conn, const void *message); -static void message_handler_req_lib_cpg_leave (void *conn, void *message); +static void message_handler_req_lib_cpg_leave (void *conn, const void *message); -static void message_handler_req_lib_cpg_mcast (void *conn, void *message); +static void message_handler_req_lib_cpg_mcast (void *conn, const void *message); -static void message_handler_req_lib_cpg_membership (void *conn, void *message); +static void message_handler_req_lib_cpg_membership (void *conn, + const void *message); -static void message_handler_req_lib_cpg_trackstart (void *conn, void *message); +static void message_handler_req_lib_cpg_trackstart (void *conn, + const void *message); -static void message_handler_req_lib_cpg_trackstop (void *conn, void *message); +static void message_handler_req_lib_cpg_trackstop (void *conn, + const void *message); -static void message_handler_req_lib_cpg_local_get (void *conn, void *message); +static void message_handler_req_lib_cpg_local_get (void *conn, + const void *message); -static void message_handler_req_lib_cpg_groups_get (void *conn, void *message); +static void message_handler_req_lib_cpg_groups_get (void *conn, + const void *message); static int cpg_node_joinleave_send (struct group_info *gi, struct process_info *pi, int fn, int reason); @@ -681,7 +686,7 @@ static void cpg_confchg_fn ( /* Can byteswap join & leave messages */ static void exec_cpg_procjoin_endian_convert (void *msg) { - struct req_exec_cpg_procjoin *req_exec_cpg_procjoin = (struct req_exec_cpg_procjoin *)msg; + struct req_exec_cpg_procjoin *req_exec_cpg_procjoin = msg; req_exec_cpg_procjoin->pid = swab32(req_exec_cpg_procjoin->pid); swab_mar_cpg_name_t (&req_exec_cpg_procjoin->group_name); @@ -705,7 +710,7 @@ static void exec_cpg_joinlist_endian_convert (void *msg_v) static void exec_cpg_downlist_endian_convert (void *msg) { - struct req_exec_cpg_downlist *req_exec_cpg_downlist = (struct req_exec_cpg_downlist *)msg; + struct req_exec_cpg_downlist *req_exec_cpg_downlist = msg; unsigned int i; req_exec_cpg_downlist->left_nodes = swab32(req_exec_cpg_downlist->left_nodes); @@ -718,7 +723,7 @@ static void exec_cpg_downlist_endian_convert (void *msg) static void exec_cpg_mcast_endian_convert (void *msg) { - struct req_exec_cpg_mcast *req_exec_cpg_mcast = (struct req_exec_cpg_mcast *)msg; + struct req_exec_cpg_mcast *req_exec_cpg_mcast = msg; swab_mar_req_header_t (&req_exec_cpg_mcast->header); swab_mar_cpg_name_t (&req_exec_cpg_mcast->group_name); @@ -1016,9 +1021,9 @@ static int cpg_lib_init_fn (void *conn) } /* Join message from the library */ -static void message_handler_req_lib_cpg_join (void *conn, void *message) +static void message_handler_req_lib_cpg_join (void *conn, const void *message) { - struct req_lib_cpg_join *req_lib_cpg_join = (struct req_lib_cpg_join *)message; + const struct req_lib_cpg_join *req_lib_cpg_join = message; struct process_info *pi = (struct process_info *)api->ipc_private_data_get (conn); struct res_lib_cpg_join res_lib_cpg_join; struct group_info *gi; @@ -1055,7 +1060,7 @@ join_err: } /* Leave message from the library */ -static void message_handler_req_lib_cpg_leave (void *conn, void *message) +static void message_handler_req_lib_cpg_leave (void *conn, const void *message) { struct process_info *pi = (struct process_info *)api->ipc_private_data_get (conn); struct res_lib_cpg_leave res_lib_cpg_leave; @@ -1084,9 +1089,9 @@ leave_ret: } /* Mcast message from the library */ -static void message_handler_req_lib_cpg_mcast (void *conn, void *message) +static void message_handler_req_lib_cpg_mcast (void *conn, const void *message) { - struct req_lib_cpg_mcast *req_lib_cpg_mcast = (struct req_lib_cpg_mcast *)message; + const struct req_lib_cpg_mcast *req_lib_cpg_mcast = message; struct process_info *pi = (struct process_info *)api->ipc_private_data_get (conn); struct group_info *gi = pi->group; struct iovec req_exec_cpg_iovec[2]; @@ -1132,7 +1137,8 @@ static void message_handler_req_lib_cpg_mcast (
[Openais] [PATCH 7/9] coroipcs.h: update signature of coroipcs_handler_fn_lvalue to match
From: Jim Meyering * exec/coroipcs.h: signature of coroipcs_handler_fn_lvalue must match that of lib_handler_fn; noted via main.c. --- exec/coroipcs.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/exec/coroipcs.h b/exec/coroipcs.h index 60b63d2..283b2c1 100644 --- a/exec/coroipcs.h +++ b/exec/coroipcs.h @@ -43,7 +43,7 @@ struct iovec; typedef int (*coroipcs_init_fn_lvalue) (void *conn); typedef int (*coroipcs_exit_fn_lvalue) (void *conn); -typedef void (*coroipcs_handler_fn_lvalue) (void *conn, void *msg); +typedef void (*coroipcs_handler_fn_lvalue) (void *conn, const void *msg); struct coroipcs_init_state { const char *socket_name; -- 1.6.2.rc1.285.gc5f54 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 1/9] testevsth.c: const+size_t: evs_deliver_fn, evs_confchg_fn
From: Jim Meyering --- test/testevsth.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/test/testevsth.c b/test/testevsth.c index 1d74f7d..9a1635d 100644 --- a/test/testevsth.c +++ b/test/testevsth.c @@ -47,7 +47,7 @@ char *delivery_string; #define CALLBACKS 20 int callback_count = 0; -void evs_deliver_fn (struct in_addr source_addr, void *msg, int msg_len) +void evs_deliver_fn (struct in_addr source_addr, const void *msg, size_t msg_len) { #ifdef PRINT_OUTPUT char *buf; @@ -62,9 +62,9 @@ void evs_deliver_fn (struct in_addr source_addr, void *msg, int msg_len) } void evs_confchg_fn ( - struct in_addr *member_list, int member_list_entries, - struct in_addr *left_list, int left_list_entries, - struct in_addr *joined_list, int joined_list_entries) + const struct in_addr *member_list, size_t member_list_entries, + const struct in_addr *left_list, size_t left_list_entries, + const struct in_addr *joined_list, size_t joined_list_entries) { int i; -- 1.6.2.rc1.285.gc5f54 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 9/9] * services/confdb.c: Likewise.
From: Jim Meyering --- services/confdb.c | 148 ++--- 1 files changed, 95 insertions(+), 53 deletions(-) diff --git a/services/confdb.c b/services/confdb.c index ad19a20..6ae5fdb 100644 --- a/services/confdb.c +++ b/services/confdb.c @@ -62,28 +62,45 @@ static int confdb_exec_init_fn ( static int confdb_lib_init_fn (void *conn); static int confdb_lib_exit_fn (void *conn); -static void message_handler_req_lib_confdb_object_create (void *conn, void *message); -static void message_handler_req_lib_confdb_object_destroy (void *conn, void *message); -static void message_handler_req_lib_confdb_object_find_destroy (void *conn, void *message); - -static void message_handler_req_lib_confdb_key_create (void *conn, void *message); -static void message_handler_req_lib_confdb_key_get (void *conn, void *message); -static void message_handler_req_lib_confdb_key_replace (void *conn, void *message); -static void message_handler_req_lib_confdb_key_delete (void *conn, void *message); -static void message_handler_req_lib_confdb_key_iter (void *conn, void *message); - -static void message_handler_req_lib_confdb_key_increment (void *conn, void *message); -static void message_handler_req_lib_confdb_key_decrement (void *conn, void *message); - -static void message_handler_req_lib_confdb_object_iter (void *conn, void *message); -static void message_handler_req_lib_confdb_object_find (void *conn, void *message); - -static void message_handler_req_lib_confdb_object_parent_get (void *conn, void *message); -static void message_handler_req_lib_confdb_write (void *conn, void *message); -static void message_handler_req_lib_confdb_reload (void *conn, void *message); - -static void message_handler_req_lib_confdb_track_start (void *conn, void *message); -static void message_handler_req_lib_confdb_track_stop (void *conn, void *message); +static void message_handler_req_lib_confdb_object_create (void *conn, + const void *message); +static void message_handler_req_lib_confdb_object_destroy (void *conn, + const void *message); +static void message_handler_req_lib_confdb_object_find_destroy (void *conn, + const void *message); + +static void message_handler_req_lib_confdb_key_create (void *conn, + const void *message); +static void message_handler_req_lib_confdb_key_get (void *conn, + const void *message); +static void message_handler_req_lib_confdb_key_replace (void *conn, + const void *message); +static void message_handler_req_lib_confdb_key_delete (void *conn, + const void *message); +static void message_handler_req_lib_confdb_key_iter (void *conn, +const void *message); + +static void message_handler_req_lib_confdb_key_increment (void *conn, + const void *message); +static void message_handler_req_lib_confdb_key_decrement (void *conn, + const void *message); + +static void message_handler_req_lib_confdb_object_iter (void *conn, + const void *message); +static void message_handler_req_lib_confdb_object_find (void *conn, + const void *message); + +static void message_handler_req_lib_confdb_object_parent_get (void *conn, + const void *message); +static void message_handler_req_lib_confdb_write (void *conn, + const void *message); +static void message_handler_req_lib_confdb_reload (void *conn, + const void *message); + +static void message_handler_req_lib_confdb_track_start (void *conn, + const void *message); +static void message_handler_req_lib_confdb_track_stop (void *conn, + const void *message); static void confdb_notify_lib_of_key_change( object_change_type_t change_type, @@ -293,9 +310,11 @@ static int confdb_lib_exit_fn (void *conn) return (0); } -static void message_handler_req_lib_confdb_object_create (void *conn, void *message) +static void message_handler_req_lib_confdb_object_create (void *conn, + const void *message) { - struct req_lib_confdb_object_create *req_lib_confdb_object_create = (struct req_lib_confdb_object_create *)message; + const struct req_lib_confdb_obje
[Openais] [PATCH 6/9] * services/pload.c: Likewise
From: Jim Meyering --- services/pload.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/services/pload.c b/services/pload.c index 2dbe974..424abe6 100644 --- a/services/pload.c +++ b/services/pload.c @@ -91,7 +91,7 @@ static void req_exec_pload_start_endian_convert (void *msg); static void req_exec_pload_mcast_endian_convert (void *msg); -static void message_handler_req_pload_start (void *conn, void *msg); +static void message_handler_req_pload_start (void *conn, const void *msg); static int pload_lib_init_fn (void *conn); @@ -232,9 +232,9 @@ static int pload_lib_exit_fn (void *conn) return (0); } -static void message_handler_req_pload_start (void *conn, void *msg) +static void message_handler_req_pload_start (void *conn, const void *msg) { - struct req_lib_pload_start *req_lib_pload_start = (struct req_lib_pload_start *)msg; + const struct req_lib_pload_start *req_lib_pload_start = msg; struct req_exec_pload_start req_exec_pload_start; struct iovec iov; -- 1.6.2.rc1.285.gc5f54 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 2/9] coroapi.h: change lib_handler_fn's *msg to be const
From: Jim Meyering Make a tiny type change and watch it propagate. * include/corosync/engine/coroapi.h (struct corosync_lib_handler) [lib_handler_fn]: Change type of 2nd parameter: s/void *msg/const void *msg/. --- include/corosync/engine/coroapi.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/corosync/engine/coroapi.h b/include/corosync/engine/coroapi.h index d143ee1..6f6c69d 100644 --- a/include/corosync/engine/coroapi.h +++ b/include/corosync/engine/coroapi.h @@ -557,7 +557,7 @@ struct corosync_api_v1 { #define SERVICE_HANDLER_MAXIMUM_COUNT 64 struct corosync_lib_handler { - void (*lib_handler_fn) (void *conn, void *msg); + void (*lib_handler_fn) (void *conn, const void *msg); int response_size; int response_id; enum cs_lib_flow_control flow_control; -- 1.6.2.rc1.285.gc5f54 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] another big batch of API changes
Here is a tiny API change, along with the many changes it induces. 0001 is just something I saw along the way. 0002 is the tiny change that the adjustments in all of the following. Here's that tiny diff: diff --git a/include/corosync/engine/coroapi.h ... ... struct corosync_lib_handler { - void (*lib_handler_fn) (void *conn, void *msg); + void (*lib_handler_fn) (void *conn, const void *msg); The following changes pick up the pieces. $ wc -l 00* : 6 -api 38 0001-testevsth.c-const-size_t-evs_deliver_fn-evs_confc.patch 29 0002-coroapi.h-change-lib_handler_fn-s-msg-to-be-const.patch 585 0003-Propagate-the-above-into-cfg.c-and-votequorum.c.patch 65 0004-Propagate-the-above-into-vsf_quorum.c.patch 157 0005-propagate-to-evc.c.patch 37 0006--services-pload.c-Likewise.patch 27 0007-coroipcs.h-update-signature-of-coroipcs_handler_fn_.patch 167 0008--services-cpg.c-Likewise.patch 304 0009--services-confdb.c-Likewise.patch 1415 total $ cat 000*|diffstat : exec/coroipcs.h |2 exec/vsf_quorum.c | 19 +++- include/corosync/engine/coroapi.h |2 services/cfg.c| 83 ++--- services/confdb.c | 148 -- services/cpg.c| 56 -- services/evs.c| 46 +-- services/pload.c |6 - services/votequorum.c | 116 + test/testevsth.c |8 +- 10 files changed, 284 insertions(+), 202 deletions(-) ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 4/9] Propagate the above into vsf_quorum.c.
From: Jim Meyering * exec/vsf_quorum.c: add const to msg param. --- exec/vsf_quorum.c | 19 --- 1 files changed, 12 insertions(+), 7 deletions(-) diff --git a/exec/vsf_quorum.c b/exec/vsf_quorum.c index dc05458..45f537e 100644 --- a/exec/vsf_quorum.c +++ b/exec/vsf_quorum.c @@ -81,9 +81,12 @@ struct internal_callback_pd { void *context; }; -static void message_handler_req_lib_quorum_getquorate (void *conn, void *msg); -static void message_handler_req_lib_quorum_trackstart (void *conn, void *msg); -static void message_handler_req_lib_quorum_trackstop (void *conn, void *msg); +static void message_handler_req_lib_quorum_getquorate (void *conn, + const void *msg); +static void message_handler_req_lib_quorum_trackstart (void *conn, + const void *msg); +static void message_handler_req_lib_quorum_trackstop (void *conn, + const void *msg); static void send_library_notification(void *conn); static void send_internal_notification(void); static int quorum_exec_init_fn (struct corosync_api_v1 *api); @@ -394,7 +397,8 @@ static void send_library_notification(void *conn) return; } -static void message_handler_req_lib_quorum_getquorate (void *conn, void *msg) +static void message_handler_req_lib_quorum_getquorate (void *conn, + const void *msg) { struct res_lib_quorum_getquorate res_lib_quorum_getquorate; @@ -409,9 +413,10 @@ static void message_handler_req_lib_quorum_getquorate (void *conn, void *msg) } -static void message_handler_req_lib_quorum_trackstart (void *conn, void *msg) +static void message_handler_req_lib_quorum_trackstart (void *conn, + const void *msg) { - struct req_lib_quorum_trackstart *req_lib_quorum_trackstart = (struct req_lib_quorum_trackstart *)msg; + const struct req_lib_quorum_trackstart *req_lib_quorum_trackstart = msg; mar_res_header_t res; struct quorum_pd *quorum_pd = (struct quorum_pd *)corosync_api->ipc_private_data_get (conn); @@ -446,7 +451,7 @@ static void message_handler_req_lib_quorum_trackstart (void *conn, void *msg) corosync_api->ipc_response_send(conn, &res, sizeof(mar_res_header_t)); } -static void message_handler_req_lib_quorum_trackstop (void *conn, void *msg) +static void message_handler_req_lib_quorum_trackstop (void *conn, const void *msg) { mar_res_header_t res; struct quorum_pd *quorum_pd = (struct quorum_pd *)corosync_api->ipc_private_data_get (conn); -- 1.6.2.rc1.285.gc5f54 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] howto distribute data accross all nodes?
On Thu, 2009-04-09 at 10:19 +0200, Dietmar Maurer wrote: > > What I recommend here is to place your local node id in the message > > contents (retrieved via cpg_local_get) and then compare that nodeid to > > incoming messages > > Why do you include the local node id into the message? I can compare the > local node id with the sending node id without that, for example: > > ... > cpg_local_get(cpg_handle, &local_nodeid); > ... > > static void cpg_deliver_callback (cpg_handle_t handle, > struct cpg_name *groupName, > uint32_t nodeid, > uint32_t pid, > void *msg, > int msg_len) > { > ... > if (nodeid == local_nodeid) > ... > } > > Or do I miss something? > Sorry I forgot that was in the callback parameter. You should be good to go with the method proposed with your code. Lots of code, hard to remember all the interfaces :) Regards -steve > - Dietmar > > > > ___ > Openais mailing list > Openais@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] howto distribute data accross all nodes?
> What I recommend here is to place your local node id in the message > contents (retrieved via cpg_local_get) and then compare that nodeid to > incoming messages Why do you include the local node id into the message? I can compare the local node id with the sending node id without that, for example: ... cpg_local_get(cpg_handle, &local_nodeid); ... static void cpg_deliver_callback (cpg_handle_t handle, struct cpg_name *groupName, uint32_t nodeid, uint32_t pid, void *msg, int msg_len) { ... if (nodeid == local_nodeid) ... } Or do I miss something? - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
Joel Becker wrote: > Steve, Dave, etc, > Someone told me a while back that a node joining a cpg group > would be by its lonesome in the join message. That is, when the node > gets its first confchg, it will be the only node in the list of joins. > I've been using this to detect the first joiner of the group ("I joined, > and the member count is 1"). > Dave's since told me that this assumption is not valid (if it > ever was). So two or more nodes can join in parallel, and each can see > more than node in the list of joins for its first confchg. I'm now > trying to figure out an algorithm for "first joiner". I have a couple > of questions: > > 1) If I see member_count == join_count, does that mean every member has > just joinded, and all the members are receiving the same join message? > > 2) If member_count == join_count, can leave_count be non-zero? If it > is, am I guaranteed that we're looking at "all old members left, all new > members joined"? > > If these both are true, I can simply isolate a "first joiner" by > checking member_count == join_count and selecting the lowest node > number. I don't think you can detect a first-joiner using CPG. cman does it by reading the totem confchg messages. It is quite possible for two nodes to join at the same time ... during the same SYNC phase so you certainly can't rely on that. 1) If member_count == join count, then it's a safe bet that they are all new nodes, and yes , it is true that all nodes should see the same confchg messages 2) if join_count > 0 then leave_count will always be zero. That's a consequence of how CPG sends its messages really, join and leave messages are always separate. Don't rely on this behaviour though! Although I can't see any reason to change it, I'd rather not have it burned into the defacto specification. -- Chrissie ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] howto distribute data accross all nodes?
> > cpg_mcast_joined sends messages asynchrounous. Is there a way to wait > > until the message is delivered to the local callback? Or do I need to > > implement my own message numbering scheme to detect when a message > > arrives? > > > > What I recommend here is to place your local node id in the message > contents (retrieved via cpg_local_get) and then compare that nodeid to > incoming messages. If it is from yourself, you may decide to take some > special actions in that case. > > > And, if I send a message with 'cpg_mcast_joined', is there any > guarantee > > that this message arrives at the local node callback? In other words, > > can I trust that I receive messages sent by myself? If so, is there a > > timing constraint (something like, messages sent will arrive within > 10 > > second at the local node)? > > > > If a node sends a message, it will always self-deliver (or fail) in a > timely fashion. The only way the message won't be delivered is if the > local node fails. > > You are also guaranteed it will be delivered to all nodes that were a) > part of previous regular configuration _and_ b) members of new regular > configuration. OK, many thanks. - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais