Ruediger Pluem wrote:
....
+typedef struct hb_ctx_t
+{
+    int active;
+    apr_sockaddr_t *mcast_addr;
+    int server_limit;
+    int thread_limit;
+    int status;
+    int keep_running;

Shouldn't this be volatile?

Changed, r723660.

....
+            if (res == SERVER_READY && ps->generation == ap_my_generation) {
+                ready++;
+            }
+            else if (res != SERVER_DEAD &&
+                     res != SERVER_STARTING && res != SERVER_IDLE_KILL) {
+                busy++;

Is this correct even if ps->generation != ap_my_generation?

Nope, this would over-report 'busy' servers.  Fixed in r723661.
....
+
+        apr_pool_t *tpool;
+        apr_pool_create(&tpool, pool);
+        apr_pool_tag(tpool, "heartbeat_worker_temp");
+        hb_monitor(ctx, tpool);
+        apr_pool_destroy(tpool);

Why create / destroy and not simply create once and call apr_pool_clear
in the loop?

As this pool is around forever, but this is only ran every second, I don't think there is much advantage to clearing it at the small risk it keeps growing.

....
+static void start_hb_worker(apr_pool_t *p, hb_ctx_t *ctx)
+{
+    apr_status_t rv;
+
+    rv = apr_thread_mutex_create(&ctx->start_mtx, APR_THREAD_MUTEX_UNNESTED,
+                                 p);
+
+    if (rv) {
+        ap_log_error(APLOG_MARK, APLOG_CRIT, rv, NULL,
+                     "Heartbeat: apr_thread_cond_create failed");

You create a thread mutex above, not a thread cond.

Yeah, some old code used a thread cond for this, fixed in r723663.

....
+    rv = apr_thread_create(&ctx->thread, NULL, hb_worker, ctx, p);
+    if (rv) {
+        apr_pool_cleanup_kill(p, ctx, hb_pool_cleanup);
+        ap_log_error(APLOG_MARK, APLOG_CRIT, rv, NULL,
+                     "Heartbeat: apr_thread_create failed");
+        ctx->status = rv;
+    }
+
+    apr_thread_mutex_lock(ctx->start_mtx);
+    apr_thread_mutex_unlock(ctx->start_mtx);

This may deserve some comment. As far as I understand the desire is to wait 
until the
hb_worker thread is up.
But to be honest I do not understand the need for the start_mutex at all.

Added a comment in r723665.

+    rv = apr_proc_mutex_create(&ctx->mutex, ctx->mutex_path,
+#if APR_HAS_FCNTL_SERIALIZE
+                               APR_LOCK_FCNTL,
+#else
+#if APR_HAS_FLOCK_SERIALIZE
+                               APR_LOCK_FLOCK,
+#else
+#error port me to a non crap platform.
+#endif
+#endif
+                               p);

Is there any reason why we must use either APR_LOCK_FCNTL or APR_LOCK_FLOCK,
wouldn't the default mutex work?

The default lock mech on OSX is sysvsem. I couldn't get it to work properly after forking at all.

Maybe I was doing something wrong, but switching it to flock/fcntl works pretty much everywhere, and pretty consistently works even if a child crashes.

+
+    if (rv) {
+        ap_log_error(APLOG_MARK, APLOG_CRIT, rv, s,
+                     "Heartbeat: mutex failed creation at %s (type=%s)",
+                     ctx->mutex_path, apr_proc_mutex_defname());

And how do you know that apr_proc_mutex_defname is either APR_LOCK_FCNTL
or APR_LOCK_FLOCK? Maybe the default mutex on this platform is something
different.

Fixed with r723666.

Added: httpd/httpd/trunk/modules/cluster/mod_heartmonitor.c
.....
+typedef struct hm_ctx_t
+{
+    int active;
+    const char *storage_path;
+    apr_proc_mutex_t *mutex;
+    const char *mutex_path;
+    apr_sockaddr_t *mcast_addr;
+    int status;
+    int keep_running;

Shouldn't this be volatile?

Fixed in r723669.


+    apr_time_t last = apr_time_now();
+    while (ctx->keep_running) {
+        int n;
+        apr_pool_t *p;
+        apr_pollfd_t pfd;
+        apr_interval_time_t timeout;
+        apr_pool_create(&p, ctx->p);
+
+        apr_time_t now = apr_time_now();
+
+        if (apr_time_sec((now - last)) > 5) {

Hardcoded 5 seconds? Bah!!

Moved to a compile time define in r723672.

....
+
+        apr_pool_destroy(p);

Why not just clearing the pool?

Because I don't trust pools ? :P

....
+    rv = apr_thread_mutex_create(&ctx->start_mtx, APR_THREAD_MUTEX_UNNESTED,
+                                 p);
+
+    if (rv) {
+        ap_log_error(APLOG_MARK, APLOG_CRIT, rv, NULL,
+                     "Heartmonitor: apr_thread_cond_create failed");

You create a thread mutex above, not a thread cond.

Fixed in r723674.

....
+
+    apr_thread_mutex_lock(ctx->start_mtx);
+    apr_thread_mutex_unlock(ctx->start_mtx);

This may deserve some comment. As far as I understand the desire is to wait 
until the
hb_worker thread is up.
But to be honest I do not understand the need for the start_mutex at all.

Commented in r723675.

...
+#if APR_HAS_FCNTL_SERIALIZE
+
+                                            APR_LOCK_FCNTL,
+#else
+#if APR_HAS_FLOCK_SERIALIZE
+                                            APR_LOCK_FLOCK,
+#else
+#error port me to a non crap platform.
+#endif
+#endif
+                                            p);

Is there any reason why we must use either APR_LOCK_FCNTL or APR_LOCK_FLOCK,
wouldn't the default mutex work?

See comments above from mod_heartbeat.

+
+    if (rv) {
+        ap_log_error(APLOG_MARK, APLOG_CRIT, rv, s,
+                     "Heartmonitor: Failed to create listener "
+                     "mutex at %s (type=%s)", ctx->mutex_path,
+                     apr_proc_mutex_defname());

And how do you know that apr_proc_mutex_defname is either APR_LOCK_FCNTL
or APR_LOCK_FLOCK? Maybe the default mutex on this platform is something
different.


Fixed in r723677.

...
+static void *hm_create_config(apr_pool_t *p, server_rec *s)
+{
+    hm_ctx_t *ctx = (hm_ctx_t *) apr_palloc(p, sizeof(hm_ctx_t));
+
+    ctx->active = 0;
+    ctx->storage_path = ap_server_root_relative(p, "logs/hb.dat");

Why doesn't ctx->mutex_path get initialized here?

Fixed in r723679

...


Regards

THANK YOU very much for reviewing the modules, it is very appreciated!

-Paul

Reply via email to