mod_jk local error states vs. global error states

Rainer Jung Fri, 06 Mar 2009 03:53:23 -0800

On 06.03.2009 09:22, Mladen Turk wrote:

BZ 46808 is valid but it brings us back where we were before.
It solves (well doesn't actually) one thing,
but breaks the sticky sessions.
Unless the patch reliably detects the cause of failure
I'm -1 for committing that. The problem is that returned
codes from socket operations are platform specific, and
then what if some one pulls the cable in the middle of
AJP request? The possibilities are endless, and trying to
solve each and every one is over engineering thought.
We should keep the things simple:


"If there are opened connections to the backend do not
mark that worker in error"
We simply do not know the status of the network beyond
the current connection. It might be Tomcat rejecting
new connections, firewall reaching the limit, ulimit,
ssh tunnel restart, etc, etc ...
So although our connection fails other might still
work. If we can for sure detect something as absolute
failure then we could save some concurrent client
cycles, but then again we have a watchdog that can
check all the client connections in background.

I don't want to simply commit the patch of BZ 46808, I first want tofully understand the construction and implications of the local errorstates. I did understand the concept in general (even before 1.2.27 wasreleased), but nevertheless I have some questions:


1) New busy introduced in r647085
=================================

We have three busy counters:

a) one for the lb in total
b) one for each lb sub
c) one for each ajp worker

In status worker we use only a) and c). In lb we use a) and b). Yourcomment to BZ 46808 seems to indicate, that using c) instead ob b) in lbwould be better. We could then again remove b).


Right?

So we could drop the rec->s->busy++ and --, because that's done for theajp busyness alredy in jk_ajp_common.c and we would test againstaw->s->busy instead of rec->s->busy.


OK?

2) Local states vs. global states
=================================

I went through all places were we use and set states in lb. I have somecomments:


a) Setting local states and global states to different values
-------------------------------------------------------------

Of course the whole idea of the local state is to allow situations,where we behave different for a single request and for the rest of thesystem. On the other hand if we have to many such differences, thesystem gets very complex, because as you can see further down, we needto carefully decide whether we need to test against the global oragainst the local state.


First the easier cases (I think)

(i) JK_CLIENT_ERROR

1258                 else if (service_stat == JK_CLIENT_ERROR) {
1259                     /*
1260                      * Client error !!!
1261                      * Since this is bad request do not fail over.
1262                      */
1263                     rec->s->state  = JK_LB_STATE_OK;
1264                     p->states[rec->i] = JK_LB_STATE_ERROR;
1265                     rec->s->error_time = 0;
1266                     rc = JK_CLIENT_ERROR;
1267                     recoverable = JK_FALSE;
1268                 }

recoverable=JK_FALSE means we no longer loop in service() and thusstates are no longer used. I would therefore suggest to keep the twostates equal to "OK" here. It doesn't matter for the logic, but makeskeeping track of the differences easier.


(ii) JK_STATUS_FATAL_ERROR

1291                 else if (service_stat == JK_STATUS_FATAL_ERROR) {
1292                     /*
1293                      * Status code configured as service is down.
1294                      * Mark the node as bad.
1295                      * Failing over to another node could help.
1296                      */
1297                     rec->s->errors++;
1298                     if (rec->s->busy) {
1299                         rec->s->state = JK_LB_STATE_OK;
1300                     }
1301                     else {
1302                         rec->s->state = JK_LB_STATE_ERROR;
1303                     }
1304                     p->states[rec->i] = JK_LB_STATE_ERROR;
1305                     rec->s->error_time = time(NULL);
1306                     rc = JK_FALSE;
1307                 }

This is the situation, where fail_on_status was triggered with apositive status code, so the admin configured to use this as ERROR. Anddespite busyness, we actually got a response back. So I propose to notmake a distinction based on busyness here and to set the global state toERROR too.


OK?

(iii) JK_REPLY_TIMEOUT

1308                 else if (service_stat == JK_REPLY_TIMEOUT) {

1309 if (aw->s->reply_timeouts >(unsigned)p->worker->max_reply_timeouts) {

1310                         /*
1311                          * Service failed - to many reply timeouts
1312                          * Take this node out of service.
1313                          */
1314                         rec->s->errors++;
1315                         if (rec->s->busy) {
1316                             rec->s->state = JK_LB_STATE_OK;
1317                         }
1318                         else {
1319                             rec->s->state = JK_LB_STATE_ERROR;
1320                         }
1321                         p->states[rec->i] = JK_LB_STATE_ERROR;
1322                         rec->s->error_time = time(NULL);

We are already over our max_reply_timeouts. So this should triggerglobal ERROR, even when busy. The fact we have max_reply_timeouts isexactly to prevent isolated reply_timeouts to trigger ERROR. If wereally go over that limit, we should go into ERROR without furtherconditions. So I propose global ERROR here, without checking busy.


Now the more difficult cases:

(iv) JK_SERVER_ERROR

We only get this, if a memory allocation fails.

1269                 else if (service_stat == JK_SERVER_ERROR) {
1270                     /*
1271                      * Internal JK server error.
1272                      * Don't mark the node as bad.
1273                      * Failing over to another node could help.
1274                      */
1275                     rec->s->state  = JK_LB_STATE_OK;
1276                     p->states[rec->i] = JK_LB_STATE_ERROR;
1277                     rec->s->error_time = 0;
1278                     rc = JK_FALSE;
1279                 }

I'm fine with what you decided, although actually I see no reason whyallocation should work better or worse for one of the lb members.

We can leave it, to keep track of the differences it would be easier toset local state to OK too.


(v) the "else"

                else {
                    /*
                     * Service failed !!!
                     * Time for fault tolerance (if possible)...
                     */
                    rec->s->errors++;
                    if (rec->s->busy) {
                        rec->s->state = JK_LB_STATE_OK;
                    }
                    else {
                        rec->s->state = JK_LB_STATE_ERROR;
                    }
                    p->states[rec->i] = JK_LB_STATE_ERROR;
                    rec->s->error_time = time(NULL);
                    rc = JK_FALSE;
                }

This is the hard case. It handles connect problems, protocol failuresand a lot of various problems.

I agree, that it is hard to decide whether an error here should behandled as a global failover decision. On the other hand in productionit is likely that busy>0 so we will never set a global error in this"else" cases.

Of course now we are in good shape with cping/cpong (if a connectionexists) and the new socket connect timeout (if not) to detect problemson the network layer really fast and before the request is wasted undunrecoverable.

My idea would be: leave this as is, but add a second check based on theerror_time. We still set it, even when the global state is OK. In theother cases, were you use the local state to simply prevent using thesame member in the service loop twice, we don't set it. So it's a goodindicator for this "else" tree.


Something along the lines of:

Index: jk_lb_worker.h
===================================================================
--- jk_lb_worker.h      (revision 750052)
+++ jk_lb_worker.h      (working copy)
@@ -123,6 +123,9 @@

 /* Time to wait before retry. */
 #define WAIT_BEFORE_RECOVER   (60)
+/* Time after a local error switches to global error (seconds) */
+#define MAX_LOCAL_ERROR_TIME 10
+
 /* We accept doing global maintenance if we are */
 /* JK_LB_MAINTAIN_TOLERANCE seconds early. */
 #define JK_LB_MAINTAIN_TOLERANCE (2)
Index: jk_lb_worker.c
===================================================================
--- jk_lb_worker.c      (revision 750069)
+++ jk_lb_worker.c      (working copy)
@@ -1338,15 +1338,17 @@
                      * Service failed !!!
                      * Time for fault tolerance (if possible)...
                      */
+                    time_t now = time(NULL);
                     rec->s->errors++;
-                    if (rec->s->busy) {
-                        rec->s->state = JK_LB_STATE_OK;
-                    }
-                    else {
+                    if (rec->s->busy == 0 ||
+                        (rec->s->error_time > 0 &&

+ (int)difftime(now, rec->s->error_time) >MAX_LOCAL_ERROR_TIME)) {

                         rec->s->state = JK_LB_STATE_ERROR;
                     }
                     p->states[rec->i] = JK_LB_STATE_ERROR;
-                    rec->s->error_time = time(NULL);
+                    if (rec->s->error_time == 0) {
+                        rec->s->error_time = now;
+                    }
                     rc = JK_FALSE;
                 }
                 if (p->worker->lblock == JK_LB_LOCK_PESSIMISTIC)

This also has another important side effect: it prevents switching anestablished global error state back to OK, if we are in the "else" caseand busy==0.


What do you think?

a) Local states and forced recovery
--------------------------------------

I think we should test against the local state in forced recovery(whenever global is ERROR, local is as well, but local can be withoutglobal). Since we use forced recovery in maintenance (there we have nolocal states) and in normal service, I suggest:


Index: jk_lb_worker.c
===================================================================
--- jk_lb_worker.c      (revision 750069)
+++ jk_lb_worker.c      (working copy)
@@ -570,17 +570,23 @@
 }

 static int force_recovery(lb_worker_t *p,
+                          int *states,
                           jk_logger_t *l)
 {
     unsigned int i;
     int forced = 0;
+    int state;
     lb_sub_worker_t *w = NULL;
     ajp_worker_t *aw = NULL;
     JK_TRACE_ENTER(l);

     for (i = 0; i < p->num_of_workers; i++) {
         w = &p->lb_workers[i];
-        if (w->s->state == JK_LB_STATE_ERROR) {
+        if (states == NULL)
+            state = w->s->state;
+        else
+            state = states[i];
+        if (state == JK_LB_STATE_ERROR) {
             if (JK_IS_DEBUG_LEVEL(l))
                 jk_log(l, JK_LOG_INFO,
                        "worker %s is marked for forced recovery",
@@ -588,6 +594,8 @@
             aw = (ajp_worker_t *)w->worker->worker_private;
             aw->s->reply_timeouts = 0;
             w->s->state = JK_LB_STATE_FORCE;
+            if (states != NULL)
+                states[i] = JK_LB_STATE_FORCE;
             forced++;
         }
     }
@@ -659,7 +667,7 @@
                        JK_LB_DECAY_MULT * delta / lb->maintain_time);

curmax = decay_load(lb, JK_LB_DECAY_MULT * delta /lb->maintain_time, l);

             if (!recover_workers(lb, curmax, now, l)) {
-                force_recovery(lb, l);
+                force_recovery(lb, NULL, l);
             }
         }

@@ -1396,7 +1406,7 @@
                  * If it still fails, Tomcat is still disconnected.
                  */
                 jk_shm_lock();
-                nf = force_recovery(p->worker, l);
+                nf = force_recovery(p->worker, states, l);
                 jk_shm_unlock();
                 was_forced = 1;
                 if (nf) {


OK?

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

mod_jk local error states vs. global error states

Reply via email to