Hi Thierry,

Op 3-5-2018 om 8:59 schreef Thierry Fournier:
he bug. I even installed a FreeBSD:-)  I add Willy in
copy, maybe he will reproduce it.

Thierry

The 'trick' is probably sending as few requests as possible through a 'high latency' vpn (17ms for a ping from client to haproxy machine..).

Haproxy startup.
    Line 17: 00000000:TestSite.clireq[0007:ffffffff]: GET /haproxy?stats HTTP/1.1     Line 34: 00000002:TestSite.clireq[0007:ffffffff]: GET /favicon.ico HTTP/1.1     Line 44: 00000001:TestSite.clireq[0008:ffffffff]: GET /webrequest/mailstat HTTP/1.1     Line 133: 00000003:TestSite.clireq[0008:ffffffff]: GET /webrequest/mailstat HTTP/1.1     Line 220: 00000004:TestSite.clireq[0008:ffffffff]: GET /favicon.ico HTTP/1.1     Line 233: 00000005:TestSite.clireq[0008:ffffffff]: GET /haproxy?stats HTTP/1.1     Line 251: 00000006:TestSite.clireq[0008:ffffffff]: GET /webrequest/mailstat HTTP/1.1
Crash..

Sometimes it takes a few more but its not really consistently.. Its rather timing sensitive i guess..


But besides the reproduction, how is the theory behind the tasks and their cleanup how 'should' it work? Chrome browser makes a few requests to haproxy for stats page and the other for the and lua service (and a favicon in between.)..

At one point in time the tcp connection for the lua service gets closed and the process_stream starts to call the si_shutw.. a few calls deeper hlua_applet_http_release removes the http task from the list..

static void hlua_applet_http_release(struct appctx *ctx)
{
    task_delete(ctx->ctx.hlua_apphttp.task);
    task_free(ctx->ctx.hlua_apphttp.task);

Then when the current task is 'done' it will move to the next one.. the rq_next in the process loop..that however is pointing to the deleted/freed hlua_apphttp.task..?.. So getting the next task from that already destroyed element will fail...

Perhaps something like the patch below could work?
Does it make sense? (Same should then be done for tcp and cli tasks i guess..) For my testcase it doesn't crash anymore with that change. But i'm not sure if now its leaking memory instead for some cases.. Is there a easy way to check?

Regards,
PiBa-NL (Pieter)


diff --git a/src/hlua.c b/src/hlua.c
index 4c56409..6515f52 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -6635,8 +6635,7 @@ error:

 static void hlua_applet_http_release(struct appctx *ctx)
 {
-    task_delete(ctx->ctx.hlua_apphttp.task);
-    task_free(ctx->ctx.hlua_apphttp.task);
+    ctx->ctx.hlua_apphttp.task->process = NULL;
     ctx->ctx.hlua_apphttp.task = NULL;
     hlua_ctx_destroy(ctx->ctx.hlua_apphttp.hlua);
     ctx->ctx.hlua_apphttp.hlua = NULL;
diff --git a/src/task.c b/src/task.c
index fd9acf6..d6ab0b9 100644
--- a/src/task.c
+++ b/src/task.c
@@ -217,6 +217,13 @@ void process_runnable_tasks()
             t = eb32sc_entry(rq_next, struct task, rq);
             rq_next = eb32sc_next(rq_next, tid_bit);
             __task_unlink_rq(t);
+            if (!t->process) {
+                // task was 'scheduled' to be destroyed (for example a hlua_apphttp.task).
+                task_delete(t);
+                task_free(t);
+                continue;
+            }
+
             t->state |= TASK_RUNNING;
             t->pending_state = 0;



Reply via email to