I modified h2s struct in 2.2 branch with HEAD set to
f96508aae6b49277dcf142caa35042678cf8e2ca "MEDIUM: mux-h2: merge recv_wait
and send_wait event notifications" like below (subs is in exact place of
removed wait_event):

struct h2s {
        [...]
        struct tasklet *dummy0;
        struct wait_event *dummy1;
        struct wait_event *subs;      /* recv wait_event the conn_stream
associated is waiting on (via h2_subscribe) */
        struct list list; /* To be used when adding in h2c->send_list or
h2c->fctl_lsit */
        struct tasklet *shut_tl;  /* deferred shutdown tasklet, to retry to
send an RST after we failed to,
                                   * in case there's no other subscription
to do it */
}

it crashed like before with subs = 0xffffffff:

(gdb) p *(struct h2s*)(0x7fde7459e9b0)
$1 = {cs = 0x7fde5c02d260, sess = 0x5628283bc740 <pool_cache+64>, h2c =
0x5628295cbb80, h1m = {state = H1_MSG_RPBEFORE, flags = 12, curr_len = 0,
    body_len = 0, next = 0, err_pos = -1, err_state = 0}, by_id = {node =
{branches = {b = {0x0, 0x7fde3c2c6c60}}, node_p = 0x0,
      leaf_p = 0x5628295cc018, bit = -5624, pfx = 29785}, key = 11}, id =
11, flags = 28673, sws = -4060, errcode = H2_ERR_NO_ERROR, st = H2_SS_HREM,
  status = 200, body_len = 0, rxbuf = {size = 0, area = 0x0, data = 0, head
= 0}, dummy0 = 0x0, dummy1 = 0x0, subs = 0xffffffff, list = {
    n = 0x7fde7459ea68, p = 0x7fde7459ea68}, shut_tl = 0x5628297eeaf0}

it crashes like above until commit:
http://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=5c8be272c732e4f42ccd6b3d65f25aa7425a2aba
which alters tasks processing.


pon., 2 lis 2020 o 15:46 Maciej Zdeb <mac...@zdeb.pl> napisał(a):

> I'm wondering, the corrupted address was always at "wait_event" in h2s
> struct, after its removal in:
> http://git.haproxy.org/?p=haproxy-2.2.git;a=commitdiff;h=5723f295d85febf5505f8aef6afabb6b23d6fdec;hp=f11be0ea1e8e571234cb41a2fcdde2cf2161df37
> crashes went away.
>
> But with the above patch and after altering h2s struct into:
> struct h2s {
>         [...]
>         struct tasklet *shut_tl;
>         struct wait_event *recv_wait; /* recv wait_event the conn_stream
> associated is waiting on (via h2_subscribe) */
>         struct wait_event *send_wait; /* send wait_event the conn_stream
> associated is waiting on (via h2_subscribe) */
>         struct list list; /* To be used when adding in h2c->send_list or
> h2c->fctl_lsit */
> };
>
> the crash returned.
>
> However after recv_wait and send_wait were merged in:
> http://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=f96508aae6b49277dcf142caa35042678cf8e2ca
> crashes went away again.
>
> In my opinion shut_tl should be corrupted again, but it is not. Maybe the
> last patch fixed it?
>
> pon., 2 lis 2020 o 15:37 Kirill A. Korinsky <kir...@korins.ky> napisał(a):
>
>> Maciej,
>>
>> Looks like memory corruption is still here but it corrupt just some
>> another place.
>>
>> Willy do you agree?
>>
>> --
>> wbr, Kirill
>>
>> On 2. Nov 2020, at 15:34, Maciej Zdeb <mac...@zdeb.pl> wrote:
>>
>> So after Kirill suggestion to modify h2s struct in a way that tasklet
>> "shut_tl" is before recv_wait I verified if in 2.2.4 the same crash will
>> occur aaaand it did not!
>>
>> After the patch that merges recv_wait and send_wait:
>> http://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=f96508aae6b49277dcf142caa35042678cf8e2ca
>> and witch such h2s (tasklet shut_tl before wait_event subs) the crashes
>> are gone:
>>
>> struct h2s {
>>         [...]
>>         struct buffer rxbuf; /* receive buffer, always valid (buf_empty
>> or real buffer) */
>>         struct tasklet *shut_tl;  /* deferred shutdown tasklet, to retry
>> to send an RST after we failed to,
>>                                    * in case there's no other
>> subscription to do it */
>>         struct wait_event *subs;      /* recv wait_event the conn_stream
>> associated is waiting on (via h2_subscribe) */
>>         struct list list; /* To be used when adding in h2c->send_list or
>> h2c->fctl_lsit */
>> };
>>
>>
>>
>> pon., 2 lis 2020 o 12:42 Maciej Zdeb <mac...@zdeb.pl> napisał(a):
>>
>>> Great idea Kirill,
>>>
>>> With such modification:
>>>
>>> struct h2s {
>>>         [...]
>>>         struct tasklet *shut_tl;
>>>         struct wait_event *recv_wait; /* recv wait_event the conn_stream
>>> associated is waiting on (via h2_subscribe) */
>>>         struct wait_event *send_wait; /* send wait_event the conn_stream
>>> associated is waiting on (via h2_subscribe) */
>>>         struct list list; /* To be used when adding in h2c->send_list or
>>> h2c->fctl_lsit */
>>> };
>>>
>>> it crashed just like before.
>>>
>>> pon., 2 lis 2020 o 11:12 Kirill A. Korinsky <kir...@korins.ky>
>>> napisał(a):
>>>
>>>> Hi,
>>>>
>>>> Thanks for update.
>>>>
>>>> After read Wully's recommendation and provided commit that fixed an
>>>> issue I'm curious can you "edit" a bit this commit and move `shut_tl`
>>>> before `recv_wait` instead of removed `wait_event`?
>>>>
>>>> It is a quiet dummy way to confirm that memory corruption had gone, and
>>>> not just moved to somewhere else.
>>>>
>>>> --
>>>> wbr, Kirill
>>>>
>>>> On 2. Nov 2020, at 10:58, Maciej Zdeb <mac...@zdeb.pl> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Update for people on the list that might be interested in the issue,
>>>> because part of discussion was private.
>>>>
>>>> I wanted to check Willy suggestion and modified h2s struct (added dummy
>>>> fields):
>>>>
>>>> struct h2s {
>>>>         [...]
>>>>         uint16_t status;     /* HTTP response status */
>>>>         unsigned long long body_len; /* remaining body length according
>>>> to content-length if H2_SF_DATA_CLEN */
>>>>         struct buffer rxbuf; /* receive buffer, always valid (buf_empty
>>>> or real buffer) */
>>>>         int dummy0;
>>>>         struct wait_event wait_event; /* Wait list, when we're
>>>> attempting to send a RST but we can't send */
>>>>         int dummy1;
>>>>         struct wait_event *recv_wait; /* recv wait_event the
>>>> conn_stream associated is waiting on (via h2_subscribe) */
>>>>         int dummy2;
>>>>         struct wait_event *send_wait; /* send wait_event the
>>>> conn_stream associated is waiting on (via h2_subscribe) */
>>>>         int dummy3;
>>>>         struct list list; /* To be used when adding in h2c->send_list
>>>> or h2c->fctl_lsit */
>>>>         struct list sending_list; /* To be used when adding in
>>>> h2c->sending_list */
>>>> };
>>>>
>>>> With such modified h2s struct, the crash did not occur.
>>>>
>>>> I've checked HAProxy 2.1, it crashes like 2.0.
>>>>
>>>> I've also checked 2.2, bisection showed that this commit:
>>>> http://git.haproxy.org/?p=haproxy-2.2.git;a=commitdiff;h=5723f295d85febf5505f8aef6afabb6b23d6fdec;hp=f11be0ea1e8e571234cb41a2fcdde2cf2161df37
>>>> fixed the crashes we experienced. I'm not sure how/if it fixed the memory
>>>> corruption, it is possible that memory is still corrupted but not causing
>>>> the crash.
>>>>
>>>>
>>>>
>>>> pt., 25 wrz 2020 o 16:25 Kirill A. Korinsky <kir...@korins.ky>
>>>> napisał(a):
>>>>
>>>>> Very interesting.
>>>>>
>>>>> Anyway, I can see that this pice of code was refactored some time ago:
>>>>> https://github.com/haproxy/haproxy/commit/f96508aae6b49277dcf142caa35042678cf8e2ca
>>>>>
>>>>> Maybe it is worth to try 2.2 or 2.3 branch?
>>>>>
>>>>> Yes, it is a blind shot and just a guess.
>>>>>
>>>>> --
>>>>> wbr, Kirill
>>>>>
>>>>> On 25. Sep 2020, at 16:01, Maciej Zdeb <mac...@zdeb.pl> wrote:
>>>>>
>>>>> Yes at the same place with same value:
>>>>>
>>>>> (gdb) bt full
>>>>> #0  0x0000559ce98b964b in h2s_notify_recv (h2s=0x559cef94e7e0) at
>>>>> src/mux_h2.c:783
>>>>>         sw = 0xffffffff
>>>>>
>>>>>
>>>>>
>>>>> pt., 25 wrz 2020 o 15:42 Kirill A. Korinsky <kir...@korins.ky>
>>>>> napisał(a):
>>>>>
>>>>>> > On 25. Sep 2020, at 15:26, Maciej Zdeb <mac...@zdeb.pl> wrote:
>>>>>> >
>>>>>> > I was mailing outside the list with Willy and Christopher but it's
>>>>>> worth sharing that the problem occurs even with nbthread = 1. I've 
>>>>>> managed
>>>>>> to confirm it today.
>>>>>>
>>>>>>
>>>>>> I'm curious is it crashed at the same place with the same value?
>>>>>>
>>>>>> --
>>>>>> wbr, Kirill
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>

Reply via email to