Maciej,

Looks like memory corruption is still here but it corrupt just some another 
place.

Willy do you agree?

--
wbr, Kirill

> On 2. Nov 2020, at 15:34, Maciej Zdeb <mac...@zdeb.pl> wrote:
> 
> So after Kirill suggestion to modify h2s struct in a way that tasklet 
> "shut_tl" is before recv_wait I verified if in 2.2.4 the same crash will 
> occur aaaand it did not!
> 
> After the patch that merges recv_wait and send_wait: 
> http://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=f96508aae6b49277dcf142caa35042678cf8e2ca
>  
> <http://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=f96508aae6b49277dcf142caa35042678cf8e2ca>
> and witch such h2s (tasklet shut_tl before wait_event subs) the crashes are 
> gone:
> 
> struct h2s {
>         [...]
>         struct buffer rxbuf; /* receive buffer, always valid (buf_empty or 
> real buffer) */
>         struct tasklet *shut_tl;  /* deferred shutdown tasklet, to retry to 
> send an RST after we failed to,
>                                    * in case there's no other subscription to 
> do it */
>         struct wait_event *subs;      /* recv wait_event the conn_stream 
> associated is waiting on (via h2_subscribe) */
>         struct list list; /* To be used when adding in h2c->send_list or 
> h2c->fctl_lsit */
> };
> 
> 
> 
> pon., 2 lis 2020 o 12:42 Maciej Zdeb <mac...@zdeb.pl <mailto:mac...@zdeb.pl>> 
> napisał(a):
> Great idea Kirill,
> 
> With such modification:
> 
> struct h2s {
>         [...]
>         struct tasklet *shut_tl;
>         struct wait_event *recv_wait; /* recv wait_event the conn_stream 
> associated is waiting on (via h2_subscribe) */
>         struct wait_event *send_wait; /* send wait_event the conn_stream 
> associated is waiting on (via h2_subscribe) */
>         struct list list; /* To be used when adding in h2c->send_list or 
> h2c->fctl_lsit */
> };
> 
> it crashed just like before.
> 
> pon., 2 lis 2020 o 11:12 Kirill A. Korinsky <kir...@korins.ky 
> <mailto:kir...@korins.ky>> napisał(a):
> Hi,
> 
> Thanks for update.
> 
> After read Wully's recommendation and provided commit that fixed an issue I'm 
> curious can you "edit" a bit this commit and move `shut_tl` before 
> `recv_wait` instead of removed `wait_event`?
> 
> It is a quiet dummy way to confirm that memory corruption had gone, and not 
> just moved to somewhere else.
> 
> --
> wbr, Kirill
> 
>> On 2. Nov 2020, at 10:58, Maciej Zdeb <mac...@zdeb.pl 
>> <mailto:mac...@zdeb.pl>> wrote:
>> 
>> Hi,
>> 
>> Update for people on the list that might be interested in the issue, because 
>> part of discussion was private.
>> 
>> I wanted to check Willy suggestion and modified h2s struct (added dummy 
>> fields):
>> 
>> struct h2s {
>>         [...]
>>         uint16_t status;     /* HTTP response status */
>>         unsigned long long body_len; /* remaining body length according to 
>> content-length if H2_SF_DATA_CLEN */
>>         struct buffer rxbuf; /* receive buffer, always valid (buf_empty or 
>> real buffer) */
>>         int dummy0;
>>         struct wait_event wait_event; /* Wait list, when we're attempting to 
>> send a RST but we can't send */
>>         int dummy1;
>>         struct wait_event *recv_wait; /* recv wait_event the conn_stream 
>> associated is waiting on (via h2_subscribe) */
>>         int dummy2;
>>         struct wait_event *send_wait; /* send wait_event the conn_stream 
>> associated is waiting on (via h2_subscribe) */
>>         int dummy3;
>>         struct list list; /* To be used when adding in h2c->send_list or 
>> h2c->fctl_lsit */
>>         struct list sending_list; /* To be used when adding in 
>> h2c->sending_list */
>> };
>> 
>> With such modified h2s struct, the crash did not occur.
>> 
>> I've checked HAProxy 2.1, it crashes like 2.0.
>> 
>> I've also checked 2.2, bisection showed that this commit: 
>> http://git.haproxy.org/?p=haproxy-2.2.git;a=commitdiff;h=5723f295d85febf5505f8aef6afabb6b23d6fdec;hp=f11be0ea1e8e571234cb41a2fcdde2cf2161df37
>>  
>> <http://git.haproxy.org/?p=haproxy-2.2.git;a=commitdiff;h=5723f295d85febf5505f8aef6afabb6b23d6fdec;hp=f11be0ea1e8e571234cb41a2fcdde2cf2161df37>
>>  fixed the crashes we experienced. I'm not sure how/if it fixed the memory 
>> corruption, it is possible that memory is still corrupted but not causing 
>> the crash.
>> 
>> 
>> 
>> pt., 25 wrz 2020 o 16:25 Kirill A. Korinsky <kir...@korins.ky 
>> <mailto:kir...@korins.ky>> napisał(a):
>> Very interesting.
>> 
>> Anyway, I can see that this pice of code was refactored some time ago: 
>> https://github.com/haproxy/haproxy/commit/f96508aae6b49277dcf142caa35042678cf8e2ca
>>  
>> <https://github.com/haproxy/haproxy/commit/f96508aae6b49277dcf142caa35042678cf8e2ca>
>> 
>> Maybe it is worth to try 2.2 or 2.3 branch?
>> 
>> Yes, it is a blind shot and just a guess.
>> 
>> --
>> wbr, Kirill
>> 
>>> On 25. Sep 2020, at 16:01, Maciej Zdeb <mac...@zdeb.pl 
>>> <mailto:mac...@zdeb.pl>> wrote:
>>> 
>>> Yes at the same place with same value:
>>> 
>>> (gdb) bt full
>>> #0  0x0000559ce98b964b in h2s_notify_recv (h2s=0x559cef94e7e0) at 
>>> src/mux_h2.c:783
>>>         sw = 0xffffffff
>>> 
>>> 
>>> 
>>> pt., 25 wrz 2020 o 15:42 Kirill A. Korinsky <kir...@korins.ky 
>>> <mailto:kir...@korins.ky>> napisał(a):
>>> > On 25. Sep 2020, at 15:26, Maciej Zdeb <mac...@zdeb.pl 
>>> > <mailto:mac...@zdeb.pl>> wrote:
>>> >
>>> > I was mailing outside the list with Willy and Christopher but it's worth 
>>> > sharing that the problem occurs even with nbthread = 1. I've managed to 
>>> > confirm it today.
>>> 
>>> 
>>> I'm curious is it crashed at the same place with the same value?
>>> 
>>> --
>>> wbr, Kirill
>>> 
>>> 
>> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP

Reply via email to