Hi again :-)

On Tue, Oct 13, 2015 at 06:10:33PM +0200, Willy Tarreau wrote:
> I can't reproduce either unfortunately. I'm seeing some other minor
> issues related to how the closed input is handled and showing that
> pipelining doesn't work (only the first request is handled) but that's
> all I'm seeing I'm sorry.
> 
> I've tried injecting on stats in parallel to the other frontend, I've
> tried with close and keep-alive etc... I tried to change the poller
> just in case you would be facing a race condition, no way :-(
> 
> In general it's good to keep in mind that buffer_slow_realign() is
> called to realign wrapped requests, so that normally means that
> pipelining is needed. But even then for now I can't succeed.

As usual, sending an e-mail scares the bug and it starts to shake the
white flag :-)

So by configuring the buffer size to 10000 and sending large 8kB requests,
I'm seeing a random behaviour. First, most of then time I end up with a
stuck session which never ends (no expiration timer set). And from time
to time it may crash. This time it was not in buffer_slow_realign() but
in buffer_insert_line2(), though the problem is the same :

(gdb) up
#2  0x000000000046e094 in http_header_add_tail2 (msg=0x7ce628, 
hdr_idx=0x7ce5c8, text=0x53b339 "Connection: close", len=17) at 
src/proto_http.c:595
595             bytes = buffer_insert_line2(msg->chn->buf, msg->chn->buf->p + 
msg->eoh, text, len);

(gdb) p msg->eoh
$6 = 8057
(gdb) p *msg->chn->buf
$7 = {p = 0x7f8e7b44bf9e "3456789.123456789\n", 'P' <repeats 182 times>..., 
size = 10008, i = 0, o = 8058, data = 0x7f8e7b44a024 "GET /1234567"}

(gdb) p msg->chn->buf->p - msg->chn->buf->data
$8 = 8058

As one may notice, since p is already 8kB from the beginning of the buffer
(hence 2kB from the end), writing at p + eoh is definitely wrong. Here we're
having a problem that msg->eoh is wrong or buf->p is wrong.

My opinion here is that buf->p is the wrong one, since we're dealing with a
8kB request, so it should definitely have been realigned. Or maybe it was
stripped and removed from the request buffer with HTTP processing still
enabled.

All this part is still totally unclear to me I'm afraid. I suggest that we
don't rush too fast on lua services and try to fix that during the stable
cycle. I don't want to postpone the release any further for something that
was added very recently and that is not causing any regression to existing
configs.

Best regards,
willy


Reply via email to