>
> 20180629.1347 mpeh2 fd25 h2c_error - st04 fl00000002 err05
> Just hit h2c_error - H2_ERR_STREAM_CLOSED
>

After adding more debug I found following pattern around h2c_error in
hanging connections:

... everything OK until now
20180629.1826 0000e901:backend.srvrep[000e:001a]: HTTP/1.1 200 OK
20180629.1826 0000e901:backend.srvcls[000e:adfd]
20180629.1826 mpeh2 fd14 h2s_close/h2c  - id0000006d h2c_st04
h2c_fl00000000 streams:12
20180629.1826 mpeh2 fd14 h2s_close/real - id0000006d st04 fl00003101
streams:12 -> 11

h2c_error
20180629.1826 mpeh2 fd14 h2_process_demux/11 - st04 fl00000000
20180629.1826 mpeh2 fd14 h2c_error - st04 fl00000000 err05
20180629.1826 0000e8e7:backend.srvcls[000e:adfd]
20180629.1826 mpeh2 fd14 h2_process_mux/01a - st06 fl00000000
20180629.1826 mpeh2 fd14 h2_process_mux/01b - st07 fl00000100
20180629.1826 0000e8dd:backend.srvcls[000e:adfd]

few more streams closed before before log for file descriptor 14 ends and
connection hangs in CLOSE_WAIT
20180629.1827 x01.clicls[000e:adfd]
20180629.1827 0000e8dd:backend.closed[000e:adfd]
20180629.1827 0000e8df:backend.clicls[000e:adfd]
20180629.1827 0000e8df:backend.closed[000e:adfd]
20180629.1827 mpeh2 fd14 h2s_destroy - id00000045 st06 fl00003081 streams:11
20180629.1827 mpeh2 fd14 h2s_close/h2c  - id00000045 h2c_st07
h2c_fl00000100 streams:11
20180629.1827 mpeh2 fd14 h2s_close/real - id00000045 st06 fl00003081
streams:11 -> 10

I have not seen this pattern (h2_process_demux/11 followed by h2c_error and
h2_process_mux/01a + h2_process_mux/01b)
in other conncetions. Only in those in CLOSE_WAIT state.

Here is the piece of code with added debug h2_process_demux/11 from sources
of haproxy 1.8.12

/* RFC7540#5.1:closed: if this state is reached as a
* result of sending a RST_STREAM frame, the peer that
* receives the RST_STREAM might have already sent
* frames on the stream that cannot be withdrawn. An
* endpoint MUST ignore frames that it receives on
* closed streams after it has sent a RST_STREAM
* frame. An endpoint MAY choose to limit the period
* over which it ignores frames and treat frames that
* arrive after this time as being in error.
*/
if (!(h2s->flags & H2_SF_RST_SENT)) {
/* RFC7540#5.1:closed: any frame other than
* PRIO/WU/RST in this state MUST be treated as
* a connection error
*/
if (h2c->dft != H2_FT_RST_STREAM &&
    h2c->dft != H2_FT_PRIORITY &&
    h2c->dft != H2_FT_WINDOW_UPDATE) {
send_log(NULL, LOG_NOTICE, "mpeh2 fd%d h2_process_demux/11 - st%02x
fl%08x\n", mpeh2_h2c_fd(h2c), mpeh2_h2c_st0(h2c), mpeh2_h2c_flags(h2c));
h2c_error(h2c, H2_ERR_STREAM_CLOSED);
goto strm_err;
}
}

Here is the piece of code with added debug h2_process_mux/01a and 01b from
sources of haproxy 1.8.12

fail:
if (unlikely(h2c->st0 >= H2_CS_ERROR)) {
send_log(NULL, LOG_NOTICE, "mpeh2 fd%d h2_process_mux/01a - st%02x
fl%08x\n", mpeh2_h2c_fd(h2c), mpeh2_h2c_st0(h2c), mpeh2_h2c_flags(h2c));
if (h2c->st0 == H2_CS_ERROR) {
if (h2c->max_id >= 0) {
h2c_send_goaway_error(h2c, NULL);
if (h2c->flags & H2_CF_MUX_BLOCK_ANY)
return 0;
}

h2c->st0 = H2_CS_ERROR2; // sent (or failed hard) !
}
send_log(NULL, LOG_NOTICE, "mpeh2 fd%d h2_process_mux/01b - st%02x
fl%08x\n", mpeh2_h2c_fd(h2c), mpeh2_h2c_st0(h2c), mpeh2_h2c_flags(h2c));
return 1;
}

I hope this helps in isolating the problem. If it's not enough, I can add
more debug to h2_mux if someone with better knowledge of source code and h2
protocol suggests where.

Milan

Reply via email to