Re: Issues with redirects for POST requests with payload

2017-05-10 Thread Jarno Huuskonen
Hi,

On Tue, May 09, Ciprian Dorin Craciun wrote:
> On Tue, May 9, 2017 at 9:47 PM, Willy Tarreau  wrote:
> > On Tue, May 09, 2017 at 02:54:45PM +0300, Jarno Huuskonen wrote:
> >> My firefox(52.1 on linux) was able to send 128k file,
> >> but 800k file results in connection reset. My chrome sent 16k file, but
> >> fails (ERR_CONNECTION_RESET) on 17k file (few times even the 17k file
> >> worked).
> 
> My particular request is 17575 octets in total (headers + body), and
> as Jarno observed, it sometimes works Firefox / Chromium, but most
> times Chromium trips over it.
> 
> The "non-determinism" of the issue is triggered by the fact if the
> browser manages to push all its payload over the network before the
> response from HAProxy is received.  (If the network latency would be
> zero, and the receive window on HAProxy size would be extremly small,
> this issue would happen every time.)

POST(sending) big enough file and firefox/chrome consistently
error with connection reset (few MB seems to work(result in connection
reset)).

> Based on a few `tcpdump` captures the issue can be described in terms
> of TCP as this:
> * the client sends its headers, and starts sending the request body;
> * HAProxy receives the headers, determines it's a redirect;
> * HAProxy writes the response status line and headers, and closes the
> connection via a reset packet;
> * meanwhile the client while still writing to the socket its payload,
> the reset packet is received which puts the client into an error state
> (because it tries to write to a closed socket);
> * thus the client just errors out without reading the response;
> 
> Sometimes the reset packet is received after the client has finished
> writing, thus the condition is not encountered any more.
> 
> (On explicit request over private email, I can provide a small tcpdump
> capture displaying the behaviour described herein.)
> 
> 
> 
> 
> > Hmmm that sounds bad, it looks like we've broken something again.
> 
> 
> I don't think HAProxy was "broken" this time, as we are using HAProxy
> 1.6.11 since last November, and only recently (since two-three weeks
> ago) we started encountering this issue without having any major
> changes to either the HAProxy configuration or the web application
> where we encountered this issue.
> 
> In fact I think the browsers "broke" something, as only with recent
> variants of Chromium and Firefox we encounter this.

I tested with firefox esr(linux) 24.5.0 -> same connection reset
behaviour.

-Jarno

> However, since there are far less few HAProxy deployments than
> browsers (say 50.000 to 1 in our deployment), I think the easiest
> component to fix is HAProxy, by making it "drain" the entire
> connection before closing it.  (Although I have no idea how
> complicated is to do this.)
> 
> 
> 
> 
> > What status code are you facing there ? I remember we've had such
> > an issue in the past where the server timeout could expire during
> > a long upload because it was not being refreshed during the upload
> > (it would thus result in a 504).
> 
> 
> This is the point, there is no "error" situation, as HAProxy just
> issues the redirect status line and headers and then resets the
> connection.
> 
> Thanks,
> Ciprian.

-- 
Jarno Huuskonen - System Administrator |  jarno.huuskonen atsign uef.fi



Re: Issues with redirects for POST requests with payload

2017-05-10 Thread Jarno Huuskonen
Hi,

On Tue, May 09, Willy Tarreau wrote:
> On Tue, May 09, 2017 at 02:54:45PM +0300, Jarno Huuskonen wrote:
> > My firefox(52.1 on linux) was able to send 128k file,
> > but 800k file results in connection reset. My chrome sent 16k file, but
> > fails (ERR_CONNECTION_RESET) on 17k file (few times even the 17k file
> > worked).
> 
> Hmmm that sounds bad, it looks like we've broken something again.

If it's haproxy error, then it's probably pretty old one. I'm getting
chrome(ERR_CONNECTION_RESET) errors with 1.5dev18, 1.5.2, 1.5.18 ...

> What status code are you facing there ? I remember we've had such

chrome ERR_CONNECTION_RESET (sending 2M file):
haproxy[22170]: 127.0.0.1:53466 [10/May/2017:10:54:15.427] test test/ 
0/-1/-1/-1/1 302 122 - - LR-- 0/0/0/0/0 0/0 "POST /invalid HTTP/1.1"

chrome ok (sending 15k file):
haproxy[22170]: 127.0.0.1:53526 [10/May/2017:10:58:19.415] test test/ 
0/-1/-1/-1/0 302 122 - - LR-- 0/0/0/0/0 0/0 "POST /invalid HTTP/1.1"
haproxy[22170]: 127.0.0.1:33674 [10/May/2017:10:58:19.423] test~ test/ 
-1/-1/-1/-1/4 400 187 - - CR-- 1/1/0/0/0 0/0 ""
haproxy[22170]: 127.0.0.1:33676 [10/May/2017:10:58:19.424] test~ test/ 
-1/-1/-1/-1/4 400 187 - - CR-- 0/0/0/0/0 0/0 ""
haproxy[22170]: 127.0.0.1:33678 [10/May/2017:10:58:19.430] test~ test/ 
-1/-1/-1/-1/2 400 187 - - CR-- 0/0/0/0/0 0/0 ""
haproxy[22170]: 127.0.0.1:33680 [10/May/2017:10:58:19.434] test~ test_be/wp1 
0/0/3/2/5 404 429 - -  1/1/0/0/0 0/0 "GET /invalid HTTP/1.1"
(the backend doesn't have /invalid -> so 404).
(the 400 BADREQ are chrome pre-connect (I think) they disappear with
option http-ignore-probes).

chrome ok (sending 17k file) (and option http-ignore-probes). Most of
the time sending 17k fails:
haproxy[22541]: 127.0.0.1:53630 [10/May/2017:11:04:03.022] test test/ 
0/-1/-1/-1/0 302 122 - - LR-- 0/0/0/0/0 0/0 "POST /invalid HTTP/1.1"
haproxy[22541]: 127.0.0.1:33784 [10/May/2017:11:04:03.041] test~ test_be/wp1 
0/0/3/1/5 404 429 - -  1/1/0/0/0 0/0 "GET /invalid HTTP/1.1"

> an issue in the past where the server timeout could expire during
> a long upload because it was not being refreshed during the upload
> (it would thus result in a 504).

My test config:
global
log /dev/log local2 info
stats socket /tmp/stats level admin

defaults
mode http
log global
option httplog
retries 2
timeout queue   2s
timeout connect 1500ms
timeout client  30s
timeout server  30s
timeout http-keep-alive 4s
timeout http-request3500ms
timeout check   1700ms
timeout tarpit 3s

frontend test
option http-ignore-probes
bind ipv4@127.0.0.1:80
bind ipv4@127.0.0.1:443 ssl crt ./crt.pem
acl is_plain dst_port 80
redirect scheme https code 302 if is_plain #!METH_POST

default_backend test_be

backend test_be
acl is_plain dst_port 80
server dummy1 some.apache.server.ip:443 ssl verify none

This is the html form I'm testing with:



http://127.0.0.1/invalid; method="post" 
enctype="multipart/form-data">
Select image to upload:






-Jarno

-- 
Jarno Huuskonen



Re: Issues with redirects for POST requests with payload

2017-05-09 Thread Ciprian Dorin Craciun
On Tue, May 9, 2017 at 9:47 PM, Willy Tarreau  wrote:
> On Tue, May 09, 2017 at 02:54:45PM +0300, Jarno Huuskonen wrote:
>> My firefox(52.1 on linux) was able to send 128k file,
>> but 800k file results in connection reset. My chrome sent 16k file, but
>> fails (ERR_CONNECTION_RESET) on 17k file (few times even the 17k file
>> worked).


My particular request is 17575 octets in total (headers + body), and
as Jarno observed, it sometimes works Firefox / Chromium, but most
times Chromium trips over it.

The "non-determinism" of the issue is triggered by the fact if the
browser manages to push all its payload over the network before the
response from HAProxy is received.  (If the network latency would be
zero, and the receive window on HAProxy size would be extremly small,
this issue would happen every time.)

Based on a few `tcpdump` captures the issue can be described in terms
of TCP as this:
* the client sends its headers, and starts sending the request body;
* HAProxy receives the headers, determines it's a redirect;
* HAProxy writes the response status line and headers, and closes the
connection via a reset packet;
* meanwhile the client while still writing to the socket its payload,
the reset packet is received which puts the client into an error state
(because it tries to write to a closed socket);
* thus the client just errors out without reading the response;

Sometimes the reset packet is received after the client has finished
writing, thus the condition is not encountered any more.

(On explicit request over private email, I can provide a small tcpdump
capture displaying the behaviour described herein.)




> Hmmm that sounds bad, it looks like we've broken something again.


I don't think HAProxy was "broken" this time, as we are using HAProxy
1.6.11 since last November, and only recently (since two-three weeks
ago) we started encountering this issue without having any major
changes to either the HAProxy configuration or the web application
where we encountered this issue.

In fact I think the browsers "broke" something, as only with recent
variants of Chromium and Firefox we encounter this.


However, since there are far less few HAProxy deployments than
browsers (say 50.000 to 1 in our deployment), I think the easiest
component to fix is HAProxy, by making it "drain" the entire
connection before closing it.  (Although I have no idea how
complicated is to do this.)




> What status code are you facing there ? I remember we've had such
> an issue in the past where the server timeout could expire during
> a long upload because it was not being refreshed during the upload
> (it would thus result in a 504).


This is the point, there is no "error" situation, as HAProxy just
issues the redirect status line and headers and then resets the
connection.

Thanks,
Ciprian.



Re: Issues with redirects for POST requests with payload

2017-05-09 Thread Willy Tarreau
On Tue, May 09, 2017 at 02:54:45PM +0300, Jarno Huuskonen wrote:
> My firefox(52.1 on linux) was able to send 128k file,
> but 800k file results in connection reset. My chrome sent 16k file, but
> fails (ERR_CONNECTION_RESET) on 17k file (few times even the 17k file
> worked).

Hmmm that sounds bad, it looks like we've broken something again.
What status code are you facing there ? I remember we've had such
an issue in the past where the server timeout could expire during
a long upload because it was not being refreshed during the upload
(it would thus result in a 504).

Willy



Re: Issues with redirects for POST requests with payload

2017-05-09 Thread Jarno Huuskonen
Hi,

On Sat, May 06, Ciprian Dorin Craciun wrote:
> Hello all!
> 
> In last weeks I've started encountering a problem, that for our
> particular use-case is seriously breaking some of our sites, namely:
> 
> * a client makes a POST request which has a "largish" payload, one
> that does not manage to "push" it through before HAProxy has a chance
> to respond,
> * if HAProxy is configured to redirect such a request (like for
> example upgrading HTTP to HTTPS),
> * then HAProxy responds with the redirect, and closes the connection;
> but the client has not yet been able to push its POST body and
> receives a write error, and thus it aborts without trying to read the
> response from HAProxy;
> 
> 
> One can easily reproduce this with:
> (
>   printf -- 'POST /invalid HTTP/1.1\r\nHost: invalid.example.com\r\n\r\n'
>   dd if=/dev/urandom bs=1024 count=4 | base64
> ) \
> | socat -d -d -v tcp:127.0.0.1:80,sndbuf=16 stdio
> 
> , which results in a connection reset, as `socat` is trying to push
> data to a closed socket.
> 
> (Via private email I can give an actual `tcpdump` capture with production 
> data.)
> 
> 
> 
> 
> Unfortunately this issue doesn't impact a "random" client but recent
> versions of Firefox and Chrome, which just display a "connection
> reset" kind of message to the users.

My firefox(52.1 on linux) was able to send 128k file,
but 800k file results in connection reset. My chrome sent 16k file, but
fails (ERR_CONNECTION_RESET) on 17k file (few times even the 17k file
worked).
(tested with haproxy 1.8dev version).
(curl seems to handle 800k file upload ok, wget seems to work upto 64k
(very limited testing)).

This http-response set-status (ugly hack/workaround) might keep chrome/firefox
happy:
frontend test
bind ipv4@127.0.0.1:80
bind ipv4@127.0.0.1:443 ssl crt ./crt.pem
acl is_plain dst_port 80
redirect scheme https code 302 if is_plain !METH_POST
default_backend test_be

backend test_be
acl is_plain dst_port 80
http-response set-status 302 reason "found\r\nLocation: 
https://127.0.0.1/invalid; if METH_POST is_plain

It might be possible to use vars to have dynamic Location:
Frontend:
http-request set-header X-DUMMY 
"https://"%[req.hdr(Host),regsub(:.+$,,g)]":443"%[path]
http-request set-var(sess.loc) hdr(X-DUMMY)

Backend:
http-response redirect code 302 location %[var(sess.loc)] if METH_POST is_plain

-Jarno

-- 
Jarno Huuskonen



Re: Issues with redirects for POST requests with payload

2017-05-08 Thread Ciprian Dorin Craciun
On Sat, May 6, 2017 at 11:13 AM, Ciprian Dorin Craciun
 wrote:
> Hello all!
>
> In last weeks I've started encountering a problem, that for our
> particular use-case is seriously breaking some of our sites, namely:
>
> * a client makes a POST request which has a "largish" payload, one
> that does not manage to "push" it through before HAProxy has a chance
> to respond,
> * if HAProxy is configured to redirect such a request (like for
> example upgrading HTTP to HTTPS),
> * then HAProxy responds with the redirect, and closes the connection;
> but the client has not yet been able to push its POST body and
> receives a write error, and thus it aborts without trying to read the
> response from HAProxy;
>
>
> One can easily reproduce this with:
> (
>   printf -- 'POST /invalid HTTP/1.1\r\nHost: invalid.example.com\r\n\r\n'
>   dd if=/dev/urandom bs=1024 count=4 | base64
> ) \
> | socat -d -d -v tcp:127.0.0.1:80,sndbuf=16 stdio
>
> , which results in a connection reset, as `socat` is trying to push
> data to a closed socket.
>
> (Via private email I can give an actual `tcpdump` capture with production 
> data.)
>
>
>
>
> Unfortunately this issue doesn't impact a "random" client but recent
> versions of Firefox and Chrome, which just display a "connection
> reset" kind of message to the users.
>
>
> I've tried searching for a similar problem, and found this:
>
>   
> http://haproxy.formilux.narkive.com/9xhXJk4f/redirecting-on-a-large-post-without-reading-it-entirely
>   http://haproxy.formilux.narkive.com/gYztlqms/fwd-302-to-502-error
>
>
> But it's not clear to me if these issues were fixed since almost 8
> years ago, or how should I proceed in solving this issue myself.  (I'm
> open to applying patches and re-compiling HAProxy.)
>
>
> Increasing `tune.bufsize` to 128k doesn't seem to help either.
>
> (I am using HAProxy 1.6.11.)




Just wanted to "ping" this thread, as perhaps sending the original
email on weekend got it "forgotten".  :)

Or perhaps nobody has hit this issue in production (yet)?

Thanks,
Ciprian.



Re: Issues with redirects for POST requests with payload

2017-05-06 Thread Ciprian Dorin Craciun
Forgot to mention that it involves HAProxy 1.6.11.

Ciprian.


On Sat, May 6, 2017 at 11:13 AM, Ciprian Dorin Craciun
 wrote:
> Hello all!
>
> In last weeks I've started encountering a problem, that for our
> particular use-case is seriously breaking some of our sites, namely:
>
> * a client makes a POST request which has a "largish" payload, one
> that does not manage to "push" it through before HAProxy has a chance
> to respond,
> * if HAProxy is configured to redirect such a request (like for
> example upgrading HTTP to HTTPS),
> * then HAProxy responds with the redirect, and closes the connection;
> but the client has not yet been able to push its POST body and
> receives a write error, and thus it aborts without trying to read the
> response from HAProxy;
>
>
> One can easily reproduce this with:
> (
>   printf -- 'POST /invalid HTTP/1.1\r\nHost: invalid.example.com\r\n\r\n'
>   dd if=/dev/urandom bs=1024 count=4 | base64
> ) \
> | socat -d -d -v tcp:127.0.0.1:80,sndbuf=16 stdio
>
> , which results in a connection reset, as `socat` is trying to push
> data to a closed socket.
>
> (Via private email I can give an actual `tcpdump` capture with production 
> data.)
>
>
>
>
> Unfortunately this issue doesn't impact a "random" client but recent
> versions of Firefox and Chrome, which just display a "connection
> reset" kind of message to the users.
>
>
> I've tried searching for a similar problem, and found this:
>
>   
> http://haproxy.formilux.narkive.com/9xhXJk4f/redirecting-on-a-large-post-without-reading-it-entirely
>   http://haproxy.formilux.narkive.com/gYztlqms/fwd-302-to-502-error
>
>
> But it's not clear to me if these issues were fixed since almost 8
> years ago, or how should I proceed in solving this issue myself.  (I'm
> open to applying patches and re-compiling HAProxy.)
>
>
> Increasing `tune.bufsize` to 128k doesn't seem to help either.
>
>
> Thanks,
> Ciprian.



Issues with redirects for POST requests with payload

2017-05-06 Thread Ciprian Dorin Craciun
Hello all!

In last weeks I've started encountering a problem, that for our
particular use-case is seriously breaking some of our sites, namely:

* a client makes a POST request which has a "largish" payload, one
that does not manage to "push" it through before HAProxy has a chance
to respond,
* if HAProxy is configured to redirect such a request (like for
example upgrading HTTP to HTTPS),
* then HAProxy responds with the redirect, and closes the connection;
but the client has not yet been able to push its POST body and
receives a write error, and thus it aborts without trying to read the
response from HAProxy;


One can easily reproduce this with:
(
  printf -- 'POST /invalid HTTP/1.1\r\nHost: invalid.example.com\r\n\r\n'
  dd if=/dev/urandom bs=1024 count=4 | base64
) \
| socat -d -d -v tcp:127.0.0.1:80,sndbuf=16 stdio

, which results in a connection reset, as `socat` is trying to push
data to a closed socket.

(Via private email I can give an actual `tcpdump` capture with production data.)




Unfortunately this issue doesn't impact a "random" client but recent
versions of Firefox and Chrome, which just display a "connection
reset" kind of message to the users.


I've tried searching for a similar problem, and found this:

  
http://haproxy.formilux.narkive.com/9xhXJk4f/redirecting-on-a-large-post-without-reading-it-entirely
  http://haproxy.formilux.narkive.com/gYztlqms/fwd-302-to-502-error


But it's not clear to me if these issues were fixed since almost 8
years ago, or how should I proceed in solving this issue myself.  (I'm
open to applying patches and re-compiling HAProxy.)


Increasing `tune.bufsize` to 128k doesn't seem to help either.


Thanks,
Ciprian.