Re: Fix triggering of runtime DNS resolution?

2015-09-02 Thread Baptiste
On Thu, Sep 3, 2015 at 1:11 AM, Baptiste  wrote:
> On Thu, Sep 3, 2015 at 12:56 AM, Conrad Hoffmann  
> wrote:
>> Hello,
>>
>> it's kind of late and I am not 100% sure I'm getting this right, so would
>> be great if someone could double-check this:
>>
>> Essentially, the runtime DNS resolution was never triggered for me. I
>> tracked this down to a signed/unsigned problem in the usage of
>> tick_is_expired() from checks.c:2158.
>>
>> curr_resolution->last_resolution is being initialized to zero
>> (server.c:981), which in turn makes it say a few thousand after the value
>> of hold.valid is added (also checks.c:2158). It is then compared to now_ms,
>> which is an unsigned integer so large that it is out of the signed integer
>> range. Thus, the comparison will not get the expected result, as it is done
>> on integer values (now_ms cast to integer gave e.g. -1875721083 a few
>> minutes ago, which is undeniably smaller then 3000).
>>
>> One way to fix this is to initialize curr_resolution->last_resolution to
>> now_ms instead of zero (attached "patch"), but then it only works because
>> both values are converted to negative integers. While I think that this
>> will reasonably hide the problem for the time being, I do think there is a
>> deeper problem here, which is the frequent passing of an unsigned integer
>> into a function that takes signed int as argument.
>>
>> I see that tick_* is used all over the place, so I thought I would rather
>> consult someone before spending lots of time creating a patch that would
>> not be used. Also, I would need some more time to actually figure out what
>> the best solution would be.
>>
>> Does anyone have any thoughts on this? Is someone maybe already aware of 
>> this?
>>
>> Thanks a lot,
>> Conrad
>> --
>> Conrad Hoffmann
>> Traffic Engineer
>>
>> SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
>>
>> Managing Director: Alexander Ljung | Incorporated in England & Wales
>> with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
>> HRB 110657B
>
>
> Hi Conrad,
>
> I remarked this as well.
> Please apply the patch in attachment and confirm it fixes this issue.
>
> I introduced this bug when trying to fix an other one: DNS resolution
> was supposed to start with first health check.
> Unfortunately, it started after hold.valid period after HAProxy's start time.
>
> Please confirm the patch in attachment fix this and that DNS queries
> are well sent at startup (and later).
>
> Baptiste


Hi Conrad,

Please note the patch in my previous mail is not the definitive one.
I started a private thread with Willy right before your mail to
discuss this point and I'll send today the definitive patch.

Baptiste



Using getaddrinfo_a on configuration load

2015-09-02 Thread Cyrus Hall

Hi!

I've searched the list and not found much on lengthy HAProxy 
start/config load times.  We tend to run HAProxy with a large number of 
backends (100+), and recently noticed that we are seeing lengthy reload 
times (20+ seconds).  This is most noticeable in locations that are far 
away from our DNS server.  We've traced this back to HAProxy doing 
sequential DNS lookup, one address at a time, and the long RTT to our 
DNS servers (~200ms).  As such, load times tend to be N*M ms, where N is 
the number of backends and M is the RTT to the DNS server.


While searching the mailing list, there was a little discussion about 
using getaddrinfo_a for making DNS queries asynchronous.  I can not find 
any signs of this work in the latest development branch release.  Is 
anyone currently working on it, and if not, is it something that the 
project would be interested in seeing?


Cheers,
Cyrus

--

Cyrus Hall | Lead Software Engineer | Twitch | 720-327-0344 | 
cy...@twitch.tv





Re: Rate Limiting - Stick-Table Entry Expiration

2015-09-02 Thread Hugues Alary
Hi Willy,

Thank you for your detailed and clear answer. I somehow missed it when you
sent it 8 days ago.

> It's the sc0_get_gpc0() which refreshes the entry.
I did not realize that. This is very good to know.

I will definitely try your suggestion and report back here.

-Hugues

On Tue, Aug 25, 2015 at 9:43 AM, Willy Tarreau  wrote:

> Hi Hugues,
>
> On Wed, Aug 19, 2015 at 01:34:46PM -0700, Hugues Alary wrote:
> > Hi there,
> >
> > I've been trying to implement rate limiting for some HTTP POST requests
> on
> > my website. It works great, except for one detail: the expiration of my
> > entry in my stick-table is always reset to 30 seconds, which means that
> if
> > the client mistakenly makes a request 29 seconds after being blocked, it
> > will be blocked, again, for 30 seconds.
>
> Note that in general that's what is desired but in your case it could be
> different.
>
> > Here's my config, stripped down tp the bare minimal for ease of reading:
> >
> > frontend http-in
> > modehttp
> > option  httplog
> >
> > bind *:80
> >
> > ### Request limiting
> > # Declare stick table
> > stick-table type string size 100k expire 30s store gpc0
> >
> > # Inspect layer 7
> > tcp-request inspect-delay 15s
> >
> > # Declare ACLs
> > acl source_is_abuser sc0_get_gpc0 gt 0
> >
> > tcp-request content track-sc0 req.cook(frontend) if
> !source_is_abuser
> > ### End Request limiting
> >
> > use_backend rate-limit if source_is_abuser
> >
> > default_backend mybackend
> >
> > backend mybackend
> > mode   http
> > option httplog
> > option forwardfor
> >
> > stick-table type string size 100k expire 30s store
> http_req_rate(30s)
> > tcp-request content track-sc1 req.cook(frontend) if METH_POST
> >
> > acl post_req_rate_abuse sc1_http_req_rate gt 30
> > acl mark_as_abuser sc0_inc_gpc0 gt 0
> >
> > tcp-request content accept if post_req_rate_abuse mark_as_abuser
> >
> > server myLocalhost 127.0.0.1:8081
> >
> > backend rate-limit
> > mode http
> > errorfile 503 /usr/local/etc/haproxy/rate-limit.http
> >
> >
> > With this config, as soon as a client makes more than 1 request per
> second
> > over 30 seconds, this client is marked as an abuser by "mybackend". The
> > following request are then, as expected, blocked by the "http-in"
> frontend.
> >
> > However, every time the currently marked "source_is_abuser" client sends
> a
> > request, the expiration counter of "http-in" 's stick-table is reset to
> 30
> > seconds. I would expect the expiration counter to keep going down, since
> > the connection is supposedly only tracked when `!source_is_abuser`.
> >
> > Any insight into what I am doing wrong?
>
> It's the sc0_get_gpc0() which refreshes the entry. Please keep in mind that
> originally stick-tables are designed to maintain stickiness information and
> to ensure that entries which are still used are kept fresh.
>
> In your case you *really* want to monitor the abuse rate by watching
> gpc0_rate.
> If you measure it over 30 seconds you'll get the average amount of attempts
> over the last 30 seconds period, and it would only increase when you detect
> an access while still being blocked. You can then decide on the threshold
> to
> block on.
>
> But that makes me think that what you're trying to achieve in fact is an
> hysteresis : you want to trigger only once the request rate reaches 30 per
> 30s, and then you want to block until it goes down to 0 per 30 second.
>
> So probably something like this would work :
>
>  acl post_req_rate_abuse sc1_http_req_rate gt 30
>  acl post_req_recent sc1_http_req_rate gt 0
>
>  tcp-request content track-sc0 req.cook(frontend) if !source_is_abuser
>  ...
>  use_backend rate-limit if source_is_abuser post_req_recent
>
> It blocks only if there was still some activity over the last period.
>
> Please share your results :-)
> Willy
>
>


Re: Fix triggering of runtime DNS resolution?

2015-09-02 Thread Baptiste
On Thu, Sep 3, 2015 at 12:56 AM, Conrad Hoffmann  wrote:
> Hello,
>
> it's kind of late and I am not 100% sure I'm getting this right, so would
> be great if someone could double-check this:
>
> Essentially, the runtime DNS resolution was never triggered for me. I
> tracked this down to a signed/unsigned problem in the usage of
> tick_is_expired() from checks.c:2158.
>
> curr_resolution->last_resolution is being initialized to zero
> (server.c:981), which in turn makes it say a few thousand after the value
> of hold.valid is added (also checks.c:2158). It is then compared to now_ms,
> which is an unsigned integer so large that it is out of the signed integer
> range. Thus, the comparison will not get the expected result, as it is done
> on integer values (now_ms cast to integer gave e.g. -1875721083 a few
> minutes ago, which is undeniably smaller then 3000).
>
> One way to fix this is to initialize curr_resolution->last_resolution to
> now_ms instead of zero (attached "patch"), but then it only works because
> both values are converted to negative integers. While I think that this
> will reasonably hide the problem for the time being, I do think there is a
> deeper problem here, which is the frequent passing of an unsigned integer
> into a function that takes signed int as argument.
>
> I see that tick_* is used all over the place, so I thought I would rather
> consult someone before spending lots of time creating a patch that would
> not be used. Also, I would need some more time to actually figure out what
> the best solution would be.
>
> Does anyone have any thoughts on this? Is someone maybe already aware of this?
>
> Thanks a lot,
> Conrad
> --
> Conrad Hoffmann
> Traffic Engineer
>
> SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
>
> Managing Director: Alexander Ljung | Incorporated in England & Wales
> with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
> HRB 110657B


Hi Conrad,

I remarked this as well.
Please apply the patch in attachment and confirm it fixes this issue.

I introduced this bug when trying to fix an other one: DNS resolution
was supposed to start with first health check.
Unfortunately, it started after hold.valid period after HAProxy's start time.

Please confirm the patch in attachment fix this and that DNS queries
are well sent at startup (and later).

Baptiste
From 06ec4730a0ed3fd5e7395d2bac907a60b62f2557 Mon Sep 17 00:00:00 2001
From: Baptiste Assmann 
Date: Wed, 2 Sep 2015 22:25:50 +0200
Subject: [PATCH] MINOR: FIX: DNS resolution doesn't start

Patch f046f1156149d3d8563cc45d7608f2c42ef5b596 introduced a regression:
DNS resolution doesn't start anymore, while it was supposed to make it
start with first health check.

current patch fix this issue with an other method: the last_resolution
is setup to now_ms - hold.valid - 1 when parsing HAProxy's configuration
file.
So at first check, the last_resolution is old enough to trigger a new
resolution.
---
 src/cfgparse.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/cfgparse.c b/src/cfgparse.c
index 6e2bcd7..fc7f0eb 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -8052,8 +8052,10 @@ out_uri_auth_compat:
 } else {
 	free(newsrv->resolvers_id);
 	newsrv->resolvers_id = NULL;
-	if (newsrv->resolution)
+	if (newsrv->resolution) {
 		newsrv->resolution->resolvers = curr_resolvers;
+		newsrv->resolution->last_resolution = tick_add(now_ms, -1 - newsrv->resolution->resolvers->hold.valid);
+	}
 }
 			}
 			else {
-- 
2.5.0



Fix triggering of runtime DNS resolution?

2015-09-02 Thread Conrad Hoffmann
Hello,

it's kind of late and I am not 100% sure I'm getting this right, so would
be great if someone could double-check this:

Essentially, the runtime DNS resolution was never triggered for me. I
tracked this down to a signed/unsigned problem in the usage of
tick_is_expired() from checks.c:2158.

curr_resolution->last_resolution is being initialized to zero
(server.c:981), which in turn makes it say a few thousand after the value
of hold.valid is added (also checks.c:2158). It is then compared to now_ms,
which is an unsigned integer so large that it is out of the signed integer
range. Thus, the comparison will not get the expected result, as it is done
on integer values (now_ms cast to integer gave e.g. -1875721083 a few
minutes ago, which is undeniably smaller then 3000).

One way to fix this is to initialize curr_resolution->last_resolution to
now_ms instead of zero (attached "patch"), but then it only works because
both values are converted to negative integers. While I think that this
will reasonably hide the problem for the time being, I do think there is a
deeper problem here, which is the frequent passing of an unsigned integer
into a function that takes signed int as argument.

I see that tick_* is used all over the place, so I thought I would rather
consult someone before spending lots of time creating a patch that would
not be used. Also, I would need some more time to actually figure out what
the best solution would be.

Does anyone have any thoughts on this? Is someone maybe already aware of this?

Thanks a lot,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
diff --git a/src/server.c b/src/server.c
index f3b0f16..e88302b 100644
--- a/src/server.c
+++ b/src/server.c
@@ -978,7 +978,7 @@ int parse_server(const char *file, int linenum, char **args, struct proxy *curpr
 			curr_resolution->status = RSLV_STATUS_NONE;
 			curr_resolution->step = RSLV_STEP_NONE;
 			/* a first resolution has been done by the configuration parser */
-			curr_resolution->last_resolution = 0;
+			curr_resolution->last_resolution = now_ms;
 			newsrv->resolution = curr_resolution;
 
  skip_name_resolution:


Hashing fetched samples / strings

2015-09-02 Thread Paul Pfaff
I was wondering if there is a way to hash fetched sample strings - for
example I have api servers and haproxy servers - I want HAProxy to set a
header (i.e.'X-Secure') with a hashed concatenation of the request id and a
'shared secret' that I can have on both api and haproxy through
configuration management software (chef)

my intention is to have a header that proves the request came from haproxy
and was not sent directly to the api via a third party. I intend to have
the api perform the same hash and compare it.

if there is a better way to handle this, issue, please let me know.

Thank you in advance,
Paul


Re: [PATCH] DOC: mention support for RFC 5077 TLS Ticket extension in starter guide

2015-09-02 Thread Cyril Bonté

Hi all,

Le 31/08/2015 11:59, Pavlos Parissis a écrit :

Maybe reStructuredText as a format and Sphinx tool could help here, but
it will require quite a bit of work to migrate to.


It was evocated, I'm not opposed to it, I just want to ensure first that
people don't have to *learn* the doc language to contribute doc. Ie: if
the format is broken in a patch, it should not result in utter crap on
the output nor in errors during conversion. That's why we have the current
format in the first place : instead of having people learn a language, we
have Cyril's tool which learns people's language.


That is so true as took me 2 days to get used it and have a clean build.


In the end it makes the
doc contributions extremely smooth.



That would be a very strong argument against any plans to migrate to
something else. As supplying patches for the doc with the current format
is so easy.

Just erase my e-mail:-)


Don't erase it too quickly, because it has always been the idea since 
the beginning. Currently, it's only at the state of proof of concept 
which provide a usable documentation with links, and that's where I 
stopped. One day, I hope to find some times to make it a kind of 
preprocessor to convert the plain text documentation to something like 
reStructuredText (or another one) and use a "more" standard tool for the 
final rendering.


--
Cyril Bonté



Re: Can HAProxy loadbalance multiple requests send through single TCP connection

2015-09-02 Thread Bryan Talbot
TCP really has no notion of "messages", it's all just bytes. So no, this
would not be possible with plain TCP.

-Bryan


On Wed, Sep 2, 2015 at 12:05 PM, Prabu rajan  wrote:

> Hi Team,
>
> Our client to HAProxy establishes single TCP connection and continues to
> send messages. We would like to know, is there a way to load balance those
> messages across the services sitting behind HAProxy. Please advise.
>
> Regards,
> Prabu
>


Can HAProxy loadbalance multiple requests send through single TCP connection

2015-09-02 Thread Prabu rajan
Hi Team,

Our client to HAProxy establishes single TCP connection and continues to
send messages. We would like to know, is there a way to load balance those
messages across the services sitting behind HAProxy. Please advise.

Regards,
Prabu


Re: Haproxy and postfix SMTPS - can't get haproxy and postfix talking to each other

2015-09-02 Thread Thomas Heil
Hi,

On 31.08.2015 13:44, Lukas Erlacher wrote:
> Hi,
> 
>>
>> Could be send your complete config and remove private information? Could
>> you also please give us the output of haproxy -vv?
>>
> 
> Full config: http://ix.io/ky6

thanks.
> 
> haproxy -vv:
> 
> HA-Proxy version 1.5.3 2014/07/25
> Copyright 2000-2014 Willy Tarreau 
> 
> Build options :
>   TARGET  = linux2628
>   CPU = generic
>   CC  = gcc
>   CFLAGS  = -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
> -Werror=format-security -D_FORTIFY_SOURCE=2
>   OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_PCRE=1
> 
> Default settings :
>   maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200
> 
> Encrypted password support via crypt(3): yes
> Built with zlib version : 1.2.8
> Compression algorithms supported : identity, deflate, gzip
> Built with OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
> Running on OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports prefer-server-ciphers : yes
> Built with PCRE version : 8.31 2012-07-06
> PCRE library supports JIT : no (USE_PCRE_JIT not set)
> Built with transparent proxy support using: IP_TRANSPARENT
> IPV6_TRANSPARENT IP_FREEBIND
> 
> Available polling systems :
>   epoll : pref=300,  test result OK
>poll : pref=200,  test result OK
>  select : pref=150,  test result OK
> Total: 3 (3 usable), will use epoll.
> 
> 
looks good to me

> Best,
> Luke
> 

Well I created a very simple config.

/etc/haproxy.cfg
global
maxconn 65000
ulimit-n 85535
uid 0
gid 0
daemon
stats socket /var/run/haproxy.stat level admin

nbproc 1

cpu-map all 1 2
ssl-server-verify none

tune.ssl.default-dh-param 2048

defaults
modetcp
no option http-server-close
timeout connect 5000
timeout client  5
timeout server  5

listen app1
bind :8080
mode http
stats enable
stats uri /
maxconn 200


frontend ft_smtps
bind :465
timeout client 1m
default_backend bk_postfix_smtps

backend bk_postfix_smtps
option tcp-check
timeout server 1m
timeout connect 5s
server mail-1 172.1.1.21:10464 send-proxy check

/etc/postfix/master.cf on 172.1.1.21
10464 inet  n   -   n   -   -   smtpd
  -o smtpd_tls_wrappermode=yes
  -o smtpd_sasl_auth_enable=yes
  -o smtpd_client_restrictions=permit_sasl_authenticated,reject
  -o smtpd_upstream_proxy_protocol=haproxy


Would you mind trying ?

10464 inet n - n - - smtpd

instead of

10464 inet n - - - - smtpd

For haproxy...

The only differnce is that you use chroot and user haproxy.. Cou,ld you
please try with the default and global section in the minimal example?


cheers
thomas




Re: [PATCH] Support statistics in multi-process mode

2015-09-02 Thread Philipp Kolmann

Hi Willy,

I saw once a message that you forgot about this patch, but never saw any 
comment on this ever again:


On 04/24/15 12:34, root wrote:

From: HiepNV 

Signed-off-by: root 
---
  Makefile  |   4 +-
  include/proto/shm_proxy.h |  28 +++
  src/dumpstats.c   |  59 ++-
  src/haproxy.c |  48 -
  src/shm_proxy.c   | 439 ++
  5 files changed, 571 insertions(+), 7 deletions(-)
  create mode 100644 include/proto/shm_proxy.h
  create mode 100644 src/shm_proxy.c




http://comments.gmane.org/gmane.comp.web.haproxy/21470


Could you please recheck, if that would be a possible feature?

thanks
Philipp


--
---
DI Mag. Philipp Kolmann  mail: kolm...@zid.tuwien.ac.at
Technische Universitaet Wien  web: www.zid.tuwien.ac.at
Zentraler Informatikdienst (ZID) tel: +43(1)58801-42011
Wiedner Hauptstr. 8-10, A-1040 WienDVR: 0005886
---




Re: Lua outbound Sockets in 1.6-dev4

2015-09-02 Thread Michael Ezzell
You are NOT able to reproduce?  I misunderstood your previous comment.

Further testing suggests (to me) that this is a timing issue, where HAProxy
does not discover that the connection is established, if connection
establishment doesn't happen within a very, very short window after the
connection is attempted.

Previously, I only tested "client talks first" (http) using a different
machine as the server.

Consider the following new results:

server talks first (ssh) - connection to local machine - *works*
server talks first (ssh) - connection to a different machine on same LAN -
*works*
server talks first (ssh) - connection to a different machine across
Internet - *works*
client talks first (http) - connection to local machine - *works*
client talks first (http) - connection to a different machine on same
LAN - *does
not work*
client talks first (http) - connection to a different machine across
Internet - *does not work*

The difference here seems to be the timing of the connection establishment,
and the presence or absence of additional events.  (Note that when I say
"local machine" I do not mean 127.0.0.1; I am still using the local
machine's Ethernet IP when talking to services on the local machine.)

When you are testing, are you using a remote machine, so that there is a
brief delay in connection establishment?  If not, this may explain why you
do not see the same behavior, since local connections do not appear to have
the same problem.

Most interesting, based on my "timing" theory, I found a workaround, which
seems very wrong in principle; so wrong, in fact, that I can't believe I
tried it; however, using the following tactic, I am able to make an
outgoing socket connection to a different machine, when client talks first.

local sock = core.tcp();
sock:settimeout(3);
local written = sock:send("GET
/latest/meta-data/placement/availability-zone HTTP/1.0\r\nHost:
169.254.169.254\r\n\r\n");
local connected, con_err = sock:connect("169.254.169.254",80);
...

This strange code works.  I hope you will agree that writing to the socket
before connecting seems very wrong, and I was surprised to find that this
code works successfully when connecting to a different machine --
 presumably because I'm pre-loading the outbound buffer, so the server's
response to my request actually triggers an event that does not occur in a
condition where the client talks first and when there is a delay in
connection establishment, even a very brief delay.


Correlate requests on multiple frontends based on src

2015-09-02 Thread Mihai Vintila

Hi All,
It is possible to correlate requests from multiple frontends so can i 
direct the request to one frontend to specific backend servers based on 
the backend servers accessed by the same ip address on another frontend ?

To understand the scenario:
MySQL Master 1 < ->  MySQL Master 2
  | |
MySQL S1 MySQL S2
  | |
MySQL S3 MySQL S4

In haproxy :
frontend1 points to MySQL Master 1 + MySQL Master 2
frontend2 points to MySQL S1-4

I want if possible to redirect clients accessing MySQL Master 1 on 
frontend1 to be redirected to MySQL S1 + S3 when they access frontend2.



--
Best regards,
Vintila Mihai Alexandru




Re: segfault with 1.6 10ec214f41385b231a0c4c529b7b555caf5280bb

2015-09-02 Thread Jesse Hathaway
Cyril Bonté  writes:

> In some conditions, srv_conn is set to NULL but is then used later.

Awesome, I will roll out a new version based on master today, thanks Cyril!




Re: Lua outbound Sockets in 1.6-dev4

2015-09-02 Thread Thierry FOURNIER
Thank you,

Now I'm sure that the connection is establish. I see also that HAProxy
close the connection 3seconds later, according with the timeout.

Now the hard work begin :) I can't reproduce the bug.

I agree your conclusion, the error message "Can't connect" is found
only once in the HAProxy Lua code. It is in the yield function, so I
guess that the yield function is wake up after the 3seconds timeout.

I don't known why.

Can you send your complete configuration file ? or a configuration
wihich reproduce the problem ?

Thanks
Thierry



On Tue, 1 Sep 2015 12:49:22 -0400
Michael Ezzell  wrote:

> You *can* reproduce the error?  I feel better already.
> 
> 
> > Can you run a tcpdump for validating the TCP connection establishment ?
> >
> 
> It looks pretty much as expected. Is this what you wanted?
> 
>  73  69.516276 10.10.10.10 -> 10.20.20.20 TCP 74 44748 > http [SYN] Seq=0
> Win=26883 Len=0 MSS=8961 SACK_PERM=1 TSval=833894013 TSecr=0 WS=128
>  74  69.516893 10.20.20.20 -> 10.10.10.10 TCP 74 http > 44748 [SYN, ACK]
> Seq=0 Ack=1 Win=26847 Len=0 MSS=8961 SACK_PERM=1 TSval=20615574
> TSecr=833894013 WS=128
>  75  69.516909 10.10.10.10 -> 10.20.20.20 TCP 66 44748 > http [ACK] Seq=1
> Ack=1 Win=27008 Len=0 TSval=833894013 TSecr=20615574
>  93  72.517981 10.10.10.10 -> 10.20.20.20 TCP 66 44748 > http [FIN, ACK]
> Seq=1 Ack=1 Win=27008 Len=0 TSval=833894764 TSecr=20615574
>  94  72.518672 10.20.20.20 -> 10.10.10.10 TCP 254 [TCP segment of a
> reassembled PDU]
>  95  72.518689 10.10.10.10 -> 10.20.20.20 TCP 66 44748 > http [ACK] Seq=2
> Ack=190 Win=28032 Len=0 TSval=833894764 TSecr=20616324
> 
> 
> Also a quick hack of src/hlua.c to discover which of the three
> possibilities is causing the error reveals that in
> hlua_socket_connect_yield()...
> 
> if (!hlua || !socket->s || channel_output_closed(&socket->s->req)) {
> ...the condition being matched and prompting the "Can't connect" error
> appears to be !socket->s.



Re: segfault with 1.6 @10ec214f41385b231a0c4c529b7b555caf5280bb

2015-09-02 Thread Willy Tarreau
Hi guys,

On Wed, Sep 02, 2015 at 08:44:21AM +0200, Cyril Bonté wrote:
> I haven't made some tests yet, but I guess the issue is on srv_conn due 
> to this part of code in proto_http.c :
>   if (((s->txn->flags & TX_CON_WANT_MSK) != TX_CON_WANT_KAL) ||
>   !si_conn_ready(&s->si[1])) {
>   si_release_endpoint(&s->si[1]);
>   srv_conn = NULL;
>   }
> 
>   [...]
> 
>   if (prev_status == 401 || prev_status == 407) {
>   [...]
>   s->txn->flags |= TX_PREFER_LAST;
>   srv_conn->flags |= CO_FL_PRIVATE;
>   }
> 
> In some conditions, srv_conn is set to NULL but is then used later.

Good catch, thanks for this! I've merged the fix. Fortunately it
only impacts 1.6 since the connection reuse code.

Thanks,
Willy