Re: [PATCH] [MEDIUM] Improve "no free ports" error case

2017-03-15 Thread Krishna Kumar (Engineering)
Hi Willy,

I am facing one problem with using system port range,

Distro: Ubuntu 16.04.1, kernel: 4.4.0-53-generic

When I set to 5 to 50999, the kernel allocates port in the range 5
to
50499, the remaining 500 ports do not seem to ever get allocated despite
running
a few thousand connections in parallel. A simple test program that I wrote
that does
a bind to IP and then connects uses all 1000 ports. Quickly checking the
tcp code, I
noticed that the kernel tries to allocate an odd port for bind, leaving the
even ports
for connect. Any idea why I don't get the full port range in bind? I am
using something
like the following when specifying the server:
 server abcd google.com:80 source e1.e2.e3.e4
and with the following sysctl:
 sysctl -w net.ipv4.ip_local_port_range="5 50999"

I hope it is OK to add an unrelated question relating to a feature to this
thread:

Is it possible to tell haproxy to use one backend for a request (GET), and
if the
response was 404 (Not Found), use another backend? This resource may be
present in the 2nd backend, but is there any way to try that upon getting
404
from the first?

Thanks,
- Krishna


On Thu, Mar 9, 2017 at 2:22 PM, Krishna Kumar (Engineering) <
krishna...@flipkart.com> wrote:

> Hi Willy,
>
> Excellent, I will try this idea, it should definitely help!
> Thanks for the explanations.
>
> Regards,
> - Krishna
>
>
> On Thu, Mar 9, 2017 at 1:37 PM, Willy Tarreau  wrote:
>
>> On Thu, Mar 09, 2017 at 12:50:16PM +0530, Krishna Kumar (Engineering)
>> wrote:
>> > 1. About 'retries', I am not sure if it works for connect() failing
>> > synchronously on the
>> > local system (as opposed to getting a timeout/refused via callback).
>>
>> Yes it normally does. I've been using it for the same purpose in certain
>> situations (eg: binding to a source port range while some daemons are
>> later bound into that range).
>>
>> > The
>> > document
>> > on retries says:
>> >
>> > "  is the number of times a connection attempt should be
>> retried
>> > on
>> >   a server when a connection either is refused or times
>> out. The
>> >   default value is 3.
>> > "
>> >
>> > The two conditions above don't fall in our use case.
>>
>> It's still a refused connection :-)
>>
>> > The way I understood was that
>> > retries happens during the callback handler. Also I am not sure if
>> there is
>> > any way to circumvent the "1 second" gap for a retry.
>>
>> Hmmm I have to check. In fact when the LB algorithm is not determinist
>> we immediately retry on another server. If we're supposed to end up only
>> on the same server we indeed apply the delay. But if it's a synchronous
>> error, I don't know. And I think it's one case (especially -EADDRNOTAVAIL)
>> where we should immediately retry.
>>
>> > 2. For nolinger, it was not recommended in the document,
>>
>> It's indeed strongly recommended against, mainly because we've started
>> to see it in configs copy-pasted from blogs without understanding the
>> impacts.
>>
>> > and also I wonder if any data
>> > loss can happen if the socket is not lingered for some time beyond the
>> FIN
>> > packet that
>> > the remote server sent for doing the close(), delayed data packets, etc.
>>
>> The data loss happens only with outgoing data, so for HTTP it's data
>> sent to the client which are at risk. Data coming from the server are
>> properly consumed. In fact, when you configure "http-server-close",
>> the nolinger is automatically enabled in your back so that haproxy
>> can close the server connection without accumulating time-waits.
>>
>> > 3. Ports: Actually each HAProxy process has 400 ports limitation to a
>> > single backend,
>> > and there are many haproxy processes on this and other servers. The
>> ports
>> > are split per
>> > process and per system. E.g. system1 has 'n' processes and each have a
>> > separate port
>> > range from each other, system2 has 'n' processes and a completely
>> different
>> > port range.
>> > For infra reasons, we are restricting the total port range. The unique
>> > ports for different
>> > haproxy processes running on same system is to avoid attempting to use
>> the
>> > same port
>> > (first port# in the range) by two processes and failing in connect, when
>> > attempting to
>> > connect to the same remote server. Hope I explained that clearly.
>>
>> Yep I clearly see the use case. That's one of the rare cases where it's
>> interesting to use SNAT between your haproxy nodes and the internet. This
>> way you'll use a unified ports pool for all your nodes and will not have
>> to reserve port ranges per system and per process. Each process will then
>> share the system's local source ports, and each system will have a
>> different
>> address. Then the SNAT will convert these IP1..N:port1..N to the public IP
>> address and an available port. This will offer you more flexibility to add
>> or remove nodes/processes etc. Maybe your total traffic cannot pass

Stick table sync problems

2017-03-15 Thread Aaron van Meerten
Hi HAProxy List,

I’ve run into an issue with the stick tables/peering issue that may be of 
interest to some of you.

I’ve got a fleet of 10 proxy servers peering with each other, fronting several 
backend servers.  I have a very simple stick table setup which I’ve pasted 
examples of below.  Basically I use a URL parameter to control server 
stickiness.

This works great, and is an amazing solution to a sticky problem for our 
BOSH-based XMPP messaging, as long as the stick table entries stay in sync.  
However, sometimes one HAProxy instance will lose one or more entries which are 
still present on the others.

This state persists between minutes and hours, in which the out-of-sync server 
continues to receive updates on some entries but is missing others.

A restart of the server can resolve the issue by causing the table to refresh, 
but this is less than ideal.

When it occurs, it appears that all the other servers continue to update the 
“TTL” on the entry, but the errant server slowly allows the entry to expire and 
be removed.
I have developed a tool which pulls the stick table from each proxy and 
compares the entries.  There’s obviously some room for expiry times to be 
different on each proxy, but I’d expect that entries which are regularly 
refreshed on all other peers should be propagated everywhere.

I suspect somehow either ephemeral network connectivity between the peers or 
some other error, but I haven’t seen anything in the logs that seem relevant.  

lsof analysis of open TCP sockets shows all peers connected on 1024 as expected.

I wondered if this list would have any ideas on further avenues for analysis on 
this particular problem.  I’ve seen this happen consistently on HAProxy 1.6 and 
1.7 through several point releases of each.  If anything it seems more frequent 
in 1.7.

Please let me know if you have any good ideas or if anyone has seen behavior 
like this before. 

Thanks,

-Aaron van Meerten

Below is the example of my peer and stick table configuration, extracted from a 
larger haproxy.cfg
If there’s more info that’d help track this down, I’m happy to provide it.


peers mypeers
 peer hcv-chaos-haproxy-13056 XX.XX.130.56:1024
 peer hcv-chaos-haproxy-230228 XX.XX.230.228:1024
 peer hcv-chaos-haproxy-35147 XX.XX.35.147:1024
 peer hcv-chaos-haproxy-10660 10.186.3.137:1024
 peer hcv-chaos-haproxy-9682 XX.XX.96.82:1024
 peer hcv-chaos-haproxy-239179 XX.XX.239.179:1024
 peer hcv-chaos-haproxy-246171 XX.XX.246.171:1024
 peer hcv-chaos-haproxy-68128 XX.XX.68.128:1024
 peer hcv-chaos-haproxy-151101 XX.XX.151.101:1024
 peer hcv-chaos-haproxy-207217 XX.XX.207.217:1024


backend nodes
  redirect scheme https if !{ ssl_fc }

# make sure we send the client's ip
 option forwardfor

  balance url_param room
  hash-type consistent
  stick-table type string len 128 size 20k peers mypeers expire 5m
  stick on url_param(room) table nodes

  #example server
   server chaos-us-east-1a-s0 XX.XX.XX.XXX:443 id 10 ssl verify none check port 
 inter 5s fastinter 1s fall 2 rise 30


Re: [PATCH][RFC] MEDIUM: global: add a 'grace' option to cap the soft-stop time

2017-03-15 Thread Willy Tarreau
Hi guys,

On Thu, Mar 16, 2017 at 01:03:24AM +0100, Cyril Bonté wrote:
> Hi Bryan,
> 
> Le 16/03/2017 à 00:52, Bryan Talbot a écrit :
> > 
> > > On Mar 15, 2017, at Mar 15, 4:44 PM, Cyril Bonté  
> > > wrote:
> > > 
> > > Several use cases may accept to abruptly close the connections when the
> > > instance is stopping instead of waiting for timeouts to happen.
> > > This option allows to specify a grace period which defines the maximum
> > > time to spend to perform a soft-stop (occuring when SIGUSR1 is
> > > received).
> > > 
> > > With this global option defined in the configuration, once all 
> > > connections are
> > > closed or the grace time is reached, the instance will quit.
> > 
> > 
> > Most of the other settings for time-limits include the word "timeout". 
> > Maybe "timeout grace", "timeout shutdown", "timeout exit" or something is 
> > more consistent with other configuration options?
> 
> Thanks for raising that point. The choice was intended and may be subject to
> discussion.
> 
> timeout keywords are (most of them, except maybe "timeout mail") defined in
> defaults/frontend/backend/listen sections, whereas this one is in the global
> one. I wanted to clearly differentiate that timeout to prevent some
> misconfigurations.
> 
> But I'm definitely not closed to add a "timeout" prefix. Also, I was not
> very decided about the "grace" name but I didn't spend a lot of time to
> write the patch (hence the [RFC] tag ;-)). You suggested "timeout shutdown",
> this one may be a better choice, indeed.

In my opinion it's irrelevant to the section, but to the role. It's not
a timeout but a grace time. We do already have "grace" in frontends for
a similar purpose. In fact in my opinion a timeout is very different in
that it's a maximum time waiting for an event to appear. Here we're not
waiting for an event, we offer a grace time after the event (the reload
signal). So in my opinion your choice of "grace" is perfectly suited here.

Willy



Re: [PATCH][RFC] MEDIUM: global: add a 'grace' option to cap the soft-stop time

2017-03-15 Thread Cyril Bonté

Hi Bryan,

Le 16/03/2017 à 00:52, Bryan Talbot a écrit :



On Mar 15, 2017, at Mar 15, 4:44 PM, Cyril Bonté  wrote:

Several use cases may accept to abruptly close the connections when the
instance is stopping instead of waiting for timeouts to happen.
This option allows to specify a grace period which defines the maximum
time to spend to perform a soft-stop (occuring when SIGUSR1 is
received).

With this global option defined in the configuration, once all connections are
closed or the grace time is reached, the instance will quit.



Most of the other settings for time-limits include the word “timeout”. Maybe 
“timeout grace”, “timeout shutdown”, “timeout exit” or something is more 
consistent with other configuration options?


Thanks for raising that point. The choice was intended and may be 
subject to discussion.


timeout keywords are (most of them, except maybe "timeout mail") defined 
in defaults/frontend/backend/listen sections, whereas this one is in the 
global one. I wanted to clearly differentiate that timeout to prevent 
some misconfigurations.


But I'm definitely not closed to add a "timeout" prefix. Also, I was not 
very decided about the "grace" name but I didn't spend a lot of time to 
write the patch (hence the [RFC] tag ;-)). You suggested "timeout 
shutdown", this one may be a better choice, indeed.



--
Cyril Bonté



Re: [PATCH][RFC] MEDIUM: global: add a 'grace' option to cap the soft-stop time

2017-03-15 Thread Bryan Talbot

> On Mar 15, 2017, at Mar 15, 4:44 PM, Cyril Bonté  wrote:
> 
> Several use cases may accept to abruptly close the connections when the
> instance is stopping instead of waiting for timeouts to happen.
> This option allows to specify a grace period which defines the maximum
> time to spend to perform a soft-stop (occuring when SIGUSR1 is
> received).
> 
> With this global option defined in the configuration, once all connections are
> closed or the grace time is reached, the instance will quit.


Most of the other settings for time-limits include the word “timeout”. Maybe 
“timeout grace”, “timeout shutdown”, “timeout exit” or something is more 
consistent with other configuration options?

-Bryan




[PATCH][RFC] MEDIUM: global: add a 'grace' option to cap the soft-stop time

2017-03-15 Thread Cyril Bonté
Several use cases may accept to abruptly close the connections when the
instance is stopping instead of waiting for timeouts to happen.
This option allows to specify a grace period which defines the maximum
time to spend to perform a soft-stop (occuring when SIGUSR1 is
received).

With this global option defined in the configuration, once all connections are
closed or the grace time is reached, the instance will quit.
---
 doc/configuration.txt  | 12 
 include/types/global.h |  2 ++
 src/cfgparse.c | 17 +
 src/haproxy.c  |  8 
 src/proxy.c|  2 ++
 5 files changed, 41 insertions(+)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 228e8132..b1109906 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -535,6 +535,7 @@ The following keywords are supported in the "global" 
section :
- deviceatlas-properties-cookie
- external-check
- gid
+   - grace
- group
- log
- log-tag
@@ -703,6 +704,17 @@ gid 
   will only be able to drop these groups if started with superuser privileges.
   See also "group" and "uid".
 
+grace 
+  Defines the grace time allowed to perform a clean soft-stop.
+
+  Arguments :
+  is the maximum time (by default in milliseconds) for which the
+instance will remain alive when a soft-stop is received via the
+SIGUSR1 signal.
+
+  This may be used to ensure that the instance will quit even if connections
+  remain opened (for example with long timeouts for a proxy in tcp mode).
+
 group 
   Similar to "gid" but uses the GID of group name  from /etc/group.
   See also "gid" and "user".
diff --git a/include/types/global.h b/include/types/global.h
index e14a2add..1d273bf4 100644
--- a/include/types/global.h
+++ b/include/types/global.h
@@ -80,6 +80,7 @@ struct global {
int gid;
int external_check;
int nbproc;
+   unsigned int grace; /* grace period to process soft stop */
int maxconn, hardmaxconn;
int maxsslconn;
int ssl_session_max_cost;   /* how many bytes an SSL session may cost */
@@ -170,6 +171,7 @@ extern const int zero;
 extern const int one;
 extern const struct linger nolinger;
 extern int stopping;   /* non zero means stopping in progress */
+extern int grace;
 extern char hostname[MAX_HOSTNAME_LEN];
 extern char localpeer[MAX_HOSTNAME_LEN];
 extern struct list global_listener_queue; /* list of the temporarily limited 
listeners */
diff --git a/src/cfgparse.c b/src/cfgparse.c
index 2eb25edb..8b6855f7 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -1011,6 +1011,23 @@ int cfg_parse_global(const char *file, int linenum, char 
**args, int kwm)
}
}
/* end of user/group name handling*/
+   else if (strcmp(args[0], "grace") == 0) {
+   const char *res;
+
+   if (!*args[1]) {
+   Alert("parsing [%s:%d] : '%s' expects  as 
argument.\n",
+   file, linenum, args[0]);
+   err_code |= ERR_ALERT | ERR_FATAL;
+   goto out;
+   }
+   res = parse_time_err(args[1], , TIME_UNIT_MS);
+   if (res) {
+   Alert("parsing [%s:%d]: unexpected character '%c' in 
argument to <%s>.\n",
+   file, linenum, *res, args[0]);
+   err_code |= ERR_ALERT | ERR_FATAL;
+   goto out;
+   }
+   }
else if (!strcmp(args[0], "nbproc")) {
if (alertif_too_many_args(1, file, linenum, args, _code))
goto out;
diff --git a/src/haproxy.c b/src/haproxy.c
index 559b4811..400fb7e7 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -118,6 +118,7 @@ int  relative_pid = 1;  /* process id starting 
at 1 */
 /* global options */
 struct global global = {
.nbproc = 1,
+   .grace = TICK_ETERNITY,
.req_count = 0,
.logsrvs = LIST_HEAD_INIT(global.logsrvs),
.maxzlibmem = 0,
@@ -158,6 +159,7 @@ struct global global = {
 
 int stopping;  /* non zero means stopping in progress */
 int jobs = 0;   /* number of active jobs (conns, listeners, active tasks, ...) 
*/
+int grace = TICK_ETERNITY;
 
 /* Here we store informations about the pids of the processes we may pause
  * or kill. We will send them a signal every 10 ms until we can bind to all
@@ -1585,6 +1587,12 @@ static void run_poll_loop()
 
/* Check if we can expire some tasks */
next = wake_expired_tasks();
+   if (stopping && (grace != TICK_ETERNITY)) {
+   if (tick_is_expired(grace, now_ms))
+   break;
+   if (next == TICK_ETERNITY || tick_is_expired(next, 
grace))
+   next = grace;
+   }
 
/* stop when there's nothing left to do */
if 

Re: Propagating agent-check weight change to tracking servers

2017-03-15 Thread Michał
Hello!
Any news in this topic? Is there anything wrong with my patch?

Michał

2017-02-04 9:38 GMT+01:00 Michał :

> Hi,
> I checked it and during synthetic tests it worked. I use same
> mechanism as origin agent-check, so it's ready to merge.
>
> It doesn't need to be backported.
>
> 2017-01-27 15:38 GMT+01:00 Michał :
>
>> Hello,
>>
>> So here's patch, which includes all functionalities I think about.
>> It propagates the response for every tracking server without changing it
>> and without intercepting it. In my opinion we should propagate relative
>> and absolute weights, because if you use weight=0 server's to offload
>> checks then you can send relative weight change and 0 will stay where it
>> is.
>>
>> Regards,
>> Michał
>>
>>
>> 2017-01-20 10:54 GMT+01:00 Willy Tarreau :
>>
>>> Hi Michal,
>>>
>>> On Thu, Jan 19, 2017 at 11:45:57PM +0100, Micha?? wrote:
>>> > Hello,
>>> >
>>> > We use track's in haproxy to minimize check traffic in some situations
>>> and
>>> > after my last patch we are probably going to switch to agent-checks for
>>> > live management of weights and statuses. One problem I see now - track
>>> > don't propagate weight setting to trackers, so if we set agent-check on
>>> > track we can manage status only.
>>> >
>>> > My first PoC solution works good, so I thought about introducing
>>> something
>>> > like agent-track or track-agent directive set on backends (or maybe it
>>> > should be default, non-configurable behaviour) to propagate agent-check
>>> > responses from main check to all tracking backends. Both default
>>> behaviour
>>> > and directive approach are small changes in code, but a little bigger
>>> in
>>> > functionality.
>>> >
>>> > In my opinion if we set agent-check on backend which is tracked by
>>> others -
>>> > it should propagate agent-check weight response to those tracking
>>> backend
>>> > and set weight on them too. Configurable or not - it will be good
>>> feature.
>>>
>>> I think we at least propagate the DRAIN state which is equivalent to
>>> weight == 0. If so I too think we should propagate *relative* weights.
>>> Agent-checks can return a relative weight (eg: 50%, 100%, 150%) or an
>>> absolute weight (eg: 10, 20). If you have two farms configured like this
>>> :
>>>
>>>backend farm1
>>>  server new1 1.1.1.1:8000 weight 10 agent-check
>>>  server new2 1.1.1.2:8000 weight 10 agent-check
>>>
>>>backend farm2
>>>  server new1 1.1.1.1:8000 weight 20 track farm1/new1
>>>  server new2 1.1.1.2:8000 weight 20 track farm1/new2
>>>  server old1 1.1.1.3:8000 weight 10 check
>>>  server old2 1.1.1.4:8000 weight 10 check
>>>
>>> Then you want the weight changes on farm1 to be applied proportionally
>>> to farm2 (ie: a ratio of the configured absolute weight, which is iweight
>>> IIRC).
>>>
>>> Otherwise that sounds quite reasonable to me given that the agent-check's
>>> purpose is to provide a more accurate vision of the server's health, and
>>> that tracking is made to share the same vision across multiple farms.
>>>
>>> Regards,
>>> Willy
>>>
>>
>>
>>
>


Re: [PATCHES] Add support for LibreSSL 2.5.1

2017-03-15 Thread Emmanuel Hocdet

> Le 14 mars 2017 à 16:28, Emmanuel Hocdet  a écrit :
> 
> Hi Piotr
> 
>> Le 14 mars 2017 à 16:04, Piotr Kubaj  a écrit :
>> 
>> And it seems like the previously attached patches do compile, but the 
>> warning is there again so now I'm finally including patches that make 
>> Haproxy both compile and not throw additional warnings.
>> 
> 
> first patch:
> 
> -#if defined(OPENSSL_IS_BORINGSSL) || (OPENSSL_VERSION_NUMBER >= 0x101fL)
> +#if (LIBRESSL_VERSION_NUMBER < 0x2050100fL) && 
> (defined(OPENSSL_IS_BORINGSSL) || (OPENSSL_VERSION_NUMBER >= 0x101fL))
> should be :
> +#if (LIBRESSL_VERSION_NUMBER < 0x2050100fL) || 
> (defined(OPENSSL_IS_BORINGSSL) || (OPENSSL_VERSION_NUMBER >= 0x101fL))
> 
> I suspect that test on LIBRESSL_VERSION_NUMBER will not work.
> 
> Manu
> 

This patch should be SSL version agnostic:



0001-BUILD-ssl-simplify-SSL_CTX_set_ecdh_auto-compatibili.patch
Description: Binary data




Re: HTTP Basic Authorisation requests failing with HAProxy 1.7.2

2017-03-15 Thread Willy Tarreau
Hi Jon,

On Wed, Mar 15, 2017 at 12:38:38PM -0500, Jon Simpson wrote:
> Hi Christopher,
> 
> The patch does seem to fix the bug in my testing - I can't reproduce the
> 503 errors with your patch on 1.8. Sorry for taking a few days to get
> around to looking at this & thanks for the fix!

Much appreciated, thanks for the feedback. I'll queue it tomorrow.
Don't worry about the delay, a late confirmation is always better
than none.

Thanks!
Willy



Re: OpenSSL engine and async support

2017-03-15 Thread Willy Tarreau
Hi Grant,

On Wed, Mar 15, 2017 at 10:20:01AM -0700, Grant Zhang wrote:
> Maybe you run into the openssl 1.1 SNI issue. Does your test branch have the 
> following patch:
> http://git.haproxy.org/?p=haproxy.git;a=commit;h=d3850603933c9319528375088a9b28b9b345246b
>  

I think not because Emeric had issues applying your patch on top of recent
SSL changes so he had to roll back to the date of your submission, which
predates this fix. That makes sense indeed.

Willy



Re: Some compilation SSL errors/warnings on debian testing

2017-03-15 Thread Willy Tarreau
Hi Manu,

On Wed, Mar 15, 2017 at 07:00:28PM +0100, Emmanuel Hocdet wrote:
> > ssl_options seems still valid, all directives can be mapped to it and keep 
> > compatibility.
> > 
> 
> Patch proposal:
 
Maybe it could work, let's wait for Emeric's feedback. I remember there
was a subtle difference between no- and force- but I
don't remember which one.

Thanks,
Willy



Re: Some compilation SSL errors/warnings on debian testing

2017-03-15 Thread Emmanuel Hocdet
Hi Willy,Le 15 mars 2017 à 12:41, Emmanuel Hocdet  a écrit :Le 14 mars 2017 à 19:11, Willy Tarreau  a écrit :For the little story: openssl-1.1.0 and boringssl have SSL_CTX_set_min_proto_version/SSL_CTX_set_max_proto_versionand other methods to set protocol version are deprecated (or not implemented).It will be boring to keep compat with haproxy ssl directive no- and force-.And perhaps the add of some min- and max-.Willy, what do you think?Well, that means we'll definitely break all setups and piss-off users :-(What we can possibly do is to determine the appropriate min/max based onthe existing no- and force- and complain if there are holes.Ie, if a user has "no-sslv3 no-tls10 no-tls11" then the min version isTLS 1.2 and that could work. If a user has "no-sslv3 no-tls11 no-tls12"then the min and max versions are TLS 1.0. But if a user has"no-sslv3 no-tls11" then we don't know and we have to complain. Hopefullyit would affect very few users (those with strange setups or not aware oftheir holes).What bothers me with this API change is that it doesn't provide the sameflexibility. If you have a vulnerability coming with an implementation orsimply a known incompatibility and want to disable it and only this one,you're screwed. With the previous API it was possible. That's why I'mstill not 100% sold on the API change :-/ssl_options seems still valid, all directives can be mapped to it and keep compatibility.Patch proposal:

0001-MEDIUM-ssl-rework-of-ssl_method-calculation-to-match.patch
Description: Binary data


Re: HTTP Basic Authorisation requests failing with HAProxy 1.7.2

2017-03-15 Thread Jon Simpson
On 10 March 2017 at 20:40:11, Christopher Faulet (cfau...@haproxy.com)
wrote:

Hi Jon,

Here is a patch that should fix your bug. It was trickier than expected.
Could you confirm that it fix your bug ?

-- 
Christopher Faulet

Hi Christopher,

The patch does seem to fix the bug in my testing - I can’t reproduce the
503 errors with your patch on 1.8. Sorry for taking a few days to get
around to looking at this & thanks for the fix!

Jon


Re: OpenSSL engine and async support

2017-03-15 Thread Grant Zhang
Hi Emeric
> On Mar 15, 2017, at 10:05, Emeric Brun  wrote:
> 
> Hi John,
> 
>>> 
>>> There is some inconsistencies between the engine and the used client:
>>> 
>>> here the conf:
>>> global
>>>   tune.ssl.default-dh-param 2048
>>>   ssl-engine qat
>>>   ssl-async
>>> 
>>> listen gg
>>>   mode http
>>>   bind 0.0.0.0:8443 ssl crt /root/2048.pem
>>>   redirect location /
>>> 
>>> openssl s_client -connect performs well but curl failed:
>>> emeric@ebr-laptop:~/inject$ curl -k  https://10.0.0.109:8443/
>>> curl: (35) gnutls_handshake() failed: Bad record MAC
>>> 
>>> 
>>> If I comment the ssl-engine line, no more issue.
>>> 
>>> R,
>>> Emeric
>>> 
>>> the conf:
>>> 
>>> 
>>> 
>>> 
> 
> I'm not sure that the issue is related to your patch, i may reach an issue 
> int QAT engine
> 
> I've made some test using openssl s_server.
> 
> Doing a curl request shows this error:
> [root@centos bin]# ./openssl s_server -accept 9443 -engine qat -cert 
> /root/2048.pem 
> ERROR
> 140267076605760:error:1408F119:SSL routines:ssl3_get_record:decryption failed 
> or bad record mac:ssl/record/ssl3_record.c:602:
> shutting down SSL
> CONNECTION CLOSED
> 
> And using the haproxy as client also fails with this error:
> 140267076605760:error:800910C8:lib(128):qat_rsa_priv_enc:rsa from to 
> null:qat_rsa.c:917:
> 140267076605760:error:141EC044:SSL 
> routines:tls_construct_server_key_exchange:internal 
> error:ssl/statem/statem_srvr.c:2453:
> shutting down SSL
> CONNECTION CLOSED
> 
> R,
> Emeric

Maybe you run into the openssl 1.1 SNI issue. Does your test branch have the 
following patch:
http://git.haproxy.org/?p=haproxy.git;a=commit;h=d3850603933c9319528375088a9b28b9b345246b
 

If not, could you please give a try?

Thanks,

Grant




Re: OpenSSL engine and async support

2017-03-15 Thread Emeric Brun
Hi John,

>>
>> There is some inconsistencies between the engine and the used client:
>>
>> here the conf:
>> global
>>tune.ssl.default-dh-param 2048
>>ssl-engine qat
>>ssl-async
>>
>> listen gg
>>mode http
>>bind 0.0.0.0:8443 ssl crt /root/2048.pem
>>redirect location /
>>
>> openssl s_client -connect performs well but curl failed:
>> emeric@ebr-laptop:~/inject$ curl -k  https://10.0.0.109:8443/
>> curl: (35) gnutls_handshake() failed: Bad record MAC
>>
>>
>> If I comment the ssl-engine line, no more issue.
>>
>> R,
>> Emeric
>>
>> the conf:
>>
>>
>>
>>

I'm not sure that the issue is related to your patch, i may reach an issue int 
QAT engine

I've made some test using openssl s_server.

Doing a curl request shows this error:
[root@centos bin]# ./openssl s_server -accept 9443 -engine qat -cert 
/root/2048.pem 
ERROR
140267076605760:error:1408F119:SSL routines:ssl3_get_record:decryption failed 
or bad record mac:ssl/record/ssl3_record.c:602:
shutting down SSL
CONNECTION CLOSED

And using the haproxy as client also fails with this error:
140267076605760:error:800910C8:lib(128):qat_rsa_priv_enc:rsa from to 
null:qat_rsa.c:917:
140267076605760:error:141EC044:SSL 
routines:tls_construct_server_key_exchange:internal 
error:ssl/statem/statem_srvr.c:2453:
shutting down SSL
CONNECTION CLOSED

R,
Emeric


> 
> 




Re: [RFC PATCH] MEDIUM: persistent connections for SSL checks

2017-03-15 Thread Willy Tarreau
On Wed, Mar 15, 2017 at 09:53:11AM -0700, Steven Davidovitz wrote:
> Thank you! I may one day follow-up on persistent connections, but this does
> this trick for now.

No problem, I think persistent connections must be dealt with more globaly
and not just for SSL though.

Cheers,
Willy



Re: [RFC PATCH] MEDIUM: persistent connections for SSL checks

2017-03-15 Thread Steven Davidovitz
Thank you! I may one day follow-up on persistent connections, but this does
this trick for now.

On Wed, Mar 15, 2017 at 3:44 AM, Willy Tarreau  wrote:

> On Mon, Mar 13, 2017 at 06:10:23PM +0100, Willy Tarreau wrote:
> > > Just wanted to follow up. I've been running this patch for a couple
> days on
> > > an idle system and haven't noticed any problems.
> > > Could this be merged? Is there anything else I can test?
> >
> > I'm personally fine with it but I'd rather have Emeric approve it, as
> > he knows better than me the possible impacts of shutting down cleanly
> > or not on SSL.
> >
> > Emeric, I've re-attached the patch. Using conn_data_shutw() instead of
> > conn_data_shutw_hard() causes the "clean" flag to be set when calling
> > ssl_sock_shutw() and SSL_set_quiet_shutdown() not to be called so that
> > we perform a clean shutdown. The purpose is to give a chance to the
> > server to store the context and avoid renegociating.
>
> Now applied with Emeric's blessing.
>
> Thanks,
> Willy
>


Re: OpenSSL engine and async support

2017-03-15 Thread Grant Zhang
Hi Emeric,

Thanks for testing. I will try repro the issues locally and report back.

Regards,

Grant

> On Mar 15, 2017, at 07:41, Emeric Brun  wrote:
> 
> Hi Grant,
> 
> On 03/15/2017 12:46 PM, Emeric Brun wrote:
>> Hi Grant,
>> 
>> On 03/15/2017 12:05 PM, Emeric Brun wrote:
>>> Hi Grant,
>>> 
>>> On 02/04/2017 12:55 AM, Grant Zhang wrote:
 This patch set adds the basic support for OpenSSL crypto engine and 
 async mode.
 
 Changes since V2:
 - support keyword "algo"
 - ensure SSL engines are initialized before loading certs.
 - limit one async fd per SSL connection
 - better integrate with event cache
 
 Changes since V1:
 - add multiple engine support
 - allow default algorithms to be specified for an engine
 - remove the support for engine identifier "all" since (a) it is not 
 possible
  to specify default algorithms for all engine and (b) "all" makes it hard 
 to
  figure out what engine does what crypto algorithms.
 - address Willy's other comments.
 
>>> 
>> 
>> An other issue:
>> 
>> i'm using that configuration:
>> 
>> global
>>ssl-engine qat algo RSA
>>ssl-async
>>tune.ssl.default-dh-param 2048
>> 
>> listen ss
>>mode tcp 
>>bind 0.0.0.0:8080
>>server ssl 127.0.0.1:8443 ssl no-ssl-reuse verify none
>> 
>> listen gg
>>mode http
>>bind 0.0.0.0:8443 ssl crt /root/2048.pem
>>redirect location /
>> 
>> Unable to perform a clear request through 8080. There is no is issue if i 
>> disable the engine or if i request directly in ssl on 8443. 
>> 
>> R,
>> Emeric
>> 
> 
> There is some inconsistencies between the engine and the used client:
> 
> here the conf:
> global
>tune.ssl.default-dh-param 2048
>ssl-engine qat
>ssl-async
> 
> listen gg
>mode http
>bind 0.0.0.0:8443 ssl crt /root/2048.pem
>redirect location /
> 
> openssl s_client -connect performs well but curl failed:
> emeric@ebr-laptop:~/inject$ curl -k  https://10.0.0.109:8443/
> curl: (35) gnutls_handshake() failed: Bad record MAC
> 
> 
> If I comment the ssl-engine line, no more issue.
> 
> R,
> Emeric
> 
> the conf:
> 
> 
> 
> 




Re: Some compilation SSL errors/warnings on debian testing

2017-03-15 Thread Emmanuel Hocdet

> Le 15 mars 2017 à 12:41, Emmanuel Hocdet  a écrit :
> 
> 
>> Le 14 mars 2017 à 19:11, Willy Tarreau > a 
>> écrit :
>>> 
>>> For the little story: openssl-1.1.0 and boringssl have 
>>> SSL_CTX_set_min_proto_version/SSL_CTX_set_max_proto_version
>>> and other methods to set protocol version are deprecated (or not 
>>> implemented).
>>> It will be boring to keep compat with haproxy ssl directive no- and 
>>> force-.
>>> And perhaps the add of some min- and max-.
>>> 
>>> Willy, what do you think?
>> 
>> Well, that means we'll definitely break all setups and piss-off users :-(
>> 
>> What we can possibly do is to determine the appropriate min/max based on
>> the existing no- and force- and complain if there are holes.
>> Ie, if a user has "no-sslv3 no-tls10 no-tls11" then the min version is
>> TLS 1.2 and that could work. If a user has "no-sslv3 no-tls11 no-tls12"
>> then the min and max versions are TLS 1.0. But if a user has
>> "no-sslv3 no-tls11" then we don't know and we have to complain. Hopefully
>> it would affect very few users (those with strange setups or not aware of
>> their holes).
>> What bothers me with this API change is that it doesn't provide the same
>> flexibility. If you have a vulnerability coming with an implementation or
>> simply a known incompatibility and want to disable it and only this one,
>> you're screwed. With the previous API it was possible. That's why I'm
>> still not 100% sold on the API change :-/
>> 
> 
> 
This usage is not recommended:

"The list of protocols available can also be limited using the SSL_OP_NO_SSLv3, 
SSL_OP_NO_TLSv1, SSL_OP_NO_TLSv1_1 and SSL_OP_NO_TLSv1_2 options of the 
SSL_CTX_set_options 
 or 
SSL_set_options 
 functions, but 
this approach is not recommended. Clients should avoid creating "holes" in the 
set of protocols they support. When disabling a protocol, make sure that you 
also disable either all previous or all subsequent protocol versions. In 
clients, when a protocol version is disabled without disabling all previous 
protocol versions, the effect is to also disable all subsequent protocol 
versions."



Re: [PATCH] CLEANUP: pattern: Move pattern_finalize_config to post checks initialization

2017-03-15 Thread Thierry FOURNIER
On Mon, 13 Mar 2017 18:54:52 +0100
Nenad Merdanovic  wrote:

> Hey Willy,
> 
> On 3/13/2017 6:32 PM, Willy Tarreau wrote:
> > Hi Nenad,
> > 
> > [ccing Thierry]
> > 
> > On Sun, Mar 12, 2017 at 10:00:51PM +0100, Nenad Merdanovic wrote:
> >> Signed-off-by: Nenad Merdanovic 
> >> ---
> >>  include/proto/pattern.h | 2 --
> >>  src/haproxy.c   | 2 --
> >>  src/pattern.c   | 9 -
> >>  3 files changed, 8 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/include/proto/pattern.h b/include/proto/pattern.h
> >> index 9c93db9..1a17445 100644
> >> --- a/include/proto/pattern.h
> >> +++ b/include/proto/pattern.h
> >> @@ -37,8 +37,6 @@ extern void (*pat_prune_fcts[PAT_MATCH_NUM])(struct 
> >> pattern_expr *);
> >>  extern struct pattern *(*pat_match_fcts[PAT_MATCH_NUM])(struct sample *, 
> >> struct pattern_expr *, int);
> >>  extern int pat_match_types[PAT_MATCH_NUM];
> >>  
> >> -void pattern_finalize_config(void);
> >> -
> >>  /* return the PAT_MATCH_* index for match name "name", or < 0 if not 
> >> found */
> >>  static inline int pat_find_match_name(const char *name)
> >>  {
> >> diff --git a/src/haproxy.c b/src/haproxy.c
> >> index 559b481..bd7a5ca 100644
> >> --- a/src/haproxy.c
> >> +++ b/src/haproxy.c
> >> @@ -809,8 +809,6 @@ static void init(int argc, char **argv)
> >>exit(1);
> >>}
> >>  
> >> -  pattern_finalize_config();
> >> -
> >>err_code |= check_config_validity();
> >>if (err_code & (ERR_ABORT|ERR_FATAL)) {
> >>Alert("Fatal errors found in configuration.\n");
> >> diff --git a/src/pattern.c b/src/pattern.c
> >> index 60fe462..224e326 100644
> >> --- a/src/pattern.c
> >> +++ b/src/pattern.c
> >> @@ -2463,7 +2463,7 @@ int pattern_delete(struct pattern_expr *expr, struct 
> >> pat_ref_elt *ref)
> >>  /* This function finalize the configuration parsing. Its set all the
> >>   * automatic ids
> >>   */
> >> -void pattern_finalize_config(void)
> >> +static int pattern_finalize_config(void)
> >>  {
> >>int i = 0;
> >>struct pat_ref *ref, *ref2, *ref3;
> >> @@ -2509,4 +2509,11 @@ void pattern_finalize_config(void)
> >>/* swap root */
> >>LIST_ADD(, _reference);
> >>LIST_DEL();
> >> +  return 0;
> >> +}
> >> +
> >> +__attribute__((constructor))
> >> +static void __pattern_init(void)
> >> +{
> >> +hap_register_post_check(pattern_finalize_config);
> >>  }
> > 
> > Are you sure about this one ? The post_check will cause the iniitialization
> > to be called *after* check_config_validity(). I'm not saying it's not
> > compatible, I just don't know if what it does is needed during the checks.
> > If you have already checked, I'm fine with it (but then please mention it
> > in the commit message).
> 
> I checked and didn't see any impact to check_config_validity(). This
> function assigns unique IDs to the patterns for CLI purposes and
> initializes the pattern LRU cache. It is pretty much the same as
> tlskeys_finalize_config() which was already moved to post checks init.
> 
> I'll resend the patch with the new commit message.


Hi, sorry for my response time. I read the code and I don't see any
suspicious point. In addition, we can take in account these points:

 - The LRU are useless with the configuration check

 - The pattern list id are not used by configuration, only by the cli,
   so the configuration check connot fail if ID are wrongly positionned.

So, I think that this patch will not cause any problem. I think that
this initialization way is cleaner than the old function call in the
haproxy main code.

br,
Thierry.


> Regards,
> Nenad
> 
> > 
> > Thanks,
> > Willy
> > 



Re: Dynamic cookies support

2017-03-15 Thread Willy Tarreau
On Wed, Mar 15, 2017 at 03:19:15PM +0100, Olivier Houchard wrote:
> Oops my bad, I'm an idiot.

Great, that's exactly what we were missing here, I needed some help.

> Willy, can you commit the attached patch ?

Both patches merged, thanks guys.

Willy



Re: Force connection close after a haproxy reload

2017-03-15 Thread Dave Cottlehuber
On Wed, 15 Mar 2017, at 12:02, Willy Tarreau wrote:
> Hi Cyril!
> 
> On Wed, Mar 15, 2017 at 11:48:01AM +0100, Cyril Bonté wrote:
> > As a reminder (to me), I sent a patch in december (just before the 1.7.0
> > release), which immediately closes the HTTP keep-alived connections.
> > Currently, during the soft stop, HTTP connections are only closed when a
> > request is processed, it doesn't do anything on connections already in an
> > idle state.
> 
> Ah yes I vaguely remember about this discussion now.
> 
> > I didn't spend more time on it but having a quick look at it, it may be 
> > ready
> > to merge soon.
> 
> Cool!
> 
> > About TCP connections, while I wrote the patch, I was thinking about a 
> > global
> > "grace timeout", which will enforce haproxy exit if the soft stop takes too
> > long (for example when tcp connections don't expire). Something like :
> > 
> > global
> >   grace 30s

Yes please.  I have a reasonable number of websocket connections that
run for hours or days. I'd much prefer having an operational guarantee
that
a restart/reload will take no longer than 5 minutes by which time all of
the
transactional HTTP-only (non-upgraded) connections will have been long
time.

A+
Dave 



Re: Dynamic cookies support

2017-03-15 Thread Olivier Houchard
On Wed, Mar 15, 2017 at 03:52:04PM +0200, Jarno Huuskonen wrote:
> Hi Olivier,
> 
> On Tue, Mar 14, Olivier Houchard wrote:
> > Hi guys,
> > 
> > You'll find attached patches to add support for dynamically-generated 
> > session
> > cookies for each server, the said cookies will be a hash of the IP, the
> > TCP port, and a secret key provided.
> > This adds 2 keywords to the config file, a "dynamic" keyword in the cookie
> > line, which just enables dynamic cookie, and a "dynamic-cookie-key" keyword,
> > for backends, which specifies the secret key to use.
> > This is a first step to be able to add and remove servers on the fly, 
> > without modifying the configuration. That way, we can have cookies that are
> > automatically usable for all the load-balancers.
> > 
> > Any comment would be welcome.
> 
> If I omit dynamic-cookie-key then I'll get a crash:
> (gdb) bt
> #0  0x76a72e71 in __strlen_sse2_pminub () from /lib64/libc.so.6
> #1  0x0044bbe4 in srv_set_dyncookie (s=s@entry=0x720050)
> at src/server.c:88
> (I think p->dyncookie_key is NULL).
> 
> #2  0x00429720 in check_config_validity () at src/cfgparse.c:8480
> #3  0x00487b25 in init (argc=, argc@entry=4, 
> argv=, argv@entry=0x7fffdcb8) at src/haproxy.c:814
> #4  0x0040820b in main (argc=4, argv=0x7fffdcb8)
> at src/haproxy.c:1643
> 
> with this config(defaults/global omitted):
> frontend test
>   bind ipv4@127.0.0.1:8080
>   default_backend test_be
> 
> backend test_be
>   balance roundrobin
>   #dynamic-cookie-key "dymmykey"
>   cookie WEBSRV insert dynamic
>   server wp1 127.0.0.1:8084 id 1
>   server wp2 127.0.0.1:8085 id 2
> 


Oops my bad, I'm an idiot.
Willy, can you commit the attached patch ?

> Is the hash same with little/big endian processors ?
> memcpy(&(tmpbuf[key_len]),
> s->addr.ss_family == AF_INET ?
> (void *)&((struct sockaddr_in *)>addr)->sin_addr.s_addr :
> (void *)&(((struct sockaddr_in6 *)>addr)->sin6_addr.s6_addr),
> addr_len);
> 

I expect those to be stored in network byte order, so it should be.

> -Jarno
> 
> (Also there's a minor typo in comments:
> > +/* Calculates the dynamic persitent cookie for a server, if a secret key 
> > has
> > + * been provided.
> > + */
> ...
> > +   else if (!strcmp(args[cur_arg], "dynamic")) { /* 
> > Dynamic persitent cookies secret key */
> 
> s/persitent/persistent/)
> 

Oops, patch #2 :)

Thanks a lot !

Olivier
>From f9a92fd9a624962860af9c61208d60e304e60a52 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 15 Mar 2017 15:11:06 +0100
Subject: [PATCH 1/2] BUG/MEDIUM server: Fix crash when dynamic is defined, but
 not key is provided.
X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4

Wait until we're sure we have a key before trying to calculate its length.
---
 src/server.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/server.c b/src/server.c
index 4e03e50..5589723 100644
--- a/src/server.c
+++ b/src/server.c
@@ -85,7 +85,7 @@ void srv_set_dyncookie(struct server *s)
struct server *tmpserv;
char *tmpbuf;
unsigned long long hash_value;
-   size_t key_len = strlen(p->dyncookie_key);
+   size_t key_len;
size_t buffer_len;
int addr_len;
int port;
@@ -94,6 +94,7 @@ void srv_set_dyncookie(struct server *s)
!(s->proxy->ck_opts & PR_CK_DYNAMIC) ||
s->proxy->dyncookie_key == NULL)
return;
+   key_len = strlen(p->dyncookie_key);
 
if (s->addr.ss_family != AF_INET &&
s->addr.ss_family != AF_INET6)
-- 
2.9.3

>From c576bb09e643b34798056c83182752e872e578c1 Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Wed, 15 Mar 2017 15:12:06 +0100
Subject: [PATCH 2/2] MINOR: Typo in comment.
X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4

---
 src/cfgparse.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/cfgparse.c b/src/cfgparse.c
index 5a09589..2eb25ed 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -3310,7 +3310,7 @@ int cfg_parse_listen(const char *file, int linenum, char 
**args, int kwm)
curproxy->cookie_maxlife = maxlife;
cur_arg++;
}
-   else if (!strcmp(args[cur_arg], "dynamic")) { /* 
Dynamic persitent cookies secret key */
+   else if (!strcmp(args[cur_arg], "dynamic")) { /* 
Dynamic persistent cookies secret key */
 
if (warnifnotcap(curproxy, PR_CAP_BE, file, 
linenum, args[cur_arg], NULL))
err_code |= ERR_WARN;
-- 
2.9.3



Re: Possible regression on HAProxy 1.6, related to ACLs and dynamic payload buffers

2017-03-15 Thread Felipe Guerreiro Barbosa Ruiz
Sure thing, I'll get it tested and submit the patch. Thanks for the swift
response.

Cheers,
Felipe

On 15 March 2017 at 07:06, Willy Tarreau  wrote:

> On Wed, Mar 15, 2017 at 10:56:10AM +0100, Willy Tarreau wrote:
> > > I did some digging and found that the ability to handle dynamic
> buffers was
> > > added in 00f0084752eab236af80e61291d672e835790cff
> > >  00f0084752eab236af80e61291d672e835790cff>
> > > but it seems to have been removed in
> > > d7bdcb874bcbd82737e51f080b9b0092863399f9
> > >  d7bdcb874bcbd82737e51f080b9b0092863399f9>,
> > > but I can't tell if the removal was done on purpose or if it was an
> > > accident. Either way, d7bdcb874bcb was not backported to 1.5, making
> their
> > > behaviour wildly different. Also, due to d7bdcb874bcb, 1.6 does not
> work as
> > > stated in the docs
> > >  configuration.html#7.3.5-req.payload>,
> > > which say that "As a special case, if the  argument is zero,
> the
> > > the whole buffer from  to the end is extracted."
> >
> > That's a very accute observation.
> >
> > > Was that on purpose or is it an actual bug?
> >
> > Both. It was made on purpose to address a specific case without thinking
> about
> > this one in my opinion. I think we should return "don't know yet" when
> the
> > current buffer size is zero (not allocated yet) and use the buffer's size
> > when the argument is zero. I'm going to have a look at it.
>
> Looking closer, I was wrong, it's only a bug. In fact the same condition
> was used to address both payload_lv() and payload() while only the latter
> makes a special case of length zero. So I think this should be changed
> like this :
>
> -   if (!buf_size || buf_size > global.tune.bufsize || buf_offset +
> buf_size > global.tune.bufsize) {
> +   if (buf_size > global.tune.bufsize || buf_offset + buf_size >
> global.tune.bufsize) {
>
> Looking at the rest of the function it should work. Would you care to test
> and to send us this patch with a suitable commit message ? Please look at
> CONTRIBUTING to get all the info and/or just "git log src/payload.c".
>
> Thanks,
> Willy
>


Re: Dynamic cookies support

2017-03-15 Thread Jarno Huuskonen
Hi Olivier,

On Tue, Mar 14, Olivier Houchard wrote:
> Hi guys,
> 
> You'll find attached patches to add support for dynamically-generated session
> cookies for each server, the said cookies will be a hash of the IP, the
> TCP port, and a secret key provided.
> This adds 2 keywords to the config file, a "dynamic" keyword in the cookie
> line, which just enables dynamic cookie, and a "dynamic-cookie-key" keyword,
> for backends, which specifies the secret key to use.
> This is a first step to be able to add and remove servers on the fly, 
> without modifying the configuration. That way, we can have cookies that are
> automatically usable for all the load-balancers.
> 
> Any comment would be welcome.

If I omit dynamic-cookie-key then I'll get a crash:
(gdb) bt
#0  0x76a72e71 in __strlen_sse2_pminub () from /lib64/libc.so.6
#1  0x0044bbe4 in srv_set_dyncookie (s=s@entry=0x720050)
at src/server.c:88
(I think p->dyncookie_key is NULL).

#2  0x00429720 in check_config_validity () at src/cfgparse.c:8480
#3  0x00487b25 in init (argc=, argc@entry=4, 
argv=, argv@entry=0x7fffdcb8) at src/haproxy.c:814
#4  0x0040820b in main (argc=4, argv=0x7fffdcb8)
at src/haproxy.c:1643

with this config(defaults/global omitted):
frontend test
bind ipv4@127.0.0.1:8080
default_backend test_be

backend test_be
balance roundrobin
#dynamic-cookie-key "dymmykey"
cookie WEBSRV insert dynamic
server wp1 127.0.0.1:8084 id 1
server wp2 127.0.0.1:8085 id 2

Is the hash same with little/big endian processors ?
memcpy(&(tmpbuf[key_len]),
s->addr.ss_family == AF_INET ?
(void *)&((struct sockaddr_in *)>addr)->sin_addr.s_addr :
(void *)&(((struct sockaddr_in6 *)>addr)->sin6_addr.s6_addr),
addr_len);

-Jarno

(Also there's a minor typo in comments:
> +/* Calculates the dynamic persitent cookie for a server, if a secret key has
> + * been provided.
> + */
...
> + else if (!strcmp(args[cur_arg], "dynamic")) { /* 
> Dynamic persitent cookies secret key */

s/persitent/persistent/)

-- 
Jarno Huuskonen



Re: OpenSSL engine and async support

2017-03-15 Thread Emeric Brun
Hi Grant,

On 03/15/2017 12:05 PM, Emeric Brun wrote:
> Hi Grant,
> 
> On 02/04/2017 12:55 AM, Grant Zhang wrote:
>> This patch set adds the basic support for OpenSSL crypto engine and 
>> async mode.
>>
>> Changes since V2:
>> - support keyword "algo"
>> - ensure SSL engines are initialized before loading certs.
>> - limit one async fd per SSL connection
>> - better integrate with event cache
>>
>> Changes since V1:
>> - add multiple engine support
>> - allow default algorithms to be specified for an engine
>> - remove the support for engine identifier "all" since (a) it is not possible
>>   to specify default algorithms for all engine and (b) "all" makes it hard to
>>   figure out what engine does what crypto algorithms.
>> - address Willy's other comments.
>>
> 

An other issue:

i'm using that configuration:

global
ssl-engine qat algo RSA
ssl-async
tune.ssl.default-dh-param 2048

listen ss
mode tcp 
bind 0.0.0.0:8080
server ssl 127.0.0.1:8443 ssl no-ssl-reuse verify none

listen gg
mode http
bind 0.0.0.0:8443 ssl crt /root/2048.pem
redirect location /

Unable to perform a clear request through 8080. There is no is issue if i 
disable the engine or if i request directly in ssl on 8443. 

R,
Emeric



Re: Some compilation SSL errors/warnings on debian testing

2017-03-15 Thread Emmanuel Hocdet

> Le 14 mars 2017 à 19:11, Willy Tarreau  a écrit :
>> 
>> For the little story: openssl-1.1.0 and boringssl have 
>> SSL_CTX_set_min_proto_version/SSL_CTX_set_max_proto_version
>> and other methods to set protocol version are deprecated (or not 
>> implemented).
>> It will be boring to keep compat with haproxy ssl directive no- and 
>> force-.
>> And perhaps the add of some min- and max-.
>> 
>> Willy, what do you think?
> 
> Well, that means we'll definitely break all setups and piss-off users :-(
> 
> What we can possibly do is to determine the appropriate min/max based on
> the existing no- and force- and complain if there are holes.
> Ie, if a user has "no-sslv3 no-tls10 no-tls11" then the min version is
> TLS 1.2 and that could work. If a user has "no-sslv3 no-tls11 no-tls12"
> then the min and max versions are TLS 1.0. But if a user has
> "no-sslv3 no-tls11" then we don't know and we have to complain. Hopefully
> it would affect very few users (those with strange setups or not aware of
> their holes).
> What bothers me with this API change is that it doesn't provide the same
> flexibility. If you have a vulnerability coming with an implementation or
> simply a known incompatibility and want to disable it and only this one,
> you're screwed. With the previous API it was possible. That's why I'm
> still not 100% sold on the API change :-/
> 

ssl_options seems still valid, all directives can be mapped to it and keep 
compatibility.

Manu



Re: Force connection close after a haproxy reload

2017-03-15 Thread Willy Tarreau
Hi Cyril!

On Wed, Mar 15, 2017 at 11:48:01AM +0100, Cyril Bonté wrote:
> As a reminder (to me), I sent a patch in december (just before the 1.7.0
> release), which immediately closes the HTTP keep-alived connections.
> Currently, during the soft stop, HTTP connections are only closed when a
> request is processed, it doesn't do anything on connections already in an
> idle state.

Ah yes I vaguely remember about this discussion now.

> I didn't spend more time on it but having a quick look at it, it may be ready
> to merge soon.

Cool!

> About TCP connections, while I wrote the patch, I was thinking about a global
> "grace timeout", which will enforce haproxy exit if the soft stop takes too
> long (for example when tcp connections don't expire). Something like :
> 
> global
>   grace 30s

It's pretty similar to what we have with the "grace" setting in frontends
which maintains the listener alive for some time and automatically shuts
it down. And yes I think it does make sense because in the end people
complain about "too old" processes.

Please note, if you go down that route, you don't need to kill idle sessions
upon signal receipt anymore, all of them should be dealt with by this grace
period. That will make http quit even more cleanly as well.

In the past I thought about shortening the client/server timeouts upon
reload, or switching them to client-fin and server-fin only. But that
wouldn't work because some connections can remain very active for a long
time and would never die.

> Would it be something acceptable Willy ?

Yes I think that this grace setting would solve all use cases and avoid
having to kill each and every pending connection (they would simply go
away with the process). And since it's a new setting we could possibly
even imagine backporting it to 1.7 if there's strong demand and it's
riskless.

Cheers,
Willy



Re: Force connection close after a haproxy reload

2017-03-15 Thread Pavlos Parissis
On 15/03/2017 11:48 πμ, Cyril Bonté wrote:
> Hi all,
> 
>> De: "Willy Tarreau"  À: "Robson Roberto Souza Peixoto"
>>  Cc: haproxy@formilux.org Envoyé: Mardi 14 Mars
>> 2017 13:20:46 Objet: Re: Force connection close after a haproxy reload
>> 
>> On Tue, Mar 14, 2017 at 11:16:26AM +, Robson Roberto Souza Peixoto
>> wrote:
>>> But will `-st` mode wait for current http requests finish? Or will 
>>> interrupt all connections without waiting for the responses?
>> 
>> It will interrupt them all but you said you were running TCP and not HTTP 
>> so with TCP there's no notion of end of request nor response.
> 
> As a reminder (to me), I sent a patch in december (just before the 1.7.0
> release), which immediately closes the HTTP keep-alived connections.
> Currently, during the soft stop, HTTP connections are only closed when a
> request is processed, it doesn't do anything on connections already in an
> idle state. I didn't spend more time on it but having a quick look at it, it
> may be ready to merge soon.
> 

What I have observed in production is that in HTTP mode the timeout settings
(client, http-requests) contribute to how long older process stay around. Longer
timeouts makes older processes to stay around longer time.
Having said this, it all depends to the traffic composition. For instance, when
you have radio streaming then older processes stay for days regardless to
timeout settings.


> About TCP connections, while I wrote the patch, I was thinking about a global
> "grace timeout", which will enforce haproxy exit if the soft stop takes too
> long (for example when tcp connections don't expire). Something like :
> 
> global grace 30s

This is a very interesting approach, as it solves the issue of old processes
staying around for days for TCP mode. +1 from me.


Cheers,
Pavlos



signature.asc
Description: OpenPGP digital signature


Re: Force connection close after a haproxy reload

2017-03-15 Thread Cyril Bonté
Hi all,

> De: "Willy Tarreau" 
> À: "Robson Roberto Souza Peixoto" 
> Cc: haproxy@formilux.org
> Envoyé: Mardi 14 Mars 2017 13:20:46
> Objet: Re: Force connection close after a haproxy reload
> 
> On Tue, Mar 14, 2017 at 11:16:26AM +, Robson Roberto Souza
> Peixoto wrote:
> > But will `-st` mode wait for current http requests finish? Or will
> > interrupt all connections without waiting for the responses?
> 
> It will interrupt them all but you said you were running TCP and not
> HTTP
> so with TCP there's no notion of end of request nor response.

As a reminder (to me), I sent a patch in december (just before the 1.7.0 
release), which immediately closes the HTTP keep-alived connections. Currently, 
during the soft stop, HTTP connections are only closed when a request is 
processed, it doesn't do anything on connections already in an idle state.
I didn't spend more time on it but having a quick look at it, it may be ready 
to merge soon.

About TCP connections, while I wrote the patch, I was thinking about a global 
"grace timeout", which will enforce haproxy exit if the soft stop takes too 
long (for example when tcp connections don't expire). Something like :

global
  grace 30s

Would it be something acceptable Willy ?

Cheers,
Cyril Bonté



Re: [RFC PATCH] MEDIUM: persistent connections for SSL checks

2017-03-15 Thread Willy Tarreau
On Mon, Mar 13, 2017 at 06:10:23PM +0100, Willy Tarreau wrote:
> > Just wanted to follow up. I've been running this patch for a couple days on
> > an idle system and haven't noticed any problems.
> > Could this be merged? Is there anything else I can test?
> 
> I'm personally fine with it but I'd rather have Emeric approve it, as
> he knows better than me the possible impacts of shutting down cleanly
> or not on SSL.
> 
> Emeric, I've re-attached the patch. Using conn_data_shutw() instead of
> conn_data_shutw_hard() causes the "clean" flag to be set when calling
> ssl_sock_shutw() and SSL_set_quiet_shutdown() not to be called so that
> we perform a clean shutdown. The purpose is to give a chance to the
> server to store the context and avoid renegociating.

Now applied with Emeric's blessing.

Thanks,
Willy



Re: Dynamic cookies support

2017-03-15 Thread Willy Tarreau
On Tue, Mar 14, 2017 at 11:22:35PM +0100, Willy Tarreau wrote:
> Well that looks pretty good, I have no comment to make about it. 
> Do you want me to merge them now or are you seeking more comments first ?

now applied, thanks!
Willy



Re: Possible regression on HAProxy 1.6, related to ACLs and dynamic payload buffers

2017-03-15 Thread Willy Tarreau
On Wed, Mar 15, 2017 at 10:56:10AM +0100, Willy Tarreau wrote:
> > I did some digging and found that the ability to handle dynamic buffers was
> > added in 00f0084752eab236af80e61291d672e835790cff
> > 
> > but it seems to have been removed in
> > d7bdcb874bcbd82737e51f080b9b0092863399f9
> > ,
> > but I can't tell if the removal was done on purpose or if it was an
> > accident. Either way, d7bdcb874bcb was not backported to 1.5, making their
> > behaviour wildly different. Also, due to d7bdcb874bcb, 1.6 does not work as
> > stated in the docs
> > ,
> > which say that "As a special case, if the  argument is zero, the
> > the whole buffer from  to the end is extracted."
> 
> That's a very accute observation.
> 
> > Was that on purpose or is it an actual bug?
> 
> Both. It was made on purpose to address a specific case without thinking about
> this one in my opinion. I think we should return "don't know yet" when the
> current buffer size is zero (not allocated yet) and use the buffer's size
> when the argument is zero. I'm going to have a look at it.

Looking closer, I was wrong, it's only a bug. In fact the same condition
was used to address both payload_lv() and payload() while only the latter
makes a special case of length zero. So I think this should be changed
like this :

-   if (!buf_size || buf_size > global.tune.bufsize || buf_offset + 
buf_size > global.tune.bufsize) {
+   if (buf_size > global.tune.bufsize || buf_offset + buf_size > 
global.tune.bufsize) {

Looking at the rest of the function it should work. Would you care to test
and to send us this patch with a suitable commit message ? Please look at
CONTRIBUTING to get all the info and/or just "git log src/payload.c".

Thanks,
Willy



Re: Possible regression on HAProxy 1.6, related to ACLs and dynamic payload buffers

2017-03-15 Thread Willy Tarreau
Hi Felipe,

On Tue, Mar 14, 2017 at 04:03:22PM -0300, Felipe Guerreiro Barbosa Ruiz wrote:
> Hi all,
> 
> After upgrading from 1.5 to 1.6 I noticed some ACLs stopped working. All of
> them looked like: acl some_name req.payload(0,0) <>
> 
> I did some digging and found that the ability to handle dynamic buffers was
> added in 00f0084752eab236af80e61291d672e835790cff
> 
> but it seems to have been removed in
> d7bdcb874bcbd82737e51f080b9b0092863399f9
> ,
> but I can't tell if the removal was done on purpose or if it was an
> accident. Either way, d7bdcb874bcb was not backported to 1.5, making their
> behaviour wildly different. Also, due to d7bdcb874bcb, 1.6 does not work as
> stated in the docs
> ,
> which say that "As a special case, if the  argument is zero, the
> the whole buffer from  to the end is extracted."

That's a very accute observation.

> Was that on purpose or is it an actual bug?

Both. It was made on purpose to address a specific case without thinking about
this one in my opinion. I think we should return "don't know yet" when the
current buffer size is zero (not allocated yet) and use the buffer's size
when the argument is zero. I'm going to have a look at it.

Thanks,
Willy