Re: directing requests to a specific server

2019-05-23 Thread Paul Lockaby
Perfect! Thanks Tim. So many options in the HAProxy configuration sometimes I 
get lost in it.


> On May 23, 2019, at 12:09 PM, Tim Düsterhus  wrote:
> 
> Paul,
> 
> Am 23.05.19 um 20:17 schrieb Paul Lockaby:
>> If there is a way that I can direct a request to a specific server in a 
>> backend rather than duplicating backends with different server lists that 
>> would be ideal. Is that possible?
> 
> I believe you are searching for use-server:
> https://cbonte.github.io/haproxy-dconv/1.9/configuration.html#4.2-use-server
> 
> Best regards
> Tim Düsterhus



Re: directing requests to a specific server

2019-05-23 Thread Tim Düsterhus
Paul,

Am 23.05.19 um 20:17 schrieb Paul Lockaby:
> If there is a way that I can direct a request to a specific server in a 
> backend rather than duplicating backends with different server lists that 
> would be ideal. Is that possible?

I believe you are searching for use-server:
https://cbonte.github.io/haproxy-dconv/1.9/configuration.html#4.2-use-server

Best regards
Tim Düsterhus



directing requests to a specific server

2019-05-23 Thread Paul Lockaby
Hello!

I have a frontend/backend that looks kind of like below, obviously very 
simplified.



frontend myhost-frontend
bind *:443 ssl crt /usr/local/ssl/certs/host.pem
mode http
log global

acl request_monitor_cluster path_beg /monitor/cluster
use_backend monitor_cluster if request_monitor_cluster

# otherwise send requests to this backend
default_backend myhost-backend

backend monitor_cluster
mode http
log global
balance source
hash-type consistent
option httpchk GET /haproxy/alive.txt
http-check disable-on-404

server host01 host01.example.com:1234 check
server host02 host02.example.com:1234 check
server host03 host03.example.com:1234 check



The key thing here is that if someone goes to 
https://myhost.example.com/monitor/cluster they access the monitoring system 
from some arbitrary host in the cluster. I'd like to be able to do something 
where if I go to, say, https://myhost.example.com/monitor/node/host01 that I 
can go to the monitoring system on a specific node.

My first thought is that I would have to make three more backends, one for each 
node in the cluster, and each backend would only have one server in it. If 
there is a way that I can direct a request to a specific server in a backend 
rather than duplicating backends with different server lists that would be 
ideal. Is that possible?


Re: Sticky-table persistence in a Kubernetes environment

2019-05-23 Thread Willy Tarreau
Hi Eduardo,

On Thu, May 23, 2019 at 10:09:55AM -0300, Eduardo Doria Lima wrote:
> Hi Aleks,
> 
> I don't understand what you means with "local host". But could be nice if
> new process get data of old process.

That's exatly the principle. A peers section contains a number of peers,
including the local one. Example, let's say you have 4 haproxy nodes, all
of them will have the exact same section :

   peers my-cluster
   peer node1 10.0.0.1:1200
   peer node2 10.0.0.2:1200
   peer node3 10.0.0.3:1200
   peer node4 10.0.0.4:1200

When you start haproxy it checks if there is a peer with the same name
as the local machine, if so it considers it as the local peer and will
try to synchronize the full tables with it. Normally what this means is
that the old process connects to the new one to teach it everything.
When your peers don't hold the same name, you can force it on the command
line using -H to give the local peer name, e.g. "-H node3".

Also, be sure to properly reload, not restart! The restart (-st) will
kill the old process without leaving it a chance to resychronize! The
reload (-sf) will tell it to finish its work then quit, and among its
work there's the resync job ;-)

> As I said to João Morais, we "solve" this problem adding a sidecar HAProxy
> (another container in same pod) only to store the sticky-table of main
> HAProxy. In my opinion it's a resource waste, but this is best solution now.

That's a shame because the peers naturally support not losing tables on
reload, so indeed your solution is way more complex!

Hoping this helps,
Willy



Re: RFE: insert server into peer section

2019-05-23 Thread Willy Tarreau
Hi Aleks,

On Thu, May 23, 2019 at 01:12:48PM +0200, Aleksandar Lazic wrote:
> We had a interesting discussion on Kubeconf how a session table in peer's can
> survive a restart of a haproxy instance.
> 
> We came to a request for enhancement (RFE) to be able to add a peer server,
> not a peer section, to a existing peers section, similar to add server for
> backend.

It's not entirely clear to me in which case it would be useful. Do you
need to deploy new haproxy nodes on the fly and to avoid restarting the
other ones to see it as a peer maybe ?

Thanks to the changes Fred did which made the peers become totally regular
servers, I suspect it wouldn't be too hard to make the server-template
mechanism work the same way for peers. It's just that I'm not sure about
the expected benefits.

Cheers,
Willy



Re: [ANNOUNCE] haproxy-2.0-dev4

2019-05-23 Thread Willy Tarreau
On Thu, May 23, 2019 at 07:35:43PM +0500,  ??? wrote:
> we can definetly cache "git clone" for BoringSSL, I'll send patch.

OK!

> as for "build cache", it might be not that trivial.

No problem, I'm only suggesting. What matters the most to me is that
it works fine and causes little false positives (one in a while is OK).
The second goal is to make efficient use of the resources they assign
us for free. The third one is that it shows a low latency. The last two
goals tend to depend on the same principles :-)

Willy



Re: [ANNOUNCE] haproxy-2.0-dev4

2019-05-23 Thread Илья Шипицин
чт, 23 мая 2019 г. в 18:45, Willy Tarreau :

> On Thu, May 23, 2019 at 04:17:33PM +0500,  ??? wrote:
> > I'd like to run sanitizers on vaious combinations, like ZLIB / SLZ, PCRE
> /
> > PCRE2 ...
> > ok, let us do it before Wednesday
>
> OK, why not. Feel free to send patches once you can test them. Please
> make sure not to unreasonably increase the build time by multiplying
> the build combinations, right now it provides good value because you
> get the result while still working on the subject, this is an essential
> feature! And I'd even say that the boringssl build is extremely long and
> hinders this a little bit, I don't know why it is like this. If we could
> also save some resources on their infrastructure by keeping some prebuilt
> stuff somewhere, that would be great, but I have no idea whether it's
> possible to cache some data (e.g. rebuild the components only once a
> day).
>

we can definetly cache "git clone" for BoringSSL, I'll send patch.

as for "build cache", it might be not that trivial.



>
> Cheers,
> Willy
>


Re: [ANNOUNCE] haproxy-2.0-dev4

2019-05-23 Thread Willy Tarreau
On Thu, May 23, 2019 at 04:17:33PM +0500,  ??? wrote:
> I'd like to run sanitizers on vaious combinations, like ZLIB / SLZ, PCRE /
> PCRE2 ...
> ok, let us do it before Wednesday

OK, why not. Feel free to send patches once you can test them. Please
make sure not to unreasonably increase the build time by multiplying
the build combinations, right now it provides good value because you
get the result while still working on the subject, this is an essential
feature! And I'd even say that the boringssl build is extremely long and
hinders this a little bit, I don't know why it is like this. If we could
also save some resources on their infrastructure by keeping some prebuilt
stuff somewhere, that would be great, but I have no idea whether it's
possible to cache some data (e.g. rebuild the components only once a
day).

Cheers,
Willy



Re: Sticky-table persistence in a Kubernetes environment

2019-05-23 Thread Eduardo Doria Lima
Hi Aleks,

I don't understand what you means with "local host". But could be nice if
new process get data of old process.

As I said to João Morais, we "solve" this problem adding a sidecar HAProxy
(another container in same pod) only to store the sticky-table of main
HAProxy. In my opinion it's a resource waste, but this is best solution now.

I know João don't have time to implement the peers part now. But I'm trying
to make some tests, if successful I can make a pull request.


Att,
Eduardo

Em qui, 23 de mai de 2019 às 09:40, Aleksandar Lazic 
escreveu:

>
> Hi Eduardo.
>
> Thu May 23 14:30:46 GMT+02:00 2019 Eduardo Doria Lima :
>
> > HI Aleks,
>  > "First why do you restart all haproxies at the same time and don't use
> rolling updates ?"
>  > We restarts all HAProxys at the same time because they watch Kubernetes
> API. The ingress ( https://github.com/jcmoraisjr/haproxy-ingress [
> https://github.com/jcmoraisjr/haproxy-ingress] ) do this automatic. I was
> talking with ingress creator João Morais about the possibility of use a
> random value to restart but we agree it's not 100% secure to keep the
> table. The ingress don't use rolling update because it's fast to realod
> HAProxy than kill entire Pod. I think. I will find more about this.
>
> João, Baptiste and I talked about this topic on the kubeconf here and the
> was the suggestion to add the "local host" in the peers section.
>  When a restart happen then haproxy new process ask haproxy old process to
> get the data.
>
> I don't know when joao have the time to implement the peers part.
>
> Regards
>  Aleks
>
> > "Maybe you can add a init container to update the peers in the current
> running haproxy pod's with socket commands, if possible."
>  > The problem is not update the peers, we can do this. The problem is all
> the peers reload at same time.
>  > "* how often happen such a restart?"
>  > Not to much, but enough to affect some users when it occurs.
>  >
>  > "* how many entries are in the tables?"
>  > I don't know exactly, maybe between thousand and ten thousand.
>  >
>  > Thanks!
>  > Att, Eduardo
>  >
>  >
>  >
>  > Em qua, 22 de mai de 2019 às 16:10, Aleksandar Lazic <
> al-hapr...@none.at [] > escreveu:
>  >
>  >>
>  >> Hi Eduardo.
>  >>
>  >> That's a pretty interesting question, at least for me.
>  >>
>  >> First why do you restart all haproxies at the same time and don't use
> rolling updates ?
>  >>
>  >>
> https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/
> [
> https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/
> ]
>  >>
>  >> Maybe you can add a init container to update the peers in the current
> running haproxy pod's with socket commands, if possible.
>  >>
>  >> https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ [
> https://kubernetes.io/docs/concepts/workloads/pods/init-containers/]
>  >>
>  >> http://cbonte.github.io/haproxy-dconv/1.9/management.html#9.3 [
> http://cbonte.github.io/haproxy-dconv/1.9/management.html#9.3]
>  >>
>  >> Agree with you that peers possibility would be nice.
>  >>
>  >> Some other questions are.
>  >>
>  >> * how often happen such a restart?
>  >> * how many entries are in the tables?
>  >>
>  >> I don't see anything wrong to use a "quorum" Server. This is a pretty
> common solution even on contained setups.
>  >>
>  >> Regards
>  >> Aleks
>  >>
>  >> Wed May 22 15:36:10 GMT+02:00 2019 Eduardo Doria Lima <
> eduardo.l...@trt20.jus.br [] >:
>  >>
>  >>> Hi,
>  >>> I'm using HAProxy to support a system that was initially developed
> for Apache (AJP) and JBoss. Now we are migrating it's infrastructure to a
> Kubernetes cluster with HAProxy as ingress (load balancer).
>  >>> The big problem is this system depends strict to JSESSIONID. Some
> internal requests made in Javascript or Angular don't respect browser
> cookies and send requests only with original Jboss JSESSIONID value.
>  >>> Because of this we need a sticky-table to map JSESSIONID values. But
> in a cluster environment ( https://github.com/jcmoraisjr/haproxy-ingress [
> https://github.com/jcmoraisjr/haproxy-ingress] ) HAProxy has many
> instances and this instances don't have fixed IP, they are volatile.
>  >>> Also, in Kubernetes cluster everything is in constant change and any
> change is a reload of all HAProxy instances. So, we lost the sticky-table.
>  >>> Even we use "peers" feature as described in this issue (
> https://github.com/jcmoraisjr/haproxy-ingress/issues/296 [
> https://github.com/jcmoraisjr/haproxy-ingress/issues/296] ) by me, we
> don't know if table will persist because all instances will reload in the
> same time.
>  >>> We thought to use a separate HAProxy server only to cache this table.
> This HAProxy will never reload. But I'm not comfortable to use a HAProxy
> server instance only for this.
>  >>> I appreciate if you help me. Thanks!
>  >>>
>  >>> Att,
>  >>> Eduardo
>  >>>
>  >>
>  >
>
>
>


Re: Haproxy infront of exim cluster - SMTP protocol synchronization error

2019-05-23 Thread Jarno Huuskonen
Hi,

On Wed, May 22, Brent Clark wrote:
> 2019-05-22 12:23:15 SMTP protocol synchronization error (input sent
> without waiting for greeting): rejected connection from
> H=smtpgatewayserver [IP_OF_LB_SERVER] input="PROXY TCP4 $MY_IP
> $IP_OF_LB_SERVER 39156 587\r\n"

Seems like proxy protocol is not enabled on exim.

> We use Exim and I set:
> hostlist haproxy_hosts = IP.OF.LB

Do you have
hosts_proxy(https://www.exim.org/exim-html-current/doc/html/spec_html/ch-proxies.html)
 set/enabled ? 

-Jarno

> My haproxy config:
> https://pastebin.com/raw/JYAXkAq4
> 
> If I run
> openssl s_client -host smtpgatewayserver -port 587 -starttls smtp -crlf
> 
> openssl says connected, but SSL-Session is empty.
> 
> I would like to say, if I change 'send-proxy' to 'check', the
> everything works, BUT the IP logged by Exim, is that of the LB, and
> not the client.
> 
> If anyone could please review the haproxy config / my setup, it
> would be appreciated.
> 
> Many thanks
> Brent Clark
> 
> 

-- 
Jarno Huuskonen



Re: Sticky-table persistence in a Kubernetes environment

2019-05-23 Thread Aleksandar Lazic


Hi Eduardo.

Thu May 23 14:30:46 GMT+02:00 2019 Eduardo Doria Lima :

> HI Aleks,
 > "First why do you restart all haproxies at the same time and don't use 
 > rolling updates ?"
 > We restarts all HAProxys at the same time because they watch Kubernetes API. 
 > The ingress ( https://github.com/jcmoraisjr/haproxy-ingress 
 > [https://github.com/jcmoraisjr/haproxy-ingress] ) do this automatic. I was 
 > talking with ingress creator João Morais about the possibility of use a 
 > random value to restart but we agree it's not 100% secure to keep the table. 
 > The ingress don't use rolling update because it's fast to realod HAProxy 
 > than kill entire Pod. I think. I will find more about this.

João, Baptiste and I talked about this topic on the kubeconf here and the was 
the suggestion to add the "local host" in the peers section.
 When a restart happen then haproxy new process ask haproxy old process to get 
the data.

I don't know when joao have the time to implement the peers part.

Regards
 Aleks

> "Maybe you can add a init container to update the peers in the current 
> running haproxy pod's with socket commands, if possible."
 > The problem is not update the peers, we can do this. The problem is all the 
 > peers reload at same time.
 > "* how often happen such a restart?"
 > Not to much, but enough to affect some users when it occurs.
 >
 > "* how many entries are in the tables?"
 > I don't know exactly, maybe between thousand and ten thousand.
 >
 > Thanks!
 > Att, Eduardo
 >
 >
 >
 > Em qua, 22 de mai de 2019 às 16:10, Aleksandar Lazic < al-hapr...@none.at [] 
 > > escreveu:
 >
 >>
 >> Hi Eduardo.
 >>
 >> That's a pretty interesting question, at least for me.
 >>
 >> First why do you restart all haproxies at the same time and don't use 
 >> rolling updates ?
 >>
 >> https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/ 
 >> [https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/]
 >>
 >> Maybe you can add a init container to update the peers in the current 
 >> running haproxy pod's with socket commands, if possible.
 >>
 >> https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ 
 >> [https://kubernetes.io/docs/concepts/workloads/pods/init-containers/]
 >>
 >> http://cbonte.github.io/haproxy-dconv/1.9/management.html#9.3 
 >> [http://cbonte.github.io/haproxy-dconv/1.9/management.html#9.3]
 >>
 >> Agree with you that peers possibility would be nice.
 >>
 >> Some other questions are.
 >>
 >> * how often happen such a restart?
 >> * how many entries are in the tables?
 >>
 >> I don't see anything wrong to use a "quorum" Server. This is a pretty 
 >> common solution even on contained setups.
 >>
 >> Regards
 >> Aleks
 >>
 >> Wed May 22 15:36:10 GMT+02:00 2019 Eduardo Doria Lima < 
 >> eduardo.l...@trt20.jus.br [] >:
 >>
 >>> Hi,
 >>> I'm using HAProxy to support a system that was initially developed for 
 >>> Apache (AJP) and JBoss. Now we are migrating it's infrastructure to a 
 >>> Kubernetes cluster with HAProxy as ingress (load balancer).
 >>> The big problem is this system depends strict to JSESSIONID. Some internal 
 >>> requests made in Javascript or Angular don't respect browser cookies and 
 >>> send requests only with original Jboss JSESSIONID value.
 >>> Because of this we need a sticky-table to map JSESSIONID values. But in a 
 >>> cluster environment ( https://github.com/jcmoraisjr/haproxy-ingress 
 >>> [https://github.com/jcmoraisjr/haproxy-ingress] ) HAProxy has many 
 >>> instances and this instances don't have fixed IP, they are volatile.
 >>> Also, in Kubernetes cluster everything is in constant change and any 
 >>> change is a reload of all HAProxy instances. So, we lost the sticky-table.
 >>> Even we use "peers" feature as described in this issue ( 
 >>> https://github.com/jcmoraisjr/haproxy-ingress/issues/296 
 >>> [https://github.com/jcmoraisjr/haproxy-ingress/issues/296] ) by me, we 
 >>> don't know if table will persist because all instances will reload in the 
 >>> same time.
 >>> We thought to use a separate HAProxy server only to cache this table. This 
 >>> HAProxy will never reload. But I'm not comfortable to use a HAProxy server 
 >>> instance only for this.
 >>> I appreciate if you help me. Thanks!
 >>>
 >>> Att,
 >>> Eduardo
 >>>
 >>
 >





Re: Sticky-table persistence in a Kubernetes environment

2019-05-23 Thread Eduardo Doria Lima
HI Aleks,

"First why do you restart all haproxies at the same  time and don't use
rolling updates ?"

We restarts all HAProxys at the same time because they watch Kubernetes
API. The ingress (https://github.com/jcmoraisjr/haproxy-ingress) do this
automatic. I was talking with ingress creator João Morais about the
possibility of use a random value to restart but we agree it's not 100%
secure to keep the table.
The ingress don't use rolling update because it's fast to realod HAProxy
than kill entire Pod. I think. I will find more about this.

"Maybe you can add a init container to update the peers in the current
running haproxy pod's  with socket commands, if possible."

The problem is not update the peers, we can do this. The problem is all the
peers reload at same time.

"* how often happen such a restart?"

Not to much, but enough to affect some users when it occurs.

"* how many entries are in the tables?"

I don't know exactly, maybe between thousand and ten thousand.


Thanks!

Att,
Eduardo



Em qua, 22 de mai de 2019 às 16:10, Aleksandar Lazic 
escreveu:

> Hi Eduardo.
>
> That's a pretty interesting question, at least for me.
>
> First why do you restart all haproxies at the same  time and don't use
> rolling updates ?
>
> https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/
>
> Maybe you can add a init container to update the peers in the current
> running haproxy pod's  with socket commands, if possible.
>
> https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
>
> http://cbonte.github.io/haproxy-dconv/1.9/management.html#9.3
>
> Agree with you that peers possibility would be nice.
>
> Some other questions are.
>
> * how often happen such a restart?
> * how many entries are in the tables?
>
> I don't see anything wrong to use a "quorum" Server. This is a pretty
> common solution even on contained setups.
>
> Regards
> Aleks
>
> Wed May 22 15:36:10 GMT+02:00 2019 Eduardo Doria Lima <
> eduardo.l...@trt20.jus.br>:
>
> Hi,
>
> I'm using HAProxy to support a system that was initially developed for
> Apache (AJP) and JBoss. Now we are migrating it's infrastructure to a
> Kubernetes cluster with HAProxy as ingress (load balancer).
>
> The big problem is this system depends strict to JSESSIONID. Some internal
> requests made in Javascript or Angular don't respect browser cookies and
> send requests only with original Jboss JSESSIONID value.
>
> Because of this we need a sticky-table to map JSESSIONID values. But in a
> cluster environment (https://github.com/jcmoraisjr/haproxy-ingress)
> HAProxy has many instances and this instances don't have fixed IP, they are
> volatile.
>
> Also, in Kubernetes cluster everything is in constant change and any
> change is a reload of all HAProxy instances. So, we lost the sticky-table.
>
> Even we use "peers" feature as described in this issue (
> https://github.com/jcmoraisjr/haproxy-ingress/issues/296) by me, we don't
> know if table will persist because all instances will reload in the same
> time.
>
> We thought to use a separate HAProxy server only to cache this table. This
> HAProxy will never reload. But I'm not comfortable to use a HAProxy server
> instance only for this.
>
> I appreciate if you help me. Thanks!
>
>
> Att,
> Eduardo
>
>


Re: [ANNOUNCE] haproxy-2.0-dev4

2019-05-23 Thread Илья Шипицин
чт, 23 мая 2019 г. в 01:28, Willy Tarreau :

> Hi,
>
> HAProxy 2.0-dev4 was released on 2019/05/22. It added 83 new commits
> after version 2.0-dev3.
>
> This release completes the integration of a few pending features and
> the ongoing necessary cleanups before 2.0.
>
> A few bugs were addressed in the way to deal with certain connection
> errors, but overall there was nothing dramatic, which indicates we're
> stabilizing (it has been running flawlessly for 1 week now on
> haproxy.org).
>
> There are a few new features that were already planned. One is the support
> of event ports as an alternate (read "faster") polling method on Solaris,
> by Manu. Another one is the replacement of the slow stream processing by
> a better and more reliable watchdog. It currently only supports Linux
> however, but a FreeBSD port seems reasonably easy to do. It will detect
> inter-thread deadlocks as well as tasks stuck looping in an endless list
> which has been corrupted, and will provoke a panic, dumping all threads
> states, then doing an abort (in hope to get a core). This will allow the
> problem to be immediately detected and even the service to be automatically
> restarted when the service manager supports it. It's also possible to
> consult all threads' states on the CLI using "show threads".
>
> As previously discussed we have also deprecated the very old req* and rsp*
> directives with warnings suggesting what to use instead. They still work
> but the goal is to kill them in 2.1, so there's no rush to convert your
> configs given that 2.0 is LTS but you will be encouraged to progressively
> adapt your future configs. Likewise "option forceclose" now warns and
> "resolution_pool_size" is an error (it never existed in any release).
>
> WURFL is now HTX-aware. There are some new developer-friendly commands
> on the CLI when built with -DDEBUG_DEV, they allow to inspect memory
> areas or send signals, which is convenient during development. It should
> have been done earlier!
>
> Cirrus-CI is enabled to test builds on FreeBSD. To be honest at this point
> it's still not completely clear to me how to fully use it as their
> interface
> is a bit limited but it has the merit of existing. It doesn't build as
> often
> as Travis-CI, and it decided to build the last fix after I tagged this
> release, showing that apparently there's still a build error on FreeBSD,
> that I don't understand for now.
>
> Lots of code cleanups were done, and some old build options were refreshed
> to match their equivalent makefile option.
>
> Overall, aside the possible occasional build issues here and there, it's
> expected to be a bit more stable than dev3, which I'm currently already
> satisfied with.
>
> Let's set on -dev5 around next Wednesday with the final polishing.
> Depending
> on the amount of issues we'll be able to decide on a release date.
>


I'd like to run sanitizers on vaious combinations, like ZLIB / SLZ, PCRE /
PCRE2 ...
ok, let us do it before Wednesday


>
> Please find the usual URLs below :
>Site index   : http://www.haproxy.org/
>Discourse: http://discourse.haproxy.org/
>Slack channel: https://slack.haproxy.org/
>Issue tracker: https://github.com/haproxy/haproxy/issues
>Sources  : http://www.haproxy.org/download/2.0/src/
>Git repository   : http://git.haproxy.org/git/haproxy.git/
>Git Web browsing : http://git.haproxy.org/?p=haproxy.git
>Changelog: http://www.haproxy.org/download/2.0/src/CHANGELOG
>Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/
>
> Willy
> ---
> Complete changelog :
> Bertrand Jacquin (1):
>   DOC: fix "successful" typo
>
> Christopher Faulet (1):
>   BUG/MINOR: http_fetch: Rely on the smp direction for "cookie()" and
> "hdr()"
>
> Emmanuel Hocdet (3):
>   BUILD: makefile: use USE_OBSOLETE_LINKER for solaris
>   BUILD: makefile: remove -fomit-frame-pointer optimisation (solaris)
>   MAJOR: polling: add event ports support (Solaris)
>
> Ilya Shipitsin (2):
>   BUILD: enable freebsd builds on cirrus-ci
>   BUILD: travis: add sanitizers to travis-ci builds
>
> Olivier Houchard (3):
>   BUG/MEDIUM: streams: Don't use CF_EOI to decide if the request is
> complete.
>   BUG/MEDIUM: streams: Try to L7 retry before aborting the connection.
>   BUG/MEDIUM: streams: Don't switch from SI_ST_CON to SI_ST_DIS on
> read0.
>
> Tim Duesterhus (3):
>   MEDIUM: Make 'option forceclose' actually warn
>   MEDIUM: Make 'resolution_pool_size' directive fatal
>   BUG/MINOR: mworker: Fix memory leak of mworker_proc members
>
> William Lallemand (1):
>   MINOR: init: setenv HAPROXY_CFGFILES
>
> Willy Tarreau (61):
>   DOC: management: place "show activity" at the right place
>   MINOR: cli/activity: show the dumping thread ID starting at 1
>   MINOR: task: export global_task_mask
>   MINOR: cli/debug: add a thread dump function
>   BUG/MINOR: debug: make ha_task_dump() 

RFE: insert server into peer section

2019-05-23 Thread Aleksandar Lazic

Hi.

We had a interesting discussion on Kubeconf how a session table in peer's can 
survive a restart of a haproxy instance.

We came to a request for enhancement (RFE) to be able to add a peer server, not 
a peer section, to a existing peers section, similar to add server for backend.

Opinions?

Regards
 Aleks



Re: cirrus-ci is red

2019-05-23 Thread Илья Шипицин
чт, 23 мая 2019 г. в 14:03, Willy Tarreau :

> On Wed, May 22, 2019 at 07:15:21PM +0200, Willy Tarreau wrote:
> > On Wed, May 22, 2019 at 03:24:25PM +0500,  ??? wrote:
> > > Hello,
> > >
> > > someone is reviewing this
> https://github.com/haproxy/haproxy/runs/133866993
> > > ?
> >
> > So apparently we don't have _POSIX_C_SOURCE >= 199309L there, which
> > contradicts the promise in my linux man pages :-/ The docs on opengroup
> > do not mention this define but indicate that the extension was derived
> > from another spec. I'm going to remove the version test, as I think that
> > the POSIX_TIMERS will not be set anyway in this case. We'll see if it
> > breaks anywhere else.
>
> I could build haproxy on a FreeBSD 11.1 machine. I had other issues to
> fix, but I didn't get the error on clock_gettime(). I'm not sure how
> it can happen since the defines would need to be incompatible between
> two files. I'm not even certain it's running on the last source, I can't
> find how to navigate between the builds on their interface.
>

I'll have a look.
probably we'll add freebsd 11 to cirrus matrix


>
> For now I've addressed the issues which will lead to a failure with
> timerfd_{create,settime,delete}() that I detected here in my VM. We'll
> see if Cirrus =' status changes. Once it works we could try again with
> USE_RT on FreeBSD, but well, one thing at a time :-)
>
> Cheers,
> Willy
>


Re: cirrus-ci is red

2019-05-23 Thread Willy Tarreau
On Wed, May 22, 2019 at 07:15:21PM +0200, Willy Tarreau wrote:
> On Wed, May 22, 2019 at 03:24:25PM +0500,  ??? wrote:
> > Hello,
> > 
> > someone is reviewing this https://github.com/haproxy/haproxy/runs/133866993
> > ?
> 
> So apparently we don't have _POSIX_C_SOURCE >= 199309L there, which
> contradicts the promise in my linux man pages :-/ The docs on opengroup
> do not mention this define but indicate that the extension was derived
> from another spec. I'm going to remove the version test, as I think that
> the POSIX_TIMERS will not be set anyway in this case. We'll see if it
> breaks anywhere else.

I could build haproxy on a FreeBSD 11.1 machine. I had other issues to
fix, but I didn't get the error on clock_gettime(). I'm not sure how
it can happen since the defines would need to be incompatible between
two files. I'm not even certain it's running on the last source, I can't
find how to navigate between the builds on their interface.

For now I've addressed the issues which will lead to a failure with
timerfd_{create,settime,delete}() that I detected here in my VM. We'll
see if Cirrus =' status changes. Once it works we could try again with
USE_RT on FreeBSD, but well, one thing at a time :-)

Cheers,
Willy



Re: SD-termination cause

2019-05-23 Thread Willy Tarreau
Hi Maksim,

On Thu, May 23, 2019 at 10:00:19AM +0300, ?? ? wrote:
> 2nd session (from haproxy to ssl-enabled backend A, dumped with tshark for
> better readability):
> 1 09:10:48.222518 HAPROXY -> BACKEND_A TCP 94 36568 -> 9790 [SYN] Seq=0
> Win=26520 Len=0 MSS=8840 SACK_PERM=1 TSval=3064071282 TSecr=0 WS=2048
> 2 09:10:48.222624 BACKEND_A -> HAPROXY TCP 94 9790 -> 36568 [SYN, ACK] Seq=0
> Ack=1 Win=26784 Len=0 MSS=8940 SACK_PERM=1 TSval=3366865490
> TSecr=3064071282 WS=256
> 3 09:10:48.222639 HAPROXY -> BACKEND_A TCP 86 36568 -> 9790 [ACK] Seq=1 Ack=1
> Win=26624 Len=0 TSval=3064071283 TSecr=3366865490
> 4 09:10:48.222658 HAPROXY -> BACKEND_A TLSv1 603 Client Hello
> 5 09:10:48.222741 BACKEND_A -> HAPROXY TCP 86 9790 -> 36568 [ACK] Seq=1
> Ack=518 Win=27904 Len=0 TSval=3366865490 TSecr=3064071283
> 6 09:10:48.272165 HAPROXY -> BACKEND_A TCP 86 36568 -> 9790 [RST, ACK]
> Seq=518 Ack=1 Win=26624 Len=0 TSval=3064071332 TSecr=3366865490
> 
> Backend didn't answer with Server Hello in 49.5ms after tcp-handshed has
> finished for some reason. That is the root case of the error!!!

Indeed, this is an interesting case. I'm not sure why it's reported like
this but it definitely is a corner case as the L4 connection is established
and the handshake was aborted. It should have been reported as "sC" (timeout
during connect). But I can easily understand how we can make some wrong
assumptions based on the available elements when reporting an error (e.g.
the connection is valid, only the handshake is incomplete). We definitely
need to figure what's happening before releasing 2.0, as it could indicate
a bigger issue in the connection setup error path. By the way I'm thinking
that for 2.1 we should probably think about reporting a separate step for
the server-side handshake, but that's another story.

> The last session (from haproxy to plain-http backend B):
> 1 09:10:48.272235 HAPROXY -> BACKEND_B TCP 94 33532 -> 9791 [SYN] Seq=0
> Win=26520 Len=0 MSS=8840 SACK_PERM=1 TSval=561683483 TSecr=0 WS=2048
> 2 09:10:48.272358 BACKEND_B -> HAPROXY TCP 94 9791 -> 33532 [SYN, ACK] Seq=0
> Ack=1 Win=26784 Len=0 MSS=8940 SACK_PERM=1 TSval=874005989 TSecr=561683483
> WS=256
> 3 09:10:48.272369 HAPROXY -> BACKEND_B TCP 86 33532 -> 9791 [ACK] Seq=1 Ack=1
> Win=26624 Len=0 TSval=561683483 TSecr=874005989
> 4 09:10:48.272396 HAPROXY -> BACKEND_B HTTP 3590 GET /xx/xx/xxx
> HTTP/1.1
> 5 09:10:48.272448 HAPROXY -> BACKEND_B TCP 86 33532 -> 9791 [FIN, ACK]
> Seq=3505 Ack=1 Win=26624 Len=0 TSval=561683483 TSecr=874005989
> 6 09:10:48.272529 BACKEND_B -> HAPROXY TCP 86 9791 -> 33532 [ACK] Seq=1
> Ack=3505 Win=33792 Len=0 TSval=874005989 TSecr=561683483
> 7 09:10:48.272729 BACKEND_B -> HAPROXY TCP 86 9791 -> 33532 [FIN, ACK] Seq=1
> Ack=3506 Win=33792 Len=0 TSval=874005989 TSecr=561683483
> 8 09:10:48.272736 HAPROXY -> BACKEND_B TCP 86 33532 -> 9791 [ACK] Seq=3506
> Ack=2 Win=26624 Len=0 TSval=561683484 TSecr=874005989
> 
> As you can see, haproxy instance made another try to establish connection
> and it did succeed but 50ms are over, and FIN was send right after
> GET-request.

This should never happen either, or you may quickly run out of source ports
by having your ports in TIME_WAIT state :-(

> Conclusion:
> * Haproxy does not respond with 502 in case of timing out on ssl-connection
> establishing to backends

So for this case since it's a timeout, it should be a 504.

> * Seems strange to me that connection timer was not reset after the first
> unsuccessfull connection ("retries 1" was set)

Indeed you're right, that might be the reason for the FIN just after
the GET.

> * SD-status of error is confusing :)

I suspect there are in fact 2 or 3 issues in the outgoing connection
code that result in all of this. This code is very complex since it
has to deal with reuse, server pools and redispatch at the same time.
We need to have a look into this. I'll wait for Olivier's availability
since he knows this area better (especially the reuse stuff that I
would break just by approaching it).

Many thanks for your detailed traces and analysis, this is very informative!

Willy



Re: SD-termination cause

2019-05-23 Thread Максим Куприянов
Hi, Willy!

This kind of errors only happen on proxy-sections with ssl-enabled backends
('ssl verify none' in server lines).
In order to find out what realy happens from network point of view I added
one plain-http backend to one of the proxy-sections.
Then I captured the sutuation when request failed on this plain-http
backend.

interesting parameters from config:
  timeout connect 50
  timeout queue   1s
  retries 1
servers lines look like this:
  default-server weight 50 on-error fastinter
  server BACKEND_A:9790 10.10.10.10:9790 weight 100 check ssl verify none
observe layer7
  server BACKEND_B:9791 10.10.10.11:9791 weight 100 check observe layer7

Now I'll show you 3 tcp-sessions, I've captured:
The first session (from client to haproxy instance):
1 09:10:48.222378 IP6 127.0.0.1.52726 > 127.0.0.1.link: Flags [S], seq
3359830899, win 43690, options [mss 65476,sackOK,TS val 3131804957 ecr
0,nop,wscale 11], length 0
2 09:10:48.222388 IP6 127.0.0.1.link > 127.0.0.1.52726: Flags [S.], seq
1294278968, ack 3359830900, win 43690, options [mss 65476,sackOK,TS val
3131804957 ecr 3131804957,nop,wscale 11], length 0
3 09:10:48.222397 IP6 127.0.0.1.52726 > 127.0.0.1.link: Flags [.], ack 1,
win 22, options [nop,nop,TS val 3131804957 ecr 3131804957], length 0
4 09:10:48.222449 IP6 127.0.0.1.52726 > 127.0.0.1.link: Flags [P.], seq
1:3505, ack 1, win 22, options [nop,nop,TS val 3131804957 ecr 3131804957],
length 3504
5 09:10:48.222458 IP6 127.0.0.1.link > 127.0.0.1.52726: Flags [.], ack
3505, win 86, options [nop,nop,TS val 3131804957 ecr 3131804957], length 0
6 09:10:48.272790 IP6 127.0.0.1.link > 127.0.0.1.52726: Flags [F.], seq 1,
ack 3505, win 86, options [nop,nop,TS val 3131805008 ecr 3131804957],
length 0
7 09:10:48.272836 IP6 127.0.0.1.52726 > 127.0.0.1.link: Flags [F.], seq
3505, ack 2, win 22, options [nop,nop,TS val 3131805008 ecr 3131805008],
length 0
8 09:10:48.272844 IP6 127.0.0.1.link > 127.0.0.1.52726: Flags [.], ack
3506, win 86, options [nop,nop,TS val 3131805008 ecr 3131805008], length 0

As you can see the client sent request to the haproxy instance (packet #4).
The instance acknoledged it (packet #5).
And then 50.332ms after haproxy answered with FIN with no data (packet #6,
"length 0").

2nd session (from haproxy to ssl-enabled backend A, dumped with tshark for
better readability):
1 09:10:48.222518 HAPROXY → BACKEND_A TCP 94 36568 → 9790 [SYN] Seq=0
Win=26520 Len=0 MSS=8840 SACK_PERM=1 TSval=3064071282 TSecr=0 WS=2048
2 09:10:48.222624 BACKEND_A → HAPROXY TCP 94 9790 → 36568 [SYN, ACK] Seq=0
Ack=1 Win=26784 Len=0 MSS=8940 SACK_PERM=1 TSval=3366865490
TSecr=3064071282 WS=256
3 09:10:48.222639 HAPROXY → BACKEND_A TCP 86 36568 → 9790 [ACK] Seq=1 Ack=1
Win=26624 Len=0 TSval=3064071283 TSecr=3366865490
4 09:10:48.222658 HAPROXY → BACKEND_A TLSv1 603 Client Hello
5 09:10:48.222741 BACKEND_A → HAPROXY TCP 86 9790 → 36568 [ACK] Seq=1
Ack=518 Win=27904 Len=0 TSval=3366865490 TSecr=3064071283
6 09:10:48.272165 HAPROXY → BACKEND_A TCP 86 36568 → 9790 [RST, ACK]
Seq=518 Ack=1 Win=26624 Len=0 TSval=3064071332 TSecr=3366865490

Backend didn't answer with Server Hello in 49.5ms after tcp-handshed has
finished for some reason. That is the root case of the error!!!

The last session (from haproxy to plain-http backend B):
1 09:10:48.272235 HAPROXY → BACKEND_B TCP 94 33532 → 9791 [SYN] Seq=0
Win=26520 Len=0 MSS=8840 SACK_PERM=1 TSval=561683483 TSecr=0 WS=2048
2 09:10:48.272358 BACKEND_B → HAPROXY TCP 94 9791 → 33532 [SYN, ACK] Seq=0
Ack=1 Win=26784 Len=0 MSS=8940 SACK_PERM=1 TSval=874005989 TSecr=561683483
WS=256
3 09:10:48.272369 HAPROXY → BACKEND_B TCP 86 33532 → 9791 [ACK] Seq=1 Ack=1
Win=26624 Len=0 TSval=561683483 TSecr=874005989
4 09:10:48.272396 HAPROXY → BACKEND_B HTTP 3590 GET /xx/xx/xxx
HTTP/1.1
5 09:10:48.272448 HAPROXY → BACKEND_B TCP 86 33532 → 9791 [FIN, ACK]
Seq=3505 Ack=1 Win=26624 Len=0 TSval=561683483 TSecr=874005989
6 09:10:48.272529 BACKEND_B → HAPROXY TCP 86 9791 → 33532 [ACK] Seq=1
Ack=3505 Win=33792 Len=0 TSval=874005989 TSecr=561683483
7 09:10:48.272729 BACKEND_B → HAPROXY TCP 86 9791 → 33532 [FIN, ACK] Seq=1
Ack=3506 Win=33792 Len=0 TSval=874005989 TSecr=561683483
8 09:10:48.272736 HAPROXY → BACKEND_B TCP 86 33532 → 9791 [ACK] Seq=3506
Ack=2 Win=26624 Len=0 TSval=561683484 TSecr=874005989

As you can see, haproxy instance made another try to establish connection
and it did succeed but 50ms are over, and FIN was send right after
GET-request.

Conclusion:
* Haproxy does not respond with 502 in case of timing out on ssl-connection
establishing to backends
* Seems strange to me that connection timer was not reset after the first
unsuccessfull connection ("retries 1" was set)
* SD-status of error is confusing :)

--
Best regards,
Maksim Kupriianov

чт, 23 мая 2019 г. в 06:40, Willy Tarreau :

> Hi Maksim,
>
> On Tue, May 21, 2019 at 01:47:30PM +0300, ?? ? wrote:
> > Hi!
> >
> > I've run into some weird problem of many connections failed with SD
> status
> > in log. And I 

Re: do we consider using patchwork ?

2019-05-23 Thread Willy Tarreau
Hi Aleks,

On Thu, May 23, 2019 at 08:05:18AM +0200, Aleksandar Lazic wrote:
> From my point of view is the ci and issue tacker a good step forward but for
> now we should try to focus on the list as it is still the main communication
> channel.

I mean, there are multiple valid communication channels, since there
are multiple communications at the same time. For example, Lukas, is
doing an awesome job at helping people on Discourse and only brings
here qualified issues so that I almost never have to go there. Same
with the github issues that we've wanted to have for quite some time
without me being at the center of this, they work pretty well right
now, and if something is ignored for too long, someone will ping here
about the problem so it works remarkably well.

> How about to add into the tools mailing forward and reply?

Given that there are people who manage to sort this info first, I'd
rather not for now. This is less stuff to concentrate on. For me it
is very important to have a set of trusted people who I know do the
right thing, because when an issue is escalated here or when I get
a patch that was said to be validated regarding a GitHub issue, in
general, I apply it without looking at it and it's a big relief not
to have to review a patch.

> We can then use the mail clients for communication and the tools will
> receive the answers automatically.

Someone would have to set this up, and possibly to develop a bot like
Lukas did for the PRs. At the moment the stuff is more or less well
balanced, it's just that we have added lots of useful tools in a short
time and that these ones still need to be cared about because, well,
it's the beginning. Also for me it's important that we don't forget
the real goals : the goal is to improve haproxy, not to improve the
tools. If improving the tools improves haproxy, fine. But the tools
are not the goal but a way to reach the goal faster. For example, the
CI is very useful since we now detect build breakage much earlier.
However we need to keep in mind that it's an indication that we broke
something, it must not be a goal to have green lights all the time. If
something is broken on a platform (even an important one) because of an
ongoing change and we consider it's more important to finish the changes
than to fix this platform, I'm perfectly fine with this. It eases the
developers' work by giving them the feedback they need without having
to actively re-test their changes everywhere (and Travis is particularly
good at this because it triggers a build almost instantly after a push
so the loop feedback is very fast). For example, the problem that was
reported by Cirrus on the FreeBSD build breakage by the recent watchdog
changes annoys me, not because Cirrus is red, which I don't care about,
but because it's still broken after the fix that I thought valid, and
now I know that the supposedly valid POSIX defines I used to detect
support are not portable, so I will need to be extra careful about this
and to fix it while it's still fresh in my head.

So let's just try to put a pause in the tooling improvements so that we
still have a bit of time available for the code, and see what can be
improved in 3-6 months once we feel that things are going well except
a few that need to be addressed. It will save us from wasting time
doing mistakes.

Cheers,
Willy



Re: do we consider using patchwork ?

2019-05-23 Thread Aleksandar Lazic


Hi.

Wed May 22 23:41:13 GMT+02:00 2019 Willy Tarreau :

> Hi Ilya,
 >
 > On Thu, May 23, 2019 at 01:29:53AM +0500,  ??? wrote:
 > > Hello,
 > >
 > > if we do not like using github PR and Willy receives 2k emails a day...
 > > do we consider using something like that
 > > https://patchwork.openvpn.net/project/openvpn2/list/ ?
 >
 > At least not now, please let's slow down on process changes, I cannot
 > catch up anymore. Really. I find myself spending 10 times more time in
 > a browser than what I used to do 6 months ago, for me it's becoming
 > very difficult. Between the issue tracker, the CI, github settings, the
 > links to dumps, confs or logs that are lazily copy-pasted instead of
 > sending the info itself etc... In the end I find myself working far
 > less efficiently for now, having to spend more time at work to produce
 > the same, however it helps others work more efficiently, which is nice.
 > But since I've always been a bottleneck, it remains important that we
 > don't forget to optimize my time, or everyone will spend their time
 > waiting for me, which I cannot accept. And the worst that can happen
 > is that I become a bottleneck due to processes because this would be
 > something I wouldn't be able to improve at all.


Full ack.

>From my point of view is the ci and issue tacker a good step forward but for 
>now we should try to focus on the list as it is still the main communication 
>channel.

How about to add into the tools mailing forward and reply?
 We can then use the mail clients for communication and the tools will receive 
the answers automatically.

Jm2c

> I hope you can understand that no change comes with zero cost and that
 > for some people (like me) they come with a higher cost. Sometimes this
 > cost can be recovered over time, sometimes it's a pure loss. So let's
 > not engage too many changes at once and keep some time to observe the
 > outcome of everything we've done for now.
 >
 > Cheers,
 > Willy

Regards
 Aleks