Re: Haproxy 1.7.11 log problems

2019-11-20 Thread Alexander Kasantsev
I updated haproxy to 1.7.12 but nothing changed


> 20 нояб. 2019 г., в 15:38, Aleksandar Lazic  написал(а):
> 
> 
> On this page is a 1.7.12 listed, is this the repo which you use?
> 
> https://repo.ius.io/6/x86_64/packages/h/
> 
> Please can you try the 1.7.12.
> 
> Do you know that eol is next year?
> https://wiki.centos.org/Download
> 
> Regards
> Aleks
> 
> Nov 20, 2019 12:45:37 PM Alexander Kasantsev :
> 
>> I’m on CentOS 6.10, the latest version for me is 1.7.11 from ius repo
>> 
>>> 20 нояб. 2019 г., в 14:17, Aleksandar Lazic 
>> написал(а):
>>> 
>>> 
>>> Hi.
>>> 
>>> Please can you use the latest 1.7, latest 1.8 or 2.0 and tell us if the 
>>> problem still exist.
>>> 
>>> Best regards
>>> Aleks
>>> 
>>> Nov 20, 2019 9:52:01 AM Alexander Kasantsev 
>> :
>>> 
 Good day everyone!
 
 I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with 
 logging
 
 I have a following in config file for logging 
 
 capture request  header Host len 200
 capture request  header Referer len 200
 capture request  header User-Agent len 200
 capture request  header Content-Type len 200
 capture request  header Cookie len 300
 log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ 
 %ST\ \"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ 
 \"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ 
 'NGINX-CACHE-- "-"'\ \"%ts\»
 
 
 Logformat is almost the same with Nginx
 
 But is some cases it works incorrectly
 
 For example log output
 
 Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - 
 [20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 
 "https://example.com/; "Mozilla/5.0" "some.cookie data" 19 
 "vm06.lb.rsl.loc" NGINX-CACHE-- "-" "—"
 
 Problem is that "GET /piwik.php H"  must be "GET /piwik.php HTTP/1.1" 
 its %HV parameter in log-format 
 
 A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or 
 "HTTP/1." 
 
>> 




Re: [PATCH] MINOR: contrib/prometheus-exporter: allow to select the exported metrics

2019-11-20 Thread William Dauchy
Hi Christopher,

On Wed, Nov 20, 2019 at 02:56:28PM +0100, Christopher Faulet wrote:
> Nice, Thanks for your feedback. It is merged now. And I'm on the backports
> for the 2.0.

You apparently forgot to backport
commit 0d1c2a65e8370a770d01 (MINOR: stats: Report max times in addition of the 
averages for sessions)

2.0 tree does not build anymore because ST_F_QT_MAX is not defined. 
-- 
William



RE: native prometheus exporter: retrieving check_status

2019-11-20 Thread Pierre Cheynier
>> My only fear for this point would be to make the code too complicated
>> and harder to maintain.
>>
>
> And slow down the exporter execution. Moreover, everyone will have a 
> different 
> opinion on how to aggregate the stats. My first idea was to sum all servers 
> counters. But Pierre's reply shown me that it's not what he expects.

I agree it's probably too complex and opinionated. Let see how it goes with 
servers
aggregations only, done on prometheus side,  since it's a server-related field 
initially.
If we identify issues/bottlenecks with output size we'll reopen this thread.

-- 
Pierre


RE: native prometheus exporter: retrieving check_status

2019-11-20 Thread Pierre Cheynier
> Ok, so it is a new kind of metric. I mean, not exposed by HAProxy. It would 
> require an extra loop on all servers for each backend. It is probably doable 
> for 
> the check_status. For the code, I don't know. Because it is not exclusive to 
> HTTP checks. it is also used for SMTP and LDAP checks. At the end, I think a 
> better idea would be to have a way to get specifics metrics in each scope and 
> let Prometheus handling the aggregation. This way, everyone is free to choose 
> how to proceed while limiting the number of metrics exported.

Fair enough, as stated on the other thread with William we'll see how it goes 
doing
it this way. If we have issues related to output size we'll start a new 
discussion.

Thanks!

-- 
Pierre




Re: native prometheus exporter: retrieving check_status

2019-11-20 Thread Christopher Faulet

Le 19/11/2019 à 17:12, Pierre Cheynier a écrit :

* also for `check_status`, there is the case of L7STS and its associated values 
that are present
in another field. Most probably it could benefit from a better representation 
in a prometheus
output (thanks to labels)?


We can also export the metrics ST_F_CHECK_CODE. For the use of labels, I have no
idea. For now, the labels are static in the exporter. And I don't know if it is
pertinent to add dynamic info in labels. If so, what is your idea ? Add a "code"
label associated to the check_status metric ?


Here again, my maybe-not-so-good idea was to keep the ability to retrieve all 
the
underlying details at backend level, such as:
* 100 servers are L7OK
* 1 server is L4TOUT
* 2 servers are L4CON
* 2 servers are L7STS
** 1 due to a HTTP 429
** 1 due to a HTTP 503

But this is maybe overkill in terms of complexity, we could maybe push more on
our ability to retrieve non-maint servers status.



Ok, so it is a new kind of metric. I mean, not exposed by HAProxy. It would 
require an extra loop on all servers for each backend. It is probably doable for 
the check_status. For the code, I don't know. Because it is not exclusive to 
HTTP checks. it is also used for SMTP and LDAP checks. At the end, I think a 
better idea would be to have a way to get specifics metrics in each scope and 
let Prometheus handling the aggregation. This way, everyone is free to choose 
how to proceed while limiting the number of metrics exported.


--
Christopher Faulet



Re: native prometheus exporter: retrieving check_status

2019-11-20 Thread Christopher Faulet

Le 19/11/2019 à 16:48, William Dauchy a écrit :

On Tue, Nov 19, 2019 at 03:31:28PM +0100, Christopher Faulet wrote:

* also for `check_status`, there is the case of L7STS and its associated
values that are present in another field. Most probably it could benefit
from a better representation in a prometheus output (thanks to labels)?


We can also export the metrics ST_F_CHECK_CODE. For the use of labels, I
have no idea. For now, the labels are static in the exporter. And I don't
know if it is pertinent to add dynamic info in labels. If so, what is your
idea ? Add a "code" label associated to the check_status metric ?


we need to be very careful here indeed. It's not very clear in my mind
how much values we are talking about, but labels trigger the creation of
a new metric of each key/value pair. So it can quickly explode your
memory on scrapping side.


If there is different metric for each label, it is probably not the right way to 
do. However, I may have wrong, I'm not a Prometheus expert, far from it :) I 
will probably start by exporting metrics as present in HAProxy using a mapping 
to represent the check_status.





* what about getting some backend-level aggregation of server metrics, such as 
the one that was previously mentioned, to avoid retrieving all the server 
metrics but still be able to get some insights?
I'm thinking about an aggregation of some fields at backend level, which was 
not previously done with the CSV output.


It is feasible. But only counters may be aggregated. It may be enabled using
a parameter in the query-string. However, it is probably pertinent only when
the server metrics are filtered out. Because otherwise, Prometheus can
handle the aggregation itself.


My only fear for this point would be to make the code too complicated
and harder to maintain.


And slow down the exporter execution. Moreover, everyone will have a different 
opinion on how to aggregate the stats. My first idea was to sum all servers 
counters. But Pierre's reply shown me that it's not what he expects.


--
Christopher Faulet



Re: [PATCH] MINOR: contrib/prometheus-exporter: allow to select the exported metrics

2019-11-20 Thread Christopher Faulet

Le 20/11/2019 à 13:03, William Dauchy a écrit :

On Tue, Nov 19, 2019 at 04:35:47PM +0100, Christopher Faulet wrote:

Here is updated patches with the support for "scope" and "no-maint"
parameters. If this solution is good enough for you (and if it works :), I
will push it.


$ curl 
"http://127.0.0.1:8080/metrics?scope=global=frontend=backend=server;
151M
$ curl 
"http://127.0.0.1:8080/metrics?scope=global=frontend=backend=server;
13.9M

looks very useful from here :)
I think you can push this last version!



Nice, Thanks for your feedback. It is merged now. And I'm on the backports for 
the 2.0.


--
Christopher Faulet



Re: Haproxy 1.7.11 log problems

2019-11-20 Thread Aleksandar Lazic


On this page is a 1.7.12 listed, is this the repo which you use?
 
https://repo.ius.io/6/x86_64/packages/h/
 
Please can you try the 1.7.12.
 
Do you know that eol is next year?
https://wiki.centos.org/Download
 
Regards
Aleks

Nov 20, 2019 12:45:37 PM Alexander Kasantsev :
 
> I’m on CentOS 6.10, the latest version for me is 1.7.11 from ius repo
> 
> > 20 нояб. 2019 г., в 14:17, Aleksandar Lazic 
>  написал(а):
> > 
> > 
> > Hi.
> > 
> > Please can you use the latest 1.7, latest 1.8 or 2.0 and tell us if the 
> > problem still exist.
> > 
> > Best regards
> > Aleks
> > 
> > Nov 20, 2019 9:52:01 AM Alexander Kasantsev 
> :
> > 
> >> Good day everyone!
> >> 
> >> I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with 
> >> logging
> >> 
> >> I have a following in config file for logging 
> >> 
> >>  capture request  header Host len 200
> >>  capture request  header Referer len 200
> >>  capture request  header User-Agent len 200
> >>  capture request  header Content-Type len 200
> >>  capture request  header Cookie len 300
> >>  log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ 
> >> %ST\ \"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ 
> >> \"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ 
> >> 'NGINX-CACHE-- "-"'\ \"%ts\»
> >> 
> >> 
> >> Logformat is almost the same with Nginx
> >> 
> >> But is some cases it works incorrectly
> >> 
> >> For example log output
> >> 
> >> Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - 
> >> [20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 
> >> "https://example.com/; "Mozilla/5.0" "some.cookie data" 19 
> >> "vm06.lb.rsl.loc" NGINX-CACHE-- "-" "—"
> >> 
> >> Problem is that "GET /piwik.php H"  must be "GET /piwik.php HTTP/1.1" 
> >> its %HV parameter in log-format 
> >> 
> >> A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or 
> >> "HTTP/1." 
> >> 
> 



Re: [PATCH] MINOR: contrib/prometheus-exporter: allow to select the exported metrics

2019-11-20 Thread William Dauchy
On Tue, Nov 19, 2019 at 04:35:47PM +0100, Christopher Faulet wrote:
> Here is updated patches with the support for "scope" and "no-maint"
> parameters. If this solution is good enough for you (and if it works :), I
> will push it.

$ curl 
"http://127.0.0.1:8080/metrics?scope=global=frontend=backend=server;
151M
$ curl 
"http://127.0.0.1:8080/metrics?scope=global=frontend=backend=server;
13.9M

looks very useful from here :)
I think you can push this last version!
-- 
William



Re: Haproxy 1.7.11 log problems

2019-11-20 Thread Alexander Kasantsev
I’m on CentOS 6.10, the latest version for me is 1.7.11 from ius repo

> 20 нояб. 2019 г., в 14:17, Aleksandar Lazic  написал(а):
> 
> 
> Hi.
> 
> Please can you use the latest 1.7, latest 1.8 or 2.0 and tell us if the 
> problem still exist.
> 
> Best regards
> Aleks
> 
> Nov 20, 2019 9:52:01 AM Alexander Kasantsev :
> 
>> Good day everyone!
>> 
>> I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with logging
>> 
>> I have a following in config file for logging 
>> 
>>  capture request  header Host len 200
>>  capture request  header Referer len 200
>>  capture request  header User-Agent len 200
>>  capture request  header Content-Type len 200
>>  capture request  header Cookie len 300
>>  log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ 
>> %ST\ \"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ 
>> \"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ 
>> 'NGINX-CACHE-- "-"'\ \"%ts\»
>> 
>> 
>> Logformat is almost the same with Nginx
>> 
>> But is some cases it works incorrectly
>> 
>> For example log output
>> 
>> Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - 
>> [20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 
>> "https://example.com/; "Mozilla/5.0" "some.cookie data" 19 "vm06.lb.rsl.loc" 
>> NGINX-CACHE-- "-" "—"
>> 
>> Problem is that "GET /piwik.php H"  must be "GET /piwik.php HTTP/1.1" 
>> its %HV parameter in log-format 
>> 
>> A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or 
>> "HTTP/1." 
>> 




Re: Haproxy 1.7.11 log problems

2019-11-20 Thread Aleksandar Lazic


Hi.
 
Please can you use the latest 1.7, latest 1.8 or 2.0 and tell us if the problem 
still exist.
 
Best regards
Aleks

Nov 20, 2019 9:52:01 AM Alexander Kasantsev :
 
> Good day everyone!
> 
> I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with logging
> 
> I have a following in config file for logging 
> 
>   capture request  header Host len 200
>   capture request  header Referer len 200
>   capture request  header User-Agent len 200
>   capture request  header Content-Type len 200
>   capture request  header Cookie len 300
>   log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ 
> %ST\ \"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ 
> \"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ 
> 'NGINX-CACHE-- "-"'\ \"%ts\»
> 
> 
> Logformat is almost the same with Nginx
> 
> But is some cases it works incorrectly
> 
> For example log output
> 
> Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - 
> [20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 
> "https://example.com/; "Mozilla/5.0" "some.cookie data" 19 "vm06.lb.rsl.loc" 
> NGINX-CACHE-- "-" "—"
> 
> Problem is that "GET /piwik.php H"  must be "GET /piwik.php HTTP/1.1" 
> its %HV parameter in log-format 
> 
> A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or "HTTP/1." 
> 



Re: master-worker no-exit-on-failure with SO_REUSEPORT and a port being already in use

2019-11-20 Thread William Lallemand
On Wed, Nov 20, 2019 at 10:19:20AM +0100, Christian Ruppert wrote:
> Hi William,
> 
> thanks for the patch. I'll test it later today.  What I actually wanted to
> achieve is: https://cbonte.github.io/haproxy-dconv/2.0/management.html#4 Then
> HAProxy tries to bind to all listening ports. If some fatal errors happen
> (eg: address not present on the system, permission denied), the process quits
> with an error. If a socket binding fails because a port is already in use,
> then the process will first send a SIGTTOU signal to all the pids specified
> in the "-st" or "-sf" pid list. This is what is called the "pause" signal. It
> instructs all existing haproxy processes to temporarily stop listening to
> their ports so that the new process can try to bind again. During this time,
> the old process continues to process existing connections. If the binding
> still fails (because for example a port is shared with another daemon), then
> the new process sends a SIGTTIN signal to the old processes to instruct them
> to resume operations just as if nothing happened. The old processes will then
> restart listening to the ports and continue to accept connections. Not that
> this mechanism is system
> 
> In my test case though it failed to do so.

Well, it only works with HAProxy processes, not with other processes. There is
no mechanism to ask a process which is neither an haproxy process nor a process
which use SO_REUSEPORT.

With HAProxy processes it will bind with SO_REUSEPORT, and will only use the
SIGTTOU/SIGTTIN signals if it fails to do so.

This part of the documentation is for HAProxy without master-worker mode
in master-worker mode, once the master is launched successfully it is never
supposed to quit upon a reload (kill -USR2).

During a reload in master-worker mode, the master will do a -sf . 
If the reload failed for any reason (bad configuration, unable to bind etc.),
the behavior is to keep the previous workers. It only tries to kill the workers
if the reload succeed. So this is the default behavior.

-- 
William Lallemand



Re: travis-ci: should we drop openssl-1.1.0 and replace it with 3.0 ?

2019-11-20 Thread Willy Tarreau
On Tue, Nov 19, 2019 at 11:57:56PM +0100, Lukas Tribus wrote:
> Testing and implementing build fixes for APIs while they are under active
> development not only takes away precious dev time, it's also causes our own
> code to be messed up with workarounds possibly only needed for specific
> openssl development code at one point in time.

This actually is a pretty valid point I hadn't thought about and which
we experienced already in the past. It's not rare that a change gets
reverted in other projects, and wasting time working around it just to
see it finally cancelled is not cool.

With all this said, I tend to see the CI as a way to lower the number
of surprizes. This means that the most relevant stuff to test there is
what we can reasonably expect to encounter in field. If some mainstream
distros ship with specific openssl versions and they take care of the
support themselves, it seems reasonable to keep these versions. That
does not mean we have to test all combinations, as we can reasonably
expect that testing a wide enough spectrum increases the likelihood
that what is located between both extremities will also work. So if
1.1.0 is still shipped and maintained in relevant distros, we can
keep it.

Just my two cents,
Willy



Re: master-worker no-exit-on-failure with SO_REUSEPORT and a port being already in use

2019-11-20 Thread Christian Ruppert

Hi William,

thanks for the patch. I'll test it later today.
What I actually wanted to achieve is:
https://cbonte.github.io/haproxy-dconv/2.0/management.html#4
Then HAProxy tries to bind to all listening ports. If some fatal errors 
happen
(eg: address not present on the system, permission denied), the process 
quits
with an error. If a socket binding fails because a port is already in 
use, then
the process will first send a SIGTTOU signal to all the pids specified 
in the
"-st" or "-sf" pid list. This is what is called the "pause" signal. It 
instructs
all existing haproxy processes to temporarily stop listening to their 
ports so
that the new process can try to bind again. During this time, the old 
process
continues to process existing connections. If the binding still fails 
(because
for example a port is shared with another daemon), then the new process 
sends a
SIGTTIN signal to the old processes to instruct them to resume 
operations just
as if nothing happened. The old processes will then restart listening to 
the
ports and continue to accept connections. Not that this mechanism is 
system


In my test case though it failed to do so.

On 2019-11-19 17:27, William Lallemand wrote:

On Tue, Nov 19, 2019 at 04:19:26PM +0100, William Lallemand wrote:

> I then add another bind for port 80, which is in use by squid already
> and try to reload HAProxy. It takes some time until it failes:
>
> Nov 19 14:39:21 894a0f616fec haproxy[2978]: [WARNING] 322/143921 (2978)
> : Reexecuting Master process
> ...
> Nov 19 14:39:28 894a0f616fec haproxy[2978]: [ALERT] 322/143922 (2978) :
> Starting frontend somefrontend: cannot bind socket [0.0.0.0:80]
> ...
> Nov 19 14:39:28 894a0f616fec systemd[1]: haproxy.service: Main process
> exited, code=exited, status=1/FAILURE
>
> The reload itself is still running (systemd) and will timeout after
> about 90s. After that, because of the Restart=always, I guess, it ends
> up in a restart loop.
>
> So I would have expected that the master process will fallback to the
> old process and proceed with the old child until the problem has been
> fixed.
>


The patch in attachment fixes a bug where haproxy could reexecute 
itself in

waitpid mode with -sf -1.

I'm not sure this is your bug, but if this is the case you should see 
haproxy
in waitpid mode, then the master exiting with the usage message in your 
logs.


--
Regards,
Christian Ruppert



Haproxy 1.7.11 log problems

2019-11-20 Thread Alexander Kasantsev
Good day everyone!

I’m migrated from haproxy 1.5 to 1.7.11 and I have some troubles with logging

I have a following in config file for logging 

  capture request  header Host len 200
  capture request  header Referer len 200
  capture request  header User-Agent len 200
  capture request  header Content-Type len 200
  capture request  header Cookie len 300
  log-format %[capture.req.hdr(0),lower]\ %ci\ -\ [%t]\ \"%HM\ %HP\ %HV\"\ %ST\ 
\"%[capture.req.hdr(3)]\"\ %U\ \"%[capture.req.hdr(1)]\"\ 
\"%[capture.req.hdr(2)]\"\ \"%[capture.req.hdr(4)]\"\ %Tq\ \"%s\"\ 
'NGINX-CACHE-- "-"'\ \"%ts\»


Logformat is almost the same with Nginx

But is some cases it works incorrectly

For example log output

Nov 20 10:41:56 lb.loc haproxy[12633]: example.com 81.4.227.173 - 
[20/Nov/2019:10:41:56.095] "GET /piwik.php H" 200 "-" 2396 
"https://example.com/; "Mozilla/5.0" "some.cookie data" 19 "vm06.lb.rsl.loc" 
NGINX-CACHE-- "-" "—"

Problem is that "GET /piwik.php H"  must be "GET /piwik.php HTTP/1.1" 
its %HV parameter in log-format 

A part of "HTTP/1.1" randomly cut’s off. It may be "HT" or "HTT" or "HTTP/1."