Re: [PATCH] nbsrv() behavior when backend is disabled

2016-12-22 Thread Willy Tarreau
Hi Marcin,

On Thu, Dec 22, 2016 at 04:44:10PM +0100, Marcin Deranek wrote:
> Hi,
> According to nbsrv() documentation this fetcher should return "an
> integer value corresponding to the number of usable servers".
> In case backend is disabled none of servers is usable, so I believe
> fetcher should return 0.

Hmmm I think that makes sense indeed. Just out of curiosity in what
situation did you encounter this condition ? A reload maybe ?

> Attached patch addresses such condition.

Thanks. I've copied your analysis above as the commit message and adjusted
the subject to mention it's a minor bug in the backend part. I'm tagging it
for backport down to 1.5.

Thanks,
Willy



Fw:

2016-12-22 Thread Victor










Status code -1 with HA-Proxy version 1.5.15

2016-12-22 Thread Alexey Zilber
Hi All,

I'm seeing the 'status code' as -1 in haproxy logs, whereas the
documentation specifies:

"The status code is always 3-digit."

I do see the 'normal' result codes, but I also see a lot of -1's.


Any idea what causes this?


-Alex


subscribe

2016-12-22 Thread Alexey Zilber
Systems Architect
SFG Media Group, LLC


[PATCH] nbsrv() behavior when backend is disabled

2016-12-22 Thread Marcin Deranek
Hi,
According to nbsrv() documentation this fetcher should return "an
integer value corresponding to the number of usable servers".
In case backend is disabled none of servers is usable, so I believe
fetcher should return 0.
Attached patch addresses such condition.
Regards,

Marcin Deranek
>From 5b3eeb5d9a214d880773c03e0213a62772f7b271 Mon Sep 17 00:00:00 2001
From: Marcin Deranek 
Date: Thu, 22 Dec 2016 16:21:08 +0100
Subject: [PATCH] nbsrv() should return 0 if backend is disabled
X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4

---
 src/backend.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/backend.c b/src/backend.c
index 57f811f..69e2c90 100644
--- a/src/backend.c
+++ b/src/backend.c
@@ -1608,7 +1608,9 @@ smp_fetch_nbsrv(const struct arg *args, struct sample *smp, const char *kw, void
 	smp->data.type = SMP_T_SINT;
 	px = args->data.prx;
 
-	if (px->srv_act)
+	if (px->state == PR_STSTOPPED)
+		smp->data.u.sint = 0;
+	else if (px->srv_act)
 		smp->data.u.sint = px->srv_act;
 	else if (px->lbprm.fbck)
 		smp->data.u.sint = 1;
-- 
2.10.2



Re: [PATCH] MEDIUM/RFC: Implement time-based server latency metrics

2016-12-22 Thread Krishna Kumar (Engineering)
Thanks Willy for your detailed review, especially some design related points
that I was not aware of. I will go through these and respond accordingly.

Regards,
- Krishna


On Thu, Dec 22, 2016 at 4:23 PM, Willy Tarreau  wrote:

> Hi Krishna,
>
> On Thu, Dec 22, 2016 at 09:41:49AM +0530, Krishna Kumar (Engineering)
> wrote:
> > We have found that the current mechanism of qtime, ctime, rtime, and
> ttime
> > based on last 1024 requests is not the most suitable to debug/visualize
> > latency issues with servers, especially if they happen to last a very
> short
> > time. For live dashboards showing server timings, we found an additional
> > last-'n' seconds metrics useful. The logs could also be parsed to derive
> > these
> > values, but suffers from delays at high volume, requiring higher
> processing
> > power and enabling logs.
> >
> > The 'last-n' seconds metrics per server/backend can be configured as
> follows
> > in the HAProxy configuration file:
> > backend backend-1
> > stats-period 32
> > ...
> >
> > To retrieve these stats at the CLI (in addition to existing metrics),
> run:
> > echo show stat-duration time 3 | socat /var/run/admin.sock stdio
> >
> > These are also available on the GUI.
> >
> > The justification for this patch are:
> > 1. Allows to capture spikes for a server during a short period. This
> helps
> >having dashboards that show server response times every few seconds
> (e.g.
> >every 1 second), so as to be able to chart it across timelines.
> > 2. Be able to get an average across different time intervals, e.g.  the
> >configuration file may specify to save the last 32 seconds, but the
> cli
> >interface can request for average across any interval upto 32 seconds.
> > E.g.
> >the following command prints the existing metrics appended by the time
> >based ones for the last 1 second:
> > echo show stat-duration time 1 | socat /var/run/admin.sock stdio
> >Running the following existing command appends the time-based metric
> > values
> >based on the time period configured in the configuration file per
> >backend/server:
> > echo show stat | socat /var/run/admin.sock stdio
> > 3. Option per backend for configuring the server stat's time interval,
> and.
> >no API breakage to stats (new metrics are added at end of line).
> >
> > Please review, any feedback on the code/usability/extensibility is very
> much
> > appreciated.
>
> First, thanks for this work. I'm having several concerns and comments
> however
> about it.
>
> The first one is that the amount of storage is overkill if the output can
> only emit an average over a few periods. I mean, the purpose of stats is
> to emit what we know internally. Some people might want to see historgrams,
> and while we have everything internally with your patch, it's not possible
> to produce them.
>
> For this reason I think we should proceed differently and always emit these
> stats over a few hard-coded periods. You proved that they don't take that
> much space, and I think it would make sense probably to emit them over a
> small series of power of 4 seconds : 1s, 4s, 16s, 64s. That's quite cheap
> to store and easy to compute because it's not needed anymore to store all
> individual values, you can cascade them while filling a bucket.
>
> And if you go down that route then there's no need anymore for adding yet
> another setting in the configuration, it will be done by default. Another
> point regarding this setting "stats-period" is that you currently do not
> support the "defaults" section while I guess almost all users will want to
> have it there. That's another reason for keeping it hard-coded.
>
> Now some general comments and guidance on the code itself :
>
> > diff --git a/include/proto/stats.h b/include/proto/stats.h
> > index ac893b8..f33ceb1 100644
> > --- a/include/proto/stats.h
> > +++ b/include/proto/stats.h
> > @@ -95,7 +95,8 @@ int stats_fill_fe_stats(struct proxy *px, struct field
> *stats, int len);
> >  int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
> >  struct field *stats, int len);
> >  int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
> > -struct field *stats, int len);
> > +struct field *stats, int len,
> > +struct stream_interface *si);
>
> Please avoid re-introducing the stream_interface in stats, we're trying
> hard
> to abstract it. From what I'm seeing you only need it to know if there's a
> different period. Better pass the period directly in argument from the
> caller.
>
> > diff --git a/include/types/counters.h b/include/types/counters.h
> > index 06ce617..a33e01f 100644
> > --- a/include/types/counters.h
> > +++ b/include/types/counters.h
> > @@ -105,6 +105,41 @@ struct be_counters {
> >   } p;/* protocol-specific stats
> */
> >  };
> >
> > +/*
> > + * Struct time-stat-counters:
> 

Re: [PATCH] MEDIUM/RFC: Implement time-based server latency metrics

2016-12-22 Thread Willy Tarreau
Hi Krishna,

On Thu, Dec 22, 2016 at 09:41:49AM +0530, Krishna Kumar (Engineering) wrote:
> We have found that the current mechanism of qtime, ctime, rtime, and ttime
> based on last 1024 requests is not the most suitable to debug/visualize
> latency issues with servers, especially if they happen to last a very short
> time. For live dashboards showing server timings, we found an additional
> last-'n' seconds metrics useful. The logs could also be parsed to derive
> these
> values, but suffers from delays at high volume, requiring higher processing
> power and enabling logs.
> 
> The 'last-n' seconds metrics per server/backend can be configured as follows
> in the HAProxy configuration file:
> backend backend-1
> stats-period 32
> ...
> 
> To retrieve these stats at the CLI (in addition to existing metrics), run:
> echo show stat-duration time 3 | socat /var/run/admin.sock stdio
> 
> These are also available on the GUI.
> 
> The justification for this patch are:
> 1. Allows to capture spikes for a server during a short period. This helps
>having dashboards that show server response times every few seconds (e.g.
>every 1 second), so as to be able to chart it across timelines.
> 2. Be able to get an average across different time intervals, e.g.  the
>configuration file may specify to save the last 32 seconds, but the cli
>interface can request for average across any interval upto 32 seconds.
> E.g.
>the following command prints the existing metrics appended by the time
>based ones for the last 1 second:
> echo show stat-duration time 1 | socat /var/run/admin.sock stdio
>Running the following existing command appends the time-based metric
> values
>based on the time period configured in the configuration file per
>backend/server:
> echo show stat | socat /var/run/admin.sock stdio
> 3. Option per backend for configuring the server stat's time interval, and.
>no API breakage to stats (new metrics are added at end of line).
> 
> Please review, any feedback on the code/usability/extensibility is very much
> appreciated.

First, thanks for this work. I'm having several concerns and comments however
about it.

The first one is that the amount of storage is overkill if the output can
only emit an average over a few periods. I mean, the purpose of stats is
to emit what we know internally. Some people might want to see historgrams,
and while we have everything internally with your patch, it's not possible
to produce them.

For this reason I think we should proceed differently and always emit these
stats over a few hard-coded periods. You proved that they don't take that
much space, and I think it would make sense probably to emit them over a
small series of power of 4 seconds : 1s, 4s, 16s, 64s. That's quite cheap
to store and easy to compute because it's not needed anymore to store all
individual values, you can cascade them while filling a bucket.

And if you go down that route then there's no need anymore for adding yet
another setting in the configuration, it will be done by default. Another
point regarding this setting "stats-period" is that you currently do not
support the "defaults" section while I guess almost all users will want to
have it there. That's another reason for keeping it hard-coded.

Now some general comments and guidance on the code itself :

> diff --git a/include/proto/stats.h b/include/proto/stats.h
> index ac893b8..f33ceb1 100644
> --- a/include/proto/stats.h
> +++ b/include/proto/stats.h
> @@ -95,7 +95,8 @@ int stats_fill_fe_stats(struct proxy *px, struct field 
> *stats, int len);
>  int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
>  struct field *stats, int len);
>  int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags,
> -struct field *stats, int len);
> +struct field *stats, int len,
> +struct stream_interface *si);

Please avoid re-introducing the stream_interface in stats, we're trying hard
to abstract it. From what I'm seeing you only need it to know if there's a
different period. Better pass the period directly in argument from the caller.

> diff --git a/include/types/counters.h b/include/types/counters.h
> index 06ce617..a33e01f 100644
> --- a/include/types/counters.h
> +++ b/include/types/counters.h
> @@ -105,6 +105,41 @@ struct be_counters {
>   } p;/* protocol-specific stats */
>  };
>  
> +/*
> + * Struct time-stat-counters:
> + *
> + * Implement a last-n seconds based counters for backend/server's q/c/r/t
> + * times. Total memory consumption on x86_64 to maintain all 4 event time
> + * values per backend/server is: 16 + #seconds * 56 bytes.
> + *   for 1 second:   72 bytes
> + *   for 16 seconds: 912 bytes
> + *   for 32 seconds: 1808 bytes
> + *   for 64 second:  3600 bytes
> + */
> +struct __tstat_counters {

Avoid struct names using '__' as a prefix. In 

Re: multiproc ssl recommendations

2016-12-22 Thread Willy Tarreau
Hi Elias,

On Wed, Dec 21, 2016 at 10:06:13PM +0100, Elias Abacioglu wrote:
> How about nginx style? nbproc auto + cpu-map auto?

Well, based on my experience on many different setups, I can tell
you that there isn't any single default setting which will be at
least basically right for a single use case :-/

At least we don't have to specify bitmaps so that's probably easier
for most users.

> +1 on a per-process bind line (or auto).
> (auto would mean a good enough default setup)

"good enough" :-)

"good enough" for SSL means "bad enough" for network and vice-versa
for example. I tend to think that the "per-process" directive might
be the best trade-off because you still have the control on where you
want to bind your frontend and then you request that you have one bind
line per process. *This* currently is the part that is painful to
configure for most users, especially since every improperly bound
listener will end up on process #1.

> As for my multi proc ssl setup in case anyone was wondering:
> I did a ssl-offload listener that runs on all cores except core0 on each
> cpu + it's HT sibling.
> relaying via unix sockets to a frontend that runs on core0 on each cpu and
> it's HT siblings, so (0,1,28,29 in my case).

So you have cross-CPU communications, that's really bad for performance and
latency. However you can have your cpu0 cores relay to the cpu0 listener and
the cpu1 cores rely to the cpu1 listener.

> I haven't configured any "dedicated" cores for network interrupts.. Maybe I
> should?

You must always do this, otherwise the system will place them where it annoys
you the most. There's a reason to this. Historically, many servers used to eat
lots of resources while the system part was minimal. Thus it made sense to
try to place interrupts on the same cores as the application in order to at
least reduce the required memory bandwidth, which was scarce by then.
Nowadays we have huge L3 caches with improved arbitration algorithms, they
have a tremendous bandwidth, we're talking about hundreds of gigabytes per
second, so the inter-core communication becomes cheap as long as you stay on
the same L3 cache. And that's nice because on the type of workloads haproxy
has to deal with, between 2/3 and 5/6 of the CPU time is spent in the network
stack. There's nothing wrong with this, haproxy is just a lazy toy instructing
the kernel between what sockets the data must flow. But when you realize this,
you understand that your network stack definitely needs dedicated CPUs, and
must not be polluted by haproxy. That's an unusual workload and the scheduler
often takes a bad decision at this. That's why you have to manually tune it.

When dealing with only clear text traffic, I generally recommend to dedicate
2 cores and their respective HT siblings to the network stack, and use other
cores for haproxy. With 2 cores for the network, you're generally fine up to
20-40 Gbps. However if you have to deal with a lot of SSL, then theres a
rule of thumb : SSL is so much more expensive than the network that you'll
never need the two at once. Hence it makes sense to turn back to the good
old principle of binding IRQs on the same CPUs as the SSL cores. Not to
reduce latency, but just because the network traffic required to saturate SSL
is so low that neither will annoy the other one by being on the same cores.

> My idea would be like core 0 + 1 on each cpu + it's HT siblings, which
> would result 2 different cpu's with 2 cores + 2 HT each. According what
> Willy thumb rule that would be "enough" (80Gbps or 40Gbps depending how he
> counted HT virtual cores in his rule of thumb).
> And then I would exclude these network IRQ cores from haproxy
> frontend/listen binds.
> Sounds good as a sane setup?

Yep that sounds good. For "large" objects you should be able to achieve 40 Gbps
on a single socket with this, and theorically more with two sockets (but that's
not linear, you have bottlenecks everywhere including down the PCIe bus which
can get congested by low bandwidth cards. When I say "large", I mean that 2
years ago I managed to reach 55 Gbps with objects 64kB and larger on a 4-core
machine. That was 100k req/s.

Willy