Re: multiproc ssl recommendations

2016-12-27 Thread Willy Tarreau
On Tue, Dec 27, 2016 at 09:47:12AM +0100, Elias Abacioglu wrote:
> On Thu, Dec 22, 2016 at 11:06 AM, Willy Tarreau wrote:
> 
> > > As for my multi proc ssl setup in case anyone was wondering:
> > > I did a ssl-offload listener that runs on all cores except core0 on each
> > > cpu + it's HT sibling.
> > > relaying via unix sockets to a frontend that runs on core0 on each cpu
> > and
> > > it's HT siblings, so (0,1,28,29 in my case).
> >
> > So you have cross-CPU communications, that's really bad for performance and
> > latency. However you can have your cpu0 cores relay to the cpu0 listener
> > and
> > the cpu1 cores rely to the cpu1 listener.
> 
> 
> How would I achieve this configuration wise?
> How do I tell the server line to only send traffic from cpu0 or cpu1?

You can't do this since you have no control over this. The correct
practice is to have distinct network interfaces physically attached
to their respective CPU sockets, and to have dedicated routes to reach
various destinations via the respective interfaces, or to force the
source address to match one of the interfaces you want.

Eg: let's say you have eth0 and eth1 attached to cpu0, and eth2+eth3 attached
to cpu1. You can imagine having eth0 and eth2 as the frontend interfaces,
and eth1+eth3 as the backend interfaces.

Then you can have eth0 and eth2 in the same LAN and a shared IP address
for the frontend in another LAN. The front switch/router/L3/L4 LB will
split the incoming traffic through these NICs to the reach the shared
IP address. Then you have one haproxy process responsible for eth0+eth1
and another one for eth2+eth3. Similarly to the frontend, on the backend
side you have eth1 and eth3 in the same LAN, and your haproxy processes
reaching the various servers using these interfaces. Since they have a
different IP address, there's never any doubt regarding the return route.

Example :

listen lb1
bind 192.168.1.1:80 interface eth0 process 1
server srv1 192.168.2.101:80 source 192.168.2.1
server srv2 192.168.2.102:80 source 192.168.2.1
server srv3 192.168.2.103:80 source 192.168.2.1
server srv4 192.168.2.104:80 source 192.168.2.1

listen lb2
bind 192.168.1.1:80 interface eth1 process 2
server srv1 192.168.2.101:80 source 192.168.2.2
server srv2 192.168.2.102:80 source 192.168.2.2
server srv3 192.168.2.103:80 source 192.168.2.2
server srv4 192.168.2.104:80 source 192.168.2.2

eth0 : 192.168.0.1/24
eth1 : 192.168.0.2/24
lo   : 192.168.1.1/24
eth2 : 192.168.2.1/24
eth3 : 192.168.2.2/24

With eth0 and eth1 bound to cpu0, you attach process 1 to other cores of the
same CPU socket. As you can see above, traffic flowing through process 1
never touches the second NIC. Similar for process 2.

I hope it's clear,
Willy




Re: multiproc ssl recommendations

2016-12-27 Thread Elias Abacioglu
On Thu, Dec 22, 2016 at 11:06 AM, Willy Tarreau wrote:

> > As for my multi proc ssl setup in case anyone was wondering:
> > I did a ssl-offload listener that runs on all cores except core0 on each
> > cpu + it's HT sibling.
> > relaying via unix sockets to a frontend that runs on core0 on each cpu
> and
> > it's HT siblings, so (0,1,28,29 in my case).
>
> So you have cross-CPU communications, that's really bad for performance and
> latency. However you can have your cpu0 cores relay to the cpu0 listener
> and
> the cpu1 cores rely to the cpu1 listener.


How would I achieve this configuration wise?
How do I tell the server line to only send traffic from cpu0 or cpu1?


Re: multiproc ssl recommendations

2016-12-22 Thread Willy Tarreau
Hi Elias,

On Wed, Dec 21, 2016 at 10:06:13PM +0100, Elias Abacioglu wrote:
> How about nginx style? nbproc auto + cpu-map auto?

Well, based on my experience on many different setups, I can tell
you that there isn't any single default setting which will be at
least basically right for a single use case :-/

At least we don't have to specify bitmaps so that's probably easier
for most users.

> +1 on a per-process bind line (or auto).
> (auto would mean a good enough default setup)

"good enough" :-)

"good enough" for SSL means "bad enough" for network and vice-versa
for example. I tend to think that the "per-process" directive might
be the best trade-off because you still have the control on where you
want to bind your frontend and then you request that you have one bind
line per process. *This* currently is the part that is painful to
configure for most users, especially since every improperly bound
listener will end up on process #1.

> As for my multi proc ssl setup in case anyone was wondering:
> I did a ssl-offload listener that runs on all cores except core0 on each
> cpu + it's HT sibling.
> relaying via unix sockets to a frontend that runs on core0 on each cpu and
> it's HT siblings, so (0,1,28,29 in my case).

So you have cross-CPU communications, that's really bad for performance and
latency. However you can have your cpu0 cores relay to the cpu0 listener and
the cpu1 cores rely to the cpu1 listener.

> I haven't configured any "dedicated" cores for network interrupts.. Maybe I
> should?

You must always do this, otherwise the system will place them where it annoys
you the most. There's a reason to this. Historically, many servers used to eat
lots of resources while the system part was minimal. Thus it made sense to
try to place interrupts on the same cores as the application in order to at
least reduce the required memory bandwidth, which was scarce by then.
Nowadays we have huge L3 caches with improved arbitration algorithms, they
have a tremendous bandwidth, we're talking about hundreds of gigabytes per
second, so the inter-core communication becomes cheap as long as you stay on
the same L3 cache. And that's nice because on the type of workloads haproxy
has to deal with, between 2/3 and 5/6 of the CPU time is spent in the network
stack. There's nothing wrong with this, haproxy is just a lazy toy instructing
the kernel between what sockets the data must flow. But when you realize this,
you understand that your network stack definitely needs dedicated CPUs, and
must not be polluted by haproxy. That's an unusual workload and the scheduler
often takes a bad decision at this. That's why you have to manually tune it.

When dealing with only clear text traffic, I generally recommend to dedicate
2 cores and their respective HT siblings to the network stack, and use other
cores for haproxy. With 2 cores for the network, you're generally fine up to
20-40 Gbps. However if you have to deal with a lot of SSL, then theres a
rule of thumb : SSL is so much more expensive than the network that you'll
never need the two at once. Hence it makes sense to turn back to the good
old principle of binding IRQs on the same CPUs as the SSL cores. Not to
reduce latency, but just because the network traffic required to saturate SSL
is so low that neither will annoy the other one by being on the same cores.

> My idea would be like core 0 + 1 on each cpu + it's HT siblings, which
> would result 2 different cpu's with 2 cores + 2 HT each. According what
> Willy thumb rule that would be "enough" (80Gbps or 40Gbps depending how he
> counted HT virtual cores in his rule of thumb).
> And then I would exclude these network IRQ cores from haproxy
> frontend/listen binds.
> Sounds good as a sane setup?

Yep that sounds good. For "large" objects you should be able to achieve 40 Gbps
on a single socket with this, and theorically more with two sockets (but that's
not linear, you have bottlenecks everywhere including down the PCIe bus which
can get congested by low bandwidth cards. When I say "large", I mean that 2
years ago I managed to reach 55 Gbps with objects 64kB and larger on a 4-core
machine. That was 100k req/s.

Willy



Re: multiproc ssl recommendations

2016-12-21 Thread Elias Abacioglu
On Sat, Dec 10, 2016 at 8:52 AM, Willy Tarreau wrote:

> On Fri, Dec 09, 2016 at 08:18:45PM +0100, Pavlos Parissis wrote:
> > On 9 December 2016 at 20:07, Apollon Oikonomopoulos wrote:
> (...)
> > >> > I wonder if a `per-process' keyword would make sense here. I find
> > >> >
> > >> >   bind :443 ssl  per-process
> > >> >
> > >> > more concise than 15 or 20 individual bind lines. This would have
> the
> > >> > same effect as N bind lines, one for each process in the
> bind-process
> > >> > list.
> (...)
> > Indeed, that would be nice. I guess it isn't big issue as most of the
> > people use a configuration management tool, which does the expansion.
>
> I find that this is a very good idea. We need to be careful when
> implementing
> it because it will definitely come with problematic cases, but the idea is
> good. In fact, Manu suggested to me in private that using multiple bind
> lines is not convenient for him because he loads tons of certificates and
> it would require him to load them multiple times (takes more time, eats
> more
> memory). Something like the above, if properly designed, would solve that
> as
> well.
>
> I think we have to think a bit about a reusable implementation because we
> also need to implement something comparable for the DNS resolvers so that
> there's a per-process socket. In the end I suspect that we'll end up having
> a list of FDs instead of a single FD for each process.
>
> Also recently I noted that the cpu-map statement is boring when you deal
> with many processes, and even more when you want to experiment with
> different nbproc values because often you have to comment out many lines
> and try again with many new ones. Most often we just want to have one
> CPU for one process and they have to follow a regular pattern, eg +1 for
> the process means +1 for the CPU. But sometimes due to hyperthreading or
> NUMA you may need to use +2 or +8. Thus I was thinking we could have an
> automatic cpu-set value by using something more or less like this :
>
> cpu-map  1-10 2+1
> cpu-map 11-20 16+1
>
> This would do the same as this :
>
> cpu-map 1 2
> cpu-map 2 3
> cpu-map 3 4
> ...
> cpu-map 10 11
> cpu-map 11 16
> ...
> cpu-map 20 25
>
> We could also have this :
>
> cpu-map  1-10 2+2
>
>   equivalent to :
> cpu-map 1 2
> cpu-map 2 4
> cpu-map 3 6
> cpu-map 4 8
> ...
>
> And maybe we can add a "/X" statement to apply a modulo after the increment
> and limit the number of CPUs used in the loop :
>
> cpu-map  1-7 2+8/14
>
>   equivalent to :
> cpu-map 1 2
> cpu-map 2 10
> cpu-map 3 4
> cpu-map 4 12
> cpu-map 5 6
> cpu-map 6 14
> cpu-map 7 8
>
> This can be useful to automatically enable use of some NUMA nodes or not
> depending on the nbproc value.
>
> Maybe others have other ideas, they're welcome.
>
> Cheers,
> Willy


Hi,

How about nginx style? nbproc auto + cpu-map auto?
+1 on a per-process bind line (or auto).
(auto would mean a good enough default setup)



As for my multi proc ssl setup in case anyone was wondering:
I did a ssl-offload listener that runs on all cores except core0 on each
cpu + it's HT sibling.
relaying via unix sockets to a frontend that runs on core0 on each cpu and
it's HT siblings, so (0,1,28,29 in my case).


I haven't configured any "dedicated" cores for network interrupts.. Maybe I
should?
My idea would be like core 0 + 1 on each cpu + it's HT siblings, which
would result 2 different cpu's with 2 cores + 2 HT each. According what
Willy thumb rule that would be "enough" (80Gbps or 40Gbps depending how he
counted HT virtual cores in his rule of thumb).
And then I would exclude these network IRQ cores from haproxy
frontend/listen binds.
Sounds good as a sane setup?

/Elias


Re: multiproc ssl recommendations

2016-12-09 Thread Willy Tarreau
On Fri, Dec 09, 2016 at 08:18:45PM +0100, Pavlos Parissis wrote:
> On 9 December 2016 at 20:07, Apollon Oikonomopoulos  
> wrote:
(...)
> >> > I wonder if a `per-process' keyword would make sense here. I find
> >> >
> >> >   bind :443 ssl  per-process
> >> >
> >> > more concise than 15 or 20 individual bind lines. This would have the
> >> > same effect as N bind lines, one for each process in the bind-process
> >> > list.
(...)
> Indeed, that would be nice. I guess it isn't big issue as most of the
> people use a configuration management tool, which does the expansion.

I find that this is a very good idea. We need to be careful when implementing
it because it will definitely come with problematic cases, but the idea is
good. In fact, Manu suggested to me in private that using multiple bind
lines is not convenient for him because he loads tons of certificates and
it would require him to load them multiple times (takes more time, eats more
memory). Something like the above, if properly designed, would solve that as
well.

I think we have to think a bit about a reusable implementation because we
also need to implement something comparable for the DNS resolvers so that
there's a per-process socket. In the end I suspect that we'll end up having
a list of FDs instead of a single FD for each process.

Also recently I noted that the cpu-map statement is boring when you deal
with many processes, and even more when you want to experiment with
different nbproc values because often you have to comment out many lines
and try again with many new ones. Most often we just want to have one
CPU for one process and they have to follow a regular pattern, eg +1 for
the process means +1 for the CPU. But sometimes due to hyperthreading or
NUMA you may need to use +2 or +8. Thus I was thinking we could have an
automatic cpu-set value by using something more or less like this :

cpu-map  1-10 2+1
cpu-map 11-20 16+1

This would do the same as this :

cpu-map 1 2
cpu-map 2 3
cpu-map 3 4
...
cpu-map 10 11
cpu-map 11 16
...
cpu-map 20 25

We could also have this :

cpu-map  1-10 2+2

  equivalent to :
cpu-map 1 2
cpu-map 2 4
cpu-map 3 6
cpu-map 4 8
...

And maybe we can add a "/X" statement to apply a modulo after the increment
and limit the number of CPUs used in the loop :

cpu-map  1-7 2+8/14

  equivalent to :
cpu-map 1 2
cpu-map 2 10
cpu-map 3 4
cpu-map 4 12
cpu-map 5 6
cpu-map 6 14
cpu-map 7 8

This can be useful to automatically enable use of some NUMA nodes or not
depending on the nbproc value.

Maybe others have other ideas, they're welcome.

Cheers,
Willy



Re: multiproc ssl recommendations

2016-12-09 Thread Pavlos Parissis
On 9 December 2016 at 20:07, Apollon Oikonomopoulos  wrote:
> Hi Pavlos,
>
> On 17:31 Fri 09 Dec , Pavlos Parissis wrote:
>> On 09/12/2016 08:54 πμ, Apollon Oikonomopoulos wrote:
>> > Hi Willy, Elias,
>> >
>> > On 08:33 Fri 09 Dec , Willy Tarreau wrote:
>> >> On Thu, Dec 01, 2016 at 02:53:25PM +0100, Elias Abacioglu wrote:
>> >>> # Should I use core 0 on each CPU for backends (proc 1+15) or should
>> >>> I
>> >>> use core 1(proc 2+16)?
>> >>
>> >> Backends are processed on the same CPU as the frontend which passes them
>> >> the traffic, so the bind-process has no effect there. In fact bind-process
>> >> on a backend means "at least on these processes".
>> >>
>> >> That's why it's better to proceed like this (stupid numbers, just so that
>> >> you get the idea):
>> >>
>> >>listen ssl-offload
>> >>   bind-proess 2-50
>> >>   bind :443 ssl  process 2
>> >>   ...
>> >>   bind :443 ssl  process 50
>> >
>> > I wonder if a `per-process' keyword would make sense here. I find
>> >
>> >   bind :443 ssl  per-process
>> >
>> > more concise than 15 or 20 individual bind lines. This would have the
>> > same effect as N bind lines, one for each process in the bind-process
>> > list.
>>
>> If you have bind per process then all sockets are bound separately and you
>> get X listening sockets on port 443, which results to have one distinct 
>> socket
>> in each process with its own queue(SYN backlog queues and etc), and the 
>> kernel's
>> load balancing works much better.
>
> That's true, yes. However what I'm saying is that some syntactic sugar
> to have the parser auto-expand a single "bind" directive to create N
> sockets instead of one, would be nice.
>

Indeed, that would be nice. I guess it isn't big issue as most of the
people use a configuration
management tool, which does the expansion.

Cheers,
Pavlos



Re: multiproc ssl recommendations

2016-12-09 Thread Apollon Oikonomopoulos
Hi Pavlos,

On 17:31 Fri 09 Dec , Pavlos Parissis wrote:
> On 09/12/2016 08:54 πμ, Apollon Oikonomopoulos wrote:
> > Hi Willy, Elias,
> > 
> > On 08:33 Fri 09 Dec , Willy Tarreau wrote:
> >> On Thu, Dec 01, 2016 at 02:53:25PM +0100, Elias Abacioglu wrote:
> >>> # Should I use core 0 on each CPU for backends (proc 1+15) or should 
> >>> I
> >>> use core 1(proc 2+16)?
> >>
> >> Backends are processed on the same CPU as the frontend which passes them
> >> the traffic, so the bind-process has no effect there. In fact bind-process
> >> on a backend means "at least on these processes".
> >>
> >> That's why it's better to proceed like this (stupid numbers, just so that
> >> you get the idea):
> >>
> >>listen ssl-offload
> >>   bind-proess 2-50
> >>   bind :443 ssl  process 2
> >>   ...
> >>   bind :443 ssl  process 50
> > 
> > I wonder if a `per-process' keyword would make sense here. I find
> > 
> >   bind :443 ssl  per-process
> > 
> > more concise than 15 or 20 individual bind lines. This would have the 
> > same effect as N bind lines, one for each process in the bind-process 
> > list.
> 
> If you have bind per process then all sockets are bound separately and you
> get X listening sockets on port 443, which results to have one distinct socket
> in each process with its own queue(SYN backlog queues and etc), and the 
> kernel's
> load balancing works much better.

That's true, yes. However what I'm saying is that some syntactic sugar 
to have the parser auto-expand a single "bind" directive to create N 
sockets instead of one, would be nice.

Regards,
Apollon



Re: multiproc ssl recommendations

2016-12-09 Thread Pavlos Parissis
On 09/12/2016 08:54 πμ, Apollon Oikonomopoulos wrote:
> Hi Willy, Elias,
> 
> On 08:33 Fri 09 Dec , Willy Tarreau wrote:
>> On Thu, Dec 01, 2016 at 02:53:25PM +0100, Elias Abacioglu wrote:
>>> # Should I use core 0 on each CPU for backends (proc 1+15) or should 
>>> I
>>> use core 1(proc 2+16)?
>>
>> Backends are processed on the same CPU as the frontend which passes them
>> the traffic, so the bind-process has no effect there. In fact bind-process
>> on a backend means "at least on these processes".
>>
>> That's why it's better to proceed like this (stupid numbers, just so that
>> you get the idea):
>>
>>listen ssl-offload
>>   bind-proess 2-50
>>   bind :443 ssl  process 2
>>   ...
>>   bind :443 ssl  process 50
> 
> I wonder if a `per-process' keyword would make sense here. I find
> 
>   bind :443 ssl  per-process
> 
> more concise than 15 or 20 individual bind lines. This would have the 
> same effect as N bind lines, one for each process in the bind-process 
> list.

If you have bind per process then all sockets are bound separately and you
get X listening sockets on port 443, which results to have one distinct socket
in each process with its own queue(SYN backlog queues and etc), and the kernel's
load balancing works much better.

So, haproxy performs better!

Cheers,
Pavlos



signature.asc
Description: OpenPGP digital signature


Re: multiproc ssl recommendations

2016-12-08 Thread Apollon Oikonomopoulos
Hi Willy, Elias,

On 08:33 Fri 09 Dec , Willy Tarreau wrote:
> On Thu, Dec 01, 2016 at 02:53:25PM +0100, Elias Abacioglu wrote:
> > # Should I use core 0 on each CPU for backends (proc 1+15) or should 
> > I
> > use core 1(proc 2+16)?
> 
> Backends are processed on the same CPU as the frontend which passes them
> the traffic, so the bind-process has no effect there. In fact bind-process
> on a backend means "at least on these processes".
> 
> That's why it's better to proceed like this (stupid numbers, just so that
> you get the idea):
> 
>listen ssl-offload
>   bind-proess 2-50
>   bind :443 ssl  process 2
>   ...
>   bind :443 ssl  process 50

I wonder if a `per-process' keyword would make sense here. I find

  bind :443 ssl  per-process

more concise than 15 or 20 individual bind lines. This would have the 
same effect as N bind lines, one for each process in the bind-process 
list.

>   server clear 127.0.0.1:1 send-proxy-v2
> 
>frontend clear
>   bind-process 1
>   bind 127.0.0.1:1 accept-proxy

Would you recommend using unix sockets for process-to-process 
communication, at least to get rid of the TCP state overhead?

Regards,
Apollon



Re: multiproc ssl recommendations

2016-12-08 Thread Willy Tarreau
Hi Elias,

On Thu, Dec 01, 2016 at 02:53:25PM +0100, Elias Abacioglu wrote:
> Hi,
> 
> Similar to what Christian asked about a few days ago I would like help
> to summarize the recommendations for running a haproxy as a SSL LB on
> a multi cpu, multi core machine.
> 
> I have a machine with two sockets equipped with Intel Xeon E5-2680 v4.
> 56 cores in total with HT enabled, 28 with HT disabled.
> 
> Summary(correct me where I'm wrong):
> Hyperthreading - should be disabled

HT is useful with SSL, just like it performs well for most NIC drivers,
provided you never mix two different workloads on two threads of the
same core (typically you put to SSL processes on the same core, or two
network interrupts).

For regular L7 traffic, it doesn't bring anything and can even sometimes
have a slightly negative effect, so it's better not to use the second
thread of cores that are used for non-SSL traffic.

My recommendation here remains the same : just have dedicated SSL
offloading processes running on their own cores, and forward all this
traffic to the L7 running on a single process. This allows you to get
better stats, better checks, etc.

> As for network IRQ pinning, should I not pin the NIC to core 0 of each
> CPU? Should I split it up between the cores when doing SSL termination
> on most cores?

It's more complicated and it depends what workload you can achieve on
a single core. For my use cases, I tend to assign network IRQs on pairs
of threads and assign them "enough cores" so that I don't see ksoftirqd
pop up anymore. But there's no one-size-fits-all. For example on our
ALOHA appliances, we've enough of a single CPU core (2 threads) for the
kernel NICs to let our anti-ddos PacketShield deal with line-rate SYN
floods at 10 Gbps, or to forward 10 Gbps of traffic. On another machine
I tested two years ago on which we reached ~60 Gbps of L7 traffic, we
needed about 4 cores to reach the highest performance level but
unfortunately we had only 4 :-/ I tend to think that one rough rule of
thumb could be about 1 core per 10 Gbps of traffic.

(...)
> # (didn't see any cpu-map in Christians configuration, is it not
> needed anymore?)

It has never been "needed", use it if you want to pin specific processes
to specific CPUs (ie useful for SSL vs L7).

> frontend https
> bind :443 ssl crt /etc/ssl/xx.pem process 1
> 
> bind :443 ssl crt /etc/ssl/xx.pem process 28
> # should I omitt process 1/2 and 15/16 here?

No reason for this.

> # Should I use core 0 on each CPU for backends (proc 1+15) or should I
> use core 1(proc 2+16)?

Backends are processed on the same CPU as the frontend which passes them
the traffic, so the bind-process has no effect there. In fact bind-process
on a backend means "at least on these processes".

That's why it's better to proceed like this (stupid numbers, just so that
you get the idea):

   listen ssl-offload
  bind-proess 2-50
  bind :443 ssl  process 2
  ...
  bind :443 ssl  process 50
  server clear 127.0.0.1:1 send-proxy-v2

   frontend clear
  bind-process 1
  bind 127.0.0.1:1 accept-proxy
  ...

> I just read the release notes for haproxy 1.7. There was mention of
> mixing different certificate types under "support of multi-certs".
> 
> Previously I would have a tcp-ssl that would send traffic to a ecc
> backend and a rsa backend depending on req.ssl_ec_ext, which in turn
> would send traffic to a frontend with two sockets, one for each
> certificate type.
> To something like this (hope my ascii doesn't get messed up).

it was messed up, you need to have fixed size fonts for ascii art
and I suspect you were using a proportional font to read it in your
mailer :-)

> +-->backend01
> +--->backend-ecc+ 
> |
> frontend-tcp-ssl:443 |
> +--->frontend-https+-->backend02
> +--->backend-rsa +
>  |
> 
> +-->backend03
> 
> Does this mean I can now mix RSA and ECC certificates on the same
> frontend and skip the entire frontend-backend-frontend trick?

Absolutely! That was the purpose.

Regards,
Willy



multiproc ssl recommendations

2016-12-01 Thread Elias Abacioglu
Hi,

Similar to what Christian asked about a few days ago I would like help
to summarize the recommendations for running a haproxy as a SSL LB on
a multi cpu, multi core machine.

I have a machine with two sockets equipped with Intel Xeon E5-2680 v4.
56 cores in total with HT enabled, 28 with HT disabled.

Summary(correct me where I'm wrong):
Hyperthreading - should be disabled
As for network IRQ pinning, should I not pin the NIC to core 0 of each
CPU? Should I split it up between the cores when doing SSL termination
on most cores?

global
nbproc 28 # equal to the number of none HT cores (28 in my case)
# cpu-map's equal to the number of non HT cores.
# (didn't see any cpu-map in Christians configuration, is it not
needed anymore?)
cpu-map 1 0
cpu-map 2 1
...
cpu-map 28 27

frontend https
bind :443 ssl crt /etc/ssl/xx.pem process 1

bind :443 ssl crt /etc/ssl/xx.pem process 28
# should I omitt process 1/2 and 15/16 here?

# Should I use core 0 on each CPU for backends (proc 1+15) or should I
use core 1(proc 2+16)?
backend x
bind-process 1 15
backend y
bind-process 2 16
backend z
# or should i use a bind-process like the frontend's bind?
bind-process 2
bind-process 16


I just read the release notes for haproxy 1.7. There was mention of
mixing different certificate types under "support of multi-certs".

Previously I would have a tcp-ssl that would send traffic to a ecc
backend and a rsa backend depending on req.ssl_ec_ext, which in turn
would send traffic to a frontend with two sockets, one for each
certificate type.
To something like this (hope my ascii doesn't get messed up).


+-->backend01
+--->backend-ecc+ |
frontend-tcp-ssl:443 |
+--->frontend-https+-->backend02
+--->backend-rsa + |

+-->backend03

Does this mean I can now mix RSA and ECC certificates on the same
frontend and skip the entire frontend-backend-frontend trick?

  +-->backend01
  |
frontend-https:443+-->backend02
  |
  +-->backend03

Thanks,
Elias