Feature suggestion: Check for same binding on multiple frontends

2018-03-06 Thread Moomjian, Chad
Haproxy Developers,

I recently modified a configuration file for haproxy, and after setting it up, 
I noticed that about half of my requests came back with a 503 error, and the 
other half came back with the correct elements being returned.

After doing troubleshooting involving a test haproxy instance and changing the 
IP address, I realized that I had mistakenly added the same IP binding, 
10.x.x.11:443, in two different frontends. As a result, half of my requests had 
no matching path (we don't use a default backend), and the other half were 
using responding correctly.

Since I cannot think of a time when this would be desired behavior, would it be 
possible to add a check on haproxy startup for the exact same IP binding in 
multiple frontends of the same config file? This could save me and others from 
possibly making this mistake in the future.

Thanks for the consideration and for the great product. We switched our entire 
production and pre-production environments from F5's to haproxy load balancers, 
and everyone has been really happy with the move. The haproxy load balancers 
actually outperform the expensive, commercial, physical devices.

Regards,

Chad Moomjian
Cloud Systems Administrator

OutMatch



Re: segfault in haproxy=1.8.4

2018-03-06 Thread Максим Куприянов
Hi, Willy!

If it could be interesting, I got a new core with exactly the same
backtrace:
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x563f5373bf21 in __pendconn_free (p=0x563f560db0c8) at
src/queue.c:296
296 HA_ATOMIC_SUB(&p->strm->be->nbpend, 1);
(gdb) bt
#0  0x563f5373bf21 in __pendconn_free (p=0x563f560db0c8) at
src/queue.c:296
#1  0x563f5373c1de in pendconn_get_next_strm (px=0x563f560da290,
srv=0x563f560df380) at src/queue.c:122
#2  process_srv_queue (s=0x563f560df380) at src/queue.c:153
#3  0x563f536bf192 in sess_update_st_cer (s=s@entry=0x7f42841d2f60) at
src/stream.c:742
#4  0x563f536c356d in process_stream (t=0x7f42841d32c0) at
src/stream.c:1783
#5  0x563f5373baa8 in process_runnable_tasks () at src/task.c:311
#6  0x563f536f0c84 in run_poll_loop () at src/haproxy.c:2398
#7  run_thread_poll_loop (data=) at src/haproxy.c:2460
#8  0x7f42af227184 in start_thread () from
/lib/x86_64-linux-gnu/libpthread.so.0
#9  0x7f42ae4a503d in clone () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) print p
$9 = (struct pendconn *) 0x563f560db0c8
(gdb) print p->strm
$10 = (struct stream *) 0x7f4385291f91
(gdb) print p->strm->uniq_id
Cannot access memory at address 0x7f4385291f95

--
Best regards,
Maksim Kupriianov


2018-03-05 19:35 GMT+03:00 Willy Tarreau :

> On Mon, Mar 05, 2018 at 09:19:16PM +0500, ?? ? wrote:
> > Hi Willy!
> >
> > I have 2 more haproxy-servers with exactly the same configuration and
> load.
> > Both has threads compiled in but not enabled in config (no nbthreads).
> And
> > there're no segfaults at all. So I'm sure everything is fine without
> > threads.
> > Haproxy's config file itself is way too large to find out exact
> > proxy-section where the fault occurred to tell you about configuration :(
> > But if it could help: we have few heavy loaded proxy-sections with
> hundred
> > of servers inside, round-roubin algo and maxconn=2. Stick tables are
> > enabled but only to calculate average characteristics of traffic and use
> it
> > in ACL so no real stickness on backends.
> > I still have the core file available, so I can extract some more detailed
> > traces if you need'em.
>
> OK thanks for these details, these are already quite valuable information.
> Please don't lose the executable file that produced the core, in case we'd
> need to dig deeper into it.
>
> Cheers,
> Willy
>


Re: Dynamically adding/deleting SSL certificates

2018-03-06 Thread Aurélien Nephtali
Willy,

On Tue, Mar 6, 2018 at 2:30 PM, Willy Tarreau  wrote:
> More or less. I'd rather return a 3rd case, like with do with samples :
> "not sure yet" (need more data to decide). That allows the failure and
> success cases to remain definitive.

Indeed, that what I was trying to say with: "to wait until a keyword
handler says it is ready to process the command" :) : it will keep
saying "need more data" until it can decide and say "OK I continued
and processed the command" or "I couldn't process it".

> I just don't know exactly what is *currently* used. And we make it a
> hard rule not to break existing deployments on purpose. People write
> management scripts, utilities etc and purposely breaking their API is
> really not fun at all. This is the *only* reason here.

And I fully agree with that, I want to add the feature without being
too intrusive or being non retro-compatible.

> I think we're in sync on this. Please take a look at OCSP to see how it's
> currently handled so that we don't have to imagine all possibly stupid
> cases. Also take a look at {set|add} {acl|map} and I think that should
> be all for now to get the whole picture of the compatiblity we have to
> maintain.
>

I will do that and check if everything can fit in.

Thanks !

-- 
Aurélien Nephtali



Re: Dynamically adding/deleting SSL certificates

2018-03-06 Thread Willy Tarreau
Hello Aurélien,

On Tue, Mar 06, 2018 at 02:13:31PM +0100, Aurélien Nephtali wrote:
> > Probably that we could in fact extend the CLI syntax in a backwards 
> > compatible
> > way :
> >
> >[ ]* *
> >[optional body]
> >
> > Most commands don't use a body. Those using a body have to terminate it 
> > using
> > either its own representation known by the command, or an empty line.
> 
> Wouldn't it be easier/clearer to always require an empty line if there
> is a body ?

In theory I fully agree. The only thing is that I'm uncertain about the
format used by OCSP right now. I think it starts as an argument, I'm not
certain whether or not it continues on multiple lines or not.

> Even if it can be present but optional, an empty line would still
> signal the end of it.

If that doesn't break OCSP, I'm all in favor of this, at least for the
sake of avoiding layering violation : the CLI layer transports the body
without having to understand it and it's only the consumer which decides
if it's complete or not while decoding it.

> > "set ssl ocsp-response" takes a series of words in arguments, or a body.
> > Some commands might even simply support the concatenation of both, ie they
> > start immediately on the argument, detect the block is not completed, and
> > read the rest on the following lines till the end is found (either defined
> > by the data format or by the empty line). This simplifies the syntax of
> > long commands, to support both the first line as the first series of
> > arguments, or as the first line of the body.
> 
> Basically, it boils down to adding multi-lines support to the CLI, right ?

Looks like this indeed.

> The parser will have to be modified to wait until a keyword handler
> says it is ready to process the command if it has every argument it
> needs (continue) or some are missing (fail).

More or less. I'd rather return a 3rd case, like with do with samples :
"not sure yet" (need more data to decide). That allows the failure and
success cases to remain definitive.

> > E.g. we could write (just using maps as a simple example) :
> >
> >add map foo.txt foo1 bar1
> >
> > or:
> >
> >add map foo.txt
> >foo1 bar1
> >foo2 bar2
> >LF
> 
> [1]
> 
> >
> > For OCSP, we could have :
> >
> >set ssl ocsp-response line1
> >line2
> >line3
> >line4===
> >
> > or :
> >set ssl ocsp-response line1 line2 line3 line4===
> >
> > or :
> >set ssl ocsp-response
> >line1
> >line2
> >line3
> >line4===
> 
> Is there an advantage to allow these two syntaxes versus always the
> one in [1] ? Why not always requiring an empty line if multi-line is
> used ?

I just don't know exactly what is *currently* used. And we make it a
hard rule not to break existing deployments on purpose. People write
management scripts, utilities etc and purposely breaking their API is
really not fun at all. This is the *only* reason here. However since
I'm not sure about the current OCSP syntax, anything which still works
will be fine. For example, if it supports words passed as arguments
and doesn't use any extra line, we can simply make it accept either
an OCSP response passed as arguments, in which case there is no body,
or no argument at all in which case a body is expected, and terminated
by the empty line.

> That would still require cooperation from the handlers but it
> will stay simple: do I have enough data to proceed ?

That's why the "need more info" feedback from the handlers is needed,
and why (if possible), avoiding having to feed unterminated bodies to
these handlers "just in case" is better. Now that I'm thinking about
it, I don't see how the OCSP would currently consume extra lines, so
maybe it only works using arguments and we're already fine.

> In the case of ocsp-response, without an empty line, it would mean
> trying to decode the whole buffer at each new data - or maybe just
> what was added if base64dec() is modified to support partial data.
> Waiting for an empty line would simplify the logic by only decoding
> the buffer once - this is an example for ocsp-response but I think
> processing arguments once is easier than processing them at each new
> data.

I wholeheartly agree as long as we don't break existing ones ;-)
I'm even fine with not documenting the deprecated behaviours and only
document the new one to encourage migration.

> Requiring an empty line would also solve the case of commands with an
> unknown number of arguments. I do not know if there are some yet but
> they could then safely be added.

Indeed, like what I proposed for maps/acls.

> In the particular case of a PEM certificate, the handler would wait
> for an empty line and it would know it can treat what is after the
> known arguments as being the certificate without trying to guess it is
> complete - in the case where there would be no empty line.

Definitely!

I think we're in sync on this. Please take a look at OCSP to see how it's
currently handled so that we don't have to imag

Re: What is a nice way to bypass the maintenance mode for certain IP's?

2018-03-06 Thread Willy Tarreau
On Tue, Mar 06, 2018 at 09:48:11AM +, Pieter Vogelaar wrote:
> Does use-server also accept some keyword to address the first server in the
> backend instead of a specific valid server name of the backend?

Hmmm no there's no such feature. I'm not sure I'm seeing well the real
use case to be honnest, considering that you should have a farm with
many servers. Or maybe you'd only want to target a specific server
position, like server #1, server #2 etc to be sure to iterate over all
of them using enough rules ?

> That would save quite a bit logic complexity in Puppet.

It could be useful if you could shortly describe what the challenge is
in this case and how you would see it addressed. Maybe it's something
not too difficult to implement if it can be useful.

Cheers,
Willy



Re: Dynamically adding/deleting SSL certificates

2018-03-06 Thread Aurélien Nephtali
Hello Willy,

On Mon, Mar 5, 2018 at 8:37 PM, Willy Tarreau  wrote:
> Quotes could be part of some future statements and we'd
> possibly regret having used them if already used for this. For example we 
> could
> imagine one day uploading some JSON parts for certain things.

True, but it could also be addressed by switching the input mode to a
certain flavour (JSON, XML, whatever) but it is also true that
handling multiple lines also solves the problem and does not require
supporting quotes.

> Probably that we could in fact extend the CLI syntax in a backwards compatible
> way :
>
>[ ]* *
>[optional body]
>
> Most commands don't use a body. Those using a body have to terminate it using
> either its own representation known by the command, or an empty line.

Wouldn't it be easier/clearer to always require an empty line if there
is a body ?
Even if it can be present but optional, an empty line would still
signal the end of it.

> "set ssl ocsp-response" takes a series of words in arguments, or a body.
> Some commands might even simply support the concatenation of both, ie they
> start immediately on the argument, detect the block is not completed, and
> read the rest on the following lines till the end is found (either defined
> by the data format or by the empty line). This simplifies the syntax of
> long commands, to support both the first line as the first series of
> arguments, or as the first line of the body.

Basically, it boils down to adding multi-lines support to the CLI, right ?
The parser will have to be modified to wait until a keyword handler
says it is ready to process the command if it has every argument it
needs (continue) or some are missing (fail).

>
> E.g. we could write (just using maps as a simple example) :
>
>add map foo.txt foo1 bar1
>
> or:
>
>add map foo.txt
>foo1 bar1
>foo2 bar2
>LF

[1]

>
> For OCSP, we could have :
>
>set ssl ocsp-response line1
>line2
>line3
>line4===
>
> or :
>set ssl ocsp-response line1 line2 line3 line4===
>
> or :
>set ssl ocsp-response
>line1
>line2
>line3
>line4===

Is there an advantage to allow these two syntaxes versus always the
one in [1] ? Why not always requiring an empty line if multi-line is
used ? That would still require cooperation from the handlers but it
will stay simple: do I have enough data to proceed ?
In the case of ocsp-response, without an empty line, it would mean
trying to decode the whole buffer at each new data - or maybe just
what was added if base64dec() is modified to support partial data.
Waiting for an empty line would simplify the logic by only decoding
the buffer once - this is an example for ocsp-response but I think
processing arguments once is easier than processing them at each new
data.

Requiring an empty line would also solve the case of commands with an
unknown number of arguments. I do not know if there are some yet but
they could then safely be added.

In the particular case of a PEM certificate, the handler would wait
for an empty line and it would know it can treat what is after the
known arguments as being the certificate without trying to guess it is
complete - in the case where there would be no empty line.

Thanks !

-- 
Aurélien Nephtali



Re: nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-06 Thread Tom
Yeah I did put it in the wrong queue.  Sorry about that guys.

Sent from my iPhone

> On Mar 6, 2018, at 1:09 AM, Johan Hendriks  wrote:
> 
> Tom, this list is for the loadbalancer software haproxy, so i think you 
> mailed the wrong list. 
> 
> Regards 
> Johan Hendriks
> 
> Op 6 mrt. 2018 06:41 schreef "TomK" :
>> Hey Guy's,
>> 
>> Getting below message which in turn fails to list proper UID / GID on NFSv4 
>> mounts from within an unprivileged account. All files show up with owner and 
>> group as nobody / nobody.
>> 
>> Wondering if anyone saw this and what the solution could be?
>> 
>> [root@client01 etc]# cat /etc/idmapd.conf|grep -v "#"| sed -e "/^$/d"
>> [General]
>> Verbosity = 7
>> Domain = nix.my.dom
>> [Mapping]
>> [Translation]
>> [Static]
>> [UMICH_SCHEMA]
>> LDAP_server = ldap-server.local.domain.edu
>> LDAP_base = dc=local,dc=domain,dc=edu
>> [root@client01 etc]#
>> 
>> Mount looks like this:
>> 
>> nfs-c01.nix.my.dom:/n/my.dom on /n/my.dom type nfs4 
>> (rw,relatime,vers=4.0,rsize=8192,wsize=8192,namlen=255,hard,proto=tcp,port=0,timeo=10,retrans=2,sec=sys,clientaddr=192.168.0.236,local_lock=none,addr=192.168.0.80)
>> 
>> /var/log/messages
>> 
>> Mar  6 00:17:27 client01 nfsidmap[14396]: key: 0x3f2c257b type: uid value: 
>> t...@my.dom@localdomain timeout 600
>> Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: calling 
>> nsswitch->name_to_uid
>> Mar  6 00:17:27 client01 nfsidmap[14396]: nss_getpwnam: name 
>> 't...@my.dom@localdomain' domain 'nix.my.dom': resulting localname '(null)'
>> Mar  6 00:17:27 client01 nfsidmap[14396]: nss_getpwnam: name 
>> 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'
>> Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: 
>> nsswitch->name_to_uid returned -22
>> Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: final return 
>> value is -22
>> Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: calling 
>> nsswitch->name_to_uid
>> Mar  6 00:17:27 client01 nfsidmap[14396]: nss_getpwnam: name 
>> 'nob...@nix.my.dom' domain 'nix.my.dom': resulting localname 'nobody'
>> Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: 
>> nsswitch->name_to_uid returned 0
>> Mar  6 00:17:27 client01 nfsidmap[14396]: nfs4_name_to_uid: final return 
>> value is 0
>> Mar  6 00:17:27 client01 nfsidmap[14398]: key: 0x324b0048 type: gid value: 
>> t...@my.dom@localdomain timeout 600
>> Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: calling 
>> nsswitch->name_to_gid
>> Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: 
>> nsswitch->name_to_gid returned -22
>> Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: final return 
>> value is -22
>> Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: calling 
>> nsswitch->name_to_gid
>> Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: 
>> nsswitch->name_to_gid returned 0
>> Mar  6 00:17:27 client01 nfsidmap[14398]: nfs4_name_to_gid: final return 
>> value is 0
>> Mar  6 00:17:31 client01 systemd-logind: Removed session 23.
>> 
>> 
>> 
>> 
>> Result of:
>> 
>> systemctl restart rpcidmapd
>> 
>> /var/log/messages
>> ---
>> Mar  5 23:46:12 client01 systemd: Stopping Automounts filesystems on 
>> demand...
>> Mar  5 23:46:13 client01 systemd: Stopped Automounts filesystems on demand.
>> Mar  5 23:48:51 client01 systemd: Stopping NFSv4 ID-name mapping service...
>> Mar  5 23:48:51 client01 systemd: Starting Preprocess NFS configuration...
>> Mar  5 23:48:51 client01 systemd: Started Preprocess NFS configuration.
>> Mar  5 23:48:51 client01 systemd: Starting NFSv4 ID-name mapping service...
>> Mar  5 23:48:51 client01 rpc.idmapd[14117]: libnfsidmap: using domain: 
>> nix.my.dom
>> Mar  5 23:48:51 client01 rpc.idmapd[14117]: libnfsidmap: Realms list: 
>> 'NIX.MY.DOM'
>> Mar  5 23:48:51 client01 rpc.idmapd: rpc.idmapd: libnfsidmap: using domain: 
>> nix.my.dom
>> Mar  5 23:48:51 client01 rpc.idmapd: rpc.idmapd: libnfsidmap: Realms list: 
>> 'NIX.MY.DOM'
>> Mar  5 23:48:51 client01 rpc.idmapd: rpc.idmapd: libnfsidmap: loaded plugin 
>> /lib64/libnfsidmap/nsswitch.so for method nsswitch
>> Mar  5 23:48:51 client01 rpc.idmapd[14117]: libnfsidmap: loaded plugin 
>> /lib64/libnfsidmap/nsswitch.so for method nsswitch
>> Mar  5 23:48:51 client01 rpc.idmapd[14118]: Expiration time is 600 seconds.
>> Mar  5 23:48:51 client01 systemd: Started NFSv4 ID-name mapping service.
>> Mar  5 23:48:51 client01 rpc.idmapd[14118]: Opened 
>> /proc/net/rpc/nfs4.nametoid/channel
>> Mar  5 23:48:51 client01 rpc.idmapd[14118]: Opened 
>> /proc/net/rpc/nfs4.idtoname/channel
>> 
>> 
>> -- 
>> Cheers,
>> Tom K.
>> -
>> 
>> Living on earth is expensive, but it includes a free trip around the sun.
>> 
>> 


Re: haproxy 1.8.4-1 hangs on kernel 4.16.0-041600rc1

2018-03-06 Thread Adrian Veith
thanks I am trying 4.16-rc4 now and it looks good.

cheers Adrian.


Am 06.03.2018 um 12:05 schrieb Lukas Tribus:
> On 6 March 2018 at 11:38, Adrian Veith  wrote:
>> I had this hang in haproxy after trying out kernel 4.16.0-041600rc1
>> after starting haproxy for some minutes. Now I am back on kernel
>> 4.15.0-10-generic and everything seems ok so far.
> Yeah, this is a kernel bug, you need the fix:
> netfilter: drop outermost socket lock in getsockopt()
>
> Which should be in 4.16-rc3 (we are already at 4.16-rc4).
>
> If you use kernel.org stable or RC kernel, always upgrade to the
> latest kernel first.
>




Re: haproxy 1.8.4-1 hangs on kernel 4.16.0-041600rc1

2018-03-06 Thread Lukas Tribus
Hello,


On 6 March 2018 at 11:38, Adrian Veith  wrote:
> I had this hang in haproxy after trying out kernel 4.16.0-041600rc1
> after starting haproxy for some minutes. Now I am back on kernel
> 4.15.0-10-generic and everything seems ok so far.

Yeah, this is a kernel bug, you need the fix:
netfilter: drop outermost socket lock in getsockopt()

Which should be in 4.16-rc3 (we are already at 4.16-rc4).

If you use kernel.org stable or RC kernel, always upgrade to the
latest kernel first.


cheers,
lukas



Re: Dynamically adding/deleting SSL certificates

2018-03-06 Thread Aurélien Nephtali
Hello Willy,

On Mon, Mar 5, 2018 at 7:25 PM, Willy Tarreau  wrote:
> I tend to think (first idea out of my head) that for such file types,
> we could very well consider that the command reads multiple lines and
> stops at the first empty line. That's very convenient to use in scripts
> and even by hand in copy-paste sessions. It would work with almost all
> of the data types we have to feed via the CLI, including the maps/acls.

It looks like a clean way to do.

> And a script writing there would just have to run grep -v "^$" to be
> save, which is pretty easy.
>
> In fact that's already the format used for the output : the output of
> each command is defined as running till the first empty line.
>
> I also thought about escaping end of lines with a backslash but that
> becomes very painful to place in scripts.

Yes, I also thought about that but discarded the idea since I wanted
something that could easily be used on the command line without major
data preprocessing.

> Just my two cents, I'm also interested in people's ideas regarding this.

Thanks for the comments, I will think about it and continue to monitor
other ideas!

-- 
Aurélien Nephtali



haproxy 1.8.4-1 hangs on kernel 4.16.0-041600rc1

2018-03-06 Thread Adrian Veith
I had this hang in haproxy after trying out kernel 4.16.0-041600rc1
after starting haproxy for some minutes. Now I am back on kernel
4.15.0-10-generic and everything seems ok so far.

Adrian Veith

[ 9063.536247] INFO: task haproxy:1234 blocked for more than 120 seconds.
[ 9063.536334]   Tainted: G C   4.16.0-041600rc1-generic
#201802120030
[ 9063.536456] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 9063.536582] haproxy D    0  1234   1232 0x
[ 9063.536586] Call Trace:
[ 9063.536595]  __schedule+0x297/0x870
[ 9063.536598]  schedule+0x2c/0x80
[ 9063.536601]  __lock_sock+0x7d/0xc0
[ 9063.536605]  ? wait_woken+0x80/0x80
[ 9063.536607]  lock_sock_nested+0x64/0x70
[ 9063.536613]  getorigdst+0x59/0x230 [nf_conntrack_ipv4]
[ 9063.536617]  nf_getsockopt+0x4f/0x80
[ 9063.536619]  ip_getsockopt+0x81/0xc0
[ 9063.536622]  tcp_getsockopt+0x28/0x40
[ 9063.536624]  sock_common_getsockopt+0x1a/0x20
[ 9063.536625]  SyS_getsockopt+0x7d/0xe0
[ 9063.536629]  do_syscall_64+0x76/0x130
[ 9063.536631]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 9063.536633] RIP: 0033:0x7fbca66f7a0a
[ 9063.536635] RSP: 002b:7ffccfe718b8 EFLAGS: 0246 ORIG_RAX:
0037
[ 9063.536637] RAX: ffda RBX:  RCX:
7fbca66f7a0a
[ 9063.536638] RDX: 0050 RSI:  RDI:
0001
[ 9063.536639] RBP: 55c721ace048 R08: 7ffccfe718cc R09:
0001
[ 9063.536640] R10: 55c721ace048 R11: 0246 R12:

[ 9063.536641] R13: 0001 R14: 7ffccfe718cc R15:
00908002
[ 9184.370248] INFO: task haproxy:1234 blocked for more than 120 seconds.
[ 9184.370336]   Tainted: G C   4.16.0-041600rc1-generic
#201802120030
[ 9184.370457] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 9184.370583] haproxy D    0  1234   1232 0x
[ 9184.370586] Call Trace:
[ 9184.370595]  __schedule+0x297/0x870
[ 9184.370598]  schedule+0x2c/0x80
[ 9184.370601]  __lock_sock+0x7d/0xc0
[ 9184.370605]  ? wait_woken+0x80/0x80
[ 9184.370607]  lock_sock_nested+0x64/0x70
[ 9184.370614]  getorigdst+0x59/0x230 [nf_conntrack_ipv4]
[ 9184.370617]  nf_getsockopt+0x4f/0x80
[ 9184.370620]  ip_getsockopt+0x81/0xc0
[ 9184.370622]  tcp_getsockopt+0x28/0x40
[ 9184.370624]  sock_common_getsockopt+0x1a/0x20
[ 9184.370626]  SyS_getsockopt+0x7d/0xe0
[ 9184.370629]  do_syscall_64+0x76/0x130
[ 9184.370632]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 9184.370634] RIP: 0033:0x7fbca66f7a0a
[ 9184.370635] RSP: 002b:7ffccfe718b8 EFLAGS: 0246 ORIG_RAX:
0037
[ 9184.370637] RAX: ffda RBX:  RCX:
7fbca66f7a0a
[ 9184.370638] RDX: 0050 RSI:  RDI:
0001
[ 9184.370639] RBP: 55c721ace048 R08: 7ffccfe718cc R09:
0001
[ 9184.370640] R10: 55c721ace048 R11: 0246 R12:

[ 9184.370641] R13: 0001 R14: 7ffccfe718cc R15:
00908002
[ 9305.204082] INFO: task haproxy:1234 blocked for more than 120 seconds.
[ 9305.204170]   Tainted: G C   4.16.0-041600rc1-generic
#201802120030
[ 9305.204291] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 9305.204417] haproxy D    0  1234   1232 0x
[ 9305.204421] Call Trace:
[ 9305.204430]  __schedule+0x297/0x870
[ 9305.204433]  schedule+0x2c/0x80
[ 9305.204437]  __lock_sock+0x7d/0xc0
[ 9305.204441]  ? wait_woken+0x80/0x80
[ 9305.204443]  lock_sock_nested+0x64/0x70
[ 9305.204449]  getorigdst+0x59/0x230 [nf_conntrack_ipv4]
[ 9305.204452]  nf_getsockopt+0x4f/0x80
[ 9305.204455]  ip_getsockopt+0x81/0xc0
[ 9305.204457]  tcp_getsockopt+0x28/0x40
[ 9305.204459]  sock_common_getsockopt+0x1a/0x20
[ 9305.204461]  SyS_getsockopt+0x7d/0xe0
[ 9305.204464]  do_syscall_64+0x76/0x130
[ 9305.204467]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 9305.204469] RIP: 0033:0x7fbca66f7a0a
[ 9305.204470] RSP: 002b:7ffccfe718b8 EFLAGS: 0246 ORIG_RAX:
0037
[ 9305.204473] RAX: ffda RBX:  RCX:
7fbca66f7a0a
[ 9305.204474] RDX: 0050 RSI:  RDI:
0001
[ 9305.204475] RBP: 55c721ace048 R08: 7ffccfe718cc R09:
0001
[ 9305.204475] R10: 55c721ace048 R11: 0246 R12:

[ 9305.204476] R13: 0001 R14: 7ffccfe718cc R15:
00908002
[ 9426.037943] INFO: task haproxy:1234 blocked for more than 120 seconds.
[ 9426.038031]   Tainted: G C   4.16.0-041600rc1-generic
#201802120030
[ 9426.038151] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 9426.038277] haproxy D    0  1234   1232 0x
[ 9426.038281] Call Trace:
[ 9426.038291]  __schedule+0x297/0x870
[ 9426.038294]  schedule+0x2c/0x80
[ 9426.038298]  __lock_sock+0x7d/0xc0
[ 9426.038301]  ? wait_woken+0x80/0x80
[ 9426.038303]  lo

Re: What is a nice way to bypass the maintenance mode for certain IP's?

2018-03-06 Thread Pieter Vogelaar
Does use-server also accept some keyword to address the first server in the 
backend instead of a specific valid server name of the backend?

That would save quite a bit logic complexity in Puppet.
 
Best regards,
Pieter Vogelaar
 

Op 02-03-18 15:41 heeft Willy Tarreau  geschreven:

On Fri, Mar 02, 2018 at 01:51:42PM +, Pieter Vogelaar wrote:
> When I move force-persist to the backend, it indeed works.

Great, thanks for the feedback.

> From some other post I understand it's only possible to bypass the
> maintenance mode where stickiness is used?

Yes, or you can use the "use-server" directive which respects force-persist
as well. What I've seen a number of people use to test deployments was a
static page at a hidden URL where you had a list of all configured servers
and a radio button (or a link), allowing you to decide what server to 
connect
to. Then clicking on the server would set the cookie so that you can 
directly
try to connect there and see if it works or not. If it doesn't, you just 
have
to go back to the server selection page and try another one. I remember 
having
implemented one such page with extra stuff like setting/clearing the cookie,
setting/clearing a "force-persist" cookie that was used to decide whether to
go there forcing the access or as a regular user. Thus using stickiness it's
very convenient. But if it doesn't suit your needs well, use-server will let
you do exactly what you want.

Willy