Re: Segfault on haproxy 1.7.10 with state file and slowstart

2018-08-13 Thread Raghu Udiyar
Hi Willy, Baptiste

On Fri, 29 Jun 2018 at 08:12 Willy Tarreau  wrote:

> Hi Raghu,
> [...]
>
>[ALERT] 248/130258 (21960) : parsing [/etc/haproxy/test.cfg:53] :
> 'server bla' : no method found to resolve address '(null)'
>[ALERT] 248/130258 (21960) : Failed to initialize server(s) addr.
>
> According to Nenad :
>   "It's not a good way to fix the issue we were experiencing
>before. It will need a bigger rewrite, because the logic in
>srv_iterate_initaddr needs to be changed."
>
> Could you please try the example above in the commit message with and
> without
> your patch ? Maybe the problem is still there, or maybe the logic change
> mentioned by Nenad has already been done and this is no longer a problem,
> in which case we could take your patch this time.
>

Yes this is a problem - I missed to test this case. I have a proposal.

The init-addr option only makes sense when fqdn is used. But last method is
useful when using IP addresses. This patch enables
the last method to work by default when IP addresses are used. Can you
please review?

Thanks
Raghu


0001-MEDIUM-init-allow-init-addr-to-work-with-server-ip-a.patch
Description: Binary data


Re: Segfault on haproxy 1.7.10 with state file and slowstart

2018-06-28 Thread Raghu Udiyar
Hi Willy, Baptiste

I have been running this in production without any issues. Can you please
review, verified that this will apply on current devel as well.

On Fri, 6 Apr 2018 at 22:36 Willy Tarreau  wrote:

> Hi Raghu,
>
> On Wed, Mar 28, 2018 at 08:30:39PM +0000, Raghu Udiyar wrote:
> > Hi Willy, Baptiste
> >
> > It looks like this was the intended behaviour as per the documentation.
> But
> > it looks to be simple to enable the init-addr behaviour for IP addresses
> as
> > well. Can you please review the attached patch, this is against devel.
>
> Baptiste, I missed this one. Could you please take a look ?
>
> Thanks,
> Willy
>


Re: Segfault on haproxy 1.7.10 with state file and slowstart

2018-03-28 Thread Raghu Udiyar
Hi Willy, Baptiste

It looks like this was the intended behaviour as per the documentation. But
it looks to be simple to enable the init-addr behaviour for IP addresses as
well. Can you please review the attached patch, this is against devel.


0001-MEDIUM-init-allow-init-addr-to-work-with-server-ip-a.patch
Description: Binary data


Re: Segfault on haproxy 1.7.10 with state file and slowstart

2018-02-24 Thread Raghu Udiyar
Hi Baptiste

Were you able to look into this? A workaround is to use a dummy hostname,
and init-addr like so :

   server testserver13 dummyhost:9003 check init-addr last,10.0.19.10
   server testserver14 dummyhost:9003 check init-addr last,10.0.19.12

   # disabled placeholder
   server testserver15 dummyhost:9003 disabled check init-addr
last,169.254.0.9


This works as expected, but seems like a hack to me. And the blog post
should be updated to indicate this behaviour.


-- Raghu

On Mon, Jan 15, 2018 at 1:10 AM, Willy Tarreau  wrote:

> Hi Raghu,
>
> On Sun, Jan 14, 2018 at 05:59:16PM +0530, Raghu Udiyar wrote:
> > Hi Willy
> >
> > Yes, this patch works, no more segfaults.
> >
> > However, unrelated to this issue, while testing I noticed that the
> address
> > from the state file is not applied if the configuration does not use a
> > hostname.
> >
> > In the example I provided, the configuration originally has the IP
> address
> > set to 169.254.0.9, which is then updated through the socket to say
> 10.0.19.17.
>
> I noticed it as well but I thought I was the one doing something wrong.
> CCing Baptiste in case he remembers certain cases that have to be
> addressed a specific way by the state file. I remember this stuff was
> tested quite a bit so I suspect that it could depend on certain
> conditions.
>
> > Expectation was that upon haproxy restart the address in the state fill
> > will override the address in the configuration, but this does not happen.
> >
> > I see the relevant code in srv_init_addr :
> >
> >   if (srv->hostname)
> > return_code |= srv_iterate_initaddr(srv);
> >
> > As per the server-template example in the dynamic scaling blog post, this
> > should have been supported. Am I missing some configuration?
>
> I can't really tell, let's wait for Baptiste's opinion on this.
>
> Thanks,
> Willy
>


Re: Segfault on haproxy 1.7.10 with state file and slowstart

2018-01-14 Thread Raghu Udiyar
Hi Willy

Yes, this patch works, no more segfaults.

However, unrelated to this issue, while testing I noticed that the address
from the state file is not applied if the configuration does not use a
hostname.

In the example I provided, the configuration originally has the IP address
set to 169.254.0.9, which is then updated through the socket to say 10.0.19.17.
Expectation was that upon haproxy restart the address in the state fill
will override the address in the configuration, but this does not happen.

I see the relevant code in srv_init_addr :

  if (srv->hostname)
return_code |= srv_iterate_initaddr(srv);

As per the server-template example in the dynamic scaling blog post, this
should have been supported. Am I missing some configuration?

Thanks!
-- Raghu


-- Raghu

On Fri, Jan 12, 2018 at 4:14 PM, Willy Tarreau  wrote:

> Hello Raghu,
>
> On Thu, Jan 11, 2018 at 02:20:34PM +0530, Raghu Udiyar wrote:
> > Hello,
> >
> > Haproxy 1.7.10 segfaults when the srv_admin_state is set to
> > SRV_ADMF_CMAINT (0x04)
> > for a backend server, and that backend has the `slowstart` option set.
> >
> > The following configuration reproduces it :
> (...)
>
> Thanks for all the details, they made it easy to reproduce.
>
> From what I'm seeing, it's a fundamental design issue in the state
> file handling in 1.7. It starts checks before they have been
> initialized, and try to wake up a NULL task. In 1.8 due to the more
> dynamic config, the initialization sequence has changed and checks
> are initialized before parsing the state file, but I don't feel at
> ease with doing in 1.7 since I don't know if some config elements
> may remain non-updated.
>
> So instead I've just protected against using the task wakeups during
> the state file parsing, and they will be initialized later with the
> appropriate parameters.
>
> Could you please check the attached patch on top of 1.7.10 ?
>
> Thanks,
> Willy
>


Segfault on haproxy 1.7.10 with state file and slowstart

2018-01-11 Thread Raghu Udiyar
Hello,

Haproxy 1.7.10 segfaults when the srv_admin_state is set to
SRV_ADMF_CMAINT (0x04)
for a backend server, and that backend has the `slowstart` option set.

The following configuration reproduces it :

-
# haproxy.cfg (replace  below)

global
maxconn 3
user haproxy
group haproxy
server-state-file //servers.state

log-tag haproxy
nbproc 1
cpu-map 1 2
stats socket /run/haproxy.sock level admin
stats socket /run/haproxy_op.sock mode 666 level operator

defaults
mode http
option forwardfor

option dontlognull
option httplog
log 127.0.0.1 local1 debug

timeout connect 5s
timeout client 50s
timeout server 50s
timeout http-request 8s

load-server-state-from-file global
listen admin
bind *:9002
stats enable
stats auth haproxyadmin:xxx

frontend testserver
bind *:9000
option tcp-smart-accept
option splice-request
option splice-response
default_backend testservers

backend testservers
balance roundrobin
option tcp-smart-connect
option splice-request
option splice-response
timeout server 2s
timeout queue 2s
default-server maxconn 10 *slowstart 10s* weight 1
server testserver15 10.0.19.10:9003check
server testserver16 10.0.19.12:9003check

server testserver17 169.254.0.9:9003 disabledcheck
server testserver20 169.254.0.9:9003 disabledcheck


# servers.state file

1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state
srv_uweight srv_iweight srv_time_since_last_change srv_check_status
srv_check_result srv_check_health srv_check_state srv_agent_state
bk_f_forced_id srv_f_forced_id
4 testservers 1 testserver15 10.0.19.10 2 0 1 1 924 6 3 4 6 0 0 0
4 testservers 2 testserver16 10.0.19.12 2 0 1 1 924 6 3 4 6 0 0 0
4 testservers 3 testserver17 169.254.0.9 0 5 1 1 924 1 0 0 14 0 0 0
4 testservers 4 testserver20 10.0.19.17 0 *4* 1 1 454 6 3 4 6 0 0 0



The state *4* above for testserver20 causes the segfault, and only occurs
when slowstart is set.

The configuration check can reproduce it ie: haproxy -c -f haproxy.cfg

The backtrace :

(gdb) bt
#0  task_schedule (when=-508447097, task=0x0) at include/proto/task.h:244
#1  srv_clr_admin_flag (mode=SRV_ADMF_FMAINT, s=0x1fb0fd0) at
src/server.c:626
#2  srv_adm_set_ready (s=0x1fb0fd0) at include/proto/server.h:231
#3  srv_update_state (params=0x7ffe4f15e7d0, version=1, srv=0x1fb0fd0) at
src/server.c:2289
#4  apply_server_state () at src/server.c:2664
#5  0x0044b60f in init (argc=, argc@entry=4,
argv=,
argv@entry=0x7ffe4f160d38) at src/haproxy.c:975
#6  0x004491be in main (argc=4, argv=0x7ffe4f160d38) at
src/haproxy.c:1795


The way we use the state file is to have servers with `disabled` option in
the configuration; and during scaling update the backend address and mark
as active using the socket. The 169.254.0.9 address is a dummy address for
the disabled servers.

Can someone take a look? I couldn't find any related bugs fixed in 1.8.

Thanks
-- Raghu


Re: Herald - Generic and extensible agent-check service for Haproxy

2016-11-19 Thread Raghu Udiyar
On Thu, Nov 17, 2016 at 8:03 PM, Willy Tarreau  wrote:

> > Yes actually. When we started, herald was multi service, which is when we
> > realised its not possible to distinguish which backend agent request was
> > for. An "agent-send" feature that sends something like
> > "/\n" would be useful. Herald would require
> > minor modifications to support this.
>
> Right now haproxy already supports "agent-send" but you have to pass the
> string yourself in the configuration. So you can already update Herald
> to make use of this when configs support it. And later we may add a new
> directive so that users don't even have to configure the string anymore.
>

Ah, yes I see it in 1.7; I'll add this to Herald.


>
> > > Do you want me to add a link to the main haproxy page, and if so to
> > > what location ?
> > >
> >
> > Yes, I think the external link sections here is appropriate :
> > http://www.haproxy.org/#link
>
> Sure, that's what I intented, but I meant what should be the target of
> the link ? The github page, your article, anything else ?


Please link to the github page : https://github.com/helpshift/herald

Thanks,
Raghu


Re: Herald - Generic and extensible agent-check service for Haproxy

2016-11-13 Thread Raghu Udiyar
On Thu, Nov 10, 2016 at 4:16 PM, Willy Tarreau  wrote:

> Hello Raghu,
> > Blog post here :
> > https://engineering.helpshift.com/2016/herald-haproxy-
> loadfeedback-agent/
> >
> > Hope this is useful for the haproxy community.
>
> This is excellent, thank you for contributing this!
>

Thanks Willy!


> I'm having a few questions : did you miss anything from haproxy when
> doing this ? For example, did you have to workaround the impossibility
> to do something via the agent just because of the agent protocol or
> because of some validity checks performed by haproxy ? I'm asking
> because if we have to perform very minor tweaks, better do them before
> the release. Do you think you could benefit from "agent-send" so that
> the same agent is used for multiple servers, when haproxy would then
> indicate with the connection for what server it is connecting ? If so,
> would an automatic string such as "agent-send-name" sending the backend
> and the server name be an useful improvement for this ?
>

Yes actually. When we started, herald was multi service, which is when we
realised its not possible to distinguish which backend agent request was
for. An "agent-send" feature that sends something like
"/\n" would be useful. Herald would require
minor modifications to support this.


> Do you want me to add a link to the main haproxy page, and if so to
> what location ?
>

Yes, I think the external link sections here is appropriate :
http://www.haproxy.org/#link

Thanks,
Raghu


Herald - Generic and extensible agent-check service for Haproxy

2016-11-10 Thread Raghu Udiyar
Hello

We have developed a generic service (using python gevent) to serve as an
agent check service for Haproxy. Its extensible via plugins for application
feedback, and supports result cacheing, json expressions, arithmetic, regex
match, and fallback in case of failure.

The source is here : https://github.com/helpshift/herald

Blog post here :
https://engineering.helpshift.com/2016/herald-haproxy-loadfeedback-agent/

Hope this is useful for the haproxy community.

Thanks,
-- Raghu


[PATCH] BUG/MINOR: stats: fix missing comma in stats on agent drain

2016-02-05 Thread Raghu Udiyar
The csv stats format breaks when agent changes server state to drain.
Tools like hatop, metric or check agents will fail due to this. This
should be backported to 1.6.
---
 src/dumpstats.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/dumpstats.c b/src/dumpstats.c
index f43a5cc..b868ef5 100644
--- a/src/dumpstats.c
+++ b/src/dumpstats.c
@@ -3584,7 +3584,7 @@ static int stats_dump_sv_stats(struct stream_interface 
*si, struct proxy *px, in
[SRV_STATS_STATE_NOLB]  = "NOLB,",
[SRV_STATS_STATE_DRAIN_GOING_DOWN]  = "DRAIN 
%d/%d,",
[SRV_STATS_STATE_DRAIN] = "DRAIN,",
-   [SRV_STATS_STATE_DRAIN_AGENT]   = "DRAIN 
(agent)",
+   [SRV_STATS_STATE_DRAIN_AGENT]   = "DRAIN 
(agent),",
[SRV_STATS_STATE_NO_CHECK]  = "no check,"
};
 
-- 
2.7.0