Re: MySQL + Haproxy Question

2009-10-24 Thread Krzysztof Oledzki



On Sat, 24 Oct 2009, Joseph Hardeman wrote:


Hey Guys,

Hi,

I was wondering if there was a way to have Haproxy handle mysql requests.  I 
know that I can use the TCP option instead of HTTP and it will work, but I 
was wondering if anyone has a way to make haproxy send all requests for 
Select statements to a set of servers and all Insert, Updates, and Deletes to 
a master MySQL server.


I was just thinking about it and was wondering if this was possible and if 
anyone has done it.  If you have would you be willing to share how your setup 
is.


Currently, there is no MySQL support in HAProxy. However, you should try
MySQL_Proxy:
 http://forge.mysql.com/wiki/MySQL_Proxy

Best regards,

Krzysztof Olędzki

[PATCH] [RFC] Decrease server health based on http responses / events

2009-10-24 Thread Krzysztof Piotr Oledzki
Subject: [RFC] Decrease server health based on http responses / events

This RFC quality patch implements decreasing server health based on
observing communication between HAProxy and servers.

I have had a working patch for this for a long time, however I needed to
rewrite nearly everything to remove hardcoded values, add more modes and
to port it into 1.4. So after the rework there is nearly nothing left from
the old code. :| In the current status the code is expected to work but it
definitely needs more testing.

BTW: I'm not very happy with names of both functions and parameters,
If you have a better idea please don't hesitate to propose it. ;)

TODO: documentation, comments, pure tcp support.

diff --git a/include/common/defaults.h b/include/common/defaults.h
index b0aee86..ae2f65c 100644
--- a/include/common/defaults.h
+++ b/include/common/defaults.h
@@ -120,6 +120,9 @@
 #define DEF_CHECK_REQ   "OPTIONS / HTTP/1.0\r\n\r\n"
 #define DEF_SMTP_CHECK_REQ   "HELO localhost\r\n"
 
+#define DEF_HANA_ONERR HANA_ONERR_FAILCHK
+#define DEF_CELIMIT10
+
 // X-Forwarded-For header default
 #define DEF_XFORWARDFOR_HDR"X-Forwarded-For"
 
diff --git a/include/proto/checks.h b/include/proto/checks.h
index bd70164..2d16976 100644
--- a/include/proto/checks.h
+++ b/include/proto/checks.h
@@ -29,6 +29,7 @@ const char *get_check_status_description(short check_status);
 const char *get_check_status_info(short check_status);
 struct task *process_chk(struct task *t);
 int start_checks();
+void halth_analyze(struct server *s, short status);
 
 #endif /* _PROTO_CHECKS_H */
 
diff --git a/include/types/checks.h b/include/types/checks.h
index 1b04608..3690aa5 100644
--- a/include/types/checks.h
+++ b/include/types/checks.h
@@ -18,6 +18,9 @@ enum {
 
/* Below we have finished checks */
HCHK_STATUS_CHECKED,/* DUMMY STATUS */
+
+   HCHK_STATUS_HANA,   /* Detected enough consecutive errors */
+
HCHK_STATUS_SOCKERR,/* Socket error */
 
HCHK_STATUS_L4OK,   /* L4 check passed, for example tcp 
connect */
@@ -41,6 +44,39 @@ enum {
HCHK_STATUS_SIZE
 };
 
+enum {
+   HANA_UNKNOWN= 0,
+
+   HANA_TCP_OK,
+
+   HANA_HTTP_OK,
+   HANA_HTTP_STS,
+   HANA_HTTP_HDRRSP,
+   HANA_HTTP_RSP,
+
+   HANA_READ_ERROR,
+   HANA_READ_TIMEOUT,
+   HANA_BROKEN_PIPE,
+
+   HANA_SIZE
+};
+
+enum {
+   HANA_ONERR_UNKNOWN  = 0,
+
+   HANA_ONERR_FASTINTER,
+   HANA_ONERR_FAILCHK,
+   HANA_ONERR_SUDDTH,
+   HANA_ONERR_MARKDWN,
+};
+
+enum {
+   HANA_OBS_NONE   = 0,
+
+   HANA_OBS_EVENTS,
+   HANA_OBS_HTTP_RSPS,
+};
+
 struct check_status {
short result;   /* one of SRV_CHK_* */
char *info; /* human readable short info */
diff --git a/include/types/server.h b/include/types/server.h
index b3fe83d..b163190 100644
--- a/include/types/server.h
+++ b/include/types/server.h
@@ -115,7 +115,10 @@ struct server {
struct sockaddr_in check_addr;  /* the address to check, if 
different from  */
short check_port;   /* the port to use for the 
health checks */
int health; /* 0->rise-1 = bad; 
rise->rise+fall-1 = good */
+   int consecutive_errors; /* */
int rise, fall; /* time in iterations */
+   int consecutive_errors_limit;   /* */
+   short observe, onerror; /* */
int inter, fastinter, downinter;/* checks: time in milliseconds 
*/
int slowstart;  /* slowstart time in seconds 
(ms in the conf) */
int result; /* health-check result : 
SRV_CHK_* */
@@ -137,7 +140,7 @@ struct server {
unsigned down_time; /* total time the server was 
down */
time_t last_change; /* last time, when the state 
was changed */
struct timeval check_start; /* last health check start time 
*/
-   unsigned long check_duration;   /* time in ms took to finish 
last health check */
+   long check_duration;/* time in ms took to finish 
last health check */
short check_status, check_code; /* check result, check code */
char check_desc[HCHK_DESC_LEN]; /* healt check descritpion */
 
diff --git a/src/cfgparse.c b/src/cfgparse.c
index 428d7b9..889fc74 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -2541,6 +2541,8 @@ int cfg_parse_listen(const char *file, int linenum, char 
**args, int kwm)
newsrv->uweight = newsrv->iweight = 1;
newsrv->maxqueue = 0;
newsrv->slowstart = 0;
+   newsrv->onerror = DEF_HANA_ONERR;
+   newsrv->consecutive_errors_limit = DEF_CELIMIT;
 
cur_arg = 3;
while (

Re: MySQL + Haproxy Question

2009-10-24 Thread Joseph Hardeman

Hi Mariusz

Thats actually what I thought, but I wanted to ask to be sure. *S*  I am 
going to look into that solution again, the last time I tried it, many 
months ago now, I couldn't get it to work right and I would have to 
replace all of the libmysql* so files on my web servers. 


Thanks for the reply.

Joe

XANi wrote:

Hi
On Sat, 24 Oct 2009 16:01:26 -0400, Joseph Hardeman
 wrote:
  

Hey Guys,

I was wondering if there was a way to have Haproxy handle mysql 
requests.  I know that I can use the TCP option instead of HTTP and

it will work, but I was wondering if anyone has a way to make haproxy
send all requests for Select statements to a set of servers and all
Insert, Updates, and Deletes to a master MySQL server.

I was just thinking about it and was wondering if this was possible
and if anyone has done it.  If you have would you be willing to share
how your setup is.

U can't do that, u either have to use something like 
http://forge.mysql.com/wiki/MySQL_Proxy_RW_Splitting

or (better) rewrite ur app to split write and read requests

Regards
Mariusz
  


--
This message has been scanned for viruses by Colocube's AV Scanner



Re: MySQL + Haproxy Question

2009-10-24 Thread XANi
Hi
On Sat, 24 Oct 2009 16:01:26 -0400, Joseph Hardeman
 wrote:
> Hey Guys,
> 
> I was wondering if there was a way to have Haproxy handle mysql 
> requests.  I know that I can use the TCP option instead of HTTP and
> it will work, but I was wondering if anyone has a way to make haproxy
> send all requests for Select statements to a set of servers and all
> Insert, Updates, and Deletes to a master MySQL server.
> 
> I was just thinking about it and was wondering if this was possible
> and if anyone has done it.  If you have would you be willing to share
> how your setup is.
U can't do that, u either have to use something like 
http://forge.mysql.com/wiki/MySQL_Proxy_RW_Splitting
or (better) rewrite ur app to split write and read requests

Regards
Mariusz
-- 
Mariusz Gronczewski (XANi) 
GnuPG: 0xEA8ACE64
http://devrandom.pl



signature.asc
Description: PGP signature


MySQL + Haproxy Question

2009-10-24 Thread Joseph Hardeman

Hey Guys,

I was wondering if there was a way to have Haproxy handle mysql 
requests.  I know that I can use the TCP option instead of HTTP and it 
will work, but I was wondering if anyone has a way to make haproxy send 
all requests for Select statements to a set of servers and all Insert, 
Updates, and Deletes to a master MySQL server.


I was just thinking about it and was wondering if this was possible and 
if anyone has done it.  If you have would you be willing to share how 
your setup is.


Thanks

Joe

--
This message has been scanned for viruses by Colocube's AV Scanner




[PATCH 2/2] [MINOR] Collect & provide http response codes for frontends, fix backends

2009-10-24 Thread Krzysztof Piotr Oledzki
>From d529a265e60ea3366452329e2f462a0bd02d8e58 Mon Sep 17 00:00:00 2001
From: Krzysztof Piotr Oledzki 
Date: Sat, 24 Oct 2009 15:36:15 +0200
Subject: [MINOR] Collect & provide http response codes for frontends, fix 
backends

This patch extends and corrects the functionality introduced by
"Collect & provide http response codes received from servers":
 - responses are now also accounted for frontends
 - backend's and frontend's counters are incremented based
   on responses sent to client, not received from servers
---
 src/dumpstats.c  |   47 ++-
 src/proto_http.c |1 -
 src/session.c|   15 +++
 3 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/src/dumpstats.c b/src/dumpstats.c
index 1f0ae90..866f499 100644
--- a/src/dumpstats.c
+++ b/src/dumpstats.c
@@ -1278,16 +1278,33 @@ int stats_dump_proxy(struct session *s, struct proxy 
*px, struct uri_auth *uri)
 "Frontend"
 /* sessions rate : current, max, limit */
 "%s%s%s"
-/* sessions : current, max, limit, total, 
lbtot */
+/* sessions: current, max, limit */
 "%s%s%s"
-"%s"
-/* bytes : in, out */
-"%s%s"
+"id, px->id,
 U2H0(read_freq_ctr(&px->fe_sess_per_sec)),
 U2H1(px->counters.fe_sps_max), 
LIM2A2(px->fe_sps_lim, "-"),
-U2H3(px->feconn), 
U2H4(px->counters.feconn_max), U2H5(px->maxconn),
+U2H3(px->feconn), 
U2H4(px->counters.feconn_max), U2H5(px->maxconn));
+
+   /* http response (via td title): 1xx, 2xx, 3xx, 
4xx, 5xx, other */
+   if (px->mode == PR_MODE_HTTP) {
+   int i;
+
+   chunk_printf(&msg, " title=\"rsp 
codes:");
+
+   for (i = 1; i < 6; i++)
+   chunk_printf(&msg, " 
%dxx=%lld,", i, px->counters.p.http.rsp[i]);
+
+   chunk_printf(&msg, " other=%lld\"", 
px->counters.p.http.rsp[0]);
+   }
+
+   chunk_printf(&msg,
+/* sessions: total, lbtot */
+">%s"
+/* bytes : in, out */
+"%s%s"
+"",
 U2H6(px->counters.cum_feconn), 
U2H7(px->counters.bytes_in), U2H8(px->counters.bytes_out));
 
chunk_printf(&msg,
@@ -1329,10 +1346,7 @@ int stats_dump_proxy(struct session *s, struct proxy 
*px, struct uri_auth *uri)
 /* rate, rate_lim, rate_max */
 "%u,%u,%u,"
 /* check_status, check_code, 
check_duration */
-",,,"
-/* http response: 1xx, 2xx, 3xx, 4xx, 5xx, 
other */
-",,"
-"\n",
+",,,",
 px->id,
 px->feconn, px->counters.feconn_max, 
px->maxconn, px->counters.cum_feconn,
 px->counters.bytes_in, 
px->counters.bytes_out,
@@ -1343,6 +1357,21 @@ int stats_dump_proxy(struct session *s, struct proxy 
*px, struct uri_auth *uri)
 relative_pid, px->uuid, STATS_TYPE_FE,
 read_freq_ctr(&px->fe_sess_per_sec),
 px->fe_sps_lim, px->counters.fe_sps_max);
+
+   /* http response: 1xx, 2xx, 3xx, 4xx, 5xx, 
other */
+   if (px->mode == PR_MODE_HTTP) {
+   int i;
+
+   for (i=1; i<6; i++)
+   chunk_printf(&msg, "%lld,", 
px->counters.p.http.rsp[i]);
+
+   chunk_printf(&msg, "%lld,", 
px->counters.p.http.rsp[0]);
+   } else {
+   chunk_printf(&msg, ",,");
+   }
+
+   /* finish with EOL */
+   chunk_printf(&msg, "\n");
}
 
if (buffer_feed_chunk(rep, &msg) >= 0)
diff --git a/s

[PATCH 1/2] [MINOR] add additional "a href"s to stats page

2009-10-24 Thread Krzysztof Piotr Oledzki
>From e0d3f887ccd891548364c629e7a2db278e35082a Mon Sep 17 00:00:00 2001
From: Krzysztof Piotr Oledzki 
Date: Sat, 24 Oct 2009 14:24:30 +0200
Subject: [MINOR] add additional "a href"s to stats page

This patch adds  html links for proxies, frontends, servers
and backends. Once located, can be clicked. Users no longer have to
manually add #anchor to stat's url.
---
 src/dumpstats.c |   29 +
 1 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/src/dumpstats.c b/src/dumpstats.c
index 70b96b5..1f0ae90 100644
--- a/src/dumpstats.c
+++ b/src/dumpstats.c
@@ -989,6 +989,13 @@ int stats_dump_http(struct session *s, struct buffer *rep, 
struct uri_auth *uri)
 ".backup6  {background: #e0e0e0;}\n"
 ".rls  {letter-spacing: 0.2em; margin-right: 
1px;}\n" /* right letter spacing (used for grouping digits) */
 "\n"
+"a.px:link {color: #40; text-decoration: 
none;}"
+"a.px:visited {color: #40; text-decoration: 
none;}"
+"a.px:hover {color: #ff; text-decoration: 
none;}"
+"a.lfsb:link {color: #00; text-decoration: 
none;}"
+"a.lfsb:visited {color: #00; text-decoration: 
none;}"
+"a.lfsb:hover {color: #505050; text-decoration: 
none;}"
+"\n"
 "table.tbl { border-collapse: collapse; 
border-style: none;}\n"
 "table.tbl td { text-align: right; border-width: 
1px 1px 1px 1px; border-style: solid solid solid solid; padding: 2px 3px; 
border-color: gray; white-space: nowrap;}\n"
 "table.tbl td.ac { text-align: center;}\n"
@@ -1224,7 +1231,9 @@ int stats_dump_proxy(struct session *s, struct proxy *px, 
struct uri_auth *uri)
chunk_printf(&msg,
 "\n"
 ""
-"%s"
+""
+""
+"%s"
 "%s"
 "\n"
 "\n"
@@ -1247,7 +1256,7 @@ int stats_dump_proxy(struct session *s, struct proxy *px, 
struct uri_auth *uri)
 
"BckChkDwnDwntme"
 "Thrtle\n"
 "",
-px->id, px->id,
+px->id, px->id, px->id,
 px->desc ? "desc" : "empty", px->desc ? 
px->desc : "");
 
if (buffer_feed_chunk(rep, &msg) >= 0)
@@ -1265,7 +1274,8 @@ int stats_dump_proxy(struct session *s, struct proxy *px, 
struct uri_auth *uri)
chunk_printf(&msg,
 /* name, queue */
 ""
-"Frontend"
+""
+"Frontend"
 /* sessions rate : current, max, limit */
 "%s%s%s"
 /* sessions : current, max, limit, total, 
lbtot */
@@ -1274,7 +1284,7 @@ int stats_dump_proxy(struct session *s, struct proxy *px, 
struct uri_auth *uri)
 /* bytes : in, out */
 "%s%s"
 "",
-px->id,
+px->id, px->id,
 U2H0(read_freq_ctr(&px->fe_sess_per_sec)),
 U2H1(px->counters.fe_sps_max), 
LIM2A2(px->fe_sps_lim, "-"),
 U2H3(px->feconn), 
U2H4(px->counters.feconn_max), U2H5(px->maxconn),
@@ -1491,7 +1501,9 @@ int stats_dump_proxy(struct session *s, struct proxy *px, 
struct uri_auth *uri)
   "no 
check" };
chunk_printf(&msg,
 /* name */
-"%s"
+""
+""
+"%s"
 /* queue : current, max, limit */
 "%s%s%s"
 /* sessions rate : current, max, limit */
@@ -1501,7 +1513,7 @@ int stats_dump_proxy(struct session *s, struct proxy *px, 
struct uri_auth *uri)
 "state & SRV_BACKUP) ? "backup" : 
"active",
-sv_state, px->id, sv-

Re: [PATCH] [MINOR] Add "a name" to stats page

2009-10-24 Thread Krzysztof Olędzki

On 2009-10-24 09:54, Willy Tarreau wrote:

On Thu, Oct 22, 2009 at 10:49:59PM +0200, Krzysztof Piotr Oledzki wrote:

>From ad5198f6e8c143b0f070d98d64b507d343d697fa Mon Sep 17 00:00:00 2001
From: Krzysztof Piotr Oledzki 
Date: Thu, 22 Oct 2009 22:48:09 +0200
Subject: [MINOR] Add "a name" to stats page

If you have a lot of proxies/servers in your stats page it is
not easy to locate the one you are interested in. You can
of couse use search function from you favorite web browser
but browsers often lost their focus when reloading stats.

This patch adds  html tags for proxies, frontends, servers
and backends. You can use it to access a specific place, for example:


Simple and efficient, I like the idea. I'm merging the patch. We can
even improve it by adding a link on each front/back name that references
itself, so that users don't have to manually add the #proxy/Frontend on
the URL.


Good idea, I'll do it.

Best regards,

Krzysztof Olędzki



Re: Query Regarding the HAProxy and TCP

2009-10-24 Thread Willy Tarreau
Hi,

On Sat, Oct 24, 2009 at 12:41:39AM +0200, XANi wrote:
(...)
> > As all the 16 messages are sent on the same connection (Connection is
> > made only once in the TCP Client), the HAProxy is not re routing the
> > remaining message to another node.
> > 
> > Is this scenario achievable through HAProxy?
> > 
> > And my next query is that, as I mentioned that my Client creates only
> > one Connection and sends Business oriented logical messages, for the
> > rest of the life time of the client, and can that traffic be load
> > balanced?
> > 
> >  
> > 
> > Eagerly waiting for your reply with comments and solutions if
> > applicable.
> AFAIK collectd knows nothing about whats goin on inside non-HTTP TCP
> connections (it doesn't know if next packet is another "request"), for
> that to work you would have to use one-transation-per-connection or
> do heavy rewrite of haproxy to support your application protocol.

Exacly. As Mariusz explains it, there is no way to know what part of
your stream may be extracted and sent anywhere else. TCP does not
transport messages, it transports a continuous stream. If you use it
for messages, it means you have a specific protocol for this. You
could very well try to implement support for it into haproxy, but it
could take some energy and will only push the problem one bit further
and never solve the issue completely, because from your description,
I understand that the client does not correctly cope with transmission
errors.
 
Regards,
Willy




Re: Servers seen as going up and down using option httpchk

2009-10-24 Thread Willy Tarreau
Hello,

On Fri, Oct 23, 2009 at 04:34:49PM -0400, Michael Kushnir wrote:
> Hello,
> 
> I have the following configuration for my webfarm:
> 
> 7 servers, all running Centos 5. Two (LB1 and LB2) are HAproxy load
> balancers, Web1-5 are Apache2 and Lighttpd web servers (I tried both
> as a method of elimination test on the side of the web servers). All
> of the servers are dual xeon Dell PE1850s with 4GB of ram, or better.
> I am using HAproxy 1.3.15.2. The webfarm is used for adserving using
> OpenX.
> 
> Here is the problem I am having:
> 
> Once I get in the neighborhood of 1000 sessions the stats page on the
> active LB starts showing that the web servers are going up and down
> randomly (all servers even those set as backup and not getting any
> traffic). When I monitor the servers in real time and check on the
> stats page of the standby LB (the hot spare that still monitors
> servers) everything looks fine and all servers are running green with
> no problems. As a result of HAproxy thinking that servers are going up
> and down, clients are getting errors.
> 
> The apache servers are set up in prefork mode with a ServerLimit at
> 2000, MaxClients at 2000, and MaxRequestsPerChild at 4000.

can you check that you have enough processes started when the problem
happens ? I'm asking, because apache takes a very long time to start
additional processes. As a workaround, you can set StartServers to
your desired value (and I believe MinSpareServers too, though I'm
not certain).

> The server
> load per server has never gotten to 2000, and there is plenty of CPU
> and Ram to spare on each machine even during heavy load.
> 
> I am guessing that the issue has to do with the active HAproxy server
> running out of something or other and losing the ability to poll the
> web servers under this load. There is barely any usage of CPU or Ram
> on the LB itself, so I don't think its a hardware issue. Below is my
> HAproxy config file. Please note that because this is an ad server
> distribution system, I don't need the proxy to keep track of sessions
> or send the same user to the same web server as each request is a full
> self contained operation.

It is very possible that you're running out of file descriptors on your
haproxy process. Also, do you have ip_conntrack / nf_conntrack loaded
on there ? Maybe you have a wrong setting limiting the number of concurrent
connections ? "dmesg" should tell you in this case.

> global
> log 127.0.0.1   local0 notice
> #log 127.0.0.1   local1 notice
> #log loghostlocal0 info
> maxconn 2
> #debug
> #quiet
> ulimit-n 25000
> user root   < set because lit says ulimit-n above requires
> group root  < set because lit says ulimit-n above requires

This is not required. You need to *start* the process as root, but
it sets the ulimit before dropping privileges. So you can safely
use another user/group setting here.

(...)
> listen webfarm
>bind :80
>mode http
>stats enable
>stats auth ***:***
>balance roundrobin
>#cookie SERVERID insert indirect nocache
>option httpclose
>option forwardfor
>option httpchk GET /check.txt HTTP/1.0
>   #server web1..com ***.***.***.***:80 weight 1 check maxconn
> 200 inter 1000 rise 2 fall 5 <--- commented out, reserved for admin
> interface
>   server web2..com ***.***.***.***:80 weight 3 check maxconn
> 3000 inter 1000 rise 2 fall 5 backup
>   server web3..com ***.***.***.***:80 weight 3 check maxconn
> 3000 inter 1000 rise 2 fall 5
>   server web4..com ***.***.***.***:80  weight 3 check maxconn
> 3000 inter 1000 rise 2 fall 5
> 
> and so forth, same for the rest of the servers...
> 
> I have tried using HTTP/1.1 for the httpchk option, but that results
> in all servers being shown as down in the stats. I have also tried
> varying the inter variable from 500 to 4000 with no change in
> behavior. Please let me know if you can suggest something. I am
> guessing some operating system variables need to be tweaked.

You should try to disable "option httpchk", and see if it makes any
difference. If it does, it means that apache is not responding to the
request (most likely not enough processes started). If it does not
change anything, it is very possible that you have trouble creating
a new outgoing connection for one of the reasons above. Then you should
not try to hide the issue using larger intervals because it means that
your production traffic is affected by the issue too.

By the way, please check your logs for connection failures or response
timeouts.

If you're interested, in version 1.4-dev, there is a new feature which
tells you on the stats page where a check failed (L4, L7, ...). It can
help in circumstances like yours.

I'm thinking about something else. I suppose you have a second LB serving
as a backup. Could you check if it sees failed checks too ? This will t

Re: [PATCH] [MINOR] Add "a name" to stats page

2009-10-24 Thread Willy Tarreau
On Thu, Oct 22, 2009 at 10:49:59PM +0200, Krzysztof Piotr Oledzki wrote:
> >From ad5198f6e8c143b0f070d98d64b507d343d697fa Mon Sep 17 00:00:00 2001
> From: Krzysztof Piotr Oledzki 
> Date: Thu, 22 Oct 2009 22:48:09 +0200
> Subject: [MINOR] Add "a name" to stats page
> 
> If you have a lot of proxies/servers in your stats page it is
> not easy to locate the one you are interested in. You can
> of couse use search function from you favorite web browser
> but browsers often lost their focus when reloading stats.
> 
> This patch adds  html tags for proxies, frontends, servers
> and backends. You can use it to access a specific place, for example:

Simple and efficient, I like the idea. I'm merging the patch. We can
even improve it by adding a link on each front/back name that references
itself, so that users don't have to manually add the #proxy/Frontend on
the URL.

Thanks,
Willy




Re: HAPROXY in zLinux is presenting Segmentation fault

2009-10-24 Thread Willy Tarreau
Hi alexandre,

On Thu, Oct 22, 2009 at 01:52:05PM +, alexandre oliveira wrote:
> 
> Willy, I did what you have suggested.

thanks.

(...)
> holb001:~/haproxy-1.3.22 # haproxy -vv
> HA-Proxy version 1.3.22 2009/10/14
> Copyright 2000-2009 Willy Tarreau 
> 
> Build options :
>   TARGET  = linux26
>   CPU = generic
>   CC  = gcc
>   CFLAGS  = -O2 -g
>   OPTIONS =
> 
> Default settings :
>   maxconn = 2000, maxpollevents = 200
> 
> Available polling systems :
>  sepoll : pref=400,  test result OK
>   epoll : pref=300,  test result OK
>poll : pref=200,  test result OK
>  select : pref=150,  test result OK
> Total: 4 (4 usable), will use sepoll.

OK pretty much common.

> holb001:~ # uname -a
> Linux holb001 2.6.16.60-0.37_f594963d-default #1 SMP Mon Mar 23 13:39:48 UTC 
> 2009 s390x s390x s390x GNU/Linux

Less common ;-)

(...)
> # Ive started haproxy and did a test. The result is as follow:
> holb001:~/haproxy-1.3.22 # haproxy -f /etc/haproxy/haproxy.cfg -db
> Available polling systems :
>  sepoll : pref=400,  test result OK
>   epoll : pref=300,  test result OK
>poll : pref=200,  test result OK
>  select : pref=150,  test result OK
> Total: 4 (4 usable), will use sepoll.
> Using sepoll() as the polling mechanism.
> :uat.accept(0005)=0007 from [192.168.0.10:4047]
> 0001:uat.accept(0005)=0009 from [192.168.0.10:4048]
> 0002:uat.accept(0005)=000b from [192.168.0.10:4049]
> 0003:uat.accept(0005)=000d from [192.168.0.10:4050]
> 0004:uat.accept(0005)=000f from [192.168.0.10:4051]
> 0001:uat.srvcls[0009:000a]
> 0001:uat.clicls[0009:000a]
> 0001:uat.closed[0009:000a]
> :uat.srvcls[0007:0008]
> :uat.clicls[0007:0008]
> :uat.closed[0007:0008]
> Segmentation fault

Pretty fast to die... I really don't like that at all, that makes
me think about some uninitialized which has a visible effect on
your arch only.

> Remeber that this server is a zLinux, I mean, it runs under a mainframe.

yes, but that's not an excuse for crashing. Do you have gdb on this
machine ? Would it be possible then to run haproxy inside gdb and
check where it dies, and with what variables, pointers, etc... ?

> Suggestions?

Oh yes I'm thinking about something. Could you send your process
a SIGQUIT while it's waiting for a connection ? This will dump all
the memory pools, and we'll see if some of them are merged. It is
possible that some pointers are initialized and never overwritten
on other archs, but reused on yours due to different structure sizes.
This happened once already. So just do "killall -QUIT haproxy" and
send the output. It should look like this :

Dumping pools usage.
  - Pool pipe (16 bytes) : 0 allocated (0 bytes), 0 used, 2 users [SHARED]
  - Pool capture (64 bytes) : 0 allocated (0 bytes), 0 used, 1 users [SHARED]
  - Pool task (80 bytes) : 0 allocated (0 bytes), 0 used, 1 users [SHARED]
  - Pool hdr_idx (416 bytes) : 0 allocated (0 bytes), 0 used, 2 users [SHARED]
  - Pool session (816 bytes) : 0 allocated (0 bytes), 0 used, 1 users [SHARED]
  - Pool requri (1024 bytes) : 0 allocated (0 bytes), 0 used, 1 users [SHARED]
  - Pool buffer (32864 bytes) : 0 allocated (0 bytes), 0 used, 1 users [SHARED]
Total: 7 pools, 0 bytes allocated, 0 used.

Thanks !
Willy