Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
On Sat, Nov 21, 2015 at 01:21:16AM +0100, PiBa-NL wrote: > Hi Simon, > > Ok, ill try and see if i can add a config setting for it. And move the > whitespace changes to a separate patch. > Will take me some time.. And for what it's worth, I was about to make the same comments as Simon. Please however split the two : - 1 patch to change the default timeout that we can backport - another patch to make it configurable that we will not backport. I think your choice of 10s for the default timeout is reasonable. We had to start with something, the check interval made sense given that we reused that code, but it turns out that the default interval is 2s and is too short for this specific use. Thanks, Willy
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Hi Simon, Ok, ill try and see if i can add a config setting for it. And move the whitespace changes to a separate patch. Will take me some time.. Thanks for your feedback, PiBa-NL Op 21-11-2015 om 0:30 schreef Simon Horman: On Fri, Nov 20, 2015 at 11:58:19PM +0100, PiBa-NL wrote: Hi Willy, Op 16-11-2015 om 19:57 schreef Willy Tarreau: I agree with you since we don't know the timeout value nor what it applies to (connection or anything). Thus I think that we should first find and change that value, and maybe after that take your patch to permit real retries in case of connection failures. Thanks, Willy Alright found the timeout, which was actually marked with a rather clear 'warning message'. Perhaps attached patch could be applied to increase that timeout to 10 seconds? Should a similar 'warning' still be added to say this allows for some retransmits? (In my test it sends 4 SYN packets if it cannot connect at all.) Or should it be approached in a different way? Perhaps as a configuration option on the mailers section? I think it should be configurable, the mailers section sounds logical to me. Thanks, PiBa-NL >From eaf95bea0af6aa3b553a6e038997b5d339b507da Mon Sep 17 00:00:00 2001 From: Pieter Baauw Date: Fri, 20 Nov 2015 23:39:48 +0100 Subject: [PATCH] mailer: set longer timeout on mail alert to allow for a few tcp retransmits --- include/common/defaults.h | 9 + src/checks.c | 4 +--- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/include/common/defaults.h b/include/common/defaults.h index d1994e8..b0d7c07 100644 --- a/include/common/defaults.h +++ b/include/common/defaults.h @@ -144,10 +144,11 @@ #define CONN_RETRIES3 -#define CHK_CONNTIME2000 -#defineDEF_CHKINTR 2000 -#define DEF_FALLTIME3 -#define DEF_RISETIME2 +#define CHK_CONNTIME 2000 +#define DEF_CHKINTR 2000 +#define DEF_MAILALERTTIME 1 +#define DEF_FALLTIME 3 +#define DEF_RISETIME 2 #define DEF_AGENT_FALLTIME1 #define DEF_AGENT_RISETIME1 #define DEF_CHECK_REQ "OPTIONS / HTTP/1.0\r\n" I would prefer if the whitespace changes, that is all changes other than the line that adds DEF_MAILALERTTIME, were not part of this patch. diff --git a/src/checks.c b/src/checks.c index e77926a..cecd7ed 100644 --- a/src/checks.c +++ b/src/checks.c @@ -3108,9 +3108,7 @@ static int init_email_alert_checks(struct server *s) LIST_INIT(&q->email_alerts); - check->inter = DEF_CHKINTR; /* XXX: Would like to Skip to the next alert, if any, ASAP. -* But need enough time so that timeouts don't occur -* during tcp check procssing. For now just us an arbitrary default. */ + check->inter = DEF_MAILALERTTIME; I would prefer if the comment was retained, it is still valid. check->rise = DEF_AGENT_RISETIME; check->fall = DEF_AGENT_FALLTIME; err_str = init_check(check, PR_O2_TCPCHK_CHK); -- 1.9.5.msysgit.1
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
On Fri, Nov 20, 2015 at 11:58:19PM +0100, PiBa-NL wrote: > Hi Willy, > > Op 16-11-2015 om 19:57 schreef Willy Tarreau: > >I agree with you since we don't know the timeout value nor what it applies > >to (connection or anything). Thus I think that we should first find and > >change that value, and maybe after that take your patch to permit real > >retries in case of connection failures. > > > >Thanks, > >Willy > Alright found the timeout, which was actually marked with a rather clear > 'warning message'. > Perhaps attached patch could be applied to increase that timeout to 10 > seconds? Should a similar 'warning' still be added to say this allows for > some retransmits? (In my test it sends 4 SYN packets if it cannot connect at > all.) > > Or should it be approached in a different way? Perhaps as a configuration > option on the mailers section? I think it should be configurable, the mailers section sounds logical to me. > Thanks, > PiBa-NL > >From eaf95bea0af6aa3b553a6e038997b5d339b507da Mon Sep 17 00:00:00 2001 > From: Pieter Baauw > Date: Fri, 20 Nov 2015 23:39:48 +0100 > Subject: [PATCH] mailer: set longer timeout on mail alert to allow for a few > tcp retransmits > > --- > include/common/defaults.h | 9 + > src/checks.c | 4 +--- > 2 files changed, 6 insertions(+), 7 deletions(-) > > diff --git a/include/common/defaults.h b/include/common/defaults.h > index d1994e8..b0d7c07 100644 > --- a/include/common/defaults.h > +++ b/include/common/defaults.h > @@ -144,10 +144,11 @@ > > #define CONN_RETRIES3 > > -#define CHK_CONNTIME2000 > -#define DEF_CHKINTR 2000 > -#define DEF_FALLTIME3 > -#define DEF_RISETIME2 > +#define CHK_CONNTIME 2000 > +#define DEF_CHKINTR 2000 > +#define DEF_MAILALERTTIME 1 > +#define DEF_FALLTIME 3 > +#define DEF_RISETIME 2 > #define DEF_AGENT_FALLTIME1 > #define DEF_AGENT_RISETIME1 > #define DEF_CHECK_REQ "OPTIONS / HTTP/1.0\r\n" I would prefer if the whitespace changes, that is all changes other than the line that adds DEF_MAILALERTTIME, were not part of this patch. > diff --git a/src/checks.c b/src/checks.c > index e77926a..cecd7ed 100644 > --- a/src/checks.c > +++ b/src/checks.c > @@ -3108,9 +3108,7 @@ static int init_email_alert_checks(struct server *s) > > LIST_INIT(&q->email_alerts); > > - check->inter = DEF_CHKINTR; /* XXX: Would like to Skip to the > next alert, if any, ASAP. > - * But need enough time so that > timeouts don't occur > - * during tcp check procssing. For > now just us an arbitrary default. */ > + check->inter = DEF_MAILALERTTIME; I would prefer if the comment was retained, it is still valid. > check->rise = DEF_AGENT_RISETIME; > check->fall = DEF_AGENT_FALLTIME; > err_str = init_check(check, PR_O2_TCPCHK_CHK); > -- > 1.9.5.msysgit.1 >
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Hi Willy, Op 16-11-2015 om 19:57 schreef Willy Tarreau: I agree with you since we don't know the timeout value nor what it applies to (connection or anything). Thus I think that we should first find and change that value, and maybe after that take your patch to permit real retries in case of connection failures. Thanks, Willy Alright found the timeout, which was actually marked with a rather clear 'warning message'. Perhaps attached patch could be applied to increase that timeout to 10 seconds? Should a similar 'warning' still be added to say this allows for some retransmits? (In my test it sends 4 SYN packets if it cannot connect at all.) Or should it be approached in a different way? Perhaps as a configuration option on the mailers section? Thanks, PiBa-NL >From eaf95bea0af6aa3b553a6e038997b5d339b507da Mon Sep 17 00:00:00 2001 From: Pieter Baauw Date: Fri, 20 Nov 2015 23:39:48 +0100 Subject: [PATCH] mailer: set longer timeout on mail alert to allow for a few tcp retransmits --- include/common/defaults.h | 9 + src/checks.c | 4 +--- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/include/common/defaults.h b/include/common/defaults.h index d1994e8..b0d7c07 100644 --- a/include/common/defaults.h +++ b/include/common/defaults.h @@ -144,10 +144,11 @@ #define CONN_RETRIES3 -#defineCHK_CONNTIME2000 -#defineDEF_CHKINTR 2000 -#define DEF_FALLTIME3 -#define DEF_RISETIME2 +#define CHK_CONNTIME 2000 +#define DEF_CHKINTR 2000 +#define DEF_MAILALERTTIME 1 +#define DEF_FALLTIME 3 +#define DEF_RISETIME 2 #define DEF_AGENT_FALLTIME1 #define DEF_AGENT_RISETIME1 #define DEF_CHECK_REQ "OPTIONS / HTTP/1.0\r\n" diff --git a/src/checks.c b/src/checks.c index e77926a..cecd7ed 100644 --- a/src/checks.c +++ b/src/checks.c @@ -3108,9 +3108,7 @@ static int init_email_alert_checks(struct server *s) LIST_INIT(&q->email_alerts); - check->inter = DEF_CHKINTR; /* XXX: Would like to Skip to the next alert, if any, ASAP. -* But need enough time so that timeouts don't occur -* during tcp check procssing. For now just us an arbitrary default. */ + check->inter = DEF_MAILALERTTIME; check->rise = DEF_AGENT_RISETIME; check->fall = DEF_AGENT_FALLTIME; err_str = init_check(check, PR_O2_TCPCHK_CHK); -- 1.9.5.msysgit.1
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Hi Pieter, On Mon, Nov 16, 2015 at 07:14:33PM +0100, PiBa-NL wrote: > >>But i think thats not so much a problem, it does still make > >>me wonder a little what happens if a packet is lost in the middle of a > >>tcp connection, will it resend like a normal tcp connection? Its > >>difficult to test though.. > >Haproxy doesn't affect how TCP works. We never see packets, the TCP stack > >guarantees a reliable connection below us. > But without the patch only 1 SYN packet is send, shouldn't the normal > tcp stack always send 3 SYN packets when a connection is not getting > established? Absolutely, so my guess is that it's purely a timeout issue, probably that the default mailer timeout is too short to allow a retransmit to be sent. The default SYN retransmit to an unknown target is 3s usually, maybe we're running with a 3s (or lower) timeout here ? I don't know how they're set, I haven't looked at this part recently I must confess. > >>If you can apply the v2 patch i think that solves most issues that one > >>or two lost packets can cause in any case. > >Now I'm having doubts, because I think your motivation to develop this > >patch was more related to some fears of dropped *packets* (that are > >already properly handled below us) than any real world issue. > My initial motivation was to get 3 SYN packets, but if the tcp > connection is handled through the normal stack i don't understand why > only 1 is being send without the patch, and because only 1 syn packet is > being send i am not sure other ack / push-ack packets would be retried > by the normal stack either.?.(i have not yet tested if that is the case) I agree with you since we don't know the timeout value nor what it applies to (connection or anything). Thus I think that we should first find and change that value, and maybe after that take your patch to permit real retries in case of connection failures. Thanks, Willy
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Hi Willy, Op 16-11-2015 om 7:20 schreef Willy Tarreau: Hi Pieter, On Mon, Nov 16, 2015 at 12:13:50AM +0100, PiBa-NL wrote: -but check->conn->flags & 0xFF is a bit of s guess from observing the flags when it could connect but the server did not respond properly.. is there a other better way? This one is ugly. First, you should never use the numeric values when there are defines or enums because if these flags happen to change or move to something else, your value will not be spotted and will not be converted. Agreed it was ugly, but i could not find the enums based equivalent for that value.. Now i think its only checking 1 bit of it. But that seems to work alright to. You could have ORed all the respective flags but even so it didn't really make sense to have all of them. Thus I'm attaching two proposals that I'd like you to test, the first one verifies if the connection was established or not. The second one checks if we've reached the last rule (implying all of them were executed). From my tests both work as you describe. v1 one retries the connection part, v2 also retries if the mail sending did not complete normally. I think v2 would be the preferred solution. OK fine, thanks for the test. Though looking through my tcpdumps again i do see it tries to connect with 3 different client ports thats not how a normal tcp socket would retry right? Do not confuse haproxy and the tcp stack, that's important. Dropped *packets* are dealt with by the TCP stack which retransmits themm over the same connection, thus the same ports. When haproxy retries, it does not manipulate packets, it retries failed connections, ie the ones that TCP failed to fix (eg: multiple dropped packets at the TCP level causing a connection to fail for whatever reason, such as a blocked source port or a dead link requiring a new connection to be attempted via a different path. But i think thats not so much a problem, it does still make me wonder a little what happens if a packet is lost in the middle of a tcp connection, will it resend like a normal tcp connection? Its difficult to test though.. Haproxy doesn't affect how TCP works. We never see packets, the TCP stack guarantees a reliable connection below us. But without the patch only 1 SYN packet is send, shouldn't the normal tcp stack always send 3 SYN packets when a connection is not getting established? If you can apply the v2 patch i think that solves most issues that one or two lost packets can cause in any case. Now I'm having doubts, because I think your motivation to develop this patch was more related to some fears of dropped *packets* (that are already properly handled below us) than any real world issue. My initial motivation was to get 3 SYN packets, but if the tcp connection is handled through the normal stack i don't understand why only 1 is being send without the patch, and because only 1 syn packet is being send i am not sure other ack / push-ack packets would be retried by the normal stack either.?.(i have not yet tested if that is the case) Could you please confirm exactly what case you wanted to cover here ? Thanks, Willy Regards PiBa-NL
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Hi Pieter, On Mon, Nov 16, 2015 at 12:13:50AM +0100, PiBa-NL wrote: > >-but check->conn->flags & 0xFF is a bit of s guess from observing the > >flags when it could connect but the server did not respond > >properly.. is there a other better way? > >This one is ugly. First, you should never use the numeric values > >when there are defines or enums because if these flags happen to > >change or move to something else, your value will not be spotted > >and will not be converted. > Agreed it was ugly, but i could not find the enums based equivalent for > that value.. Now i think its only checking 1 bit of it. But that seems > to work alright to. You could have ORed all the respective flags but even so it didn't really make sense to have all of them. > >Thus I'm attaching two proposals that I'd like you to test, the > >first one verifies if the connection was established or not. The > >second one checks if we've reached the last rule (implying all > >of them were executed). > From my tests both work as you describe. > v1 one retries the connection part, v2 also retries if the mail sending > did not complete normally. > I think v2 would be the preferred solution. OK fine, thanks for the test. > Though looking through my tcpdumps again i do see it tries to connect > with 3 different client ports thats not how a normal tcp socket would > retry right? Do not confuse haproxy and the tcp stack, that's important. Dropped *packets* are dealt with by the TCP stack which retransmits themm over the same connection, thus the same ports. When haproxy retries, it does not manipulate packets, it retries failed connections, ie the ones that TCP failed to fix (eg: multiple dropped packets at the TCP level causing a connection to fail for whatever reason, such as a blocked source port or a dead link requiring a new connection to be attempted via a different path. > But i think thats not so much a problem, it does still make > me wonder a little what happens if a packet is lost in the middle of a > tcp connection, will it resend like a normal tcp connection? Its > difficult to test though.. Haproxy doesn't affect how TCP works. We never see packets, the TCP stack guarantees a reliable connection below us. > If you can apply the v2 patch i think that solves most issues that one > or two lost packets can cause in any case. Now I'm having doubts, because I think your motivation to develop this patch was more related to some fears of dropped *packets* (that are already properly handled below us) than any real world issue. Could you please confirm exactly what case you wanted to cover here ? Thanks, Willy
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Hi Willy, Op 15-11-2015 om 8:48 schreef Willy Tarreau: Pieter, I'm just seeing this part in your description while merging the patch : On Sun, Nov 08, 2015 at 07:19:21PM +0100, PiBa-NL wrote: HOWEVER. -i have not checked for memoryleaks, sockets not being closed properly (i dont know how to..) I'm not worried for this one, at first glance it looks OK. -is setting current and last steps to null the proper way to reset the step of rule evaluation? Yes that's apparently it. -CO_FL_ERROR is set when there is a connection error.. this seems to be the proper check. Indeed. -but check->conn->flags & 0xFF is a bit of s guess from observing the flags when it could connect but the server did not respond properly.. is there a other better way? This one is ugly. First, you should never use the numeric values when there are defines or enums because if these flags happen to change or move to something else, your value will not be spotted and will not be converted. Agreed it was ugly, but i could not find the enums based equivalent for that value.. Now i think its only checking 1 bit of it. But that seems to work alright to. Second, normally you would just need "!CONNECTED", as this flag is set once the connection is established. Note that I understood from the commit message that you wanted to cover from connection issues, but the code makes me think that you wanted to retry even if the connection was properly established but the mail could not be completely delivered. I was thinking the state might include 'more' than only !CONNECTED. Thus I'm attaching two proposals that I'd like you to test, the first one verifies if the connection was established or not. The second one checks if we've reached the last rule (implying all of them were executed). From my tests both work as you describe. v1 one retries the connection part, v2 also retries if the mail sending did not complete normally. I think v2 would be the preferred solution. Though looking through my tcpdumps again i do see it tries to connect with 3 different client ports thats not how a normal tcp socket would retry right? But i think thats not so much a problem, it does still make me wonder a little what happens if a packet is lost in the middle of a tcp connection, will it resend like a normal tcp connection? Its difficult to test though.. If you can apply the v2 patch i think that solves most issues that one or two lost packets can cause in any case. Thanks, Willy Thanks, PiBa-NL
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Pieter, I'm just seeing this part in your description while merging the patch : On Sun, Nov 08, 2015 at 07:19:21PM +0100, PiBa-NL wrote: > >>>HOWEVER. > >>>-i have not checked for memoryleaks, sockets not being closed properly > >>>(i dont know how to..) I'm not worried for this one, at first glance it looks OK. > >>>-is setting current and last steps to null the proper way to reset the > >>>step of rule evaluation? Yes that's apparently it. > >>>-CO_FL_ERROR is set when there is a connection error.. this seems to be > >>>the proper check. Indeed. > >>>-but check->conn->flags & 0xFF is a bit of s guess from observing the > >>>flags when it could connect but the server did not respond > >>>properly.. is there a other better way? This one is ugly. First, you should never use the numeric values when there are defines or enums because if these flags happen to change or move to something else, your value will not be spotted and will not be converted. Second, normally you would just need "!CONNECTED", as this flag is set once the connection is established. Note that I understood from the commit message that you wanted to cover from connection issues, but the code makes me think that you wanted to retry even if the connection was properly established but the mail could not be completely delivered. Thus I'm attaching two proposals that I'd like you to test, the first one verifies if the connection was established or not. The second one checks if we've reached the last rule (implying all of them were executed). Thanks, Willy >From 8b2520d15340f96e6e7934870600fd80c86a8942 Mon Sep 17 00:00:00 2001 From: Pieter Baauw Date: Sun, 26 Jul 2015 20:47:27 +0200 Subject: MEDIUM: mailer: retry sending a mail up to 3 times Currently only 1 connection attempt (syn packet) was send, this patch increases that to 3 attempts. This to make it less likely the mail is lost due to a single lost packet. --- src/checks.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/src/checks.c b/src/checks.c index e77926a..32ff983 100644 --- a/src/checks.c +++ b/src/checks.c @@ -1408,7 +1408,7 @@ static struct task *server_warmup(struct task *t) * * It can return one of : * - SF_ERR_NONE if everything's OK and tcpcheck_main() was not called - * - SF_ERR_UP if if everything's OK and tcpcheck_main() was called + * - SF_ERR_UP if everything's OK and tcpcheck_main() was called * - SF_ERR_SRVTO if there are no more servers * - SF_ERR_SRVCL if the connection was refused by the server * - SF_ERR_PRXCOND if the connection has been limited by the proxy (maxconn) @@ -3065,6 +3065,7 @@ static struct task *process_email_alert(struct task *t) LIST_DEL(&alert->list); check->state |= CHK_ST_ENABLED; + check->fall = 0; } } @@ -3074,6 +3075,16 @@ static struct task *process_email_alert(struct task *t) if (!(check->state & CHK_ST_INPROGRESS) && check->tcpcheck_rules) { struct email_alert *alert; + if ((check->conn->flags & CO_FL_ERROR) || // connection failed, try again + !(check->conn->flags & CO_FL_CONNECTED)) { // did not manage to connect, try again + if (check->fall < 2) { // 3 attempts max + check->current_step = NULL; + check->last_started_step = NULL; + check->fall++; + return t; + } + } + alert = container_of(check->tcpcheck_rules, typeof(*alert), tcpcheck_rules); email_alert_free(alert); -- 1.7.12.2.21.g234cd45.dirty >From 1601982888b700e454f4cb8bfb436048a3e8d71d Mon Sep 17 00:00:00 2001 From: Pieter Baauw Date: Sun, 26 Jul 2015 20:47:27 +0200 Subject: MEDIUM: mailer: retry sending a mail up to 3 times Currently only 1 connection attempt (syn packet) was send, this patch increases that to 3 attempts. This to make it less likely the mail is lost due to a single lost packet. --- src/checks.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/src/checks.c b/src/checks.c index e77926a..98011ec 100644 --- a/src/checks.c +++ b/src/checks.c @@ -1408,7 +1408,7 @@ static struct task *server_warmup(struct task *t) * * It can return one of : * - SF_ERR_NONE if everything's OK and tcpcheck_main() was not called - * - SF_ERR_UP if if everything's OK and tcpcheck_main() was called + * - SF_ERR_UP if everything's OK and tcpcheck_main() was called * - SF_ERR_SRVTO if there are no more servers * - SF_ERR_SRVCL if the connection was refused by the server * - SF_ERR_PRXCOND if the connection has been limited by the proxy (maxconn) @@ -3065,6 +3065,7 @@ static struct task *process_email_alert(struct task *t) LIST_DEL(&alert->list); check-
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
On Tue, Nov 10, 2015 at 09:52:21AM +0900, Simon Horman wrote: > I would slightly prefer if there was a more substantial comment in > process_email_alert() noting that retry occurs 3 times. Good point, I'll add this when merging. > But regardless: > Acked-by: Simon Horman Thanks Simon! Willy
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
On Mon, Nov 09, 2015 at 11:11:53AM +0100, Willy Tarreau wrote: > Hi Pieter, > > > Hi Ben, Willy, Simon, > > > > Ben, thanks for the review. > > Hoping 'release pressure' has cleared for Willy i'm resending the > > patch now, with with your comments incorporated. > > > > CC, to Simon as maintainer of mailers part so he can give approval (or > > not). > > > > The original reservations i had when sending this patch still apply. > > See the "HOWEVER." part in the bottom mail. > > > > Hoping it might get merged to improve mailer reliability. So no > > 'server down' email gets lost.. > > Thanks everyone for your time :) . > > Looks good to me. Just waiting for Simon's approval. I would slightly prefer if there was a more substantial comment in process_email_alert() noting that retry occurs 3 times. But regardless: Acked-by: Simon Horman
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Hi Pieter, > Hi Ben, Willy, Simon, > > Ben, thanks for the review. > Hoping 'release pressure' has cleared for Willy i'm resending the > patch now, with with your comments incorporated. > > CC, to Simon as maintainer of mailers part so he can give approval (or > not). > > The original reservations i had when sending this patch still apply. > See the "HOWEVER." part in the bottom mail. > > Hoping it might get merged to improve mailer reliability. So no > 'server down' email gets lost.. > Thanks everyone for your time :) . Looks good to me. Just waiting for Simon's approval. Willy
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Forgot to include list, sorry. And then the attachment dropped of.. Resending. Op 8-11-2015 om 17:33 schreef PiBa-NL: Hi Ben, Willy, Simon, Ben, thanks for the review. Hoping 'release pressure' has cleared for Willy i'm resending the patch now, with with your comments incorporated. CC, to Simon as maintainer of mailers part so he can give approval (or not). The original reservations i had when sending this patch still apply. See the "HOWEVER." part in the bottom mail. Hoping it might get merged to improve mailer reliability. So no 'server down' email gets lost.. Thanks everyone for your time :) . Regards, PiBa-NL Op 22-9-2015 om 16:43 schreef Ben Cabot: Hi PiBa-NL, Malcolm has asked me to take a look at this. While I don't know enough to answer the questions about the the design and implementation I have tested the patch. In my testing it works well and I have a couple of comments. I had a warning when building, struct email_alert *alert; should be before process_chk(t); or gcc moans (Warning: ISO C90 forbids mixed declarations and code). Ive moved the stuct to the top of the if statement where it was before my patch. I expect that to fix the warning. It makes in total 4 attempts to send the mail where I believe it should be 3? If the total desired attempts is 3 It looks like "if (check->fall < 3) { " should be "if (check->fall < 2)" with "check->fall++;" inside the if statement. I may be wrong I've only briefly looked. Yes it did '3 retries'. Ive changed to make it a total of '3 attempts' which is more like a normal 3x SYN packet when opening a failing connection. While testing this I've realised it would also be nice to log when the email fails to send after 3 attempts but that is a job for another day. Thanks for submitting this as its helpful for us, also for helping with my patch. I am still waiting for Willy to come back to me about mine as well. As he is in the middle of a release I expect he is very busy at the moment so I'll wait a while before giving him a poke and following up. Hopefully I've been of some help to you. Thanks for testing! Kind Regards, Ben On 4 August 2015 at 20:35, PiBa-NL wrote: bump? Doorgestuurd bericht Onderwerp: request for comment - [PATCH] MEDIUM: mailer: retry sending a mail up to 3 times Datum: Sun, 26 Jul 2015 21:08:41 +0200 Van:PiBa-NL Aan:HAproxy Mailing Lists Hi guys, Ive created a small patch that will retry sending a mail 3 times if it fails the first time. Its seems to work in my limited testing.. HOWEVER. -i have not checked for memoryleaks, sockets not being closed properly (i dont know how to..) -is setting current and last steps to null the proper way to reset the step of rule evaluation? -CO_FL_ERROR is set when there is a connection error.. this seems to be the proper check. -but check->conn->flags & 0xFF is a bit of s guess from observing the flags when it could connect but the server did not respond properly.. is there a other better way? -i used the 'fall' variable to track the number of retries.. should i have created a separate 'retries' variable? Thanks for any feedback you can give me. Best regards, PiBa-NL From 18fd2740b7c9f511e03afe9ebb8237f6a640a141 Mon Sep 17 00:00:00 2001 From: Pieter Baauw Date: Sun, 26 Jul 2015 20:47:27 +0200 Subject: [PATCH] MEDIUM: mailer: retry sending a mail up to 3 times Currently only 1 connection attempt (syn packet) was send, this patch increases that to 3 attempts. This to make it less likely them mail is lost due to a single lost packet. --- src/checks.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/checks.c b/src/checks.c index e77926a..335eb9a 100644 --- a/src/checks.c +++ b/src/checks.c @@ -1408,7 +1408,7 @@ static struct task *server_warmup(struct task *t) * * It can return one of : * - SF_ERR_NONE if everything's OK and tcpcheck_main() was not called - * - SF_ERR_UP if if everything's OK and tcpcheck_main() was called + * - SF_ERR_UP if everything's OK and tcpcheck_main() was called * - SF_ERR_SRVTO if there are no more servers * - SF_ERR_SRVCL if the connection was refused by the server * - SF_ERR_PRXCOND if the connection has been limited by the proxy (maxconn) @@ -3065,6 +3065,7 @@ static struct task *process_email_alert(struct task *t) LIST_DEL(&alert->list); check->state |= CHK_ST_ENABLED; + check->fall = 0; } } @@ -3074,6 +3075,17 @@ static struct task *process_email_alert(struct task *t) if (!(check->state & CHK_ST_INPROGRESS) && check->tcpcheck_rules) { struct email_alert *alert; + if ((check->conn->flags & CO_FL_ERROR) || // connection failed, try again + (check->conn->flags & 0xFF) // did not reach the 'normal end', try again + ) { + if (check->fa
Re: [PATCH] MEDIUM: mailer: try sending a mail up to 3 times
Forgot to include list, sorry. Op 8-11-2015 om 17:33 schreef PiBa-NL: Hi Ben, Willy, Simon, Ben, thanks for the review. Hoping 'release pressure' has cleared for Willy i'm resending the patch now, with with your comments incorporated. CC, to Simon as maintainer of mailers part so he can give approval (or not). The original reservations i had when sending this patch still apply. See the "HOWEVER." part in the bottom mail. Hoping it might get merged to improve mailer reliability. So no 'server down' email gets lost.. Thanks everyone for your time :) . Regards, PiBa-NL Op 22-9-2015 om 16:43 schreef Ben Cabot: Hi PiBa-NL, Malcolm has asked me to take a look at this. While I don't know enough to answer the questions about the the design and implementation I have tested the patch. In my testing it works well and I have a couple of comments. I had a warning when building, struct email_alert *alert; should be before process_chk(t); or gcc moans (Warning: ISO C90 forbids mixed declarations and code). Ive moved the stuct to the top of the if statement where it was before my patch. I expect that to fix the warning. It makes in total 4 attempts to send the mail where I believe it should be 3? If the total desired attempts is 3 It looks like "if (check->fall < 3) { " should be "if (check->fall < 2)" with "check->fall++;" inside the if statement. I may be wrong I've only briefly looked. Yes it did '3 retries'. Ive changed to make it a total of '3 attempts' which is more like a normal 3x SYN packet when opening a failing connection. While testing this I've realised it would also be nice to log when the email fails to send after 3 attempts but that is a job for another day. Thanks for submitting this as its helpful for us, also for helping with my patch. I am still waiting for Willy to come back to me about mine as well. As he is in the middle of a release I expect he is very busy at the moment so I'll wait a while before giving him a poke and following up. Hopefully I've been of some help to you. Thanks for testing! Kind Regards, Ben On 4 August 2015 at 20:35, PiBa-NL wrote: bump? Doorgestuurd bericht Onderwerp: request for comment - [PATCH] MEDIUM: mailer: retry sending a mail up to 3 times Datum: Sun, 26 Jul 2015 21:08:41 +0200 Van:PiBa-NL Aan:HAproxy Mailing Lists Hi guys, Ive created a small patch that will retry sending a mail 3 times if it fails the first time. Its seems to work in my limited testing.. HOWEVER. -i have not checked for memoryleaks, sockets not being closed properly (i dont know how to..) -is setting current and last steps to null the proper way to reset the step of rule evaluation? -CO_FL_ERROR is set when there is a connection error.. this seems to be the proper check. -but check->conn->flags & 0xFF is a bit of s guess from observing the flags when it could connect but the server did not respond properly.. is there a other better way? -i used the 'fall' variable to track the number of retries.. should i have created a separate 'retries' variable? Thanks for any feedback you can give me. Best regards, PiBa-NL