Re: Observations about reloads and DNS SRV records

2018-07-04 Thread Tait Clarridge
Hey Baptiste,

I’ll try it out next week when I get back (currently on vacation) and let
you know.

Thanks!
Tait
On Tue., Jul. 3, 2018 at 06:24 Baptiste  wrote:

> Hi,
>
> Actually, the problem was deeper than my first thought.
> In its current state, statefile and SRV records are simply not compatible.
> I had to add a new field in the state file format to add support to this.
>
> Could you please confirm the patch attached fixes your issues?
>
> Baptiste
>
>
>
> On Mon, Jun 25, 2018 at 11:48 AM, Baptiste  wrote:
>
>> Hi,
>>
>> Forget the backend id, it's the wrong answer to that problem.
>> I was investigating an other potential issue, but this does not fix the
>> original problem reported here.
>>
>> Here is the answer I delivered today on discourse, where other people
>> have also reported the same issue:
>>
>>Just to let you know that I think I found the cause of the issue but I
>> don’t have a fix yet.
>>I’ll come back to you this week with more info and hopefully a fix.
>>The issue seem to be in srv_init_addr(), because srv->hostname is not
>> set (null).
>>
>> Baptiste
>>
>>
>>
>


Re: Using different sources when connecting to a server

2018-07-04 Thread Aurélien Nephtali
Hello Baptiste,

On Wed, Jul 4, 2018 at 1:07 PM, Baptiste  wrote:
> Hi Aurélien,
>
> My 2 cents.
>
>> I'm trying to add a feature which allows HAProxy to use more than one
>> source when connecting to a server of a backend. The main reason is to
>> avoid duplicating the 'server' lines to reach more than 64k connections
>> from HAProxy to one server.
>
>
> Cool!
>
>>
>> So far I thought of two ways:
>> - each time the 'source' keyword is encountered on a 'server' line,
>>   duplicate the original 'struct server' and fill 'conn_src' with
>>   the correct source informations. It's easy to implement but does
>>   not scale at all. In fact it mimics the multiple 'server' lines.
>>   The big advantage is that it can use all existing features that
>>   deal with 'struct server' (balance keyword, for example).
>> - use a list of 'struct conn_src' in 'struct server' and 'struct
>>   proxy' and choose the best source (using round-robbin, leastconn,
>>   etc...) when a connection is about to get established.
>
>
> I also prefer the second option.
> So we would have 2 LBing algorithm? One to choose the server and one to
> choose the source IP to use?

It depends. Considering this feature could be (only ?) useful to address the 64k
maximum connections, maybe hardcoding a leastconn algorithm is enough.

>>
>> The config. syntax would look like this:
>>
>> server srv 127.0.0.1:9000 source 127.0.0.2 source 127.0.0.3 source
>> 127.0.0.4 source 127.0.0.5 source 127.0.0.6 source 127.0.1.0/24
>>
>> Not using ip1,ip2,ip/cidr,... avoids confusion when using keywords like
>> usesrc, interface, etc...
>
>
> Sure, but at least, I don't want to set 255 source for a "source
> 10.0.0.0/24", so please confirm you'll still allow CIDR notation.

Yes, look at the last 'source' from my config. line example. What I found
tedious is to use something like this:

server srv 127.0.0.1:9000 source
127.0.0.2,127.0.0.3,127.0.0.4,127.0.1.0/24 usesrc clientip,client
[...]

>>
>> Checks to the server would be done from each source but it can be very
>> slow to cover the whole range.
>
>
> I would make this optional. From a pure LBing safety point of view, I
> understand the requirement.
> That said, in some cases, we may not want to run tens or hundreds of health
> checks per second.
> I see different options:
> - check from all source IP
> - check from the host IP address (as of no source is configured)
> - check from one source IP per source subnet
>
>>
>> The main problem I see is how to efficiently store all sources for each
>> server. Using the CIDR syntax can quickly allow millions of sources to
>> be used and if we want to use algorithms like 'leastconn', we need to
>> remember how many connections are still active on a particular source
>> (using round-robbin + an index into the range would otherwise have been
>> one solution)
>> I have some ideas but I would like to know the preferred way.
>
>
> Well, storing a 32 bit hash of  and counting on this
> pattern (and automatically eject server source+dest IP which have reached
> 64K concurrent connections).

Using a leastconn algorithm with very long connections will quickly fill the
list/tree with entries with a counter of 1.

>
> I have a question: what would be the impact on "retries" ? At first, we
> could use it as of today. But later, we may want to retry from a different
> source IP.

-- 
Aurélien Nephtali



Re: haproxy 1.9 status update

2018-07-04 Thread Baptiste
Sorry to wake up an old thread, but I'm very concerned by the lack of
"architecture guide" documentation with HAProxy.
Did we make any progress on this topic?

Baptiste


Re: Using different sources when connecting to a server

2018-07-04 Thread Baptiste
Hi Aurélien,

My 2 cents.

I'm trying to add a feature which allows HAProxy to use more than one
> source when connecting to a server of a backend. The main reason is to
> avoid duplicating the 'server' lines to reach more than 64k connections
> from HAProxy to one server.
>

Cool!


> So far I thought of two ways:
> - each time the 'source' keyword is encountered on a 'server' line,
>   duplicate the original 'struct server' and fill 'conn_src' with
>   the correct source informations. It's easy to implement but does
>   not scale at all. In fact it mimics the multiple 'server' lines.
>   The big advantage is that it can use all existing features that
>   deal with 'struct server' (balance keyword, for example).
> - use a list of 'struct conn_src' in 'struct server' and 'struct
>   proxy' and choose the best source (using round-robbin, leastconn,
>   etc...) when a connection is about to get established.
>

I also prefer the second option.
So we would have 2 LBing algorithm? One to choose the server and one to
choose the source IP to use?



> The config. syntax would look like this:
>
> server srv 127.0.0.1:9000 source 127.0.0.2 source 127.0.0.3 source
> 127.0.0.4 source 127.0.0.5 source 127.0.0.6 source 127.0.1.0/24
>
> Not using ip1,ip2,ip/cidr,... avoids confusion when using keywords like
> usesrc, interface, etc...
>

Sure, but at least, I don't want to set 255 source for a "source 10.0.0.0/24",
so please confirm you'll still allow CIDR notation.


> Checks to the server would be done from each source but it can be very
> slow to cover the whole range.
>

I would make this optional. From a pure LBing safety point of view, I
understand the requirement.
That said, in some cases, we may not want to run tens or hundreds of health
checks per second.
I see different options:
- check from all source IP
- check from the host IP address (as of no source is configured)
- check from one source IP per source subnet


> The main problem I see is how to efficiently store all sources for each
> server. Using the CIDR syntax can quickly allow millions of sources to
> be used and if we want to use algorithms like 'leastconn', we need to
> remember how many connections are still active on a particular source
> (using round-robbin + an index into the range would otherwise have been
> one solution)
> I have some ideas but I would like to know the preferred way.
>

Well, storing a 32 bit hash of  and counting on this
pattern (and automatically eject server source+dest IP which have reached
64K concurrent connections).

I have a question: what would be the impact on "retries" ? At first, we
could use it as of today. But later, we may want to retry from a different
source IP.

Baptiste


Re: Connections stuck in CLOSE_WAIT state with h2

2018-07-04 Thread Milan Petruželka
>
> 20180629.1347 mpeh2 fd25 h2c_error - st04 fl0002 err05
> Just hit h2c_error - H2_ERR_STREAM_CLOSED
>

After adding more debug I found following pattern around h2c_error in
hanging connections:

... everything OK until now
20180629.1826 e901:backend.srvrep[000e:001a]: HTTP/1.1 200 OK
20180629.1826 e901:backend.srvcls[000e:adfd]
20180629.1826 mpeh2 fd14 h2s_close/h2c  - id006d h2c_st04
h2c_fl streams:12
20180629.1826 mpeh2 fd14 h2s_close/real - id006d st04 fl3101
streams:12 -> 11

h2c_error
20180629.1826 mpeh2 fd14 h2_process_demux/11 - st04 fl
20180629.1826 mpeh2 fd14 h2c_error - st04 fl err05
20180629.1826 e8e7:backend.srvcls[000e:adfd]
20180629.1826 mpeh2 fd14 h2_process_mux/01a - st06 fl
20180629.1826 mpeh2 fd14 h2_process_mux/01b - st07 fl0100
20180629.1826 e8dd:backend.srvcls[000e:adfd]

few more streams closed before before log for file descriptor 14 ends and
connection hangs in CLOSE_WAIT
20180629.1827 x01.clicls[000e:adfd]
20180629.1827 e8dd:backend.closed[000e:adfd]
20180629.1827 e8df:backend.clicls[000e:adfd]
20180629.1827 e8df:backend.closed[000e:adfd]
20180629.1827 mpeh2 fd14 h2s_destroy - id0045 st06 fl3081 streams:11
20180629.1827 mpeh2 fd14 h2s_close/h2c  - id0045 h2c_st07
h2c_fl0100 streams:11
20180629.1827 mpeh2 fd14 h2s_close/real - id0045 st06 fl3081
streams:11 -> 10

I have not seen this pattern (h2_process_demux/11 followed by h2c_error and
h2_process_mux/01a + h2_process_mux/01b)
in other conncetions. Only in those in CLOSE_WAIT state.

Here is the piece of code with added debug h2_process_demux/11 from sources
of haproxy 1.8.12

/* RFC7540#5.1:closed: if this state is reached as a
* result of sending a RST_STREAM frame, the peer that
* receives the RST_STREAM might have already sent
* frames on the stream that cannot be withdrawn. An
* endpoint MUST ignore frames that it receives on
* closed streams after it has sent a RST_STREAM
* frame. An endpoint MAY choose to limit the period
* over which it ignores frames and treat frames that
* arrive after this time as being in error.
*/
if (!(h2s->flags & H2_SF_RST_SENT)) {
/* RFC7540#5.1:closed: any frame other than
* PRIO/WU/RST in this state MUST be treated as
* a connection error
*/
if (h2c->dft != H2_FT_RST_STREAM &&
h2c->dft != H2_FT_PRIORITY &&
h2c->dft != H2_FT_WINDOW_UPDATE) {
send_log(NULL, LOG_NOTICE, "mpeh2 fd%d h2_process_demux/11 - st%02x
fl%08x\n", mpeh2_h2c_fd(h2c), mpeh2_h2c_st0(h2c), mpeh2_h2c_flags(h2c));
h2c_error(h2c, H2_ERR_STREAM_CLOSED);
goto strm_err;
}
}

Here is the piece of code with added debug h2_process_mux/01a and 01b from
sources of haproxy 1.8.12

fail:
if (unlikely(h2c->st0 >= H2_CS_ERROR)) {
send_log(NULL, LOG_NOTICE, "mpeh2 fd%d h2_process_mux/01a - st%02x
fl%08x\n", mpeh2_h2c_fd(h2c), mpeh2_h2c_st0(h2c), mpeh2_h2c_flags(h2c));
if (h2c->st0 == H2_CS_ERROR) {
if (h2c->max_id >= 0) {
h2c_send_goaway_error(h2c, NULL);
if (h2c->flags & H2_CF_MUX_BLOCK_ANY)
return 0;
}

h2c->st0 = H2_CS_ERROR2; // sent (or failed hard) !
}
send_log(NULL, LOG_NOTICE, "mpeh2 fd%d h2_process_mux/01b - st%02x
fl%08x\n", mpeh2_h2c_fd(h2c), mpeh2_h2c_st0(h2c), mpeh2_h2c_flags(h2c));
return 1;
}

I hope this helps in isolating the problem. If it's not enough, I can add
more debug to h2_mux if someone with better knowledge of source code and h2
protocol suggests where.

Milan


Using different sources when connecting to a server

2018-07-04 Thread Aurélien Nephtali
Hello,

I'm trying to add a feature which allows HAProxy to use more than one
source when connecting to a server of a backend. The main reason is to
avoid duplicating the 'server' lines to reach more than 64k connections
from HAProxy to one server.

So far I thought of two ways:
- each time the 'source' keyword is encountered on a 'server' line,
  duplicate the original 'struct server' and fill 'conn_src' with
  the correct source informations. It's easy to implement but does
  not scale at all. In fact it mimics the multiple 'server' lines.
  The big advantage is that it can use all existing features that
  deal with 'struct server' (balance keyword, for example).
- use a list of 'struct conn_src' in 'struct server' and 'struct
  proxy' and choose the best source (using round-robbin, leastconn,
  etc...) when a connection is about to get established.

The config. syntax would look like this:

server srv 127.0.0.1:9000 source 127.0.0.2 source 127.0.0.3 source 127.0.0.4 
source 127.0.0.5 source 127.0.0.6 source 127.0.1.0/24

Not using ip1,ip2,ip/cidr,... avoids confusion when using keywords like
usesrc, interface, etc...

Checks to the server would be done from each source but it can be very
slow to cover the whole range.

The main problem I see is how to efficiently store all sources for each
server. Using the CIDR syntax can quickly allow millions of sources to
be used and if we want to use algorithms like 'leastconn', we need to
remember how many connections are still active on a particular source
(using round-robbin + an index into the range would otherwise have been
one solution)
I have some ideas but I would like to know the preferred way.

Thanks.

-- 
Aurélien Nephtali