> > (a) It's not clear to me how the threshold upgrade is determined? What
> > triggers the record size bump internally?
>
> The forwarding mechanism does two things :
>   - the read side counts the number of consecutive iterations that
>     read() filled the whole receive buffer. After 3 consecutive times,
>     it considers that it's a streaming transfer and sets the flag
>     CF_STREAMER on the communication channel.
>
>   - after 2 incomplete reads, the flag disappears.
>
>   - the send side detects the number of times it can send the whole
>     buffer at once. It sets CF_STREAMER_FAST if it can flush the
>     whole buffer 3 times in a row.
>
>   - after 2 incomplete writes, the flag disappears.
>
> I preferred to only rely on CF_STREAMER and ignore the _FAST variant
> because it would only favor high bandwidth clients (it's used to
> enable splice() in fact). But I thought that CF_STREAMER alone would
> do the right job. And your WPT test seems to confirm this, when we
> look at the bandwidth usage!
>

Gotcha, thanks. As a follow up question, is it possible for me to control
the size of the read buffer?


> > (b) If I understood your earlier comment correctly, HAProxy will
> > automatically begin each new request with small record size... when it
> > detects that it's a new request.
>
> Indeed. In HTTP mode, it processes transactions (request+response), not
> connections, and each new transaction starts in a fresh state where these
> flags are cleared.


Awesome.


> > This works great if we're talking to a
> > backend in "http" mode: we parse the HTTP/1.x protocol and detect when a
> > new request is being processed, etc. However, what if I'm using HAProxy
> to
> > terminate TLS (+alpn negotiate) and then route the data to a "tcp" mode
> > backend.. which is my spdy / http/2 server talking over a non-encrypted
> > channel.
>
> Ah good point. I *suspect* that in practice it will work because :
>
>   - the last segment of the first transfer will almost always be incomplete
>     (you don't always transfer exact multiples of the buffer size) ;
>   - the first response for the next request will almost always be
> incomplete
>     (headers and not all data)
>

Ah, clever. To make this more interesting, say we have multiple streams in
flight: the frames may be interleaved and some streams may finish sooner
than others, but since multiple are in flight, chances are we'll be able to
fill the read buffer until the last stream completes.. which is actually
exactly what we want: we wouldn't want to reset the window at end of each
stream, but only when the connection goes quiet!


> So if we're in this situation, this will be enough to reset the CF_STREAMER
> flag (2 consecutive incomplete reads). I think it would be worth testing
> it.
> A very simple way to test it in your environment would be to chain two
> instances, one in TCP mode deciphering, and one in HTTP mode.
>

That's clever. I think for a realistic test we'd need a SPDY backend
though, since that's the only way we can actually get the multiplexed
streams flowing in parallel.


> > In this instance this logic wouldn't work, since HAProxy doesn't
> > have any knowledge or understanding of spdy / http/2 streams -- we'd
> start
> > the entire connection with small records, but then eventually upgrade it
> to
> > 16KB and keep it there, correct?
>
> It's not kept, it really depends on the transfer sizes all along. It
> matches
> more or less what you explained at the beginning of this thread, but based
> on transfer sizes at the lower layers.


Yep, this makes sense now - thanks.


>  > Any clever solutions for this? And on that note, are there future plans
> to
> > add "http/2" smarts to HAProxy, such that we can pick apart different
> > streams within a session, etc?
>
> Yes, I absolutely want to implement HTTP/2 but it will be time consuming
> and
> we won't have this for 1.5 at all. I also don't want to implement SPDY nor
> too early releases of 2.0, just because whatever we do will take a lot of
> time. Haproxy is a low level component, and each protocol adaptation is
> expensive to do. Not as much expensive as what people have to do with
> ASICs,
> but still harder than what some other products can do by using a small lib
> to perform the abstraction.
>

Makes sense, and great to hear!


> One of the huge difficulties we'll face will be to manage multiple streams
> over one connection. I think it will change the current paradigm of how
> requests are instanciated (which already started). From the very first
> version, we instanciated one "session" upon accept(), and this session
> contains buffers on which analyzers are plugged. The HTTP parsers are
> such analyzers. All the states and counters are stored at the session
> level. In 1.5, we started to change a few things. A connection is
> instanciated upon accept, then the session allocated after the connection
> is initialized (eg: SSL handshake complete). But splitting the sessions
> between multiple requests will be quite complex. For example, I fear
> that we'll have to always copy data because we'll have multiple
> connections on one side and a single multiplexed one on the other side.
> You can take a look at doc/internal/entities.pdf if you're interested.
>

Yep, and you guys are not the only ones that will have to go through this
architectural shift... I think many of the popular servers (Apache in
particular comes to mind), might have to seriously reconsider their
internal architecture. Not an easy thing to do, but I think it'll be worth
it. :-)

ig

Reply via email to