Hi Willy,
Thanks for the nice, detailed feedback.
Overall, I agree with all of your listed points, so no need for further
discussions.
I will hopefully send the separated patches at the beginning of next week.
Just a comment and two small questions to make sure I got things correct:
> As such I'd like them to be renamed to remove this confusion.
> Maybe just call them HA_PP2_* ?
Yes, such a prefix will clean it up quite nicely IMO. Will add that to the
first patch of the series.
> [...]
> It may even encourage us to create
> smaller pools if ever deemed really useful (e.g. a 4- or 8- byte
> load balancing hash key for end-to-end consistency would only
> take a total of 32 or 40, malloc included).
Just to make sure: Right now, we don't want to optimize further for small TLVs
other than removing the second pool for the values and using the new struct
with the VLA to reduce the overhead.
For normal use, a pool with 160 B could be used now to accommodate for the new
conn_tlv_list struct and 128 B TLVs (e.g., UNIQUE_ID)?
For the authority type, it would then be a 255 + 32 B sized pool? Maybe that
could be used for the value range 128 <= x <= 255, and then, for > 255, malloc?
Other, smaller pools are future work?
>struct conn_tlv_list {
> struct list list;
>unsigned short len; // 65535 should be more than enough!
>unsigned char type;
>char contents[0];
> };
I am also a bit confused about the second struct variant of this. IMO this is
the optimal struct layout in regards to size, that I would like to use.
What is the other struct for, where `len` and `type` are switched?
Thanks again for your efforts, it's really interesting for me to work on
HAProxy.
Best,
Alexander
-Original Message-
From: Willy Tarreau
Sent: Thursday, August 10, 2023 9:18 AM
To: Stephan, Alexander
Cc: haproxy@formilux.org
Subject: Re: [PATCH] MEDIUM: sample: Implement sample fetch for arbitrary PROXY
protocol v2 TLV values
[You don't often get email from w...@1wt.eu. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification ]
Hi Alexander,
On Mon, Jul 31, 2023 at 01:11:35PM +, Stephan, Alexander wrote:
> Dear HAProxy-maintainers,
>
> As proposed by my colleague Christian Menges in [1], I've implemented
> support for fetching arbitrary TLV values for PROXY protocol V2 via a sample
> fetch.
I'm afraid I don't remember seeing this message, and it was left without
response, so it fell through the cracks, that's not cool for your colleague.
> It can be used by calling 'fc_pp_tlv' with the numerical value of the
> desired TLV type. This also fixes issue [2].
Indeed.
> Some design considerations for my approach:
> o *Do not break the existing API*: Keep methods for authority and
> unique ID, but refactor them internally with the updated, generified logic.
That's indeed preferable :-)
> o *Keep existing behavior and performance*: Pools for authority and
> unique ID are still used. Already implemented parsing logic is still
> applied. All information about a TLV payload that is already should be
> used for validation.
OK.
> o *Small memory footprint*: As there might be up to 50k connections,
> an array or hash map is too memory intensive. Therefore, a simple list
> is used. It is the best choice here in my opinion, as there are
> typically only a handful of TLVs.
I agree. At first I thought "Hmmm lists?" but we're speaking about very few
items here, less than a handfull in general, so yes, I agree that the list
might probably be the cheapest option.
> Additionally, memory pooling is used where possible. When TLV values
> become too large there is a fallback to 'malloc'.
At first glance I like this approach. We can see in field how it deals with the
traffic, but it should be OK. However I'm noticing that you put a limit to 127,
which is a bit unfortunate, because we switched the sockaddr_storage to a pool
a few years ago, and they're of size 128, so for one extra byte you could pick
items from that shared pool.
> Note that I choose to not overwrite existing TLVs in the list. This
> way, we return always the first found but would allow duplicate
> entries. This would be relevant when there are duplicate TLVs.
Good point, I hadn't thought about this. And this is desirable because our ACLs
can iterate over multiple instances of a sample. It seems that the current
function smp_fetch_fc_pp_tlv() doesn't support it by the way, and will only
return the first one. Please have a look at
smp_fetch_ssl_hello_alpn() or more generally some HTTP sample fetches that set
SMP_F_NOT_LAST. This tells the ACL engine that if the last returned entry does
not match, it can come back with the same context to retrieve the next one, and
so on. Once you reach the end, you just drop that flag and the ACL engine has
its final status. In your context you would store a retry point (maybe even the
list's pointer). There's even a