Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and res.hdrs to selectively include / exclude headers

2024-01-22 Thread Ruei-Bang Chen
Hi Willy,

Thank you for the response. This is a very interesting topic. Different 
approaches do have their own pros and cons and no matter which way we decide to 
use, we have to make sure it can be extended easily.

I'll think more about this and have some more discussions within the team 
before getting back to you.

Thanks,
Ruei-Bang

From: Willy Tarreau 
Sent: Friday, January 19, 2024 6:28 AM
To: Ruei-Bang Chen 
Cc: haproxy@formilux.org 
Subject: Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and res.hdrs 
to selectively include / exclude headers

Hi Ruei-Bang,

On Tue, Jan 09, 2024 at 07:18:14PM +, Ruei-Bang Chen wrote:
> Hi Willy,
>
> Hope you are doing well in the New Year!

Yep, thank you!

> I just want to bump this thread so that we can continue the discussion.
>
> I understand you probably have a lot of emails to catch up on recently so
> there is no urgency for this. Just want to make sure this thread does not get
> lost.

I've looked everywhere, both in my own mbox and the list archive and
couldn't find any trace of it, so I suspect that it possibly remained
only in your outbox, or was victim of a hiccup on the send side, so
thanks for resending!

Some comments below.

> 
> From: Ruei-Bang Chen 
> Sent: Tuesday, December 12, 2023 5:32 PM
> To: Willy Tarreau 
> Cc: haproxy@formilux.org 
> Subject: Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and 
> res.hdrs to selectively include / exclude headers
>
> Hi Willy,
>
> Sorry for the late reply. I know it has been some time since our last 
> discussion.
> I finally have a chance to follow up on this  after working on some other 
> tasks
> and the Thanksgiving break.
>
> What I'm suspecting is that if
> users find it useful, it will not be long before some ask for a way to
> designate prefixes (e.g. select x-companyname-*, or exclude accept-* etc).
> The way you did it will make it easy to extend it for this, for example,
> by appending '*' to a header name, so that's fine. Hmmm well, '*' is
> actually permitted in header field names, so maybe another solution could
> be that we consider that we'd use prefixes by default and terminante names
> with ':' to designate full names. E.g. req.hdrs(accept:) would match only
> "accept" while req.hdrs(accept) would also match accept-encoding,
> accept-range etc. It's just up to us to decide (and you first since you're
> the first one to need this, so the ability to extend this and/or to factor
> arguments may be relevant to your use case.
> Yeah, I think that would be a possible ask / need from the client.
> Currently, we don't have the immediate need for this but I agree at some point
> this feature might come in handy even for us as well. I am thinking maybe
> it can be left as a separate follow-up patch after this one? Let me know if 
> you
> feel strongly that we should include this in the current patch.

Yes, I'm perfectly fine with a later follow-up. I mentioned this to make
sure that we think about it early, in case it requires to refine the syntax
*before* we corner ourselves at the wrong place, because once it's part
of a released version, you cannot change it in a breaking way anymore.
For example if we decide that an argument currently is an exact match,
it cannot later become a prefix. We still have some choice of other chars
to denote prefixes if needed, as listed in RFC9110 section 5.6.2 (header
field names are tokens):

  Tokens are short textual identifiers that do not include whitespace or
  delimiters.

  token  = 1*tchar
  tchar  = "!" / "#" / "$" / "%" / "&" / "'" / "*"
 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
 / DIGIT / ALPHA
 ; any VCHAR, except delimiters

  Many HTTP field values are defined using common syntax components, separated
  by whitespace or specific delimiting characters. Delimiters are chosen from
  the set of US-ASCII visual characters not allowed in a token (DQUOTE and
  "(),/:;<=>?@[\]{}").

Parenthesis and comma are already used to delimit the names in the
expression, but others like "@", "?" or "/" would not cause problems
if needed.

> Similarly I'm seeing that having the '!' in the first argument does add
> some special cases everywhere down the inner loop. Maybe just having a
> new name such as "req.hdrs_exc()" to enumerate exclusion would simplify
> everything, and possibly make it easier even for APIs which will feed
> these arguments, so as to remove the special case of the first argument.
> I don't know, but feel free to explore such possibilities, it's important
> to think about how configs will be used and updated. Very often the amount
> of work needed to make the core easy to use is less than the work needed
> externally to adapt to it ;-)
> This makes sense. After discussing within the team, we think it might be 
> easier
> to just add separate new fetchers like "req.hdrs_inc", "req.hdrs_exc", 
> 

Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and res.hdrs to selectively include / exclude headers

2024-01-19 Thread Willy Tarreau
Hi Ruei-Bang,

On Tue, Jan 09, 2024 at 07:18:14PM +, Ruei-Bang Chen wrote:
> Hi Willy,
> 
> Hope you are doing well in the New Year!

Yep, thank you!

> I just want to bump this thread so that we can continue the discussion.
> 
> I understand you probably have a lot of emails to catch up on recently so
> there is no urgency for this. Just want to make sure this thread does not get
> lost.

I've looked everywhere, both in my own mbox and the list archive and
couldn't find any trace of it, so I suspect that it possibly remained
only in your outbox, or was victim of a hiccup on the send side, so
thanks for resending!

Some comments below.

> 
> From: Ruei-Bang Chen 
> Sent: Tuesday, December 12, 2023 5:32 PM
> To: Willy Tarreau 
> Cc: haproxy@formilux.org 
> Subject: Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and 
> res.hdrs to selectively include / exclude headers
> 
> Hi Willy,
> 
> Sorry for the late reply. I know it has been some time since our last 
> discussion.
> I finally have a chance to follow up on this  after working on some other 
> tasks
> and the Thanksgiving break.
> 
> What I'm suspecting is that if
> users find it useful, it will not be long before some ask for a way to
> designate prefixes (e.g. select x-companyname-*, or exclude accept-* etc).
> The way you did it will make it easy to extend it for this, for example,
> by appending '*' to a header name, so that's fine. Hmmm well, '*' is
> actually permitted in header field names, so maybe another solution could
> be that we consider that we'd use prefixes by default and terminante names
> with ':' to designate full names. E.g. req.hdrs(accept:) would match only
> "accept" while req.hdrs(accept) would also match accept-encoding,
> accept-range etc. It's just up to us to decide (and you first since you're
> the first one to need this, so the ability to extend this and/or to factor
> arguments may be relevant to your use case.
> Yeah, I think that would be a possible ask / need from the client.
> Currently, we don't have the immediate need for this but I agree at some point
> this feature might come in handy even for us as well. I am thinking maybe
> it can be left as a separate follow-up patch after this one? Let me know if 
> you
> feel strongly that we should include this in the current patch.

Yes, I'm perfectly fine with a later follow-up. I mentioned this to make
sure that we think about it early, in case it requires to refine the syntax
*before* we corner ourselves at the wrong place, because once it's part
of a released version, you cannot change it in a breaking way anymore.
For example if we decide that an argument currently is an exact match,
it cannot later become a prefix. We still have some choice of other chars
to denote prefixes if needed, as listed in RFC9110 section 5.6.2 (header
field names are tokens):

  Tokens are short textual identifiers that do not include whitespace or
  delimiters.

  token  = 1*tchar
  tchar  = "!" / "#" / "$" / "%" / "&" / "'" / "*"
 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
 / DIGIT / ALPHA
 ; any VCHAR, except delimiters

  Many HTTP field values are defined using common syntax components, separated
  by whitespace or specific delimiting characters. Delimiters are chosen from
  the set of US-ASCII visual characters not allowed in a token (DQUOTE and
  "(),/:;<=>?@[\]{}").

Parenthesis and comma are already used to delimit the names in the
expression, but others like "@", "?" or "/" would not cause problems
if needed.

> Similarly I'm seeing that having the '!' in the first argument does add
> some special cases everywhere down the inner loop. Maybe just having a
> new name such as "req.hdrs_exc()" to enumerate exclusion would simplify
> everything, and possibly make it easier even for APIs which will feed
> these arguments, so as to remove the special case of the first argument.
> I don't know, but feel free to explore such possibilities, it's important
> to think about how configs will be used and updated. Very often the amount
> of work needed to make the core easy to use is less than the work needed
> externally to adapt to it ;-)
> This makes sense. After discussing within the team, we think it might be 
> easier
> to just add separate new fetchers like "req.hdrs_inc", "req.hdrs_exc", 
> "res.hdrs_inc",
> "res.hdrs_exc" to be clearer what each fetcher is doing. This way, we can 
> also make
> sure to not impact the existing flow for "req.hdrs" and "res.hdrs", wondering 
> what
> are your thoughts on this?

This could also be an idea, indeed. However, do you think about also adding
the _bin variant that builds a serialized binary block ? I think it's not
much a change, but if not done, it could be strange that only the
unrestricted list has a binary version and not the others. However I do
agree with you that it could simplify the function's implementation to have
distinct 

Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and res.hdrs to selectively include / exclude headers

2024-01-09 Thread Ruei-Bang Chen
Hi Willy,

Hope you are doing well in the New Year!

I just want to bump this thread so that we can continue the discussion.

I understand you probably have a lot of emails to catch up on recently so there 
is no urgency for this. Just want to make sure this thread does not get lost.

Thanks,
Ruei-Bang

From: Ruei-Bang Chen 
Sent: Tuesday, December 12, 2023 5:32 PM
To: Willy Tarreau 
Cc: haproxy@formilux.org 
Subject: Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and res.hdrs 
to selectively include / exclude headers

Hi Willy,

Sorry for the late reply. I know it has been some time since our last 
discussion.
I finally have a chance to follow up on this  after working on some other tasks
and the Thanksgiving break.

What I'm suspecting is that if
users find it useful, it will not be long before some ask for a way to
designate prefixes (e.g. select x-companyname-*, or exclude accept-* etc).
The way you did it will make it easy to extend it for this, for example,
by appending '*' to a header name, so that's fine. Hmmm well, '*' is
actually permitted in header field names, so maybe another solution could
be that we consider that we'd use prefixes by default and terminante names
with ':' to designate full names. E.g. req.hdrs(accept:) would match only
"accept" while req.hdrs(accept) would also match accept-encoding,
accept-range etc. It's just up to us to decide (and you first since you're
the first one to need this, so the ability to extend this and/or to factor
arguments may be relevant to your use case.
Yeah, I think that would be a possible ask / need from the client.
Currently, we don't have the immediate need for this but I agree at some point
this feature might come in handy even for us as well. I am thinking maybe
it can be left as a separate follow-up patch after this one? Let me know if you
feel strongly that we should include this in the current patch.

Similarly I'm seeing that having the '!' in the first argument does add
some special cases everywhere down the inner loop. Maybe just having a
new name such as "req.hdrs_exc()" to enumerate exclusion would simplify
everything, and possibly make it easier even for APIs which will feed
these arguments, so as to remove the special case of the first argument.
I don't know, but feel free to explore such possibilities, it's important
to think about how configs will be used and updated. Very often the amount
of work needed to make the core easy to use is less than the work needed
externally to adapt to it ;-)
This makes sense. After discussing within the team, we think it might be easier
to just add separate new fetchers like "req.hdrs_inc", "req.hdrs_exc", 
"res.hdrs_inc",
"res.hdrs_exc" to be clearer what each fetcher is doing. This way, we can also 
make
sure to not impact the existing flow for "req.hdrs" and "res.hdrs", wondering 
what
are your thoughts on this?

Thanks,
Ruei-Bang


From: Willy Tarreau 
Sent: Thursday, November 9, 2023 9:22 PM
To: Ruei-Bang Chen 
Cc: haproxy@formilux.org 
Subject: Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and res.hdrs 
to selectively include / exclude headers

Hi Ruei-Bang,

On Fri, Nov 10, 2023 at 01:33:01AM +, Ruei-Bang Chen wrote:
> Hi team,
>
> Based on the feedback from
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mail-archive.com%2Fhaproxy%40formilux.org%2Fmsg44153.html=05%7C01%7Cruechen%40linkedin.com%7Ca24985ccb6534318c6ba08dbe1ad10e5%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C638351905684956832%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=htOFRDEBc0GIbCF6EbcFmFZK0A6U%2FM8HGhP6OIbsi2c%3D=0,
>  I have
> attached a patch for modifying fetchers for req.hdrs and res.hdrs to
> selectively include / exclude headers.

Great!

> I have the following 3 questions while I was doing this change:
>
>   1.  Should I place the helper function skip_hdr_for_fetch  within
>   http_fetch.c  or there is a better place for this?

I think it's fine like this.

>   2.  Should I do this to smp_fetch_hdrs_bin  as well? I can update the code,
>   comments, documentation if that is needed.

Indeed, that would be an excellent idea, in order to maintain consistency
between the two.

>   3.> Not sure if this is the best style for modifying the fetch to support
>   10 arguments as it does not align with other fetchers due to the huge
>   number of characters required.

I had a look at it, and it's fine to me. What I'm suspecting is that if
users find it useful, it will not be long before some ask for a way to
designate prefixes (e.g. select x-companyname-*, or exclude accept-* etc).
The way you did it will make it easy to extend it for this, for example,
by appending '*' to a header name, so that's fine. Hmmm well, '*' is
actually permitted in header field names, so maybe another 

Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and res.hdrs to selectively include / exclude headers

2023-12-12 Thread Ruei-Bang Chen
Hi Willy,

Sorry for the late reply. I know it has been some time since our last 
discussion.
I finally have a chance to follow up on this  after working on some other tasks
and the Thanksgiving break.

What I'm suspecting is that if
users find it useful, it will not be long before some ask for a way to
designate prefixes (e.g. select x-companyname-*, or exclude accept-* etc).
The way you did it will make it easy to extend it for this, for example,
by appending '*' to a header name, so that's fine. Hmmm well, '*' is
actually permitted in header field names, so maybe another solution could
be that we consider that we'd use prefixes by default and terminante names
with ':' to designate full names. E.g. req.hdrs(accept:) would match only
"accept" while req.hdrs(accept) would also match accept-encoding,
accept-range etc. It's just up to us to decide (and you first since you're
the first one to need this, so the ability to extend this and/or to factor
arguments may be relevant to your use case.
Yeah, I think that would be a possible ask / need from the client.
Currently, we don't have the immediate need for this but I agree at some point
this feature might come in handy even for us as well. I am thinking maybe
it can be left as a separate follow-up patch after this one? Let me know if you
feel strongly that we should include this in the current patch.

Similarly I'm seeing that having the '!' in the first argument does add
some special cases everywhere down the inner loop. Maybe just having a
new name such as "req.hdrs_exc()" to enumerate exclusion would simplify
everything, and possibly make it easier even for APIs which will feed
these arguments, so as to remove the special case of the first argument.
I don't know, but feel free to explore such possibilities, it's important
to think about how configs will be used and updated. Very often the amount
of work needed to make the core easy to use is less than the work needed
externally to adapt to it ;-)
This makes sense. After discussing within the team, we think it might be easier
to just add separate new fetchers like "req.hdrs_inc", "req.hdrs_exc", 
"res.hdrs_inc",
"res.hdrs_exc" to be clearer what each fetcher is doing. This way, we can also 
make
sure to not impact the existing flow for "req.hdrs" and "res.hdrs", wondering 
what
are your thoughts on this?

Thanks,
Ruei-Bang


From: Willy Tarreau 
Sent: Thursday, November 9, 2023 9:22 PM
To: Ruei-Bang Chen 
Cc: haproxy@formilux.org 
Subject: Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and res.hdrs 
to selectively include / exclude headers

Hi Ruei-Bang,

On Fri, Nov 10, 2023 at 01:33:01AM +, Ruei-Bang Chen wrote:
> Hi team,
>
> Based on the feedback from
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mail-archive.com%2Fhaproxy%40formilux.org%2Fmsg44153.html=05%7C01%7Cruechen%40linkedin.com%7Ca24985ccb6534318c6ba08dbe1ad10e5%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C638351905684956832%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=htOFRDEBc0GIbCF6EbcFmFZK0A6U%2FM8HGhP6OIbsi2c%3D=0,
>  I have
> attached a patch for modifying fetchers for req.hdrs and res.hdrs to
> selectively include / exclude headers.

Great!

> I have the following 3 questions while I was doing this change:
>
>   1.  Should I place the helper function skip_hdr_for_fetch  within
>   http_fetch.c  or there is a better place for this?

I think it's fine like this.

>   2.  Should I do this to smp_fetch_hdrs_bin  as well? I can update the code,
>   comments, documentation if that is needed.

Indeed, that would be an excellent idea, in order to maintain consistency
between the two.

>   3.> Not sure if this is the best style for modifying the fetch to support
>   10 arguments as it does not align with other fetchers due to the huge
>   number of characters required.

I had a look at it, and it's fine to me. What I'm suspecting is that if
users find it useful, it will not be long before some ask for a way to
designate prefixes (e.g. select x-companyname-*, or exclude accept-* etc).
The way you did it will make it easy to extend it for this, for example,
by appending '*' to a header name, so that's fine. Hmmm well, '*' is
actually permitted in header field names, so maybe another solution could
be that we consider that we'd use prefixes by default and terminante names
with ':' to designate full names. E.g. req.hdrs(accept:) would match only
"accept" while req.hdrs(accept) would also match accept-encoding,
accept-range etc. It's just up to us to decide (and you first since you're
the first one to need this, so the ability to extend this and/or to factor
arguments may be relevant to your use case.

If you want to slightly optimize the loop in smp_fetch_hdrs(), you should
call skip_hdr_for_fetch() only when you have at least one argument, so
that 

Re: [PATCH] MEDIUM: sample: Modify fetchers for req.hdrs and res.hdrs to selectively include / exclude headers

2023-11-09 Thread Willy Tarreau
Hi Ruei-Bang,

On Fri, Nov 10, 2023 at 01:33:01AM +, Ruei-Bang Chen wrote:
> Hi team,
> 
> Based on the feedback from
> https://www.mail-archive.com/haproxy@formilux.org/msg44153.html, I have
> attached a patch for modifying fetchers for req.hdrs and res.hdrs to
> selectively include / exclude headers.

Great!

> I have the following 3 questions while I was doing this change:
> 
>   1.  Should I place the helper function skip_hdr_for_fetch  within
>   http_fetch.c  or there is a better place for this?

I think it's fine like this.

>   2.  Should I do this to smp_fetch_hdrs_bin  as well? I can update the code,
>   comments, documentation if that is needed.

Indeed, that would be an excellent idea, in order to maintain consistency
between the two.

>   3.> Not sure if this is the best style for modifying the fetch to support
>   10 arguments as it does not align with other fetchers due to the huge
>   number of characters required.

I had a look at it, and it's fine to me. What I'm suspecting is that if
users find it useful, it will not be long before some ask for a way to
designate prefixes (e.g. select x-companyname-*, or exclude accept-* etc).
The way you did it will make it easy to extend it for this, for example,
by appending '*' to a header name, so that's fine. Hmmm well, '*' is
actually permitted in header field names, so maybe another solution could
be that we consider that we'd use prefixes by default and terminante names
with ':' to designate full names. E.g. req.hdrs(accept:) would match only
"accept" while req.hdrs(accept) would also match accept-encoding,
accept-range etc. It's just up to us to decide (and you first since you're
the first one to need this, so the ability to extend this and/or to factor
arguments may be relevant to your use case.

If you want to slightly optimize the loop in smp_fetch_hdrs(), you should
call skip_hdr_for_fetch() only when you have at least one argument, so
that by default when called with no argument, it skips it. I don't know
what the most common use case will be, to be honest, so just take this
as a hint, maybe we don't care that much about all headers given that
the overall processing around it will be expensive anyway.

> Please let me know what are your thoughts on this.

I think it's quite nice like this and you patch looks clean. Please just
consider the points above in case that gives you some ideas.

Similarly I'm seeing that having the '!' in the first argument does add
some special cases everywhere down the inner loop. Maybe just having a
new name such as "req.hdrs_exc()" to enumerate exclusion would simplify
everything, and possibly make it easier even for APIs which will feed
these arguments, so as to remove the special case of the first argument.
I don't know, but feel free to explore such possibilities, it's important
to think about how configs will be used and updated. Very often the amount
of work needed to make the core easy to use is less than the work needed
externally to adapt to it ;-)

Maybe other participants here have any idea on this matter ?

Thanks!
Willy