On Tue, Oct 10, 2023 at 03:04:26PM +0200, Aleksandar Lazic wrote:
> > WASM on the other hand would provide more performance and compile-time
> > checks but I fear that it could also bring new classes of issues such as
> > higher memory usage, higher latencies, and would make it less convenient
> > to deploy updates since these would require to be rebuilt. Also we don't
> > want to put too much of the application into the load balancer. But as I
> > said I haven't had a look at the details so I don't know if we can yield
> > like in Lua, implement our own bindings for internal functions, or limit
> > the memory usage per request being processed.
> 
> Hm, how could WASM be integrated into HAP if not with SPOE? I don't have now
> any Idea what's the best way could be.

I don't know, maybe just like Lua it works in a virtual machine that
supports yielding and bounded memory usage ? Because I know that many
people don't like Lua but that's what its ecosystem provides us: safe
usage.

> Willy, please take a stable seat :-)
> 
> How about to use HTTP/(1/2/3), grpc or FCGI as filter protocols to be able
> to handle the body, instead of SPOE?

I just do not know. Some already mentioned gRPC for such use cases, maybe
that could make sense for certain things (e.g. Norman's authentication),
however it seems at first glance that it's not well suited to stream
analysis. Also it's not as much the protocol as it is the way it's plugged
into the analysers that is causing us difficulties. The fact that it's
evaluated before deciding to pass the request currently prevents us from
sending streams to a remote process, even though the protocol supports
it and filters support it. The mechanims based on the applets that handle
their own idle connections has reached its limits. We didn't have other
options back then in 1.7, but it seems like nowadays it should be done as
a mux with idle conns and would work way better. It's not just a matter of
choosing a protocol but really to integrate it in the middle of something
that only has two ends: a client and a server, and at multiple places in
the conversation.

> > So I feel like it's here to stay with its design limitations making it
> > unsuitable to many of the tasks it was imagined for, and that it could
> > actually be much less effort to simply remove it. Of course that's not
> > something to do between an odd and an even version, but maybe it's not
> > even too late to drop it from 2.9 if nobody cares anymore.
> 
> Well, this could be an option, from my point of view.
> 
> @Community: Culd you be so kind and tell us for which use cases you use
> SPOE, similar to Norman (
> https://www.mail-archive.com/haproxy@formilux.org/msg44127.html ) and how
> big the afford could be to migrate to LUA filter.

That would be nice (at least to help define a more suitable successor
that doesn't miss a specific use case).

> > Or to put it in a blunt way: does anyone want that 2.9 still supports
> > SPOE whose necessary redesign never happened in 7 years despite trying
> > to find time for this, and will likely never happen ? Or can we just
> > remove it ? I have nothing against preserving it a little bit more if
> > there really are users, but it would be nice if their use cases,
> > successes or issues were known, and even more if the effort could be
> > spread over multiple persons.
> 
> I think it would be nice when there are some use cases written in the
> https://github.com/haproxy/wiki/wiki/SPOE:-Stream-Processing-Offloading-Engine
> which are in use with SPOE to see how often this feature is used in HAP.

Well, there seems to be a few examples, all "beta" or "not prod ready",
that says it all. There are also bindings for various languages so it's
possible that some are using it with their own code (and that would be
perfectly fine, it was designed for this initially).

> Well, to make it simpler I would vote to remove SPOE and migrate the SPOE
> Workload to LUA, but as I currently don't use SPOE it would be really great
> to here from users how SPOE is used and how big the work would be to migrate
> to LUA.

I'm pretty sure it's not as simple. For example I'm aware that at haptech
we have an SSO agent that uses SPOE for the simple reason that when you
want to use LDAP, good luck for finding a really non-blocking lib! Thus
you definitely do not want that inside your process or each attempt to
establish a connection to the LDAP server freezes the whole process
until the server responds. That's precisely the type of stupidities that
SPOE was aimed at solving, and it does that pretty well. However maybe
nowadays similar authentication mechanisms exist over HTTP, I don't know
(and to be honest I don't really want to know, I'm sufficiently busy with
lots of other stuff). I'm just using this as an example of why remote
process communication can remain necessary (possibly through other
protocols if needed).

> > > That said I'd definitely be very interested as well. As much as 
> > > handcrafted
> > > configurations are nice, one quickly reaches their maintainability limits.
> > > And if we're to stop abusing DNS again and again, proper service 
> > > discovery is
> > > the way.
> > 
> > Yes I think so. I remember Marko telling us at HaproxyConf 2022 that
> > the dpapi can now consume and produce about everything that's valid
> > from an haproxy standpoint and can do it from other representations
> > (I think YAML was mentioned). This can also help some users maintain
> > and generate their configs in a way they find more convenient based
> > on the tools available to them.
> 
> Well this implies that always a dpapi should run together with HAProxy if
> you want something like DNS resolving for server or anything else? I don't
> think that's  good approach but I understand that some part have to be
> cleanuped, difficult decision.

No, I remember Tim raised this point a while ago basically saying "hey
don't break the DNS I use it for my servers". For me simple server
resolution is fine (the original intent in fact), just like the
"do_resolve()" action. I'm making a significant difference between
"this server is known for changing address from time to time, please
try to refresh it" (i.e. a rebooting AWS instance), and "we can learn
a complete farm by sending a DNS request that leads to incomplete
responses that require O(N^4) search algorithms to resolve in order
to preserve as much as possible existing ones while still detecting
the disappeared ones and assigning remaining addresses to new servers".

In the first case it's more or less a periodic update of what the
libc's gethostbyname() or getaddrinfo() would have provided if the
process would have been reloaded, but without having to reload. In
the second case, it's abusing the DNS protocol to emulate service
discovery.

> I think that the DNS Stuff should be keep there and maybe be enhanced as
> it looks to me some new Security Topics are using DNS more and more like
> ESNI, ECH, SVB, ...

I just don't want to conflate DNS and service discovery. I'm even
fine with DNS load balancing if there are some legitimate use cases.
I just want it to go back to the basics: DNS to resolve a host name
to an IPv4/v6 address, period.

> Should this be handled by dpapi and configured via socket api or any
> upcoming api in HAProxy?

I've given some thought to this in the past and I remember that I said
that given that the dpapi is responsible for reloading haproxy if needed,
any such important changes should go via this one so that it's always
aware of the exact state. It would possibly support snoop mode where
raw commands could be fed directly through it while it could still be
aware of what's being done, I don't know. It's also obvious to me that
it remains mandatory that anyone can use haproxy without it. The dpapi
should only bring benefits where relevant, not constraints. That's also
why it's important to cut properly around features. I just don't want
to enter multiple such design discussions at once, it's already difficult
for me to keep my focus on the low-level stuff that needs to be finished
(e.g. CPU detection, fix mt_lists locking, getting rid of the ring's
lock etc), it's extremely hard to make such large stretches by jumping
to OSI layer 9 like this :-)

Willy

Reply via email to