I suppose we could start writing the advice in a draft. One more point below.

On 5/3/2022 9:09 PM, Willy Tarreau wrote:
Hi Christian,

On Tue, May 03, 2022 at 05:42:56PM -0700, Christian Huitema wrote:
As I said, there appear to be two concerns. One is to protect the local
server from floods, and, as you say, this is much better done by blocking
the suspicious traffic ahead of the server. The other is, provide at least a
modicum of defense against attacks that use the QUIC server as a reflector.
For example:

* attacker sends a "carefully crafted" UDP packet to QUIC server, spoofing
the source address and port of a target,

* QUIC server processes the packet and produces the expected "response"

* target attempts to process the packet and something bad happens.

The concern here is not so much volumetric attacks, which can be mitigated
with a combination of not amplifying traffic and carefully using retry. We
are also concerned about "Request Forgery" attacks, in which the "response"
activates a bug in the target. For example, QUIC sends a long packet to a
poorly written game server, and the game server crashes when trying to
process that packet.
Sure but this problem has been existing for as long as UDP services were
reachable on the net, using ECHO, CHARGEN or RPCSS as generators or
reflectors, and a variety of other services as targets.
Yes indeed. However, see below on the extra attack avenue bouncing through clients when pretending to "move from anycast to unicast".
There may also be variants of that attack in which an
attacking server bounces packets through a QUIC client.
This is much less likely to happen since most clients nowadays are behind
stateful NAT, so the attacking server must use the same source address as
an existing connection whose server will be attacked. However, I predict
that we'll see new privacy attacks based on IP ID generation so that an
attacker can try to guess whether or not a client currently has an
established QUIC connection with a given server. The principle will be
the same as with TCP and will consist in trying to forge a packet as
coming from the supposed server and seeing if the client responds/retries,
causing a change in the IP ID generation of subsequent packets. I think
by now most operating systems are safe regarding this, but we'll see in
the long term.

QUIC includes a feature to facilitate moving a connection from an anycast address to a unicast address. The client finds the anycast address through the DNS, contacts the server, learns a "preferred server address" during the handshake, and tries validating that preferred address before using it for the rest of the connection duration, without risk losing the connection to a routing event that redirects anycast to a different server. Which is nice, but of course a villainous server could feed a bogus preferred address. That's the client bounce for you. All the server packets before the bounce are sent with the expected 5 tuple. And of course the "preferred address" is sent using encrypted packets.


Clearly, servers that can be crashed by a few magic packets have no business
being reachable from the Internet. But the concern here is that by spoofing
the source address, an attacker located outside of the server network can
reflect and attack towards a vulnerable server inside that network, even if
firewalls isolate that vulnerable server from the Internet.
This one is already solved by anti-spoofing at the edge and is not
specific to QUIC, and it's valid for other protocols (including ICMP
and TCP). If you do not filter your own sources at the edge, your
internal network is already exposed to attacks and scans. Such attacks
have been widely known and documented for 2 decades now, and similarly
to what you mentioned above, infrastructure that doesn't filter its own
source at the edge has no business being reachable from the net.

But it's very possible that we'll see a reload of such attacks with
QUIC connection attempts from ports 514 (syslog), 162 (snmp trap) or
even 161 (snmp) as an attempt to catch such misconfigured networks.
Again this is another reason for servers not to accept connections
from unprivileged ports.

I would suggest that port filtering at the application layer is only
used to decide whether to respond or not (i.e. should I send a retry
for a packet that parses correctly). For example it could be suggested
that packets coming from suspicious but valid source ports ought to
be double-checked with a retry to preserve local resources. But that
will not be used to save CPU anyway.
Yes. Makes sense. But there are four ways by which QUIC can bounce packets:

* Initial packets from the client will cause the server to send a flight of
up to 3 Handshake packets. The Retry mechanism could be used there.

* Initial packets from the client with an unsupported version number cause
the server to send a Version Negotiation packet. These are short packets,
but up to 256 bytes of the VN packet content can be predetermined by the
attacker.
256 bytes are sufficient to cause a large DNS or NTP request, so what could
happen is an attacker sending a packet from port 53 or 123 of a server
known for responding large packets, to a victim QUIC server, that will
respond to this server and cause a large response to be sent to it. As
such the attacker will only need to send 256 bytes to the victim, and
the victim will itself request the extra few kB from another location.
This is another form of amplification attack, which is properly addressed
by filtering these invalid source ports.

* 1RTT packets sent to a server with a unknown Connection ID may cause the
server to send a Stateless Reset packet. The Stateless Reset packet must be
shorter than the incoming packet, so there is no amplification. The content
is hard to predict, but who knows.

* After a connection has been established and verified, either party can
send a "path validation" packet from a "new" address and port. Clients or
servers will respond with a path validation packet of their own, trying to
reach the new address. The validation attempt could be repeated multiple
time, typically 3 times. At least 20 bytes at the beginning of the packet
can be controlled by the attacker, and possibly many more.

If an application just drops packets from a suspicious port, it mitigates
all 4 avenues of attack. If it want precision instead, it has to write
specific code at four locations. RFC 9000 details these attacks, but mostly
says "protect using ingress filtering, per BCP38".
There *will* inevitably be some problems at the beginning, but one good
thing is that solutions are reasonably simple, can instantly be deployed
in field using port filtering, and will quickly land in the code itself
because the various stack authors have full control of their code.

Port filtering at the edge will not catch the "preferred address attack" in which the server proposes a non-anycast address to the client, so I guess QUIC stacks will have to protect that. As in, if a server says "please move the connection to 10.0.0.1:53", the client should just decline and stick to the anycast address. And yes, using a configurable list makes sense.



It could have been great to decide that the first byte in the protocol
should be random. This would have significantly voided most possibilities
of abuse by making the first byte non-predictable nor controllable. But
this is easy to say afterwards and it could come with its own set of
problems (e.g. fingerprinting from the random sequence) ;-)

That one will have to wait for V2. Or V3, actually.

Maintaining a public list of well-known suspicious ports can be both
helpful and dangerous. It's helpful in that it shows implementers that
some ranges should never appear as they're normally reserved for
listening services, that some numbers are well known and widely
deployed, and that some are less common but appear anywhere, indicating
that a range is not sufficient. This should help design a flexible
filtering mechanism. But it can also be dangerous if applied as-is:
this list cannot remain static as new entries may have to be added
within a few minutes to hours hours and each addition will cause extra
breakage, thus some previous ones will likely be dropped after a
previous service stopped being widely attacked. I.e. the list in
question likely needs to be accessible by configuration in field and
not be hard-coded.
AFAIK, such lists are hard coded in quite a few implementations. Even if
they are provided in a configuration file, changing the file requires
restarting the server. So "within a few minutes" may be problematic.
It's not a problem anymore once under heavy attack, believe me ;-)
I believe you. Memories of Blaster and all that.
Usually production management rules are very strict, but once a site is
under DDoS and zero useful traffic passes, all the rules suddenly change
and you don't even need a permission anymore to change a config file and
restart a process multiple times until the site recovers! The worst
moment is when it starts to work again and you would like to clean up
all the mess in your config files but you're not allowed to touch
anything anymore, as if your actions were the ones that caused the
problem in the first place.

Other approaches could work fairly well, such as keeping a rate counter
of incoming packets per source port (e.g. unparsable or initial packets)
which, above a moderate value, would serve to send retry, and above a
critical value, can be used to decide to block the port (possibly earlier
in the chain). This remains reasonably cheap to implement, though it may
end up causing some flapping as the rate will fall after the port is
blocked upstream, which may lead to it being reopened before the attack
is finished.
Yes. Should that really be "per source port" or "per source IP + port"?
Per port should be sufficient. There's no reason from the server side that
multiple clients collectively decide to use the same source port to send
legit traffic. And using the port alone will allow to quicky detect a new
internet-wide attack coming from a randomly vulnerable service (IoT
gateways, ToIP etc) regardless of their address. I don't know how this
would be distributed, but I guess that if a source port is responsible
for 100 times more traffic than all other one combined, it will usually
be a good indicator that you don't want to deal with that traffic anymore.
I would expect quite a bit of legit server-to-server traffic on ports 443 or 853, but yes, keeping a dynamic list of most abused ports makes sense. Although that's mostly an issue for volumetric attacks, and host based defenses are not too efficient there, but still it feels better than just receiving the attack and waiting until it stops.
I think we'll discover new fun things over time and will learn new
attacks and workarounds as deployments grow.
I have not heard of actual attacks "in the wild", but who knows...
It's too early, stacks are still evolving and the level of deployment
is not high yet. That doesn't mean nobody is trying to elaborate new
attacks. For example yesterday on my home servers which do not advertise
QUIC I received one short packet from port 443 to port 443 from a host
named "scanners.labs.something", and a 1238-byte one from a random port
to 443 from a host named "scan-19a.somethingelse". Thus it shows that
studies are already in progress. We just need to be patient :-)


We probably will not have to wait too long...

Thanks for the feedback.

-- Christian Huitema

Reply via email to