Lots to digest - responses below

Jan Scheurich <jan.scheur...@ericsson.com> writes:

> Hi Darrel,
> Let me try respond to your points below.
> Regards, Jan
>> -----Original Message-----
>> From: Darrell Ball [mailto:db...@vmware.com]
>> Sent: Thursday, 30 November, 2017 01:33
>> 
>> The idea of creating an “Conntrack Established state” specific to
>> each protocol layer, as you propose, does not adhere to any protocol
>> specifications that I am aware of.
>> 
>> 1/ UDP and ICMP do not even have such a concept as “Established”
>> connection, so having those specific protocols track “Conntrack
>> Established state” has no technical basis.
>> 
>> 2/ Even if we somehow ignored UDP and ICMP, which we cannot, TCP end
>> to end connection established state is based on a 3-WAY handshake.
>> You propose to have a “Conntrack Established state” based on a 2-way
>> handshake and build knowledge of the kernel derived Conntrack
>> Established state into the TCP state machine itself.
>> I think that is wrong as the TCP state machine should not know
>> anything about the arbitrary “kernel defined established conntrack
>> state”, based on a 2-way handshake or any other Conntrack
>> state. These are different layers.
>> Moreover, it is not even necessary, since the TCP state machine
>> generically tracks packets as valid (as does UDP and ICMP) including
>> replies.
>> So (valid && reply) tells us if we have a valid reply, in a protocol 
>> agnostic manner.
> After having re-read the kernel conntrack specification for userland
> states in http://www.iptables.info/en/connection-state.html (table
> 7-1), I tend to agree with you that the generic definition of
> "established" in the main conntrack module as "valid reply packet
> seen" would be in line with kernel semantics. And we would acknowledge
> your corresponding patch.
>> 3/ This brings us back to the question of bringing userspace
>> conntrack established state in line with the kernel conntrack
>> definition of established.
>> 
>> We don’t see a requirement of blindly copying the kernel; proper
>> modelling and technical merit is more important.
>> The discussion around points “1” and “2” above make this all the more clear.
>> 
>> The kernel uses an arbitrary definition of “conntrack established”
>> based on having received a reply packet and at a midpoint b/w the 2
>> endpoints.
> The Linux kernel definition of ESTABLISHED may be arbitrary and not
> entirely without problems in some corner cases, but we have to admit
> that it is well known and has proven to be useful in very many
> deployments of Linux conntrack at the heart of a stateful firewall
> within an OVS context, but more generally outside OVS. So it has set a
> de-facto standard.
>> The userspace connection tracker defines Established as any packet
>> that hits an existing conntrack entry.
>> Everyone thinks this definition is semantically correct and
>> intuitive. I also think that even you agree with this, based on your
>> previous e-mails and discussions.
>> The existence of a conntrack entry vs none is the key difference
>> that should be tracked for EST and this is exactly what the
>> userspace connection tracker does.
> The userspace conntrack definition of ESTABLISHED would be simple and
> logical, if we were starting from a clean slate. Unfortunately we
> aren't. The new semantics clashes with the Linux de-facto standard and
> the OVS legacy. In my eyes that is where simplicity ends and we create
> problems for users and developers.
> I would also argue that the userspace definition is just as arbitrary
> as the Linux kernel. It is different, perhaps slightly simpler, but I
> wouldn't say it is more correct than the kernel.
>> 
>> Furthermore, as discussed on another thread, any pipeline written
>> properly, where the commit of a conntrack entry happens in the
>> trusted direction, will never observe a difference b/w kernel and
>> userspace EST, since forward direction packets in the trusted
>> direction are always trusted and should not be subject to drop
>> whether NEW or EST.
>> 
>> Hence, I think we should keep userspace conntrack EST as is for now.
>> We can always revisit if we find differences w.r.t. real/proper conntrack 
>> pipelines.
>> Don’t be surprised, if I look at a pipeline and say it is not proper
>> with a reason provided, as I have done earlier. I might even say
>> something like “since your VM accepts all incoming traffic without
>> filtering, why do you even use conntrack rules for that VM”, for
>> example.
>> I know there is an infinite number of broken conntrack pipelines that can be 
>> written.
> I don't think we should lead this discussion based on whether a
> specific example conntrack pipeline is considered "proper" or
> not. Features that OVS offers to its users should have a well-defined
> semantics, no matter what datapath is present and how users choose to
> use them.
> If OVS cannot guarantee well-defined semantics across datapaths, it
> should warn users so that they can avoid pitfalls and limit themselves
> to an uncritical subset.
> With the current mismatch between kernel and userspace conntrack, the
> documentation would have to say something like:
> "It is unspecified exactly when the ct() lookup of a packet pertaining
> to a newly committed connection returns ESTABLISHED. Only when a first
> reply packet to that new connection has been looked up, that packet
> and all subsequent packets are guaranteed to be marked as
> ESTABLISHED. Before the first reply packet the lookup of packets in
> the originating direction may return NEW or ESTABLISHED, depending on
> the datapath."
> I find this difficult and confusing, defeating the purpose of
> providing simple semantics in OVS.
> I sincerely hope we can return to our agreement on the OVS conference
> and pursue your patch to align the userspace conntrack EST with the
> kernel.

I see a lot of differing viewpoints, but largely I think this default
commit behaviour looks like different preferences.  What would be better
would be a semantic the user could rely on, regardless of datapath
implementation.

I can think of good reasons to commit a connection, and have that
connection be in either a "reply unseen", or an "+est" state, and those
apply equally to both datapaths.  Given that, I'm wondering if it makes
sense to expand the commit action to something like (just illustrative):

  "actions=ct(commit:-rpl)"
  "actions=ct(commit:+est)"

and let the datapaths choose their default (when actions=ct(commit)).
In that manner, cases where the interrim node wouldn't always see a
reply packet (think async routing cases) can have the connection be
defaulted to "+est" matches (which is the default for the userspace
now).  This gives the user ultimate flexibility for how the datapath
would behave, and on a per-connection basis.  It has downsides, for
sure.  As a first approximation, the user would need to be aware of which
behavior they want, and codifying this new state would need to happen.
Then again, I see this discussion as being mostly a preference on
default behavior, and maybe there's room for an extra knob.

It should be noted, this still doesn't address users who have flow rules
which don't semantically match when moving between datapaths, but
presumably it would be easier and more predictable to give a well-known
semantic option for them to migrate.  Some users won't care; I am not
sure who would.

Just spit-balling an idea in case it's assumed that changing the default
userspace behavior is out of the question.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to