Dear Dumitru,
In the previous days, I’ve performed additional tests to gain better
understanding around the issue before giving you the details.
Thank you for your earlier explanation, it clarified how conntrack and
sampling work in the simple "vm1 --- ls --- vm2" topology. However, I
believe my original observations still hold in router related topologies.
------------------------------
Setup Recap
*Topology*: vm_a (10.2.1.5) --- ls1 --- router --- ls2 --- vm_b (10.2.3.5)
ACLs applied to a shared Port Group (pg_d559...):
-
*ACL A*: from-lport – allow-related IPv4 (sample_est = 2000000)
-
*ACL B*: to-lport – allow-related ICMP (sample_est = 1000000)
*Sample configuration*:
- ACL A: direction=from-lport, match="inport == @pg && ip4",
sample_est=2000000
- ACL B: direction=to-lport, match="outport == @pg && ip4 && icmp4",
sample_est=1000000
# ovn-nbctl acl-list pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc
> from-lport 1002 (inport == @pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc &&
> ip4) allow-related
> to-lport 1002 (outport == @pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc &&
> ip4 && ip4.src == 0.0.0.0/0 && icmp4) allow-related
------------------------------
Expected Behavior (based on your explanation)
-
*First ICMP request*: no sample (ct=new).
-
*First ICMP reply*:
-
One sample from *ingress pipeline* (sample_est = 1000000)
-
One sample from *egress pipeline* (sample_est = 2000000)
→ *Total: 2 samples* for reply --> True
------------------------------
Actual Behavior Observed
On the *first ICMP reply*, I see:
-
*3 samples total*:
-
*2 samples* in the *ingress pipeline*, both with obs_point_id=1000000
-
*1 sample* in the egress pipeline, with obs_point_id=2000000
This results in *duplicated sampling actions for a single logical
datapath flow* within the ingress pipeline.
Evidence:
# ovs-dpctl dump-flows | grep 10.2.1.5
recirc_id(0x1d5),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0x20020/0xff0031),ct_label(0xf4240000000000000000000000000),eth(src=fa:16:3e:6b:42:8e,dst=fa:16:3e:dd:02:c0),eth_type(0x0800),ipv4(src=10.2.1.5,dst=10.2.3.5,proto=1,ttl=64,frag=no),
packets:299, bytes:29302, used:0.376s,
actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),ct_clear,set(eth(src=fa:16:3e:d5:7b:d1,dst=fa:16:3e:f8:af:7d)),set(ipv4(ttl=63)),ct(zone=21),recirc(0x1d6)
# recirc_id(0x1d5): two flow_sample(...) actions with same metadata (1000000)
recirc_id(0x1d6),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0x20000/0xff0031),ct_label(0x1e8480000000000000000000000000),eth(dst=fa:16:3e:f8:af:7d),eth_type(0x0800),ipv4(dst=10.2.3.5,frag=no),
packets:299, bytes:29302, used:0.376s,
actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554439,obs_point_id=2000000,output_port=4294967295)),9
# plus one flow_sample(...) later in the pipeline with metadata (2000000)
Also confirmed via IPFIX stats:
# IPFIX before ping
sampled pkts: 192758
# After a single ping
sampled pkts: 192761 → Δ = 3
Additional Findings
-
The issue *only occurs* when VMs are on *separate logical switches
connected by a router*.
-
If both VMs are on the *same logical switch*, IPFIX is correctly sampled
only once per ACL.
-
The duplicated sampling occurs *even if ACL A (IPv4) and ACL C (IPv6)
are unrelated*, as long as both have sample_est and belong to the same
Port Group.
-
The error can be reproduced *even when only vm_a's Port Group has the
sampling ACLs*. vm_b does not require any sampling configuration for the
issue to occur.
------------------------------
Another Reproducible Scenario (Minimal)
Port Group A on vm_a with:
-
ACL A: from-lport IP4 (sample_est or not)
-
ACL B: to-lport ICMP sample_est=1000000
-
ACL C: from-lport IP6 sample_est=2000000
Port Group B on vm_b:
-
No sampling required
-
ACL to allow from-lport and to-lport traffic
When pinging vm_a from vm_b, the ICMP reply still results in *two samples
with obs_point_id=1000000*.
------------------------------
📌 Key Takeaway
I believe this confirms the IPFIX duplication issue is *not due to
conntrack behavior*, but rather due to *how multiple ACLs with sample_est
on the same Port Group (in different directions) result in
twice userspace(flow_sample(...)) actions* in the same flow.
------------------------------
To avoid overloading the email, I’ve included more detailed output and
explanations in the attachment. This email uses formatting elements such as
icons, headers, and dividers for clarity. If you experience any display
issues, please let me know and I’ll avoid using them in future messages.Please
tell me if I can run any additional traces. I’m happy to assist further. Best
regards,*Oscar*
On Fri, May 9, 2025 at 7:16 PM Dumitru Ceara <[email protected]> wrote:
> On 5/9/25 2:14 PM, Dumitru Ceara wrote:
> > On 5/9/25 5:38 AM, Trọng Đạt Trần wrote:
> >> Hi Dimitru,
> >>
> >
> > Hi Oscar,
> >
> >
> >> Thank you for pointing that out.
> >>
> >> To clarify: the terms “inbound” and “outbound” in my previous message
> >> were used from the *VM’s perspective*.
> >>
> >>
> >> Topology:
> >>
> >> |vm_a ---- network1 ---- router ---- network2 ---- vm_b |
> >>
> >>
> >> ACLs:
> >>
> >> *
> >>
> >> *ACL A*: allow-related VMs to *send* IPv4 traffic (|direction=from-
> >> lport|)
> >>
> >> *
> >>
> >> *ACL B*: allow-related VMs to *receive* ICMP traffic (|direction=to-
> >> lport|)
> >>
> >> I’ve attached both the *Northbound and Southbound database dumps* to
> >> ensure the full context is available.
> >>
> >
> > Thanks for the info, I tried locally with a simplified setup where I
> > emulate your topology:
> >
> > switch c9c171ef-849c-436d-b3f9-73d83b9c4e5d (ls)
> > port vm2
> > addresses: ["00:00:00:00:00:02"]
> > port vm1
> > addresses: ["00:00:00:00:00:01"]
> >
> > Those two VIFs are in a port group:
> >
> > # ovn-nbctl list port_group
> > _uuid : 7e7a96b9-e708-4eea-b380-018314f2435c
> > acls : [1d0e7b71-ff03-4c78-ace4-2448bf237e11,
> > 7cb023e9-fee5-4576-a67d-ce1f5d98805b]
> > external_ids : {}
> > name : pg
> > ports : [d991baa6-21b0-4d46-a15d-71b9e8d6708d,
> > f2c5679c-d891-4d34-8402-8bc2047fba61]
> >
> > With two ACLs applied:
> > # ovn-nbctl acl-list pg
> > from-lport 100 (inport==@pg && ip4) allow-related
> > to-lport 200 (outport==@pg && ip4 && icmp4) allow-related
> >
> > Both ACLs have only sampling for established traffic (sample_est) set:
> > # ovn-nbctl list acl
> > _uuid : 1d0e7b71-ff03-4c78-ace4-2448bf237e11
> > action : allow-related
> > direction : from-lport
> > match : "inport==@pg && ip4"
> > priority : 100
> > sample_est : 23153fae-0a73-4f86-bdf2-137e76647da8
> > sample_new : []
> >
> > _uuid : 7cb023e9-fee5-4576-a67d-ce1f5d98805b
> > action : allow-related
> > direction : to-lport
> > match : "outport==@pg && ip4 && icmp4"
> > priority : 200
> > sample_est : 42391c82-23d2-4f2b-a7b9-88afaa68282c
> > sample_new : []
> >
> > # ovn-nbctl list sample
> > _uuid : 23153fae-0a73-4f86-bdf2-137e76647da8
> > collectors : [82540855-dcd4-44e4-8354-e08a972500cd]
> > metadata : 2000000
> >
> > _uuid : 42391c82-23d2-4f2b-a7b9-88afaa68282c
> > collectors : [82540855-dcd4-44e4-8354-e08a972500cd]
> > metadata : 1000000
> >
> > Then I send a single ICMP echo packet from vm2 towards vm1. The ICMP
> > echo hits both ACLs but because it's the packet initiating the session
> > doesn't generate a sample (sample_new is not set in the ACLs). Instead
> > 2 conntrack entries are created for the ICMP session:
> >
> > - one in the CT zone of vm2 - here the from-lport ACL is hit so the
> > sample_est metadata of the from-lport ACL (200000) is stored along in
> > the conntrack state
> >
> > - one in the CT zone of vm1 - here the tolport ACL is hit so the
> > sample_est metadata of the to-lport ACL (100000) is stored along in the
> > conntrack state
> >
> > The ICMP echo packet reaches vm1 which replies with ICMP ECHO Reply.
> >
> > For the reply the CT zone of vm1 is first checked, we match the existing
> > conntrack entry (its state moves to "established") and a sample for the
> > stored metadata, 100000, is generated. Then, in the egress pipeline,
> > the CT zone of vm2 is checked, we match the other existing conntrack
> > entry (its state also moves to "established") and a sample for the
> > stored metadata, 200000, is generated.
> >
> > This seems correct to me. Stats also seem to confirm that:
> > # ip netns exec vm2 ping 42.42.42.2 -c1
> > PING 42.42.42.2 (42.42.42.2) 56(84) bytes of data.
> > 64 bytes from 42.42.42.2: icmp_seq=1 ttl=64 time=1.46 ms
> >
> > --- 42.42.42.2 ping statistics ---
> > 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > rtt min/avg/max/mdev = 1.455/1.455/1.455/0.000 ms
> >
> > # ovs-ofctl dump-ipfix-flow br-int
> > NXST_IPFIX_FLOW reply (xid=0x2): 1 ids
> > id 2: flows=2, current flows=0, sampled pkts=2, ipv4 ok=2, ipv6
> > ok=0, tx pkts=11
> > pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=11
> >
> > But then, when I increase the number of packets things become more
> > interesting. ICMP echos also generate samples. And while that might
> > seem like a bug, it's not. :)
> >
> > When ping sends multiple packets for a single invocation it uses the
> > same ICMP ID and just increments the ICMP seq, e.g.:
> >
> > 14:07:41.986618 00:00:00:00:00:02 > 00:00:00:00:00:01, ethertype IPv4
> > (0x0800), length 98: (tos 0x0, ttl 64, id 58647, offset 0, flags [DF],
> > proto ICMP (1), length 84)
> > 42.42.42.3 > 42.42.42.2: ICMP echo request, id 35717, seq 1, length
> 64
> >
> > 14:07:42.988077 00:00:00:00:00:02 > 00:00:00:00:00:01, ethertype IPv4
> > (0x0800), length 98: (tos 0x0, ttl 64, id 59085, offset 0, flags [DF],
> > proto ICMP (1), length 84)
> > 42.42.42.3 > 42.42.42.2: ICMP echo request, id 35717, seq 2, length
> 64
> >
> > But conntrack doesn't use the ICMP ID in the key for the session it
> > installs:
>
> Sorry about the typo, I meant to say "conntrack doesn't use the ICMP SEQ
> in the key for the session it installs, it only uses the ICMP ID".
>
> >
> > # ovs-appctl dpctl/dump-conntrack | grep 42.42.42
> >
> icmp,orig=(src=42.42.42.3,dst=42.42.42.2,id=35628,type=8,code=0),reply=(src=42.42.42.2,dst=42.42.42.3,id=35628,type=0,code=0),zone=4,mark=131104,labels=0xf4240000000000000000000000000
> >
> icmp,orig=(src=42.42.42.3,dst=42.42.42.2,id=35628,type=8,code=0),reply=(src=42.42.42.2,dst=42.42.42.3,id=35628,type=0,code=0),zone=6,mark=131072,labels=0x1e8480000000000000000000000000
> >
> > So, subsequent ICMP requests will match on these two existing
> > established entries and (because sampling_est) is configured samples are
> > generated for them too.
> >
> > That's also visible in the datapath flows that forward packets in the
> > "original" direction (ICMP ECHOs in our case):
> >
> > # ovs-appctl dpctl/dump-flows | grep sample | grep '\-rpl'
> >
> recirc_id(0x29),in_port(3),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0x20000/0xff0071),ct_label(0x1e8480000000000000000000000000),eth(src=00:00:00:00:00:02,dst=00:00:00:00:00:01),eth_type(0x0800),ipv4(proto=1,frag=no),
> > packets:8, bytes:784, used:2.342s,
> >
> actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554434,obs_point_id=2000000,output_port=4294967295)),ct(commit,zone=6,mark=0x20000/0xff0071,label=0x1e8480000000000000000000000000/0xffffffffffff00000000000000000000,nat(src)),ct(zone=4),recirc(0x2a)
> >
> >
> recirc_id(0x2a),in_port(3),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0x20020/0xff0071),ct_label(0xf4240000000000000000000000000),eth(src=00:00:00:00:00:02,dst=00:00:00:00:00:00/ff:ff:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),
> > packets:8, bytes:784, used:2.342s,
> >
> actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554434,obs_point_id=1000000,output_port=4294967295)),ct(commit,zone=4,mark=0x20020/0xff0071,label=0xf4240000000000000000000000000/0xffffffffffff00000000000000000000,nat(src)),1
> >
> > So, for a less complicated test, maybe you should try with UDP/TCP
> instead.
> >
> > I hope that clarifies your doubts.
> >
> > Best regards,
> > Dumitru
> >
> >> Best regards,
> >>
> >> Oscar
> >>
> >>
> >> On Thu, May 8, 2025 at 8:11 PM Dumitru Ceara <[email protected]
> >> <mailto:[email protected]>> wrote:
> >>
> >> Hi Oscar,
> >>
> >> On 5/6/25 12:31 PM, Trọng Đạt Trần wrote:
> >> > As requested, I’ve attached additional tracing information
> related to
> >> > the sampling duplication issue.
> >> >
> >> > *
> >> >
> >> > The file |ofproto_trace.log| contains the full output of
> |ofproto/
> >> > trace| commands.
> >> >
> >> > *
> >> >
> >> > The archive |ovn-detrace.tar.gz| includes six separate files,
> each
> >> > corresponding to an |ovn-detrace| output for a flow I believe
> is
> >> > involved in the duplicated sampling.
> >> >
> >> > Since I’m not fully confident in how to use |--ct-next option|,
> I’ve
> >> > included traces for all six related flows to ensure completeness.
> >> >
> >> > Please let me know if you need further details, or if I should
> re-run
> >> > any commands with additional options.
> >> >
> >>
> >> This seems fairly easy to reproduce locally for investigation; I
> didn't
> >> try yet though. However, would you mind sharing your OVN NB
> database
> >> file (I'm assuming this is a test environment)?
> >>
> >> I would like to make sure we don't have any misunderstanding
> because the
> >> terms you use below in your ACL description (e.g.,
> "outbound"/"inbound")
> >> are not standard terms. Having the actual ACL (and the rest of the
> NB)
> >> contents will make it easier to debug.
> >>
> >> Thanks,
> >> Dumitru
> >>
> >> > Best regards,
> >> >
> >> > *Oscar*
> >> >
> >> >
> >> > On Tue, May 6, 2025 at 4:15 PM Adrián Moreno <[email protected]
> >> <mailto:[email protected]>
> >> > <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >> >
> >> > On Tue, May 06, 2025 at 11:48:07AM +0700, Trọng Đạt Trần
> wrote:
> >> > > Dear Adrián,
> >> > >
> >> > > Thank you for your response. I’ve applied your suggestion
> to use
> >> > separate
> >> > > sample entries for each ACL. However, I am still seeing
> >> unexpected
> >> > behavior
> >> > > in the IPFIX output that I’d like to clarify.
> >> > > Test Setup (Same as Before)
> >> > >
> >> > > vm_a ---- network1 ---- router ---- network2 ---- vm_b
> >> > >
> >> > >
> >> > > -
> >> > >
> >> > > Two ACLs:
> >> > > -
> >> > >
> >> > > ACL A: allow-related *outbound* IPv4
> >> > > -
> >> > >
> >> > > ACL B: allow-related *inbound* ICMP
> >> > > -
> >> > >
> >> > > ACLs applied symmetrically to both VMs.
> >> > > -
> >> > >
> >> > > Test traffic: ICMP request from vm_b to vm_a, and reply
> from
> >> > vm_a to vm_b
> >> > > .
> >> > >
> >> > > Key Problem Observed
> >> > >
> >> > > When sampling is enabled on *both* ACLs, the IPFIX record
> for
> >> > *flow (3)*
> >> > > (the ICMP reply from vm_a → router) shows *120 packets/min*.
> >> > >
> >> > > However:
> >> > >
> >> > > -
> >> > >
> >> > > If *only ACL B* (inbound ICMP) is sampled → (3) = 60
> >> packets/min
> >> > > -
> >> > >
> >> > > If *only ACL A* (outbound IP4) is sampled → (3) not
> present
> >> > > -
> >> > >
> >> > > If both are sampled → (3) = 120 packets/min
> >> > >
> >> > > This suggests that *flow (3) is being sampled twice* — even
> >> though it
> >> > > represents a *single logical flow and matches only ACL B*.
> >> > > IPFIX Observations
> >> > > FlowDescriptionExpectedActual
> >> > > (1) vm_b → router (ICMP request) 60 pkt/m 60
> >> > > (2) router → vm_a (ICMP request) 60 pkt/m 60
> >> > > (3) vm_a → router (ICMP reply) 60 pkt/m 120 ⚠️
> >> > > (4) router → vm_b (ICMP reply) 60 pkt/m 60
> >> >
> >> > This is not what I'd expect, maybe Dumitru knows?
> >> >
> >> > Could you attach ofproto/trace and ovn-detrce outputs from
> both
> >> > directions?
> >> >
> >> > Thanks.
> >> > Adrián
> >> >
> >>
>
>
With the setup you are using ( vm1 --- logical_switch --- vm2) the IPFIX will NOT be duplicated and working correctly (as you explained).
However, the IPFIX would work incorrectly when vm1 and vm2 are on different logical_switches connected by a router. (vm_a ---- ls1 ---- router ---- ls2 ---- vm_b)
I. Problem
My setup: vm_a (10.2.1.5) ---- ls1 ---- router ---- ls2 ---- vm_b (10.2.3.5)
With two ACLs applied:
- ACL A: allow-related VMs to send IPv4 traffic (direction=from-lport)
- ACL B: allow-related VMs to receive ICMP traffic (direction=to-lport)
# ovn-nbctl acl-list pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc
from-lport 1002 (inport == @pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc && ip4) allow-related
to-lport 1002 (outport == @pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc && ip4 && ip4.src == 0.0.0.0/0 && icmp4) allow-related
Both ACLs set sample_est:
# ovn-nbctl list acl
(ovn-nb-db)[root@Openstack-controller-1-ovn-scale-test /]# ovn-nbctl list acl f2886845-e9de-4a4e-a104-f0e06b7e7cbd
_uuid : f2886845-e9de-4a4e-a104-f0e06b7e7cbd
action : allow-related
direction : to-lport
match : "outport == @pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc && ip4 && ip4.src == 0.0.0.0/0 && icmp4"
priority : 1002
sample_est : 62bfeefb-30c0-471d-9775-05274b29c8c7
sample_new : []
tier : 0
(ovn-nb-db)[root@Openstack-controller-1-ovn-scale-test /]# ovn-nbctl list acl dbf38067-85e2-44cf-b37e-362f1c8eb6ae
_uuid : dbf38067-85e2-44cf-b37e-362f1c8eb6ae
action : allow-related
direction : from-lport
match : "inport == @pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc && ip4"
priority : 1002
sample_est : dd173a59-c090-403a-9212-6dfec53e999a
sample_new : []
tier : 0
# ovn-nbctl list sample
(ovn-nb-db)[root@Openstack-controller-1-ovn-scale-test /]# ovn-nbctl --no-leader-only list sample
_uuid : dd173a59-c090-403a-9212-6dfec53e999a
collectors : [7720cdf4-e6f4-4cc8-a51a-9b3d8d7df516]
metadata : 2000000
_uuid : 62bfeefb-30c0-471d-9775-05274b29c8c7
collectors : [7720cdf4-e6f4-4cc8-a51a-9b3d8d7df516]
metadata : 1000000
Result of this setup:
- The First icmp flow:
- ICMP request packet from vm_b to vm_a will not be sampled because it is a new connection (ct=new) -> Correct as you mentioned
- ICMP reply packet from vm_a to vm_b is where the result is not as expected:
+ You said: "For the reply the CT zone of vm1 is first checked, we match the existing conntrack entry (its state moves to "established") and a sample for the stored metadata, 100000, is generated."
---> But with the setup i provided, the CT zone of vm1 is checked and TWO SAMPLES for the stored metadata is generated. ==> Incorrect, should be ONE sample
+ In the egress pipeline, a sample for the stored metadata, 200000, is generated. -> Correct
The Stats taken from my environment:
#### IPFIX before ping:
# ovs-ofctl dump-ipfix-flow br-int
NXST_IPFIX_FLOW reply (xid=0x2): 1 ids
id 2: flows=2227, current flows=2, sampled pkts=192758, ipv4 ok=192758, ipv6 ok=0, tx pkts=24193
pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0
#### IPFIX after a single ping:
# ovs-ofctl dump-ipfix-flow br-int
NXST_IPFIX_FLOW reply (xid=0x2): 1 ids
id 2: flows=2227, current flows=2, sampled pkts=192761, ipv4 ok=192761, ipv6 ok=0, tx pkts=24193
pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0
=> The iCMP reply packet has been sampled 3 times, TWO in ingress pipeline and one in egress pipeline
#### That's also visible in the datapath flows
# ovs-dpctl dump-flows system@ovs-system | egrep "10.2.1.5|10.2.3.5"
recirc_id(0),in_port(6),eth(src=fa:16:3e:6b:42:8e,dst=fa:16:3e:dd:02:c0),eth_type(0x0800),ipv4(src=10.2.1.5,proto=1,frag=no), packets:299, bytes:29302, used:0.376s, actions:ct(zone=10),recirc(0x1d5)
recirc_id(0x1d5),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0x20020/0xff0031),ct_label(0xf4240000000000000000000000000),eth(src=fa:16:3e:6b:42:8e,dst=fa:16:3e:dd:02:c0),eth_type(0x0800),ipv4(src=10.2.1.5,dst=10.2.3.5,proto=1,ttl=64,frag=no), packets:299, bytes:29302, used:0.376s, actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),ct_clear,set(eth(src=fa:16:3e:d5:7b:d1,dst=fa:16:3e:f8:af:7d)),set(ipv4(ttl=63)),ct(zone=21),recirc(0x1d6)
recirc_id(0x1d6),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0x20000/0xff0031),ct_label(0x1e8480000000000000000000000000),eth(dst=fa:16:3e:f8:af:7d),eth_type(0x0800),ipv4(dst=10.2.3.5,frag=no), packets:299, bytes:29302, used:0.376s, actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554439,obs_point_id=2000000,output_port=4294967295)),9
recirc_id(0),in_port(9),eth(src=fa:16:3e:f8:af:7d,dst=fa:16:3e:d5:7b:d1),eth_type(0x0800),ipv4(src=10.2.3.5,proto=1,frag=no), packets:299, bytes:29302, used:0.376s, actions:ct(zone=21),recirc(0x1d3)
recirc_id(0x1d3),in_port(9),ct_state(+new-est-rel-rpl-inv+trk),ct_mark(0/0xff0031),ct_label(0/0xffffffffffffffffffffffff),eth(src=fa:16:3e:f8:af:7d,dst=fa:16:3e:d5:7b:d1),eth_type(0x0800),ipv4(src=10.2.3.5,dst=10.2.1.5,proto=1,ttl=64,frag=no), packets:299, bytes:29302, used:0.376s, actions:ct(commit,zone=21,mark=0x20000/0xff0031,label=0x1e8480000000000000000000000000/0xffffffff000000000000000000000000,nat(src)),set(eth(src=fa:16:3e:dd:02:c0,dst=fa:16:3e:6b:42:8e)),set(ipv4(ttl=63)),ct(zone=10),recirc(0x1d4)
recirc_id(0x1d4),in_port(9),ct_state(+new-est-rel-rpl-inv+trk),ct_mark(0/0xff0031),ct_label(0/0xffffffffffffffffffffffff),eth(src=fa:16:3e:dd:02:c0,dst=fa:16:3e:6b:42:8e),eth_type(0x0800),ipv4(dst=10.2.1.5,proto=1,frag=no), packets:299, bytes:29302, used:0.376s, actions:ct(commit,zone=10,mark=0x20020/0xff0031,label=0xf4240000000000000000000000000/0xffffffff000000000000000000000000,nat(src)),6
- The Second icmp flow and go on:
- ICMP request packets from vm_b to vm_a will be sampled because it matches a "established" connection -> Correct
- ICMP reply packets from vm_a to vm_b will be as in the First flow:
+ the CT zone of vm1 is checked and TWO SAMPLES for the stored metadata is generated ==> Incorrect, should be ONE sample
+ the egress pipeline, a sample for the stored metadata, 200000, is generated. -> Correct
#### The datapath flows from the Second icmp flow and go on:
recirc_id(0),in_port(6),eth(src=fa:16:3e:6b:42:8e,dst=fa:16:3e:dd:02:c0),eth_type(0x0800),ipv4(src=10.2.1.5,proto=1,frag=no), packets:538, bytes:52724, used:0.673s, actions:ct(zone=10),recirc(0x1d5)
recirc_id(0x1d5),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0x20020/0xff0031),ct_label(0xf4240000000000000000000000000),eth(src=fa:16:3e:6b:42:8e,dst=fa:16:3e:dd:02:c0),eth_type(0x0800),ipv4(src=10.2.1.5,dst=10.2.3.5,proto=1,ttl=64,frag=no), packets:538, bytes:52724, used:0.673s, actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),ct_clear,set(eth(src=fa:16:3e:d5:7b:d1,dst=fa:16:3e:f8:af:7d)),set(ipv4(ttl=63)),ct(zone=21),recirc(0x1d6)
recirc_id(0x1d6),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0x20000/0xff0031),ct_label(0x1e8480000000000000000000000000),eth(dst=fa:16:3e:f8:af:7d),eth_type(0x0800),ipv4(dst=10.2.3.5,frag=no), packets:538, bytes:52724, used:0.673s, actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554439,obs_point_id=2000000,output_port=4294967295)),9
recirc_id(0),in_port(9),eth(src=fa:16:3e:f8:af:7d,dst=fa:16:3e:d5:7b:d1),eth_type(0x0800),ipv4(src=10.2.3.5,proto=1,frag=no), packets:538, bytes:52724, used:0.673s, actions:ct(zone=21),recirc(0x1d3)
recirc_id(0x1d3),in_port(9),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0x20000/0xff0031),ct_label(0x1e8480000000000000000000000000),eth(src=fa:16:3e:f8:af:7d,dst=fa:16:3e:d5:7b:d1),eth_type(0x0800),ipv4(src=10.2.3.5,dst=10.2.1.5,proto=1,ttl=64,frag=no), packets:83, bytes:8134, used:0.672s, actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554439,obs_point_id=2000000,output_port=4294967295)),ct_clear,set(eth(src=fa:16:3e:dd:02:c0,dst=fa:16:3e:6b:42:8e)),set(ipv4(ttl=63)),ct(zone=10),recirc(0x1d4)
recirc_id(0x1d4),in_port(9),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0x20020/0xff0031),ct_label(0xf4240000000000000000000000000),eth(src=fa:16:3e:dd:02:c0,dst=fa:16:3e:6b:42:8e),eth_type(0x0800),ipv4(dst=10.2.1.5,proto=1,frag=no), packets:83, bytes:8134, used:0.673s, actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),6
=> The ICMP flow has been sampled 5 times:
- ICMP request: one in ingress pipeline and one in egress pipeline
- ICMP reply: TWO in ingress pipeline and one in egress pipeline
II. Further Testing
- Conducting more test cases, I notice that this behavior will happen whenever we sample both directions in a Port Group, even though the sampled ACLs are not related to each other.
For example, with our test case, I added a IPv6 to allow from-lport, enable sample only on two ACLs: the from-lport IPv6 and the to-lport ICMP, the duplicated IPFIX behavior will occur.
More over, the vm_b does not need to be sampled or having the same Port_group or ACL with vm_a for the error to happen. Meaning only config the Port_group of vm_a with sample like the above will result in error, regardless of the VM_b's ACL or Port Group configuration.
Another setup to demonstrate my statements:
vm_a (10.2.1.5) ---- ls1 ---- router ---- ls2 ---- vm_b (10.2.3.5)
Port_group_a:
- ACL A: allow-related VMs to send IPv4 traffic (direction=from-lport)
- ACL B: allow-related VMs to receive ICMP traffic (direction=to-lport) (enable sample_est)
- ACL C: allow-related VMs to send IPv6 traffic (direction=from-lport) (enable sample_est)
# ovn-nbctl acl-list pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc
from-lport 1002 (inport == @pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc && ip4) allow-related
from-lport 1002 (inport == @pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc && ip6) allow-related (sample_est=dd173a59-c090-403a-9212-6dfec53e999a)
to-lport 1002 (outport == @pg_d559bf91_b95f_49c0_8e4a_bf35f15e1dcc && ip4 && ip4.src == 0.0.0.0/0 && icmp4) allow-related (sample_est=62bfeefb-30c0-471d-9775-05274b29c8c7)
Port_group_b:
- ACL A: allow-related VMs to send IPv4 traffic (direction=from-lport)
- ACL B: allow-related VMs to receive ICMP traffic (direction=to-lport)
(ovn-nb-db)[root@Openstack-controller-1-ovn-scale-test /]# ovn-nbctl acl-list e5bda342-5123-4848-a4f0-58c81adbeeaa
from-lport 1002 (inport == @pg_89adc8f2_07a9_4b5b_9901_af3f0dc192ad && ip4) allow-related
to-lport 1002 (outport == @pg_89adc8f2_07a9_4b5b_9901_af3f0dc192ad && ip4 && ip4.src == 0.0.0.0/0 && icmp4) allow-related
Ping from vm_b to vm_a:
- First icmp request -> no sample
- First icmp reply -> TWO samples in ingress pipeline (Incorrect)
# ovs-dpctl dump-flows system@ovs-system | egrep "10.2.1.5|10.2.3.5"
recirc_id(0),in_port(6),eth(src=fa:16:3e:6b:42:8e,dst=fa:16:3e:dd:02:c0),eth_type(0x0800),ipv4(src=10.2.1.5,proto=1,frag=no), packets:90, bytes:8820, used:0.740s, actions:ct(zone=10),recirc(0x1e5)
recirc_id(0x1e5),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0x20020/0xff0031),ct_label(0xf4240000000000000000000000000),eth(src=fa:16:3e:6b:42:8e,dst=fa:16:3e:dd:02:c0),eth_type(0x0800),ipv4(src=10.2.1.5,dst=10.2.3.5,proto=1,ttl=64,frag=no), packets:90, bytes:8820, used:0.740s, actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),ct_clear,set(eth(src=fa:16:3e:d5:7b:d1,dst=fa:16:3e:f8:af:7d)),set(ipv4(ttl=63)),ct(zone=21),recirc(0x1e6)
recirc_id(0x1e6),in_port(6),ct_state(-new+est-rel+rpl-inv+trk),ct_mark(0/0xff0031),ct_label(0/0xffffffffffffffffffffffff),eth(dst=fa:16:3e:f8:af:7d),eth_type(0x0800),ipv4(dst=10.2.3.5,frag=no), packets:90, bytes:8820, used:0.740s, actions:9
recirc_id(0),in_port(9),eth(src=fa:16:3e:f8:af:7d,dst=fa:16:3e:d5:7b:d1),eth_type(0x0800),ipv4(src=10.2.3.5,proto=1,frag=no), packets:90, bytes:8820, used:0.740s, actions:ct(zone=21),recirc(0x1e3)
recirc_id(0x1e3),in_port(9),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0/0xff0031),ct_label(0/0xffffffffffffffffffffffff),eth(src=fa:16:3e:f8:af:7d,dst=fa:16:3e:d5:7b:d1),eth_type(0x0800),ipv4(src=10.2.3.5,dst=10.2.1.5,proto=1,ttl=64,frag=no), packets:89, bytes:8722, used:0.740s, actions:ct_clear,set(eth(src=fa:16:3e:dd:02:c0,dst=fa:16:3e:6b:42:8e)),set(ipv4(ttl=63)),ct(zone=10),recirc(0x1e4)
recirc_id(0x1e4),in_port(9),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0x20020/0xff0031),ct_label(0xf4240000000000000000000000000),eth(src=fa:16:3e:dd:02:c0,dst=fa:16:3e:6b:42:8e),eth_type(0x0800),ipv4(dst=10.2.1.5,proto=1,frag=no), packets:89, bytes:8722, used:0.740s, actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554437,obs_point_id=1000000,output_port=4294967295)),6
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss