Re: [ovs-dev] [PATCH v3] dpif-netdev: dfc_process optimization by prefetching EMC entry.

2019-03-24 Thread Yanqin Wei (Arm Technology China)
Hi Ian,

I also observed a minor throughput drop (around 1%) with single flow in arm 
platform, but not 25% drop.  Maybe the additional prefetch operation cause it.
Anyway, when you come back next week, let's discuss this patch again.   

Best Regards,
Wei Yanqin

-Original Message-
From: Ian Stokes  
Sent: Monday, March 25, 2019 6:16 AM
To: Yanqin Wei (Arm Technology China) ; d...@openvswitch.org
Cc: nd ; Gavin Hu (Arm Technology China) ; Ilya 
Maximets 
Subject: Re: [ovs-dev] [PATCH v3] dpif-netdev: dfc_process optimization by 
prefetching EMC entry.

On 3/13/2019 5:27 AM, Yanqin Wei wrote:
> It is observed that the throughput of multi-flow is worse than 
> single-flow in the EMC NIC to NIC cases. It is because CPU cache-miss 
> increasing in EMC lookup. Each flow need load at least one EMC entry 
> to CPU cache(several cache lines) and compare it with packet miniflow.
> This patch improve it by prefetching EMC entry in advance. Hash value 
> can be obtained from dpdk rss hash, so this step can be advanced ahead 
> of
> miniflow_extract() and prefetch EMC entry there. The prefetching size 
> is defined as ROUND_UP(128,CACHE_LINE_SIZE), which can cover majority 
> traffic including TCP/UDP protocol and need 2 cache lines in most modern CPU.
> Performance test was run in some arm platform. 1000/1 flows 
> NIC2NIC test achieved around 10% throughput improvement in 
> thunderX2(aarch64 platform).
> 

Thanks for this Wei, not a few review, please see some minor comments below WRT 
style issues.

I've also run some benchmarks on this. I was seeing typically a ~3% drop on x86 
with single flows with RFC2544. However once or twice, I saw a drop of up to 
25% on achievable lossless packet rate but I suspect it could be an anomaly in 
my setup.

Ilya, if you are testing this week on x86, it would be great you confirm if you 
see something similar in your benchmarks?

For vsperf phy2phy_scalability flow tests on x86 I saw an improvement of 
+3% after applying the patch for zero loss tests and +5% in the case of
phy2phy_scalability_cont so this looks promising.

As an FYI I'll be I'm out of office this coming week so will not have an 
opportunity to investigate further until I'm back in office. I'll be 
able to review and benchmark further then.


> Signed-off-by: Yanqin Wei 
> Reviewed-by: Gavin Hu 
Although it doesn't appear here or in patchwork, after downloading the 
patch the sign off and review tags above appear duplicated after being 
applied. Examining the mbox I can confirm they are duplicated, can you 
check this on your side also?

> ---
>   lib/dpif-netdev.c | 80 
> ---
>   1 file changed, 52 insertions(+), 28 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 4d6d0c3..982082c 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -189,6 +189,10 @@ struct netdev_flow_key {
>   #define DEFAULT_EM_FLOW_INSERT_MIN (UINT32_MAX / \
>   DEFAULT_EM_FLOW_INSERT_INV_PROB)
>   
> +/* DEFAULT_EMC_PREFETCH_SIZE can cover majority traffic including TCP/UDP
> + * protocol. */
> +#define DEFAULT_EMC_PREFETCH_SIZE ROUND_UP(128,CACHE_LINE_SIZE)
> +
>   struct emc_entry {
>   struct dp_netdev_flow *flow;
>   struct netdev_flow_key key;   /* key.hash used for emc hash value. */
> @@ -6166,15 +6170,20 @@ dp_netdev_upcall(struct dp_netdev_pmd_thread *pmd, 
> struct dp_packet *packet_,
>   }
>   
>   static inline uint32_t
> -dpif_netdev_packet_get_rss_hash_orig_pkt(struct dp_packet *packet,
> -const struct miniflow *mf)
> +dpif_netdev_packet_get_packet_rss_hash(struct dp_packet *packet,
> +bool md_is_valid)
>   {
> -uint32_t hash;
> +uint32_t hash,recirc_depth;
>   
> -if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> -hash = dp_packet_get_rss_hash(packet);
> -} else {
> -hash = miniflow_hash_5tuple(mf, 0);
> +hash = dp_packet_get_rss_hash(packet);
> +
> +if (md_is_valid) {
> +/* The RSS hash must account for the recirculation depth to avoid
> + * collisions in the exact match cache */
Minor, comment style, missing period at end of comment.

> +recirc_depth = *recirc_depth_get_unsafe();
> +if (OVS_UNLIKELY(recirc_depth)) {
> +hash = hash_finish(hash, recirc_depth);
> +}
>   dp_packet_set_rss_hash(packet, hash);
>   }
>   
> @@ -6182,24 +6191,23 @@ dpif_netdev_packet_get_rss_hash_orig_pkt(struct 
> dp_packet *packet,
>   }
>   
>   static inline uint32_t
> -dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
> -const struct miniflow *mf)
> +dpif_netdev_packet_get_hash_5tuple(struct dp_packet *packet,
> +const struct miniflow *mf,
> +bool md_is_valid)
>   {
> -uint32_t hash, recirc_depth;
> +

[ovs-dev] [RFC] Introduce "OpenFlow Controller as Shared Library"

2019-03-24 Thread Ansis Atteka
From: Ansis Atteka 

Currently ovs-vswitchd process can communicate with an OpenFlow
controller only through tcp, unix and ssl sockets.

This patch would allow ovs-vswitchd process to communicate
with an OpenFlow controller by directly calling into its
code that provides interface similar to a socket (ie
implements read() and write() functions).

There are few benefits of using shared library as OpenFlow
controller:

1. Better performance by
   a) avoiding copying OpenFlow messages to socket buffers; AND
   b) reducing context switches.
   The preliminary tests that I did improved performance by ~30% for
   an OpenFlow controller that handles PACKET_INs and resubmits packets
   with PACKET_OUTs.
2. Better parallelization in future by distributing the load
   over ovs-vswitchd handler threads (currently only one thread calls into
   the shared library code).
3. Eliminate undeterministic thread blocking that may be caused when
   socket buffers are full.
4. In some cases better security (e.g. by allowing to confine ovs-vswitchd
   process to a stricter Access Control policy). Although, In some cases
   security may get worse (e.g. because controller would run in the same
   virtual memory space as ovs-vswitchd process).

While the code is enough to demonstrate PoC, I have left some TODOs.
Because of that I am sending this code as RFC to hear more feedback
from community on subjects as
1. what should be the API that shared libraries should export.
2. if I am possibly missing something critical that makes this approach
   not feasible (e.g. race conditions, something that ovs does behind
   scenes (like vlog module initialization for plugin) and would require
   overhaul in other components, impossible to integrate code with event
   loop, inpractical cleanups that would make it impossible
   to unload plugins)

In this patch I am proposing an API that requires library to export
socket functions that mimic socket read() and write() functions.  In some
cases this allows to easy retrofit existing OpenFlow controllers as
shared library plugins.  To test out one can set test controller with

ovs-vsctl add-br brX
ovs-vsctl set-controller brX dl:/ovs/tests/.libs/libtest-dl.so
sudo ovs-ofctl add-flow brX "actions=controller"

And observe that packet-ins get to OpenFlow controller plugin:

2019-03-25T04:10:35.615Z|01230|libtestdl|INFO|received 255 byte message from 
Open vSwitch
2019-03-25T04:10:35.615Z|01231|libtestdl|INFO|Received OFPTYPE_PACKET_IN
2019-03-25T04:10:36.616Z|01232|libtestdl|INFO|received 255 byte message from 
Open vSwitch
2019-03-25T04:10:36.616Z|01233|libtestdl|INFO|Received OFPTYPE_PACKET_IN
2019-03-25T04:10:38.143Z|01234|libtestdl|INFO|received 8 byte message from Open 
vSwitch

Another approach would be for shared libraries to export init() and
finit() functions that would self register and self-unregisters certain class
implementations.

Signed-off-by: Ansis Atteka 
---
 Makefile.am   |   2 +-
 lib/automake.mk   |   3 +
 lib/stream-dl.c   | 161 +++
 lib/stream-dl.h   |  28 +
 lib/stream-dlopen.c   |  79 +++
 lib/stream-provider.h |   3 +
 lib/stream.c  |   1 +
 lib/vconn-provider.h  |   1 +
 lib/vconn-stream.c|   2 +
 lib/vconn.c   |   1 +
 tests/automake.mk |   6 ++
 tests/test-dl.c   | 169 ++
 12 files changed, 455 insertions(+), 1 deletion(-)
 create mode 100644 lib/stream-dl.c
 create mode 100644 lib/stream-dl.h
 create mode 100644 lib/stream-dlopen.c
 create mode 100644 tests/test-dl.c

diff --git a/Makefile.am b/Makefile.am
index ff1f94b48..7cb0a6b55 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -338,7 +338,7 @@ thread-safety-check:
if test -e .git && (git --version) >/dev/null 2>&1 && \
  grep -n -f build-aux/thread-safety-blacklist \
`git ls-files | grep '\.[ch]$$' \
- | $(EGREP) -v '^datapath|^lib/sflow|^third-party'` /dev/null \
+ | $(EGREP) -v '^datapath|^lib/sflow|^lib/stream-dl|^third-party'` 
/dev/null \
  | $(EGREP) -v ':[ ]*/?\*'; \
then \
  echo "See above for list of calls to functions that are"; \
diff --git a/lib/automake.mk b/lib/automake.mk
index cc5dccf39..1e9a6eefa 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -269,6 +269,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/sset.h \
lib/stp.c \
lib/stp.h \
+lib/stream-dl.c \
+lib/stream-dl.h \
lib/stream-fd.c \
lib/stream-fd.h \
lib/stream-provider.h \
@@ -346,6 +348,7 @@ lib_libopenvswitch_la_SOURCES += \
lib/signals.c \
lib/signals.h \
lib/socket-util-unix.c \
+lib/stream-dlopen.c \
lib/stream-unix.c
 endif
 
diff --git a/lib/stream-dl.c b/lib/stream-dl.c
new file mode 100644
index 0..1ff7dfd8e
--- /dev/null
+++ b/lib/stream-dl.c
@@ -0,0 +1,161 @@
+/*
+ * 

Re: [ovs-dev] [PATCH net-next] openvswitch: Make metadata_dst tunnel work in IP_TUNNEL_INFO_BRIDGE mode

2019-03-24 Thread wenxu

On 3/25/2019 9:47 AM, Tonghao Zhang wrote:
> On Mon, Mar 25, 2019 at 9:24 AM wenxu  wrote:
>> On 2019/3/25 上午2:46, Pravin Shelar wrote:
>>> On Sun, Mar 24, 2019 at 12:03 AM wenxu  wrote:
 On 2019/3/24 上午5:39, Pravin Shelar wrote:
> On Sat, Mar 23, 2019 at 2:18 AM wenxu  wrote:
>> On 2019/3/23 下午3:50, Pravin Shelar wrote:
>>
>> On Thu, Mar 21, 2019 at 3:34 AM  wrote:
>>
>> From: wenxu 
>>
>> There is currently no support for the multicasti/broadcst aspects
>> of VXLAN in ovs. In the datapath flow the tun_dst must specific.
>> But in the IP_TUNNEL_INFO_BRIDGE mode the tun_dst can not be specific.
>> And the packet can forward through the fdb of vxlan devcice. In
>> this mode the broadcast/multicast packet can be sent through the
>> following ways in ovs.
>>
>> ovs-vsctl add-port br0 vxlan -- set in vxlan type=vxlan \
>> options:key=1000 options:remote_ip=flow
>> ovs-ofctl add-flow br0 in_port=LOCAL,dl_dst=ff:ff:ff:ff:ff:ff,\
>> action=output:vxlan
>>
>> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
>> src_vni 1000 vni 1000 self
>> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
>> src_vni 1000 vni 1000 self
>>
>> This would make datapath bit complicated, can you give example of such 
>> use-case?
>>
>> There is currently no support for the multicast/broadcast aspects
>> of VXLAN in ovs. To get around the lack of multicast support, it is 
>> possible to
>> pre-provision MAC to IP address mappings either manually or from a 
>> controller.
>>
>> With this patch we can achieve this through the fdb of the lower vxlan
>> device.
>>
>> For example. three severs connects with vxlan.
>> server1 IP 10.0.0.1 tunnel IP  172.168.0.1 vni 1000
>> server2 IP 10.0.0.2 tunnel IP  172.168.0.2 vni 1000
>> server3 IP 10.0.0.3 tunnel IP  172.168.0.3 vni 1000
>>
>> All the broadcast arp request from server1, can be send to vxlan_sys_4789
>> in IP_TUNNEL_INFO_BRIDGE mode. Then the broadcast packet can send through
>> the fdb table in the vxlan device as following:
>>
>> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
>> src_vni 1000 vni 1000 self
>> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
>> src_vni 1000 vni 1000 self
>>
>>
>> Not any for multicast case. This patch make ovs vxlan tunnel using the 
>> fdb
>> table of lower vxlan device.
> Have you tried OVS mac learning?
>
 The key point is that it makes ovs vxlan tunnel can make use of the fdb 
 table of lower vxlan device.

 The fdb table can be configurable or mac learning from outside.

 For the broadcast example.  In the ovs, it can only achieve this through 
 multiple output actions to simulate the broadcast.

 ovs-ofctl add-flow br0 
 in_port=server1,dl_dst=,ff:ff:ff:ff:ff:ffactions=set_field:172.168.0.1->tun_dst,output:vxlan,\

 set_field:172.168.0.2->tun_dst,output:vxlan.

 But there are some limits for the number of output actions.

>>> I was referring to mac-learning feature in OVS i.e. using learn
>>> action. I wanted to see if there is something that you are not able to
>>> do with OVS learn action.
>>>
>> Ovs mac learn action is only work for the specific vxlan tunnel port( fixed 
>> tun_dst, tun_id) like following.
>>
>> ovs-vsctl set in vxlan options:remote_ip=172.168.0.1 options:key=1000
>>
>> ( This is the same problem for Linux bridge, It achieve this through 
>> IP_TUNNEL_INFO_BRIDGE mode work
>>
>> with the fdb of lower vxlan device)
>>
>>
>> But it is not work for the flow based tunnel (remote_ip=flow),  There will 
>> be huge number of the tunnel peer.
> One question, why do you flood the ff:ff:ff:ff:ff:ff packets to  many
> tunnel peer, in the cloud, this should be avoid(e.g arp). Because we
> use the control plane to control the packets to vm/host related, not
> all vm/host
>
> It' not only for arp, Some user also needs the IP broadcast service. Anyway 
> it's a use-case.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next] openvswitch: Make metadata_dst tunnel work in IP_TUNNEL_INFO_BRIDGE mode

2019-03-24 Thread Tonghao Zhang
On Mon, Mar 25, 2019 at 9:24 AM wenxu  wrote:
>
> On 2019/3/25 上午2:46, Pravin Shelar wrote:
> > On Sun, Mar 24, 2019 at 12:03 AM wenxu  wrote:
> >> On 2019/3/24 上午5:39, Pravin Shelar wrote:
> >>> On Sat, Mar 23, 2019 at 2:18 AM wenxu  wrote:
>  On 2019/3/23 下午3:50, Pravin Shelar wrote:
> 
>  On Thu, Mar 21, 2019 at 3:34 AM  wrote:
> 
>  From: wenxu 
> 
>  There is currently no support for the multicasti/broadcst aspects
>  of VXLAN in ovs. In the datapath flow the tun_dst must specific.
>  But in the IP_TUNNEL_INFO_BRIDGE mode the tun_dst can not be specific.
>  And the packet can forward through the fdb of vxlan devcice. In
>  this mode the broadcast/multicast packet can be sent through the
>  following ways in ovs.
> 
>  ovs-vsctl add-port br0 vxlan -- set in vxlan type=vxlan \
>  options:key=1000 options:remote_ip=flow
>  ovs-ofctl add-flow br0 in_port=LOCAL,dl_dst=ff:ff:ff:ff:ff:ff,\
>  action=output:vxlan
> 
>  bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
>  src_vni 1000 vni 1000 self
>  bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
>  src_vni 1000 vni 1000 self
> 
>  This would make datapath bit complicated, can you give example of such 
>  use-case?
> 
>  There is currently no support for the multicast/broadcast aspects
>  of VXLAN in ovs. To get around the lack of multicast support, it is 
>  possible to
>  pre-provision MAC to IP address mappings either manually or from a 
>  controller.
> 
>  With this patch we can achieve this through the fdb of the lower vxlan
>  device.
> 
>  For example. three severs connects with vxlan.
>  server1 IP 10.0.0.1 tunnel IP  172.168.0.1 vni 1000
>  server2 IP 10.0.0.2 tunnel IP  172.168.0.2 vni 1000
>  server3 IP 10.0.0.3 tunnel IP  172.168.0.3 vni 1000
> 
>  All the broadcast arp request from server1, can be send to vxlan_sys_4789
>  in IP_TUNNEL_INFO_BRIDGE mode. Then the broadcast packet can send through
>  the fdb table in the vxlan device as following:
> 
>  bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
>  src_vni 1000 vni 1000 self
>  bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
>  src_vni 1000 vni 1000 self
> 
> 
>  Not any for multicast case. This patch make ovs vxlan tunnel using the 
>  fdb
>  table of lower vxlan device.
> >>> Have you tried OVS mac learning?
> >>>
> >> The key point is that it makes ovs vxlan tunnel can make use of the fdb 
> >> table of lower vxlan device.
> >>
> >> The fdb table can be configurable or mac learning from outside.
> >>
> >> For the broadcast example.  In the ovs, it can only achieve this through 
> >> multiple output actions to simulate the broadcast.
> >>
> >> ovs-ofctl add-flow br0 
> >> in_port=server1,dl_dst=,ff:ff:ff:ff:ff:ffactions=set_field:172.168.0.1->tun_dst,output:vxlan,\
> >>
> >> set_field:172.168.0.2->tun_dst,output:vxlan.
> >>
> >> But there are some limits for the number of output actions.
> >>
> > I was referring to mac-learning feature in OVS i.e. using learn
> > action. I wanted to see if there is something that you are not able to
> > do with OVS learn action.
> >
> Ovs mac learn action is only work for the specific vxlan tunnel port( fixed 
> tun_dst, tun_id) like following.
>
> ovs-vsctl set in vxlan options:remote_ip=172.168.0.1 options:key=1000
>
> ( This is the same problem for Linux bridge, It achieve this through 
> IP_TUNNEL_INFO_BRIDGE mode work
>
> with the fdb of lower vxlan device)
>
>
> But it is not work for the flow based tunnel (remote_ip=flow),  There will be 
> huge number of the tunnel peer.
One question, why do you flood the ff:ff:ff:ff:ff:ff packets to  many
tunnel peer, in the cloud, this should be avoid(e.g arp). Because we
use the control plane to control the packets to vm/host related, not
all vm/host
> It' hard to manage the tunnel port with the specific mode.
>
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next] openvswitch: Make metadata_dst tunnel work in IP_TUNNEL_INFO_BRIDGE mode

2019-03-24 Thread wenxu
On 2019/3/25 上午2:46, Pravin Shelar wrote:
> On Sun, Mar 24, 2019 at 12:03 AM wenxu  wrote:
>> On 2019/3/24 上午5:39, Pravin Shelar wrote:
>>> On Sat, Mar 23, 2019 at 2:18 AM wenxu  wrote:
 On 2019/3/23 下午3:50, Pravin Shelar wrote:

 On Thu, Mar 21, 2019 at 3:34 AM  wrote:

 From: wenxu 

 There is currently no support for the multicasti/broadcst aspects
 of VXLAN in ovs. In the datapath flow the tun_dst must specific.
 But in the IP_TUNNEL_INFO_BRIDGE mode the tun_dst can not be specific.
 And the packet can forward through the fdb of vxlan devcice. In
 this mode the broadcast/multicast packet can be sent through the
 following ways in ovs.

 ovs-vsctl add-port br0 vxlan -- set in vxlan type=vxlan \
 options:key=1000 options:remote_ip=flow
 ovs-ofctl add-flow br0 in_port=LOCAL,dl_dst=ff:ff:ff:ff:ff:ff,\
 action=output:vxlan

 bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
 src_vni 1000 vni 1000 self
 bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
 src_vni 1000 vni 1000 self

 This would make datapath bit complicated, can you give example of such 
 use-case?

 There is currently no support for the multicast/broadcast aspects
 of VXLAN in ovs. To get around the lack of multicast support, it is 
 possible to
 pre-provision MAC to IP address mappings either manually or from a 
 controller.

 With this patch we can achieve this through the fdb of the lower vxlan
 device.

 For example. three severs connects with vxlan.
 server1 IP 10.0.0.1 tunnel IP  172.168.0.1 vni 1000
 server2 IP 10.0.0.2 tunnel IP  172.168.0.2 vni 1000
 server3 IP 10.0.0.3 tunnel IP  172.168.0.3 vni 1000

 All the broadcast arp request from server1, can be send to vxlan_sys_4789
 in IP_TUNNEL_INFO_BRIDGE mode. Then the broadcast packet can send through
 the fdb table in the vxlan device as following:

 bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
 src_vni 1000 vni 1000 self
 bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
 src_vni 1000 vni 1000 self


 Not any for multicast case. This patch make ovs vxlan tunnel using the fdb
 table of lower vxlan device.
>>> Have you tried OVS mac learning?
>>>
>> The key point is that it makes ovs vxlan tunnel can make use of the fdb 
>> table of lower vxlan device.
>>
>> The fdb table can be configurable or mac learning from outside.
>>
>> For the broadcast example.  In the ovs, it can only achieve this through 
>> multiple output actions to simulate the broadcast.
>>
>> ovs-ofctl add-flow br0 
>> in_port=server1,dl_dst=ff:ff:ff:ff:ff:ff,actions=set_field:172.168.0.1->tun_dst,output:vxlan,\
>>
>> set_field:172.168.0.2->tun_dst,output:vxlan.
>>
>> But there are some limits for the number of output actions.
>>
> I was referring to mac-learning feature in OVS i.e. using learn
> action. I wanted to see if there is something that you are not able to
> do with OVS learn action.
>
Ovs mac learn action is only work for the specific vxlan tunnel port( fixed 
tun_dst, tun_id) like following.

ovs-vsctl set in vxlan options:remote_ip=172.168.0.1 options:key=1000

( This is the same problem for Linux bridge, It achieve this through 
IP_TUNNEL_INFO_BRIDGE mode work

with the fdb of lower vxlan device)


But it is not work for the flow based tunnel (remote_ip=flow),  There will be 
huge number of the tunnel peer.

It' hard to manage the tunnel port with the specific mode.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next] openvswitch: Make metadata_dst tunnel work in IP_TUNNEL_INFO_BRIDGE mode

2019-03-24 Thread Tonghao Zhang
On Sat, Mar 23, 2019 at 5:28 PM wenxu  wrote:
>
> On 2019/3/23 下午3:50, Pravin Shelar wrote:
> > On Thu, Mar 21, 2019 at 3:34 AM  wrote:
> >> From: wenxu 
> >>
> >> There is currently no support for the multicasti/broadcst aspects
> >> of VXLAN in ovs. In the datapath flow the tun_dst must specific.
> >> But in the IP_TUNNEL_INFO_BRIDGE mode the tun_dst can not be specific.
> >> And the packet can forward through the fdb of vxlan devcice. In
> >> this mode the broadcast/multicast packet can be sent through the
> >> following ways in ovs.
> >>
> >> ovs-vsctl add-port br0 vxlan -- set in vxlan type=vxlan \
> >> options:key=1000 options:remote_ip=flow
> >> ovs-ofctl add-flow br0 in_port=LOCAL,dl_dst=ff:ff:ff:ff:ff:ff,\
> >> action=output:vxlan
> >>
> >> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
> >> src_vni 1000 vni 1000 self
> >> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
> >> src_vni 1000 vni 1000 self
> >>
> > This would make datapath bit complicated, can you give example of such 
> > use-case?
> >
>
> There is currently no support for the multicast aspects
> of VXLAN in ovs.
> With this patch we can achieve this through the fdb of the lower vxlan
> device.
You can create multi vxlan device in the ovs for example:
ovs-vsctl add-port br0 vxlan0 -- set in vxlan0 type=vxlan
options:key=1000 options:remote_ip= 172.168.0.1
ovs-vsctl add-port br0 vxlan1 -- set in vxlan1 type=vxlan
options:key=1000 options:remote_ip= 172.168.0.2

so the  ff:ff:ff:ff:ff:ff  packets will be send to vxlan0 and vxlan1 ports.
> For example. three severs connects with vxlan.
> server1 IP 10.0.0.1 tunnel IP  172.168.0.1 vni 1000
> server2 IP 10.0.0.2 tunnel IP  172.168.0.2 vni 1000
> server3 IP 10.0.0.3 tunnel IP  172.168.0.3 vni 1000
>
> All the broadcast arp request from server1, can be send to vxlan_sys_4789
> in IP_TUNNEL_INFO_BRIDGE mode. Then the broadcast packet can send through
> the fdb table in the vxlan device as following:
>
> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
> src_vni 1000 vni 1000 self
> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
> src_vni 1000 vni 1000 self
>
>
> Not any for multicast case. This patch make ovs vxlan tunnel using the fdb
> table of lower vxlan device.
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] dpif-netdev: dfc_process optimization by prefetching EMC entry.

2019-03-24 Thread Ian Stokes

On 3/13/2019 5:27 AM, Yanqin Wei wrote:

It is observed that the throughput of multi-flow is worse than single-flow
in the EMC NIC to NIC cases. It is because CPU cache-miss increasing in EMC
lookup. Each flow need load at least one EMC entry to CPU cache(several
cache lines) and compare it with packet miniflow.
This patch improve it by prefetching EMC entry in advance. Hash value can
be obtained from dpdk rss hash, so this step can be advanced ahead of
miniflow_extract() and prefetch EMC entry there. The prefetching size is
defined as ROUND_UP(128,CACHE_LINE_SIZE), which can cover majority traffic
including TCP/UDP protocol and need 2 cache lines in most modern CPU.
Performance test was run in some arm platform. 1000/1 flows NIC2NIC
test achieved around 10% throughput improvement in thunderX2(aarch64
platform).



Thanks for this Wei, not a few review, please see some minor comments 
below WRT style issues.


I've also run some benchmarks on this. I was seeing typically a ~3% drop 
on x86 with single flows with RFC2544. However once or twice, I saw a 
drop of up to 25% on achievable lossless packet rate but I suspect it 
could be an anomaly in my setup.


Ilya, if you are testing this week on x86, it would be great you confirm 
if you see something similar in your benchmarks?


For vsperf phy2phy_scalability flow tests on x86 I saw an improvement of 
+3% after applying the patch for zero loss tests and +5% in the case of 
phy2phy_scalability_cont so this looks promising.


As an FYI I'll be I'm out of office this coming week so will not have an 
opportunity to investigate further until I'm back in office. I'll be 
able to review and benchmark further then.




Signed-off-by: Yanqin Wei 
Reviewed-by: Gavin Hu 
Although it doesn't appear here or in patchwork, after downloading the 
patch the sign off and review tags above appear duplicated after being 
applied. Examining the mbox I can confirm they are duplicated, can you 
check this on your side also?



---
  lib/dpif-netdev.c | 80 ---
  1 file changed, 52 insertions(+), 28 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 4d6d0c3..982082c 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -189,6 +189,10 @@ struct netdev_flow_key {
  #define DEFAULT_EM_FLOW_INSERT_MIN (UINT32_MAX / \
  DEFAULT_EM_FLOW_INSERT_INV_PROB)
  
+/* DEFAULT_EMC_PREFETCH_SIZE can cover majority traffic including TCP/UDP

+ * protocol. */
+#define DEFAULT_EMC_PREFETCH_SIZE ROUND_UP(128,CACHE_LINE_SIZE)
+
  struct emc_entry {
  struct dp_netdev_flow *flow;
  struct netdev_flow_key key;   /* key.hash used for emc hash value. */
@@ -6166,15 +6170,20 @@ dp_netdev_upcall(struct dp_netdev_pmd_thread *pmd, 
struct dp_packet *packet_,
  }
  
  static inline uint32_t

-dpif_netdev_packet_get_rss_hash_orig_pkt(struct dp_packet *packet,
-const struct miniflow *mf)
+dpif_netdev_packet_get_packet_rss_hash(struct dp_packet *packet,
+bool md_is_valid)
  {
-uint32_t hash;
+uint32_t hash,recirc_depth;
  
-if (OVS_LIKELY(dp_packet_rss_valid(packet))) {

-hash = dp_packet_get_rss_hash(packet);
-} else {
-hash = miniflow_hash_5tuple(mf, 0);
+hash = dp_packet_get_rss_hash(packet);
+
+if (md_is_valid) {
+/* The RSS hash must account for the recirculation depth to avoid
+ * collisions in the exact match cache */

Minor, comment style, missing period at end of comment.


+recirc_depth = *recirc_depth_get_unsafe();
+if (OVS_UNLIKELY(recirc_depth)) {
+hash = hash_finish(hash, recirc_depth);
+}
  dp_packet_set_rss_hash(packet, hash);
  }
  
@@ -6182,24 +6191,23 @@ dpif_netdev_packet_get_rss_hash_orig_pkt(struct dp_packet *packet,

  }
  
  static inline uint32_t

-dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
-const struct miniflow *mf)
+dpif_netdev_packet_get_hash_5tuple(struct dp_packet *packet,
+const struct miniflow *mf,
+bool md_is_valid)
  {
-uint32_t hash, recirc_depth;
+uint32_t hash,recirc_depth;

Coding style, missing space between , and recirc_depth.
  
-if (OVS_LIKELY(dp_packet_rss_valid(packet))) {

-hash = dp_packet_get_rss_hash(packet);
-} else {
-hash = miniflow_hash_5tuple(mf, 0);
-dp_packet_set_rss_hash(packet, hash);
-}
+hash = miniflow_hash_5tuple(mf, 0);
+dp_packet_set_rss_hash(packet, hash);
  
-/* The RSS hash must account for the recirculation depth to avoid

- * collisions in the exact match cache */
-recirc_depth = *recirc_depth_get_unsafe();
-if (OVS_UNLIKELY(recirc_depth)) {
-hash = hash_finish(hash, recirc_depth);
-dp_packet_set_rss_hash(packet, hash);
+if (md_is_valid) {
+   

[ovs-dev] BookConcierge's super features are here! Discovery, Recommendations & Previews! We bring you the cheapest, fastest delivered books!

2019-03-24 Thread feedb...@bookconcierge.co
BookConcierge is the worlds biggest online bookstore & lowest-price
determinator in 23 countries. Price. Speed. Variety: only the Best &
Cheapest & F 
 ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌  ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 
‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 


 [http://australia.bookconcierge.co/] 

 [http://australia.bookconcierge.co/] 

_BOOKCONCIERGE_ [http://australia.bookconcierge.co/] IS THE LARGEST
ONLINE BOOKSTORE & LOWEST-PRICE DETERMINATOR IN THE WORLD.  

_Over 30 million books for you to locate, preview, get recommendations
and buy: latest Bestsellers, Historical Bestsellers, Children's books,
Academic, Management, Inspirational, Crime & Fantasy, both brand new &
used books.  Let our extensive & clever search algorithms do the leg
work for you! Shipped fastest to you with a few clicks, delivered
straight to your home!
 
Click on the link [http://australia.bookconcierge.co/], go to our
website, choose your country, search for the book you want, and see
where to find it at the cheapest price, and the shortest shipping
time! _ 

 [http://australia.bookconcierge.co//9780241334140/Becoming] 

 [http://australia.bookconcierge.co/] 

 [http://australia.bookconcierge.co/] 

 [https://www.facebook.com/bookconciergeHK/] 
 [https://www.instagram.com/bookconciergebooks/] 
 [http://australia.bookconcierge.co/] 

 Click on any pics above and start book-browsing! 

You received this email because you subscribed to our list. You can
unsubscribe
[https://australis.eomail1.com/unsubscribe?l=b2fc2a9b-4c7b-11e9-a3c9-06b79b628af2=20a04333-4c7d-11e9-a3c9-06b79b628af2=ee6d62c4-4c7d-11e9-a3c9-06b79b628af2=campaign=1553460301=1553460392=f1ea145d96827083bda8f6d30bdef53eebfe88f1b7613082a33f7cf8e6c13671]
at any time. 

The BookConcierge Group
www.bookconcierge.co
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] EXCUSEZ MOI

2019-03-24 Thread ARCHENAULT
Coucou 
Excusez moi mais si je vous écris en ce moment c'est parce que je crois que 
vous pouviez m'aider. Je vis la solitude et je veux rompre de cette amitié. 
Accordez moi juste de communiquer avec vous.  
Je souffre d'une tumeur. Je suis donc sur mon lit d’hôpital. Je vous donnerai 
plus de détails si vous voulez qu'on en parle. 
Ps: Envoyez moi un message de réception même si vous voulez pas qu'on 
communique. 
Merci de vous relire. 
Huguette
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next] openvswitch: Make metadata_dst tunnel work in IP_TUNNEL_INFO_BRIDGE mode

2019-03-24 Thread Pravin Shelar
On Sun, Mar 24, 2019 at 12:03 AM wenxu  wrote:
>
> On 2019/3/24 上午5:39, Pravin Shelar wrote:
> > On Sat, Mar 23, 2019 at 2:18 AM wenxu  wrote:
> >> On 2019/3/23 下午3:50, Pravin Shelar wrote:
> >>
> >> On Thu, Mar 21, 2019 at 3:34 AM  wrote:
> >>
> >> From: wenxu 
> >>
> >> There is currently no support for the multicasti/broadcst aspects
> >> of VXLAN in ovs. In the datapath flow the tun_dst must specific.
> >> But in the IP_TUNNEL_INFO_BRIDGE mode the tun_dst can not be specific.
> >> And the packet can forward through the fdb of vxlan devcice. In
> >> this mode the broadcast/multicast packet can be sent through the
> >> following ways in ovs.
> >>
> >> ovs-vsctl add-port br0 vxlan -- set in vxlan type=vxlan \
> >> options:key=1000 options:remote_ip=flow
> >> ovs-ofctl add-flow br0 in_port=LOCAL,dl_dst=ff:ff:ff:ff:ff:ff,\
> >> action=output:vxlan
> >>
> >> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
> >> src_vni 1000 vni 1000 self
> >> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
> >> src_vni 1000 vni 1000 self
> >>
> >> This would make datapath bit complicated, can you give example of such 
> >> use-case?
> >>
> >> There is currently no support for the multicast/broadcast aspects
> >> of VXLAN in ovs. To get around the lack of multicast support, it is 
> >> possible to
> >> pre-provision MAC to IP address mappings either manually or from a 
> >> controller.
> >>
> >> With this patch we can achieve this through the fdb of the lower vxlan
> >> device.
> >>
> >> For example. three severs connects with vxlan.
> >> server1 IP 10.0.0.1 tunnel IP  172.168.0.1 vni 1000
> >> server2 IP 10.0.0.2 tunnel IP  172.168.0.2 vni 1000
> >> server3 IP 10.0.0.3 tunnel IP  172.168.0.3 vni 1000
> >>
> >> All the broadcast arp request from server1, can be send to vxlan_sys_4789
> >> in IP_TUNNEL_INFO_BRIDGE mode. Then the broadcast packet can send through
> >> the fdb table in the vxlan device as following:
> >>
> >> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
> >> src_vni 1000 vni 1000 self
> >> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
> >> src_vni 1000 vni 1000 self
> >>
> >>
> >> Not any for multicast case. This patch make ovs vxlan tunnel using the fdb
> >> table of lower vxlan device.
> > Have you tried OVS mac learning?
> >
> The key point is that it makes ovs vxlan tunnel can make use of the fdb table 
> of lower vxlan device.
>
> The fdb table can be configurable or mac learning from outside.
>
> For the broadcast example.  In the ovs, it can only achieve this through 
> multiple output actions to simulate the broadcast.
>
> ovs-ofctl add-flow br0 
> in_port=server1,dl_dst=ff:ff:ff:ff:ff:ff,actions=set_field:172.168.0.1->tun_dst,output:vxlan,\
>
> set_field:172.168.0.2->tun_dst,output:vxlan.
>
> But there are some limits for the number of output actions.
>
I was referring to mac-learning feature in OVS i.e. using learn
action. I wanted to see if there is something that you are not able to
do with OVS learn action.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC v4 1/1] datapath: Add a new action check_pkt_len

2019-03-24 Thread Numan Siddique
On Fri, Mar 22, 2019 at 9:10 PM Gregory Rose  wrote:

>
> On 3/22/2019 2:58 AM, Numan Siddique wrote:
>
>
>
> On Fri, Mar 22, 2019 at 6:16 AM Gregory Rose  wrote:
>
>>
>>
>> On 3/21/2019 5:38 PM, Gregory Rose wrote:
>> >
>> >
>> > On 3/21/2019 10:37 AM, Numan Siddique wrote:
>> >> This is the datapath patch -
>> https://patchwork.ozlabs.org/patch/1046370/
>> >> and this is the corresponding ovs-vswitchd patch -
>> >> https://patchwork.ozlabs.org/patch/1059081/ (this is part of the
>> >> series -
>> >> https://patchwork.ozlabs.org/project/openvswitch/list/?series=98190,
>> >> but probably you would be interested in only ovs patch)
>> >>
>> >> Sharing the links so that you can find it easily.
>> >>
>> >> Thanks
>> >> Numan
>> >>
>> >>
>> >
>> > This patch:
>> >
>> > https://patchwork.ozlabs.org/patch/1059081/
>> >
>> > shows this when applied:
>> >
>> > Applying: Add a new OVS action check_pkt_larger
>> > .git/rebase-apply/patch:1097: new blank line at EOF.
>> > +
>> > warning: 1 line adds whitespace errors.
>> >
>> > In regards to the datapath patch 1046370
>> > 
>> >
>> > In execute_check_pkt_len():
>> >
>> > +
>> > +   actual_pkt_len = skb->len + (skb_vlan_tag_present(skb) ?
>> > VLAN_HLEN : 0);
>> > +
>> >
>> > This doesn't seem right to me - the skb length should include the
>> > length of the entire packet, including any
>> > VLAN tags, or at least that is my understanding.  Please check it.
>>
>
Hi Greg,

I checked and tested it it. I can confirm that skb->len doesn't include the
vlan header.
Existing code in flow.c also uses the same way to get the packet len -
https://github.com/openvswitch/ovs/blob/master/datapath/flow.c#L77

Thanks
Numan

>
>> > In validate_and_copy_check_pkt_len() in flow_netlink.c:
>> >
>> > +   static const struct nla_policy pol[OVS_CHECK_PKT_LEN_ATTR_MAX
>> > + 1] = {
>> > +   [OVS_CHECK_PKT_LEN_ATTR_PKT_LEN] = {.type = NLA_U16 },
>> > +   [OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER] = {
>> > +   .type = NLA_NESTED },
>> > +   [OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL] = {
>> > +   .type = NLA_NESTED },
>> > +   };
>> >
>> > I don't care for declaring these things within function scope and it
>> > is not generally done.  I see that
>> > flow_netlink.c has one other instance of the nla_policy structure
>> > statically declared within the function scope
>> > but if you look at datapath.c none of them are.  I prefer the way it's
>> > done in datapath.c.  I also grepped around
>> > in other kernel code in the ./net tree and that is also the way it's
>> > done there, i.e. I didn't see any other
>> > instances of it declared within function scope.
>> >
>> > I compiled both the ovs-vswitchd and openvswitch kernel module
>> > components with no issues.  I wanted to use
>> > clang but the version of clang on Ubuntu right now doesn't have
>> > retpoline support so it won't compile
>> > kernel modules.
>> >
>> > :-/
>> >
>> > I did some quick regression testing and found no problems.  If you can
>> > address the two coding issues I brought up
>> > then I'd be glad to add my reviewed and tested by tags.
>>
>> Oh wait, this is just an RFC.
>>
>> I'll review and test the patches again when they officially come out.
>> Maybe clang will have retpoline support
>> by then.
>>
>>
> Thank you for the review. I will address them. I am planning to submit the
> patch
> to net-next ML. Looks like the window is open now -
> http://vger.kernel.org/~davem/net-next.html
>
> Please let me know in case you prefer to submit another RFC version here
> before submitting to  net-dev ML.
>
>
> I think you're good to go.
>
> Thanks,
>
> - Greg
>
>
> Thanks
> Numan
>
> - Greg
>>
>>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ovs: tests:Adjust the test cases about group stats

2019-03-24 Thread Roi Dayan



On 22/03/2019 12:26, solomon wrote:
> The bucket id field is added to the ofp11_bucket_counter structure.
> So, adjust the corresponding testcases.
> 
> 
> Signed-off-by: solomon 
> ---
>  tests/ofp-print.at | 48 
>  1 file changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/tests/ofp-print.at b/tests/ofp-print.at
> index e38ca4ae5..93aa7ec3f 100644
> --- a/tests/ofp-print.at
> +++ b/tests/ofp-print.at
> @@ -2171,18 +2171,18 @@ AT_CLEANUP
>  AT_SETUP([NXST_GROUP reply - OF1.0])
>  AT_KEYWORDS([ofp-print OFPT_STATS_REPLY])
>  AT_CHECK([ovs-ofctl ofp-print "\
> -01 11 00 b8 00 00 00 04 ff ff 00 00 00 00 23 20 00 00 00 07 00 00 00 00 \
> -00 58 00 00 87 65 43 21 00 00 00 04 00 00 00 00 \
> +01 11 00 e0 00 00 00 04 ff ff 00 00 00 00 23 20 00 00 00 07 00 00 00 00 \
> +00 70 00 00 87 65 43 21 00 00 00 04 00 00 00 00 \
>  00 00 00 00 00 00 88 88 00 00 00 00 00 77 77 77 \
>  00 00 00 12 1d cd 65 00 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> -00 00 00 00 00 00 66 66 00 00 00 00 00 33 33 33 \
> -00 48 00 00 00 00 00 05 00 00 00 02 00 00 00 00 \
> +00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 01 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 02 00 00 00 00 00 00 00 00 00 00 66 66 00 00 00 00 00 33 33 33 \
> +00 58 00 00 00 00 00 05 00 00 00 02 00 00 00 00 \
>  00 00 00 00 00 00 88 88 00 00 00 00 00 77 77 77 \
>  00 00 00 10 1d cd 65 00 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 01 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
>  "], [0], [dnl
>  NXST_GROUP reply (xid=0x4):
>   
> group_id=2271560481,duration=18.500s,ref_count=4,packet_count=34952,byte_count=7829367,bucket0:packet_count=4369,byte_count=2236962,bucket1:packet_count=4369,byte_count=2236962,bucket2:packet_count=26214,byte_count=3355443
> @@ -2193,16 +2193,16 @@ AT_CLEANUP
>  AT_SETUP([OFPST_GROUP reply - OF1.1])
>  AT_KEYWORDS([ofp-print OFPT_STATS_REPLY])
>  AT_CHECK([ovs-ofctl ofp-print "\
> -02 13 00 a0 00 00 00 02 00 06 00 00 00 00 00 00 \
> -00 50 00 00 87 65 43 21 00 00 00 04 00 00 00 00 \
> +02 13 00 c8 00 00 00 02 00 06 00 00 00 00 00 00 \
> +00 68 00 00 87 65 43 21 00 00 00 04 00 00 00 00 \
>  00 00 00 00 00 00 88 88 00 00 00 00 00 77 77 77 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> -00 00 00 00 00 00 66 66 00 00 00 00 00 33 33 33 \
> -00 40 00 00 00 00 00 05 00 00 00 02 00 00 00 00 \
> +00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 01 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 02 00 00 00 00 00 00 00 00 00 00 66 66 00 00 00 00 00 33 33 33 \
> +00 50 00 00 00 00 00 05 00 00 00 02 00 00 00 00 \
>  00 00 00 00 00 00 88 88 00 00 00 00 00 77 77 77 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 01 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
>  "], [0], [dnl
>  OFPST_GROUP reply (OF1.1) (xid=0x2):
>   
> group_id=2271560481,ref_count=4,packet_count=34952,byte_count=7829367,bucket0:packet_count=4369,byte_count=2236962,bucket1:packet_count=4369,byte_count=2236962,bucket2:packet_count=26214,byte_count=3355443
> @@ -2213,18 +2213,18 @@ AT_CLEANUP
>  AT_SETUP([OFPST_GROUP reply - OF1.3])
>  AT_KEYWORDS([ofp-print OFPT_STATS_REPLY])
>  AT_CHECK([ovs-ofctl ofp-print "\
> -04 13 00 b0 00 00 00 02 00 06 00 00 00 00 00 00 \
> -00 58 00 00 87 65 43 21 00 00 00 04 00 00 00 00 \
> +04 13 00 d8 00 00 00 02 00 06 00 00 00 00 00 00 \
> +00 70 00 00 87 65 43 21 00 00 00 04 00 00 00 00 \
>  00 00 00 00 00 00 88 88 00 00 00 00 00 77 77 77 \
>  00 00 00 12 1d cd 65 00 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> -00 00 00 00 00 00 66 66 00 00 00 00 00 33 33 33 \
> -00 48 00 00 00 00 00 05 00 00 00 02 00 00 00 00 \
> +00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 01 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 02 00 00 00 00 00 00 00 00 00 00 66 66 00 00 00 00 00 33 33 33 \
> +00 58 00 00 00 00 00 05 00 00 00 02 00 00 00 00 \
>  00 00 00 00 00 00 88 88 00 00 00 00 00 77 77 77 \
>  00 00 00 10 1d cd 65 00 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> -00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
> +00 00 00 01 00 00 00 00 00 00 00 00 00 00 11 11 00 00 00 00 00 22 22 22 \
>  "], [0], [dnl
>  OFPST_GROUP reply (OF1.3) (xid=0x2):
>   
> 

Re: [ovs-dev] [PATCH net-next] openvswitch: Make metadata_dst tunnel work in IP_TUNNEL_INFO_BRIDGE mode

2019-03-24 Thread wenxu
On 2019/3/24 上午5:39, Pravin Shelar wrote:
> On Sat, Mar 23, 2019 at 2:18 AM wenxu  wrote:
>> On 2019/3/23 下午3:50, Pravin Shelar wrote:
>>
>> On Thu, Mar 21, 2019 at 3:34 AM  wrote:
>>
>> From: wenxu 
>>
>> There is currently no support for the multicasti/broadcst aspects
>> of VXLAN in ovs. In the datapath flow the tun_dst must specific.
>> But in the IP_TUNNEL_INFO_BRIDGE mode the tun_dst can not be specific.
>> And the packet can forward through the fdb of vxlan devcice. In
>> this mode the broadcast/multicast packet can be sent through the
>> following ways in ovs.
>>
>> ovs-vsctl add-port br0 vxlan -- set in vxlan type=vxlan \
>> options:key=1000 options:remote_ip=flow
>> ovs-ofctl add-flow br0 in_port=LOCAL,dl_dst=ff:ff:ff:ff:ff:ff,\
>> action=output:vxlan
>>
>> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
>> src_vni 1000 vni 1000 self
>> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
>> src_vni 1000 vni 1000 self
>>
>> This would make datapath bit complicated, can you give example of such 
>> use-case?
>>
>> There is currently no support for the multicast/broadcast aspects
>> of VXLAN in ovs. To get around the lack of multicast support, it is possible 
>> to
>> pre-provision MAC to IP address mappings either manually or from a 
>> controller.
>>
>> With this patch we can achieve this through the fdb of the lower vxlan
>> device.
>>
>> For example. three severs connects with vxlan.
>> server1 IP 10.0.0.1 tunnel IP  172.168.0.1 vni 1000
>> server2 IP 10.0.0.2 tunnel IP  172.168.0.2 vni 1000
>> server3 IP 10.0.0.3 tunnel IP  172.168.0.3 vni 1000
>>
>> All the broadcast arp request from server1, can be send to vxlan_sys_4789
>> in IP_TUNNEL_INFO_BRIDGE mode. Then the broadcast packet can send through
>> the fdb table in the vxlan device as following:
>>
>> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
>> src_vni 1000 vni 1000 self
>> bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
>> src_vni 1000 vni 1000 self
>>
>>
>> Not any for multicast case. This patch make ovs vxlan tunnel using the fdb
>> table of lower vxlan device.
> Have you tried OVS mac learning?
>
The key point is that it makes ovs vxlan tunnel can make use of the fdb table 
of lower vxlan device.

The fdb table can be configurable or mac learning from outside.

For the broadcast example.  In the ovs, it can only achieve this through 
multiple output actions to simulate the broadcast.

ovs-ofctl add-flow br0 
in_port=server1,dl_dst=ff:ff:ff:ff:ff:ff,actions=set_field:172.168.0.1->tun_dst,output:vxlan,\

    set_field:172.168.0.2->tun_dst,output:vxlan.

But there are some limits for the number of output actions.


   


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev