The dp_hash selection method for select groups overcomes the scalability problems of the current default selection method which, due to L2-L4 hashing during xlation and un-wildcarding of the hashed fields, basically requires an upcall to the slow path to load-balance every L4 connection. The consequence are an explosion of datapath flows (megaflows degenerate to miniflows) and a limitation of connection setup rate OVS can handle.
This commit changes the default selection method to dp_hash, provided the bucket configuration is such that the dp_hash method can accurately represent the bucket weights with up to 64 hash values. Otherwise we stick to original default hash method. We use the new dp_hash algorithm OVS_HASH_L4_SYMMETRIC to maintain the symmetry property of the old default hash method. A controller can explicitly request the old default hash selection method by specifying selection method "hash" with an empty list of fields in the Group properties of the OpenFlow 1.5 Group Mod message. Update the documentation about selection method in the ovs-ovctl man page. Revise and complete the ofproto-dpif unit tests cases for select groups. Signed-off-by: Jan Scheurich <jan.scheur...@ericsson.com> Signed-off-by: Nitin Katiyar <nitin.kati...@ericsson.com> Co-authored-by: Nitin Katiyar <nitin.kati...@ericsson.com> --- NEWS | 2 + lib/ofp-group.c | 15 ++- ofproto/ofproto-dpif.c | 30 +++-- ofproto/ofproto-dpif.h | 1 + ofproto/ofproto-provider.h | 2 +- tests/mpls-xlate.at | 26 ++-- tests/ofproto-dpif.at | 316 +++++++++++++++++++++++++++++++++++---------- tests/ofproto-macros.at | 7 +- utilities/ovs-ofctl.8.in | 47 ++++--- 9 files changed, 334 insertions(+), 112 deletions(-) diff --git a/NEWS b/NEWS index ec548b0..2b2be1e 100644 --- a/NEWS +++ b/NEWS @@ -17,6 +17,8 @@ Post-v2.9.0 * OFPT_ROLE_STATUS is now available in OpenFlow 1.3. * OpenFlow 1.5 extensible statistics (OXS) now implemented. * New OpenFlow 1.0 extensions for group support. + * Default selection method for select groups is now dp_hash with improved + accuracy. - Linux kernel 4.14 * Add support for compiling OVS with the latest Linux 4.14 kernel - ovn: diff --git a/lib/ofp-group.c b/lib/ofp-group.c index f5b0af8..697208f 100644 --- a/lib/ofp-group.c +++ b/lib/ofp-group.c @@ -1600,12 +1600,17 @@ parse_group_prop_ntr_selection_method(struct ofpbuf *payload, return OFPERR_OFPBPC_BAD_VALUE; } - error = oxm_pull_field_array(payload->data, fields_len, - &gp->fields); - if (error) { - OFPPROP_LOG(&rl, false, + if (fields_len > 0) { + error = oxm_pull_field_array(payload->data, fields_len, + &gp->fields); + if (error) { + OFPPROP_LOG(&rl, false, "ntr selection method fields are invalid"); - return error; + return error; + } + } else { + /* Selection_method "hash: w/o fields means default hash method. */ + gp->fields.values_size = 0; } return 0; diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index c9c2e51..a45d6ea 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -1,5 +1,4 @@ /* - * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -4787,7 +4786,7 @@ group_setup_dp_hash_table(struct group_dpif *group, size_t max_hash) } *webster; if (n_buckets == 0) { - VLOG_DBG(" Don't apply dp_hash method without buckets"); + VLOG_DBG(" Don't apply dp_hash method without buckets."); return false; } @@ -4862,9 +4861,24 @@ group_set_selection_method(struct group_dpif *group) const struct ofputil_group_props *props = &group->up.props; const char *selection_method = props->selection_method; + VLOG_DBG("Constructing select group %"PRIu32, group->up.group_id); if (selection_method[0] == '\0') { - VLOG_DBG("No selection method specified."); - group->selection_method = SEL_METHOD_DEFAULT; + VLOG_DBG("No selection method specified. Trying dp_hash."); + /* If the controller has not specified a selection method, check if + * the dp_hash selection method with max 64 hash values is appropriate + * for the given bucket configuration. */ + if (group_setup_dp_hash_table(group, 64)) { + /* Use dp_hash selection method with symmetric L4 hash. */ + group->selection_method = SEL_METHOD_DP_HASH; + group->hash_alg = OVS_HASH_ALG_SYM_L4; + group->hash_basis = 0; + VLOG_DBG("Use dp_hash with %d hash values using algorithm %d.", + group->hash_mask + 1, group->hash_alg); + } else { + /* Fall back to original default hashing in slow path. */ + VLOG_DBG("Falling back to default hash method."); + group->selection_method = SEL_METHOD_DEFAULT; + } } else if (!strcmp(selection_method, "dp_hash")) { VLOG_DBG("Selection method specified: dp_hash."); /* Try to use dp_hash if possible at all. */ @@ -4872,7 +4886,7 @@ group_set_selection_method(struct group_dpif *group) group->selection_method = SEL_METHOD_DP_HASH; group->hash_alg = props->selection_method_param >> 32; if (group->hash_alg >= __OVS_HASH_MAX) { - VLOG_DBG(" Invalid dp_hash algorithm %d. " + VLOG_DBG("Invalid dp_hash algorithm %d. " "Defaulting to OVS_HASH_ALG_L4", group->hash_alg); group->hash_alg = OVS_HASH_ALG_L4; } @@ -4881,7 +4895,7 @@ group_set_selection_method(struct group_dpif *group) group->hash_mask + 1, group->hash_alg); } else { /* Fall back to original default hashing in slow path. */ - VLOG_DBG(" Falling back to default hash method."); + VLOG_DBG("Falling back to default hash method."); group->selection_method = SEL_METHOD_DEFAULT; } } else if (!strcmp(selection_method, "hash")) { @@ -4890,12 +4904,12 @@ group_set_selection_method(struct group_dpif *group) /* Controller has specified hash fields. */ struct ds s = DS_EMPTY_INITIALIZER; oxm_format_field_array(&s, &props->fields); - VLOG_DBG(" Hash fields: %s", ds_cstr(&s)); + VLOG_DBG("Hash fields: %s", ds_cstr(&s)); ds_destroy(&s); group->selection_method = SEL_METHOD_HASH; } else { /* No hash fields. Fall back to original default hashing. */ - VLOG_DBG(" No hash fields. Falling back to default hash method."); + VLOG_DBG("No hash fields. Falling back to default hash method."); group->selection_method = SEL_METHOD_DEFAULT; } } else { diff --git a/ofproto/ofproto-dpif.h b/ofproto/ofproto-dpif.h index e95fead..1a404c8 100644 --- a/ofproto/ofproto-dpif.h +++ b/ofproto/ofproto-dpif.h @@ -61,6 +61,7 @@ struct ofproto_async_msg; struct ofproto_dpif; struct uuid; struct xlate_cache; +struct xlate_ctx; /* Number of implemented OpenFlow tables. */ enum { N_TABLES = 255 }; diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h index d636fb3..2b77b89 100644 --- a/ofproto/ofproto-provider.h +++ b/ofproto/ofproto-provider.h @@ -572,7 +572,7 @@ struct ofgroup { const struct ovs_list buckets; /* Contains "struct ofputil_bucket"s. */ const uint32_t n_buckets; - const struct ofputil_group_props props; + struct ofputil_group_props props; struct rule_collection rules OVS_GUARDED; /* Referring rules. */ }; diff --git a/tests/mpls-xlate.at b/tests/mpls-xlate.at index 9bbf22a..34d82a3 100644 --- a/tests/mpls-xlate.at +++ b/tests/mpls-xlate.at @@ -25,7 +25,7 @@ dummy@ovs-dummy: hit:0 missed:0 ]) dnl Setup single MPLS tags. -AT_CHECK([ovs-ofctl -O OpenFlow13 add-group br0 group_id=1232,type=select,bucket=output:LOCAL]) +AT_CHECK([ovs-ofctl -O OpenFlow15 add-group br0 group_id=1232,type=select,selection_method=hash,bucket=output:LOCAL]) AT_CHECK([ovs-ofctl -O OpenFlow13 add-group br0 group_id=1233,type=all,bucket=output:LOCAL]) AT_CHECK([ovs-ofctl -O OpenFlow13 add-group br0 group_id=1234,type=all,bucket=dec_ttl,output:LOCAL]) AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 in_port=local,dl_type=0x0800,action=push_mpls:0x8847,set_field:10-\>mpls_label,output:1]) @@ -71,9 +71,15 @@ AT_CHECK([tail -1 stdout], [0], [Datapath actions: pop_mpls(eth_type=0x800),recirc(0x2) ]) -AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(2),in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=1.1.2.92,dst=1.1.2.88,proto=47,tos=0,ttl=64,frag=no)'], [0], [stdout]) -AT_CHECK([tail -1 stdout], [0], - [Datapath actions: 100 +for d in 0 1 2 3; do + pkt="in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x8847),mpls(label=22,tc=0,ttl=64,bos=1)" + AT_CHECK([ovs-appctl netdev-dummy/receive p0 $pkt]) +done + +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/packets.*actions:1/actions:1/' | strip_used | strip_ufid | sort], [0], [dnl +flow-dump from non-dpdk interfaces: +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x8847),mpls(label=22/0xfffff,tc=0/0,ttl=64/0x0,bos=1/1), packets:3, bytes:54, used:0.0s, actions:pop_mpls(eth_type=0x800),recirc(0x3) +recirc_id(0x3),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:100 ]) dnl Test MPLS pop then all group output (bucket actions do not trigger recirculation) @@ -85,10 +91,10 @@ AT_CHECK([tail -1 stdout], [0], dnl Test MPLS pop then all group output (bucket actions trigger recirculation) AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x8847),mpls(label=24,tc=0,ttl=64,bos=1)'], [0], [stdout]) AT_CHECK([tail -1 stdout], [0], - [Datapath actions: pop_mpls(eth_type=0x800),recirc(0x3) + [Datapath actions: pop_mpls(eth_type=0x800),recirc(0x4) ]) -AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(3),in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=1.1.2.92,dst=1.1.2.88,proto=47,tos=0,ttl=64,frag=no)'], [0], [stdout]) +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(4),in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=1.1.2.92,dst=1.1.2.88,proto=47,tos=0,ttl=64,frag=no)'], [0], [stdout]) AT_CHECK([tail -1 stdout], [0], [Datapath actions: set(ipv4(ttl=63)),100 ]) @@ -96,10 +102,10 @@ AT_CHECK([tail -1 stdout], [0], dnl Test MPLS pop then all output to patch port AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x8847),mpls(label=25,tc=0,ttl=64,bos=1)'], [0], [stdout]) AT_CHECK([tail -1 stdout], [0], - [Datapath actions: pop_mpls(eth_type=0x800),recirc(0x4) + [Datapath actions: pop_mpls(eth_type=0x800),recirc(0x5) ]) -AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(4),in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=1.1.2.92,dst=1.1.2.88,proto=47,tos=0,ttl=64,frag=no)'], [0], [stdout]) +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(5),in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=1.1.2.92,dst=1.1.2.88,proto=47,tos=0,ttl=64,frag=no)'], [0], [stdout]) AT_CHECK([tail -1 stdout], [0], [Datapath actions: 101 ]) @@ -124,10 +130,10 @@ AT_CHECK([tail -1 stdout], [0], dnl Double MPLS pop AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x8847),mpls(label=60,tc=0,ttl=64,bos=0,label=50,tc=0,ttl=64,bos=1)'], [0], [stdout]) AT_CHECK([tail -1 stdout], [0], - [Datapath actions: pop_mpls(eth_type=0x8847),pop_mpls(eth_type=0x800),recirc(0x5) + [Datapath actions: pop_mpls(eth_type=0x8847),pop_mpls(eth_type=0x800),recirc(0x7) ]) -AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(5),in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=1.1.2.92,dst=1.1.2.88,proto=47,tos=0,ttl=64,frag=no)'], [0], [stdout]) +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'recirc_id(7),in_port(1),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=1.1.2.92,dst=1.1.2.88,proto=47,tos=0,ttl=64,frag=no)'], [0], [stdout]) AT_CHECK([tail -1 stdout], [0], [Datapath actions: set(ipv4(ttl=10)),100 ]) diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index 6d87951..00ab97b 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -337,10 +337,18 @@ OVS_VSWITCHD_START add_of_ports br0 1 10 AT_CHECK([ovs-ofctl -O OpenFlow12 add-group br0 'group_id=1234,type=select,bucket=set_field:192.168.3.90->ip_src,output:10']) AT_CHECK([ovs-ofctl -O OpenFlow12 add-flow br0 'ip actions=group:1234,output:10']) -AT_CHECK([ovs-appctl ofproto/trace br0 'in_port=1,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,dl_type=0x0800,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_proto=1,nw_tos=0,nw_ttl=128,icmp_type=8,icmp_code=0'], [0], [stdout]) -AT_CHECK([tail -1 stdout], [0], - [Datapath actions: set(ipv4(src=192.168.3.90,dst=192.168.0.2)),10,set(ipv4(src=192.168.0.1,dst=192.168.0.2)),10 + +for d in 0 1 2 3; do + pkt="in_port(1),eth(src=50:54:00:00:00:07,dst=50:54:00:00:00:1),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.1.2,proto=1,tos=0,ttl=128,frag=no),icmp(type=8,code=0)" + AT_CHECK([ovs-appctl netdev-dummy/receive p1 $pkt]) +done + +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/dp_hash(.*\/0xf)/dp_hash(0xXXXX\/0xf)/' | sed 's/packets.*actions:/actions:/' | strip_ufid | strip_used | sort], [0], [dnl +flow-dump from non-dpdk interfaces: +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:hash(sym_l4(0)),recirc(0x1) +recirc_id(0x1),dp_hash(0xXXXX/0xf),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(src=192.168.0.1,frag=no), actions:set(ipv4(src=192.168.3.90)),10,set(ipv4(src=192.168.0.1)),10 ]) + OVS_VSWITCHD_STOP AT_CLEANUP @@ -397,81 +405,265 @@ AT_CLEANUP AT_SETUP([ofproto-dpif - select group]) + +# Helper function to check the spread of dp_hash flows over buckets in the datapath +check_dpflow_stats () { + min_flows=$1 + min_buckets=$2 + read -d '' dpflows + hash_flow=`echo "$dpflows" | grep "actions:hash"` + n_flows=`echo "$dpflows" | grep -c dp_hash` + n_buckets=`echo "$dpflows" | grep dp_hash | grep -o "actions:[[0-9]]*" | sort | uniq -c | wc -l` + if [[ $n_flows -ge $min_flows ]]; then flows=ok; else flows=nok; fi + if [[ $n_buckets -ge $min_buckets ]]; then buckets=ok; else buckets=nok; fi + echo $hash_flow + echo "n_flows=$flows n_buckets=$buckets" +} + OVS_VSWITCHD_START add_of_ports br0 1 10 11 + +ovs-appctl vlog/set ofproto_dpif:file:dbg AT_CHECK([ovs-ofctl -O OpenFlow12 add-group br0 'group_id=1234,type=select,bucket=output:10,bucket=output:11']) +AT_CHECK([grep -A6 "Constructing select group 1234" ovs-vswitchd.log | sed 's/^.*ofproto_dpif/ofproto_dpif/'], [0], [dnl +ofproto_dpif|DBG|Constructing select group 1234 +ofproto_dpif|DBG|No selection method specified. Trying dp_hash. +ofproto_dpif|DBG| Minimum weight: 1, total weight: 2 +ofproto_dpif|DBG| Using 16 hash values: +ofproto_dpif|DBG| Bucket 0: weight=1, target=8.00 hits=8 +ofproto_dpif|DBG| Bucket 1: weight=1, target=8.00 hits=8 +ofproto_dpif|DBG|Use dp_hash with 16 hash values using algorithm 1. +]) AT_CHECK([ovs-ofctl -O OpenFlow12 add-flow br0 'ip actions=write_actions(group:1234)']) # Try a bunch of different flows and make sure that they get distributed -# at least somewhat. -for d in 0 1 2 3 4 5 6 7 8 9 a b c d e f; do - AT_CHECK([ovs-appctl ofproto/trace br0 "in_port=1,dl_src=50:54:00:00:00:07,dl_dst=50:54:00:00:00:0$d,dl_type=0x0800,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_proto=1,nw_tos=0,nw_ttl=128,icmp_type=8,icmp_code=0"], [0], [stdout]) - tail -1 stdout >> results +# # at least somewhat. +for d in 0 1 2 3; do + for s in 1 2 3 4 ; do + pkt="in_port(1),eth(src=50:54:00:00:00:07,dst=50:54:00:00:00:1),eth_type(0x0800),ipv4(src=192.168.0.$s,dst=192.168.1.$d,proto=1,tos=0,ttl=128,frag=no),icmp(type=8,code=0)" + AT_CHECK([ovs-appctl netdev-dummy/receive p1 $pkt]) + done done -sort results | uniq -c -AT_CHECK([sort results | uniq], [0], - [Datapath actions: 10 -Datapath actions: 11 + +AT_CHECK([ovs-appctl dpctl/dump-flows | sort | strip_ufid | strip_used | check_dpflow_stats 5 2], [0], [dnl +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:15, bytes:1590, used:0.0s, actions:hash(sym_l4(0)),recirc(0x1) +n_flows=ok n_buckets=ok ]) + OVS_VSWITCHD_STOP AT_CLEANUP AT_SETUP([ofproto-dpif - select group with watch port]) + OVS_VSWITCHD_START add_of_ports br0 1 10 11 AT_CHECK([ovs-ofctl -O OpenFlow12 add-group br0 'group_id=1234,type=select,bucket=watch_port:10,output:10,bucket=output:11']) AT_CHECK([ovs-ofctl -O OpenFlow12 add-flow br0 'ip actions=write_actions(group:1234)']) -AT_CHECK([ovs-appctl ofproto/trace br0 'in_port=1,dl_src=50:54:00:00:00:07,dl_dst=50:54:00:00:00:07,dl_type=0x0800,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_proto=1,nw_tos=0,nw_ttl=128,icmp_type=8,icmp_code=0'], [0], [stdout]) -AT_CHECK([tail -1 stdout], [0], - [Datapath actions: 11 + +for d in 0 1 2 3; do + pkt="in_port(1),eth(src=50:54:00:00:00:07,dst=50:54:00:00:00:1),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=1,tos=0,ttl=128,frag=no),icmp(type=8,code=0)" + AT_CHECK([ovs-appctl netdev-dummy/receive p1 $pkt]) +done + +AT_CHECK([ovs-appctl dpctl/dump-flows | sort| sed 's/dp_hash(.*\/0xf)/dp_hash(0xXXXX\/0xf)/' | strip_ufid | strip_used], [0], [dnl +flow-dump from non-dpdk interfaces: +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:3, bytes:318, used:0.0s, actions:hash(sym_l4(0)),recirc(0x1) +recirc_id(0x1),dp_hash(0xXXXX/0xf),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:3, bytes:318, used:0.0s, actions:11 ]) + OVS_VSWITCHD_STOP AT_CLEANUP -AT_SETUP([ofproto-dpif - select group with weight]) +AT_SETUP([ofproto-dpif - select group with weights]) + +# Helper function to check the spread of dp_hash flows over buckets in the datapath +check_dpflow_stats () { + min_flows=$1 + min_buckets=$2 + read -d '' dpflows + hash_flow=`echo "$dpflows" | grep "actions:hash"` + n_flows=`echo "$dpflows" | grep -c dp_hash` + n_buckets=`echo "$dpflows" | grep dp_hash | grep -o "actions:[[0-9]]*" | sort | uniq -c | wc -l` + if [[ $n_flows -ge $min_flows ]]; then flows=ok; else flows=nok; fi + if [[ $n_buckets -ge $min_buckets ]]; then buckets=ok; else buckets=nok; fi + echo $hash_flow + echo "n_flows=$flows n_buckets=$buckets" +} + +# Helper function to check the accuracy of distribution of packets over buckets +check_group_stats () { + min=($1 $2 $3 $4) + buckets=`grep -o 'packet_count=[[0-9]]*' | cut -d'=' -f2 | tail -n +2` + i=0 + for bucket in $buckets; do + if [[ $bucket -ge ${min[i]} ]]; then + echo "bucket$i >= ${min[[$i]]}" + else + echo "bucket$i < ${min[[$i]]}" + fi + (( i++ )) + if [[ $i -ge 4 ]]; then break; fi + done +} + OVS_VSWITCHD_START -add_of_ports br0 1 10 11 12 -AT_CHECK([ovs-ofctl -O OpenFlow12 add-group br0 'group_id=1234,type=select,bucket=output:10,bucket=output:11,weight=2000,bucket=output:12,weight=0']) +add_of_ports br0 1 10 11 12 13 14 + +ovs-appctl vlog/set ofproto_dpif:file:dbg +AT_CHECK([ovs-ofctl -O OpenFlow13 add-group br0 'group_id=1234,type=select,bucket=weight:5,output:10,bucket=weight:10,output:11,bucket=weight:25,output:12,bucket=weight:60,output:13,bucket=weight:0,output:14']) +AT_CHECK([grep -A9 "Constructing select group 1234" ovs-vswitchd.log | sed 's/^.*ofproto_dpif/ofproto_dpif/'], [0], [dnl +ofproto_dpif|DBG|Constructing select group 1234 +ofproto_dpif|DBG|No selection method specified. Trying dp_hash. +ofproto_dpif|DBG| Minimum weight: 5, total weight: 100 +ofproto_dpif|DBG| Using 32 hash values: +ofproto_dpif|DBG| Bucket 0: weight=5, target=1.60 hits=2 +ofproto_dpif|DBG| Bucket 1: weight=10, target=3.20 hits=3 +ofproto_dpif|DBG| Bucket 2: weight=25, target=8.00 hits=8 +ofproto_dpif|DBG| Bucket 3: weight=60, target=19.20 hits=19 +ofproto_dpif|DBG| Bucket 4: weight=0, target=0.00 hits=0 +ofproto_dpif|DBG|Use dp_hash with 32 hash values using algorithm 1. +]) AT_CHECK([ovs-ofctl -O OpenFlow12 add-flow br0 'ip actions=write_actions(group:1234)']) -AT_CHECK([ovs-appctl ofproto/trace br0 'in_port=1,dl_src=50:54:00:00:00:07,dl_dst=50:54:00:00:00:07,dl_type=0x0800,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_proto=1,nw_tos=0,nw_ttl=128,icmp_type=8,icmp_code=0'], [0], [stdout]) -AT_CHECK([tail -1 stdout], [0], - [Datapath actions: 11 + +# Try 1000 different flows and make sure that they get distributed according to weights +for d1 in 0 1 2 3 4 5 6 7 8 9 ; do + for d2 in 0 1 2 3 4 5 6 7 8 9 ; do + for s in 0 1 2 3 4 5 6 7 8 9 ; do + pkt="in_port(1),eth(src=50:54:00:00:00:07,dst=50:54:00:00:00:1),eth_type(0x0800),ipv4(src=192.168.1.$s,dst=192.168.$d1.$d2,proto=6,tos=0,ttl=128,frag=no),tcp(src=1000$s,dst=1000)" + AT_CHECK([ovs-appctl netdev-dummy/receive p1 $pkt]) + done + done +done + +# Check balanced distribution over 32 dp_hash values +AT_CHECK([ovs-appctl dpctl/dump-flows | sort | strip_ufid | strip_used | check_dpflow_stats 32 4 ], [0], [dnl +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:999, bytes:117882, used:0.0s, actions:hash(sym_l4(0)),recirc(0x1) +n_flows=ok n_buckets=ok +]) + +# Check that actual distribution over the buckets is reasonably accurate: + ideal weights dp_hash values +# bucket0: 5%*1000 = 50 2/32*1000 = 63 +# bucket1: 10%*1000 = 100 3/32*1000 = 94 +# bucket2: 25%*1000 = 250 8/32*1000 = 250 +# bucket3: 60%*1000 = 600 19/32*1000 = 594 +# bucket4: 0 0 + +ovs-appctl time/warp 1000 +AT_CHECK([ovs-ofctl -O OpenFlow13 dump-group-stats br0 | sed 's/duration=[[0-9]]\.[[0-9]]*s,//' | check_group_stats 40 80 200 500], +[0], [dnl +bucket0 >= 40 +bucket1 >= 80 +bucket2 >= 200 +bucket3 >= 500 ]) + OVS_VSWITCHD_STOP AT_CLEANUP -AT_SETUP([ofproto-dpif - select group with hash selection method]) +AT_SETUP([ofproto-dpif - select group with explicit dp_hash selection method]) + OVS_VSWITCHD_START add_of_ports br0 1 10 11 -# Check that parse failures after 'fields' parsing work -AT_CHECK([ovs-ofctl -O OpenFlow10 add-group br0 'group_id=1,type=select,fields(eth_dst),bukket=output:10'], [1], ,[dnl -ovs-ofctl: unknown keyword bukket + +ovs-appctl vlog/set ofproto_dpif:file:dbg +AT_CHECK([ovs-ofctl -O OpenFlow15 add-group br0 'group_id=1234,type=select,selection_method=dp_hash,bucket=output:10,bucket=output:11']) +AT_CHECK([grep -A6 "Constructing select group 1234" ovs-vswitchd.log | sed 's/^.*ofproto_dpif/ofproto_dpif/'], [0], [dnl +ofproto_dpif|DBG|Constructing select group 1234 +ofproto_dpif|DBG|Selection method specified: dp_hash. +ofproto_dpif|DBG| Minimum weight: 1, total weight: 2 +ofproto_dpif|DBG| Using 16 hash values: +ofproto_dpif|DBG| Bucket 0: weight=1, target=8.00 hits=8 +ofproto_dpif|DBG| Bucket 1: weight=1, target=8.00 hits=8 +ofproto_dpif|DBG|Use dp_hash with 16 hash values using algorithm 0. +]) + +# Fall back to legacy hash with zero buckets +AT_CHECK([ovs-ofctl -O OpenFlow15 add-group br0 'group_id=1235,type=select,selection_method=dp_hash']) +AT_CHECK([grep -A3 "Constructing select group 1235" ovs-vswitchd.log | sed 's/^.*ofproto_dpif/ofproto_dpif/'], [0], [dnl +ofproto_dpif|DBG|Constructing select group 1235 +ofproto_dpif|DBG|Selection method specified: dp_hash. +ofproto_dpif|DBG| Don't apply dp_hash method without buckets. +ofproto_dpif|DBG|Falling back to default hash method. +]) + +# Fall back to legacy hash with zero buckets +AT_CHECK([ovs-ofctl -O OpenFlow15 add-group br0 'group_id=1236,type=select,selection_method=dp_hash,bucket=weight=1,output:10,bucket=weight=1000,output:11']) +AT_CHECK([grep -A4 "Constructing select group 1236" ovs-vswitchd.log | sed 's/^.*ofproto_dpif/ofproto_dpif/'], [0], [dnl +ofproto_dpif|DBG|Constructing select group 1236 +ofproto_dpif|DBG|Selection method specified: dp_hash. +ofproto_dpif|DBG| Minimum weight: 1, total weight: 1001 +ofproto_dpif|DBG| Too many hash values required: 1024 +ofproto_dpif|DBG|Falling back to default hash method. +]) + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - select group with legacy hash selection method]) + +# Helper function to check the spread of dp_hash flows over buckets in the datapath +check_dpflow_stats () { + min_flows=$1 + min_buckets=$2 + read -d '' dpflows + n_flows=`echo "$dpflows" | wc -l` + n_buckets=`echo "$dpflows" | grep -o "actions:[[0-9]]*" | sort | uniq -c | wc -l` + if [[ $n_flows -ge $min_flows ]]; then flows=ok; else flows=nok; fi + if [[ $n_buckets -ge $min_buckets ]]; then buckets=ok; else buckets=nok; fi + echo "n_flows=$flows n_buckets=$buckets" +} + +OVS_VSWITCHD_START +add_of_ports br0 1 10 11 + +ovs-appctl vlog/set ofproto_dpif:file:dbg +AT_CHECK([ovs-ofctl -O OpenFlow15 add-group br0 'group_id=1234,type=select,selection_method=hash,bucket=output:10,bucket=output:11']) +AT_CHECK([grep -A2 "Constructing select group 1234" ovs-vswitchd.log | sed 's/^.*ofproto_dpif/ofproto_dpif/'], [0], [dnl +ofproto_dpif|DBG|Constructing select group 1234 +ofproto_dpif|DBG|Selection method specified: hash. +ofproto_dpif|DBG|No hash fields. Falling back to default hash method. ]) -AT_CHECK([ovs-ofctl -O OpenFlow15 add-group br0 'group_id=1234,type=select,selection_method=hash,fields(eth_dst,ip_dst,tcp_dst),bucket=output:10,bucket=output:11']) + AT_CHECK([ovs-ofctl -O OpenFlow15 add-flow br0 'ip actions=write_actions(group:1234)']) -# Try a bunch of different flows and make sure that they get distributed -# at least somewhat. -for d in 0 1 2 3 4 5 6 7 8 9 a b c d e f; do - AT_CHECK([ovs-appctl ofproto/trace br0 "in_port=1,dl_src=50:54:00:00:00:07,dl_dst=50:54:00:00:00:0$d,dl_type=0x0800,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_proto=1,nw_tos=0,nw_ttl=128,icmp_type=8,icmp_code=0"], [0], [stdout]) - tail -1 stdout >> results +# Try 16 flows with differing default hash values. +for d in 0 1 2 3; do + for s in 1 2 3 4 ; do + pkt="in_port(1),eth(src=50:54:00:00:00:07,dst=50:54:00:00:00:1),eth_type(0x0800),ipv4(src=192.168.0.$s,dst=192.168.1.$d,proto=1,tos=0,ttl=128,frag=no),icmp(type=8,code=0)" + AT_CHECK([ovs-appctl netdev-dummy/receive p1 $pkt]) + done done -sort results | uniq -c -AT_CHECK([sort results | uniq], [0], - [Datapath actions: 10 -Datapath actions: 11 + +# Check that the packets installed 16 data path flows and each of the two +# buckets is hit at least once. +AT_CHECK([ovs-appctl dpctl/dump-flows | strip_ufid | strip_used | sort | check_dpflow_stats 16 2], [0], [dnl +n_flows=ok n_buckets=ok ]) -> results -# Try a bunch of different flows and make sure that they are not distributed -# as they only vary a field that is not hashed -for d in 0 1 2 3 4 5 6 7 8 9 a b c d e f; do - AT_CHECK([ovs-appctl ofproto/trace br0 "in_port=1,dl_src=50:54:00:00:00:0$d,dl_dst=50:54:00:00:00:07,dl_type=0x0800,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_proto=1,nw_tos=0,nw_ttl=128,icmp_type=8,icmp_code=0"], [0], [stdout]) - tail -1 stdout >> results -done -sort results | uniq -c -AT_CHECK([sort results | uniq | sed 's/1[[01]]/1?/'], [0], - [Datapath actions: 1? +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([ofproto-dpif - select group with custom hash selection method]) + +# Helper function to check the spread of dp_hash flows over buckets in the datapath +check_dpflow_stats () { + min_flows=$1 + min_buckets=$2 + read -d '' dpflows + n_flows=`echo "$dpflows" | wc -l` + n_buckets=`echo "$dpflows" | grep -o "actions:[[0-9]]*" | sort | uniq -c | wc -l` + if [[ $n_flows -ge $min_flows ]]; then flows=ok; else flows=nok; fi + if [[ $n_buckets -ge $min_buckets ]]; then buckets=ok; else buckets=nok; fi + echo "n_flows=$flows n_buckets=$buckets" +} + +OVS_VSWITCHD_START +add_of_ports br0 1 10 11 + +# Check that parse failures after 'fields' parsing work +AT_CHECK([ovs-ofctl -O OpenFlow10 add-group br0 'group_id=1,type=select,fields(eth_dst),bukket=output:10'], [1], ,[dnl +ovs-ofctl: unknown keyword bukket ]) # Check that fields are rejected without "selection_method=hash". @@ -484,43 +676,31 @@ AT_CHECK([ovs-ofctl -O OpenFlow15 add-group br0 'group_id=1235,type=select,selec ovs-ofctl: selection_method_param is only allowed with "selection_method" ]) -OVS_VSWITCHD_STOP -AT_CLEANUP - -AT_SETUP([ofproto-dpif - select group with dp_hash selection method]) -OVS_VSWITCHD_START -add_of_ports br0 1 10 11 -AT_CHECK([ovs-ofctl -O OpenFlow15 add-group br0 'group_id=1234,type=select,selection_method=dp_hash,bucket=output:10,bucket=output:11']) -AT_CHECK([ovs-ofctl -O OpenFlow15 add-flow br0 'ip,nw_src=192.168.0.1 actions=group:1234']) +AT_CHECK([ovs-ofctl -O OpenFlow15 add-group br0 'group_id=1234,type=select,selection_method=hash,fields(eth_dst,ip_dst,tcp_dst),bucket=output:10,bucket=output:11']) +AT_CHECK([ovs-ofctl -O OpenFlow15 add-flow br0 'ip actions=write_actions(group:1234)']) -# Try a bunch of different flows and make sure that they get distributed -# at least somewhat. -for d in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15; do - pkt="in_port(1),eth(src=50:54:00:00:00:07,dst=50:54:00:00:00:01),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.1.$d,proto=1,tos=0,ttl=128,frag=no),icmp(type=8,code=0)" +# Try 16 flows with differing custom hash and check that they give rise to +# 16 data path flows and each of the two buckets is hit at least once +for d in 0 1 2 3 4 5 6 7 8 9 a b c d e f; do + pkt="in_port(1),eth(src=50:54:00:00:00:07,dst=50:54:00:00:00:$d),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=1,tos=0,ttl=128,frag=no),icmp(type=8,code=0)" AT_CHECK([ovs-appctl netdev-dummy/receive p1 $pkt]) done -AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/dp_hash(.*\/0xf)/dp_hash(0xXXXX\/0xf)/' | sed 's/packets.*actions:1/actions:1/' | \ - strip_ufid | strip_used | sort | uniq], [0], [dnl -flow-dump from non-dpdk interfaces: -recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(src=192.168.0.1,frag=no), packets:15, bytes:1590, used:0.0s, actions:hash(l4(0)),recirc(0x1) -recirc_id(0x1),dp_hash(0xXXXX/0xf),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:10 -recirc_id(0x1),dp_hash(0xXXXX/0xf),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), actions:11 +AT_CHECK([ovs-appctl dpctl/dump-flows | strip_ufid | strip_used | sort | check_dpflow_stats 16 2], [0], [dnl +n_flows=ok n_buckets=ok ]) AT_CHECK([ovs-appctl revalidator/purge], [0]) -# Try a bunch of different flows and make sure that they are not distributed -# as they only vary a field that is not hashed +# Try 16 flows that differ only in fields that are not part of the custom +# hash and check that there is only a single datapath flow for d in 0 1 2 3 4 5 6 7 8 9 a b c d e f; do pkt="in_port(1),eth(src=50:54:00:00:00:$d,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=1,tos=0,ttl=128,frag=no),icmp(type=8,code=0)" AT_CHECK([ovs-appctl netdev-dummy/receive p1 $pkt]) done -AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/dp_hash(.*\/0xf)/dp_hash(0xXXXX\/0xf)/' | sed 's/\(actions:1\)[[01]]/\1X/' | strip_ufid | strip_used | sort], [0], [dnl -flow-dump from non-dpdk interfaces: -recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(src=192.168.0.1,frag=no), packets:15, bytes:1590, used:0.0s, actions:hash(l4(0)),recirc(0x2) -recirc_id(0x2),dp_hash(0xXXXX/0xf),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:15, bytes:1590, used:0.0s, actions:1X +AT_CHECK([ovs-appctl dpctl/dump-flows | grep -c recirc_id], [0], [dnl +1 ]) OVS_VSWITCHD_STOP diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at index 9a37464..8923ce0 100644 --- a/tests/ofproto-macros.at +++ b/tests/ofproto-macros.at @@ -300,6 +300,11 @@ strip_used () { sed 's/used:[[0-9]]\.[[0-9]]*/used:0.0/' } +# Removes all 'duration=...' to make output easier to compare. +strip_duration () { + sed 's/duration=[[0-9]]*\.[[0-9]]*s,//' +} + # Strips 'ufid:...' from output, to make it easier to compare. # (ufids are random.) strip_ufid () { @@ -318,7 +323,7 @@ m4_define([_OVS_VSWITCHD_START], [dnl Create database. touch .conf.db.~lock~ AT_CHECK([ovsdb-tool create conf.db $abs_top_srcdir/vswitchd/vswitch.ovsschema]) - +q dnl Start ovsdb-server. AT_CHECK([ovsdb-server --detach --no-chdir --pidfile --log-file --remote=punix:$OVS_RUNDIR/db.sock], [0], [], [stderr]) on_exit "kill `cat ovsdb-server.pid`" diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in index 2e2f696..4f8555a 100644 --- a/utilities/ovs-ofctl.8.in +++ b/utilities/ovs-ofctl.8.in @@ -2120,28 +2120,23 @@ The selection method used to select a bucket for a select group. This is a string of 1 to 15 bytes in length known to lower layers. This field is optional for \fBadd\-group\fR, \fBadd\-groups\fR and \fBmod\-group\fR commands on groups of type \fBselect\fR. Prohibited -otherwise. The default value is the empty string. +otherwise. If no selection method is specified, Open vSwitch up to +release 2.9 applies the \fBhash\fR method with default fields. From +2.10 onwards Open vSwitch defaults to the \fBdp_hash\fR method with symmetric +L3/L4 hash algorithm, unless the weighted group buckets cannot be mapped to +a maximum of 64 dp_hash values with sufficient accuracy. +In those rare cases Open vSwitch 2.10 and later fall back to the \fBhash\fR +method with the default set of hash fields. .RS -.IP \fBhash\fR -Use a hash computed over the fields specified with the \fBfields\fR -option, see below. \fBhash\fR uses the \fBselection_method_param\fR -as the hash basis. -.IP -Note that the hashed fields become exact matched by the datapath -flows. For example, if the TCP source port is hashed, the created -datapath flows will match the specific TCP source port value present -in the packet received. Since each TCP connection generally has a -different source port value, a separate datapath flow will be need to -be inserted for each TCP connection thus hashed to a select group -bucket. .IP \fBdp_hash\fR Use a datapath computed hash value. The hash algorithm varies accross different datapath implementations. \fBdp_hash\fR uses the upper 32 bits of the \fBselection_method_param\fR as the datapath hash -algorithm selector, which currently must always be 0, corresponding to -hash computation over the IP 5-tuple (selecting specific fields with -the \fBfields\fR option is not allowed with \fBdp_hash\fR). The lower -32 bits are used as the hash basis. +algorithm selector. The supported values are \fB0\fR (corresponding to +hash computation over the IP 5-tuple) and \fB1\fR (corresponding to a +\fIsymmetric\fR hash computation over the IP 5-tuple). Selecting specific +fields with the \fBfields\fR option is not supported with \fBdp_hash\fR). +The lower 32 bits are used as the hash basis. .IP Using \fBdp_hash\fR has the advantage that it does not require the generated datapath flows to exact match any additional packet header @@ -2155,9 +2150,23 @@ when needed, and a second match is required to match some bits of its value. This double-matching incurs a small additional latency cost for each packet, but this latency is orders of magnitude less than the latency of creating new datapath flows for new TCP connections. +.IP \fBhash\fR +Use a hash computed over the fields specified with the \fBfields\fR +option, see below. If no hash fields are specified, \fBhash\fR defaults +to a symmetric hash over the combination of MAC addresses, VLAN tags, +Ether type, IP addresses and L4 port numbers. \fBhash\fR uses the +\fBselection_method_param\fR as the hash basis. +.IP +Note that the hashed fields become exact matched by the datapath +flows. For example, if the TCP source port is hashed, the created +datapath flows will match the specific TCP source port value present +in the packet received. Since each TCP connection generally has a +different source port value, a separate datapath flow will be need to +be inserted for each TCP connection thus hashed to a select group +bucket. .RE .IP -This option will use a Netronome OpenFlow extension which is only supported +This option uses a Netronome OpenFlow extension which is only supported when using Open vSwitch 2.4 and later with OpenFlow 1.5 and later. .IP \fBselection_method_param\fR=\fIparam\fR @@ -2167,7 +2176,7 @@ lower-layer that implements the \fBselection_method\fR. It is optional if the \fBselection_method\fR field is specified as a non-empty string. Prohibited otherwise. The default value is zero. .IP -This option will use a Netronome OpenFlow extension which is only supported +This option uses a Netronome OpenFlow extension which is only supported when using Open vSwitch 2.4 and later with OpenFlow 1.5 and later. .IP \fBfields\fR=\fIfield\fR -- 1.9.1 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev