The latency stats published in v3(https://mail.openvswitch.org/pipermail/ovs-dev/2017-February/328363.html) seems to be erroneous due to the way the RFC2544 test was configured in IXIA. Please find below the updated latency stats. Only case 1 and Case 2 stats are published below, where the burst size is 32.
BTW, While calculating Latency, the stats parameter was set as 'Cut Through' meaning Latency will be calculated as first bit in, first bit out. Also the acceptable Frame Loss % is set to 1%. Note that the below results are aggregated results of approximately 9 iterations. Benchmarks are done on the same commit(83ede47a48eb92053f66815e462e94a39d8a1f2c) as v3. Case 1: Matching IP flow rules for each IXIA stream ############################################################### Packet 64 128 256 512 Branch Master Patch Master Patch Master Patch Master Patch Min 25360 199000 30260 208890 23490 131320 19620 118700 (ns) Max 854260 577600 868680 302440 197420 195090 160930 184740 (ns) Avg 384182 261213 412612 262091 190386 166025 133661 154787 (ns) 1024 1280 1518 Master Patch Master Patch Master Patch Min 20290 180650 30370 157260 19680 147550 (ns) Max 304290 239750 178570 216650 199140 209050 (ns) Avg 260350 209316 149328 185930 170091 177033 (ns) case 2: ovs-ofctl add-flow br0 in_port=1,action=output:2 ############################################################### Packet 64 128 256 512 Branch Master Patch Master Patch Master Patch Master Patch Min 27870 30680 13080 29160 12000 18970 14520 14610 (ns) Max 323790 205930 282360 289470 39170 51610 48340 80670 (ns) Avg 162219 163582 40685 41677 21582 41546 35017 66192 (ns) 1024 1280 1518 Master Patch Master Patch Master Patch Min 10820 29670 11270 24740 11510 24780 (ns) Max 29480 70300 29900 39010 32460 40010 (ns) Avg 18926 54582 19239 30636 19087 16722 (ns) Regards, Bhanuprakash. >-----Original Message----- >From: Bodireddy, Bhanuprakash >Sent: Thursday, February 2, 2017 10:15 PM >To: d...@openvswitch.org >Cc: i.maxim...@samsung.com; ktray...@redhat.com; diproiet...@ovn.org; >Bodireddy, Bhanuprakash <bhanuprakash.bodire...@intel.com>; Fischetti, >Antonio <antonio.fische...@intel.com>; Markus Magnusson ><markus.magnus...@ericsson.com> >Subject: [PATCH v3] netdev-dpdk: Implement Tx intermediate queue for >dpdk ports. > >After packet classification, packets are queued in to batches depending on the >matching netdev flow. Thereafter each batch is processed to execute the >related actions. This becomes particularly inefficient if there are few packets >in each batch as rte_eth_tx_burst() incurs expensive MMIO writes. > >This commit adds back intermediate queue implementation. Packets are >queued and burst when the packet count exceeds threshold. Also drain logic >is refactored to handle packets hanging in the tx queues. Testing shows >significant performance gains with this implementation. > >Fixes: b59cc14e032d("netdev-dpdk: Use instant sending instead of queueing >of packets.") >Signed-off-by: Bhanuprakash Bodireddy ><bhanuprakash.bodire...@intel.com>> >Signed-off-by: Antonio Fischetti <antonio.fische...@intel.com> >Co-authored-by: Antonio Fischetti <antonio.fische...@intel.com> >Signed-off-by: Markus Magnusson <markus.magnus...@ericsson.com> >Co-authored-by: Markus Magnusson <markus.magnus...@ericsson.com> >--- >v2->v3 > * Refactor the code > * Use thread local copy 'send_port_cache' instead of 'tx_port' while draining > * Invoke dp_netdev_drain_txq_port() to drain the packets from the queue >as > part of pmd reconfiguration that gets triggered due to port >addition/deletion > or change in pmd-cpu-mask. > * Invoke netdev_txq_drain() from xps_get_tx_qid() to drain packets in old > queue. This is possible in XPS case where the tx queue can change after > timeout. > * Fix another bug in netdev_dpdk_eth_tx_burst() w.r.t 'txq->count'. > >Latency stats: >Collected the latency stats with PHY2PHY loopback case using 30 IXIA streams >/UDP packets/uni direction traffic. All the stats are in nanoseconds. Results >below compare latency results between Master vs patch. > >case 1: Matching IP flow rules for each IXIA stream > Eg: For an IXIA stream with src Ip: 2.2.2.1, dst tip: 5.5.5.1 > ovs-ofctl add-flow br0 dl_type=0x0800,nw_src=2.2.2.1,actions=output:2 > > For an IXIA stream with src Ip: 4.4.4.1, dst tip: 15.15.15.1 > ovs-ofctl add-flow br0 dl_type=0x0800,nw_src=4.4.4.1,actions=output:2 > >Packet 64 128 256 512 >Branch Master Patch Master Patch Master Patch Master Patch >case 1 26100 222000 26190 217930 23890 199000 30370 212440 (min >latency ns) > 1239100 906910 1168740 691040 575470 574240 724360 734050 (max >latency ns) > 1189501 763908 913602 662941 486187 440482 470060 479376 > (avg >latency ns) > > > 1024 1280 1518 > Master Patch Master Patch Master Patch > 28320 189610 26520 220580 23950 200480 (min latency ns) > 701040 67584670 670390 19783490 685930 747040 (max latency ns) > 444033 469297 415602 506215 429587 491593 (avg latency ns) > > >case 2: ovs-ofctl add-flow br0 in_port=1,action=output:2 > >Packet 64 128 256 512 >Branch Master Patch Master Patch Master Patch Master Patch >case 2 18800 33970 19980 30350 22610 26800 13500 20220 > 506140 596690 363010 363370 544520 541570 549120 77414700 > 459509 473536 254817 256801 287872 287277 290642 301572 > > > 1024 1280 1518 > Master Patch Master Patch Master Patch > 22530 15850 21350 36020 25970 34300 > 549680 131964240 543390 81549210 552060 98207410 > 292436 294388 285468 305727 295133 300080 > > >case 3 is same as case 1 with INTERIM_QUEUE_BURST_THRESHOLD=16, >instead of 32. > >(w) patch >case 3 64 128 256 512 1024 1280 1518 > 122700 119890 135200 117530 118900 116640 123710(min) > 972830 808960 574180 696820 36717550 720500 726790(max) > 783315 674814 463256 439412 467041 463093 471967(avg) > >case 4 is same as case 2 with INTERIM_QUEUE_BURST_THRESHOLD=16, >instead of 32. > >(w) patch >case 4 64 128 256 512 1024 1280 1518 > 31750 26140 25250 17570 14750 28600 31460(min ns) > 722690 363200 539760 538320 301845040 12556210 132114800(max > ns) > 485710 253497 285589 284095 293189 282834 285829(avg ns) > >v1->v2 > * xps_get_tx_qid() is no more called twice. The last used qid is stored so > the drain function will flush the right queue also when XPS is enabled. > * netdev_txq_drain() is called unconditionally and not just for dpdk ports. > * txq_drain() takes the 'tx_lock' for queue in case of dynamic tx queues. > * Restored counting of dropped packets. > * Changed scheduling of drain function. > * Updated comments in netdev-provider.h > * Fixed a comment in dp-packet.h > >Details: > * In worst case scenario with fewer packets in batch, significant > bottleneck is observed at netdev_dpdk_eth_send() function due to > expensive MMIO writes. > > * Also its observed that CPI(cycles per instruction) Rate for the function > stood between 3.15 and 4.1 which is significantly higher than acceptable > limit of 1.0 for HPC applications and theoretical limit of 0.25 (As Backend > pipeline can retire 4 micro-operations in a cycle). > > * With this patch, CPI for netdev_dpdk_eth_send() is at 0.55 and the overall > throughput improved significantly. > > lib/dp-packet.h | 2 +- > lib/dpif-netdev.c | 66 ++++++++++++++++++++++++++++++++++++-- > lib/netdev-bsd.c | 1 + > lib/netdev-dpdk.c | 87 >++++++++++++++++++++++++++++++++++++++++++++++----- > lib/netdev-dummy.c | 1 + > lib/netdev-linux.c | 1 + > lib/netdev-provider.h | 8 +++++ > lib/netdev-vport.c | 3 +- > lib/netdev.c | 9 ++++++ > lib/netdev.h | 1 + > 10 files changed, 166 insertions(+), 13 deletions(-) > >diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 17b7026..9e3912a 100644 >--- a/lib/dp-packet.h >+++ b/lib/dp-packet.h >@@ -39,7 +39,7 @@ enum OVS_PACKED_ENUM dp_packet_source { > DPBUF_STACK, /* Un-movable stack space or static buffer. */ > DPBUF_STUB, /* Starts on stack, may expand into heap. */ > DPBUF_DPDK, /* buffer data is from DPDK allocated memory. >- * ref to build_dp_packet() in netdev-dpdk. */ >+ * Ref dp_packet_init_dpdk() in >+ dp-packet.c */ > }; > > #define DP_PACKET_CONTEXT_SIZE 64 >diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 719a518..b0d47fa >100644 >--- a/lib/dpif-netdev.c >+++ b/lib/dpif-netdev.c >@@ -289,6 +289,8 @@ struct dp_netdev_rxq { > struct dp_netdev_pmd_thread *pmd; /* pmd thread that will poll this >queue. */ }; > >+#define LAST_USED_QID_NONE -1 >+ > /* A port in a netdev-based datapath. */ struct dp_netdev_port { > odp_port_t port_no; >@@ -437,8 +439,14 @@ struct rxq_poll { > struct tx_port { > struct dp_netdev_port *port; > int qid; >- long long last_used; >+ long long last_used; /* In case XPS is enabled, it contains the >+ * timestamp of the last time the port was >+ * used by the thread to send data. After >+ * XPS_TIMEOUT_MS elapses the qid will be >+ * marked as -1. */ > struct hmap_node node; >+ int last_used_qid; /* Last queue id where packets could be >+ enqueued. */ > }; > > /* PMD: Poll modes drivers. PMD accesses devices via polling to eliminate >@@ -2900,6 +2908,25 @@ cycles_count_end(struct dp_netdev_pmd_thread >*pmd, } > > static void >+dp_netdev_drain_txq_ports(struct dp_netdev_pmd_thread *pmd) { >+ struct tx_port *cached_tx_port; >+ int tx_qid; >+ >+ HMAP_FOR_EACH (cached_tx_port, node, &pmd->send_port_cache) { >+ tx_qid = cached_tx_port->last_used_qid; >+ >+ if (tx_qid != LAST_USED_QID_NONE) { >+ netdev_txq_drain(cached_tx_port->port->netdev, tx_qid, >+ cached_tx_port->port->dynamic_txqs); >+ >+ /* Queue drained and mark it empty. */ >+ cached_tx_port->last_used_qid = LAST_USED_QID_NONE; >+ } >+ } >+} >+ >+static void > dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, > struct netdev_rxq *rx, > odp_port_t port_no) @@ -3514,15 +3541,18 @@ >pmd_load_queues_and_ports(struct dp_netdev_pmd_thread *pmd, > return i; > } > >+enum { DRAIN_TSC = 20000ULL }; >+ > static void * > pmd_thread_main(void *f_) > { > struct dp_netdev_pmd_thread *pmd = f_; >- unsigned int lc = 0; >+ unsigned int lc = 0, lc_drain = 0; > struct polled_queue *poll_list; > bool exiting; > int poll_cnt; > int i; >+ uint64_t prev = 0, now = 0; > > poll_list = NULL; > >@@ -3555,6 +3585,17 @@ reload: > poll_list[i].port_no); > } > >+#define MAX_LOOP_TO_DRAIN 128 >+ if (lc_drain++ > MAX_LOOP_TO_DRAIN) { >+ lc_drain = 0; >+ prev = now; >+ now = pmd->last_cycles; >+ >+ if ((now - prev) > DRAIN_TSC) { >+ dp_netdev_drain_txq_ports(pmd); >+ } >+ } >+ > if (lc++ > 1024) { > bool reload; > >@@ -3573,6 +3614,9 @@ reload: > } > } > >+ /* Drain the queues as part of reconfiguration */ >+ dp_netdev_drain_txq_ports(pmd); >+ > poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list); > exiting = latch_is_set(&pmd->exit_latch); > /* Signal here to make sure the pmd finishes @@ -3890,6 +3934,7 @@ >dp_netdev_add_port_tx_to_pmd(struct dp_netdev_pmd_thread *pmd, > > tx->port = port; > tx->qid = -1; >+ tx->last_used_qid = LAST_USED_QID_NONE; > > hmap_insert(&pmd->tx_ports, &tx->node, hash_port_no(tx->port- >>port_no)); > pmd->need_reload = true; >@@ -4454,6 +4499,14 @@ dpif_netdev_xps_get_tx_qid(const struct >dp_netdev_pmd_thread *pmd, > > dpif_netdev_xps_revalidate_pmd(pmd, now, false); > >+ /* The tx queue can change in XPS case, make sure packets in previous >+ * queue is drained properly. */ >+ if (tx->last_used_qid != LAST_USED_QID_NONE && >+ tx->qid != tx->last_used_qid) { >+ netdev_txq_drain(port->netdev, tx->last_used_qid, port- >>dynamic_txqs); >+ tx->last_used_qid = LAST_USED_QID_NONE; >+ } >+ > VLOG_DBG("Core %d: New TX queue ID %d for port \'%s\'.", > pmd->core_id, tx->qid, netdev_get_name(tx->port->netdev)); > return min_qid; >@@ -4548,6 +4601,13 @@ dp_execute_cb(void *aux_, struct >dp_packet_batch *packets_, > tx_qid = pmd->static_tx_qid; > } > >+ /* In case these packets gets buffered into an intermediate >+ * queue and XPS is enabled the drain function could find a >+ * different Tx qid assigned to its thread. We keep track >+ * of the qid we're now using, that will trigger the drain >+ * function and will select the right queue to flush. */ >+ p->last_used_qid = tx_qid; >+ > netdev_send(p->port->netdev, tx_qid, packets_, may_steal, > dynamic_txqs); > return; >@@ -4960,7 +5020,7 @@ dpif_dummy_register(enum dummy_level level) > "dp port new-number", > 3, 3, dpif_dummy_change_port_number, NULL); } >- >+ > /* Datapath Classifier. */ > > /* A set of rules that all have the same fields wildcarded. */ diff --git >a/lib/netdev-bsd.c b/lib/netdev-bsd.c index 94c515d..00d5263 100644 >--- a/lib/netdev-bsd.c >+++ b/lib/netdev-bsd.c >@@ -1547,6 +1547,7 @@ netdev_bsd_update_flags(struct netdev *netdev_, >enum netdev_flags off, > netdev_bsd_rxq_recv, \ > netdev_bsd_rxq_wait, \ > netdev_bsd_rxq_drain, \ >+ NULL, \ > } > > const struct netdev_class netdev_bsd_class = diff --git a/lib/netdev-dpdk.c >b/lib/netdev-dpdk.c index 94568a1..3def755 100644 >--- a/lib/netdev-dpdk.c >+++ b/lib/netdev-dpdk.c >@@ -166,7 +166,6 @@ static const struct rte_eth_conf port_conf = { > > enum { DPDK_RING_SIZE = 256 }; > BUILD_ASSERT_DECL(IS_POW2(DPDK_RING_SIZE)); >-enum { DRAIN_TSC = 200000ULL }; > > enum dpdk_dev_type { > DPDK_DEV_ETH = 0, >@@ -286,15 +285,26 @@ struct dpdk_mp { > struct ovs_list list_node OVS_GUARDED_BY(dpdk_mp_mutex); }; > >+/* Queue 'INTERIM_QUEUE_BURST_THRESHOLD' packets before >tranmitting. >+ * Defaults to 'NETDEV_MAX_BURST'(32) now. >+ */ >+#define INTERIM_QUEUE_BURST_THRESHOLD NETDEV_MAX_BURST >+ > /* There should be one 'struct dpdk_tx_queue' created for > * each cpu core. */ > struct dpdk_tx_queue { >+ int count; /* Number of buffered packets waiting to >+ be sent. */ > rte_spinlock_t tx_lock; /* Protects the members and the NIC queue > * from concurrent access. It is used only > * if the queue is shared among different > * pmd threads (see 'concurrent_txq'). */ > int map; /* Mapping of configured vhost-user queues > * to enabled by guest. */ >+ struct rte_mbuf *burst_pkts[INTERIM_QUEUE_BURST_THRESHOLD]; >+ /* Intermediate queues where packets can >+ * be buffered to amortize the cost of MMIO >+ * writes. */ > }; > > /* dpdk has no way to remove dpdk ring ethernet devices @@ -1381,6 >+1391,7 @@ static inline int netdev_dpdk_eth_tx_burst(struct netdev_dpdk >*dev, int qid, > struct rte_mbuf **pkts, int cnt) { >+ struct dpdk_tx_queue *txq = &dev->tx_q[qid]; > uint32_t nb_tx = 0; > > while (nb_tx != cnt) { >@@ -1404,6 +1415,7 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk >*dev, int qid, > } > } > >+ txq->count = 0; > return cnt - nb_tx; > } > >@@ -1788,12 +1800,42 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, >struct dp_packet_batch *batch) > } > } > >+/* Enqueue packets in an intermediate queue and call the burst >+ * function when the queue is full. This way we can amortize the >+ * cost of MMIO writes. */ >+static inline int >+netdev_dpdk_eth_tx_queue(struct netdev_dpdk *dev, int qid, >+ struct rte_mbuf **pkts, int cnt) { >+ struct dpdk_tx_queue *txq = &dev->tx_q[qid]; >+ >+ int i = 0; >+ int dropped = 0; >+ >+ while (i < cnt) { >+ int freeslots = INTERIM_QUEUE_BURST_THRESHOLD - txq->count; >+ int tocopy = MIN(freeslots, cnt-i); >+ >+ memcpy(&txq->burst_pkts[txq->count], &pkts[i], >+ tocopy * sizeof (struct rte_mbuf *)); >+ >+ txq->count += tocopy; >+ i += tocopy; >+ >+ /* Queue full, burst the packets */ >+ if (txq->count >= INTERIM_QUEUE_BURST_THRESHOLD) { >+ dropped += netdev_dpdk_eth_tx_burst(dev, qid, txq->burst_pkts, >+ txq->count); >+ } >+ } >+ return dropped; >+} >+ > static int > netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > struct dp_packet_batch *batch, > bool may_steal, bool concurrent_txq OVS_UNUSED) { >- > if (OVS_UNLIKELY(!may_steal || batch->packets[0]->source != >DPBUF_DPDK)) { > dpdk_do_tx_copy(netdev, qid, batch); > dp_packet_delete_batch(batch, may_steal); @@ -1836,7 +1878,7 @@ >netdev_dpdk_send__(struct netdev_dpdk *dev, int qid, > cnt = netdev_dpdk_qos_run(dev, pkts, cnt); > dropped = batch->count - cnt; > >- dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, cnt); >+ dropped += netdev_dpdk_eth_tx_queue(dev, qid, pkts, cnt); > > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); >@@ -1850,6 +1892,30 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, >int qid, > } > } > >+/* Drain tx queues, this is called periodically to empty the >+ * intermediate queue in case of few packets (< >+INTERIM_QUEUE_BURST_THRESHOLD) >+ * are buffered into the queue. */ >+static int >+netdev_dpdk_txq_drain(struct netdev *netdev, int qid, bool >+concurrent_txq) { >+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); >+ struct dpdk_tx_queue *txq = &dev->tx_q[qid]; >+ >+ if (OVS_LIKELY(txq->count)) { >+ if (OVS_UNLIKELY(concurrent_txq)) { >+ qid = qid % dev->up.n_txq; >+ rte_spinlock_lock(&dev->tx_q[qid].tx_lock); >+ } >+ >+ netdev_dpdk_eth_tx_burst(dev, qid, txq->burst_pkts, >+ txq->count); >+ >+ if (OVS_UNLIKELY(concurrent_txq)) { >+ rte_spinlock_unlock(&dev->tx_q[qid].tx_lock); >+ } >+ } >+ return 0; >+} >+ > static int > netdev_dpdk_eth_send(struct netdev *netdev, int qid, > struct dp_packet_batch *batch, bool may_steal, @@ -3243,7 >+3309,7 @@ unlock: > SET_CONFIG, SET_TX_MULTIQ, SEND, \ > GET_CARRIER, GET_STATS, \ > GET_FEATURES, GET_STATUS, \ >- RECONFIGURE, RXQ_RECV) \ >+ RECONFIGURE, RXQ_RECV, TXQ_DRAIN) \ > { \ > NAME, \ > true, /* is_pmd */ \ >@@ -3310,6 +3376,7 @@ unlock: > RXQ_RECV, \ > NULL, /* rx_wait */ \ > NULL, /* rxq_drain */ \ >+ TXQ_DRAIN, /* txq_drain */ \ > } > > static const struct netdev_class dpdk_class = @@ -3326,7 +3393,8 @@ static >const struct netdev_class dpdk_class = > netdev_dpdk_get_features, > netdev_dpdk_get_status, > netdev_dpdk_reconfigure, >- netdev_dpdk_rxq_recv); >+ netdev_dpdk_rxq_recv, >+ netdev_dpdk_txq_drain); > > static const struct netdev_class dpdk_ring_class = > NETDEV_DPDK_CLASS( >@@ -3342,7 +3410,8 @@ static const struct netdev_class dpdk_ring_class = > netdev_dpdk_get_features, > netdev_dpdk_get_status, > netdev_dpdk_reconfigure, >- netdev_dpdk_rxq_recv); >+ netdev_dpdk_rxq_recv, >+ NULL); > > static const struct netdev_class dpdk_vhost_class = > NETDEV_DPDK_CLASS( >@@ -3358,7 +3427,8 @@ static const struct netdev_class dpdk_vhost_class = > NULL, > NULL, > netdev_dpdk_vhost_reconfigure, >- netdev_dpdk_vhost_rxq_recv); >+ netdev_dpdk_vhost_rxq_recv, >+ NULL); > static const struct netdev_class dpdk_vhost_client_class = > NETDEV_DPDK_CLASS( > "dpdkvhostuserclient", >@@ -3373,7 +3443,8 @@ static const struct netdev_class >dpdk_vhost_client_class = > NULL, > NULL, > netdev_dpdk_vhost_client_reconfigure, >- netdev_dpdk_vhost_rxq_recv); >+ netdev_dpdk_vhost_rxq_recv, >+ NULL); > > void > netdev_dpdk_register(void) >diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c index >0657434..4ef659e 100644 >--- a/lib/netdev-dummy.c >+++ b/lib/netdev-dummy.c >@@ -1409,6 +1409,7 @@ netdev_dummy_update_flags(struct netdev >*netdev_, > netdev_dummy_rxq_recv, \ > netdev_dummy_rxq_wait, \ > netdev_dummy_rxq_drain, \ >+ NULL, \ > } > > static const struct netdev_class dummy_class = diff --git a/lib/netdev-linux.c >b/lib/netdev-linux.c index 9ff1333..79478ee 100644 >--- a/lib/netdev-linux.c >+++ b/lib/netdev-linux.c >@@ -2830,6 +2830,7 @@ netdev_linux_update_flags(struct netdev >*netdev_, enum netdev_flags off, > netdev_linux_rxq_recv, \ > netdev_linux_rxq_wait, \ > netdev_linux_rxq_drain, \ >+ NULL, \ > } > > const struct netdev_class netdev_linux_class = diff --git a/lib/netdev- >provider.h b/lib/netdev-provider.h index 8346fc4..97e72c6 100644 >--- a/lib/netdev-provider.h >+++ b/lib/netdev-provider.h >@@ -335,6 +335,11 @@ struct netdev_class { > * If the function returns a non-zero value, some of the packets might > have > * been sent anyway. > * >+ * Some netdev provider - like in case of 'dpdk' - may buffer the batch >+ * of packets into an intermediate queue. Buffered packets will be sent >+ * out when their number will exceed a threshold or by the periodic call >+ * to the drain function. >+ * > * If 'may_steal' is false, the caller retains ownership of all the > * packets. If 'may_steal' is true, the caller transfers ownership of all > * the packets to the network device, regardless of success. >@@ -769,6 +774,9 @@ struct netdev_class { > > /* Discards all packets waiting to be received from 'rx'. */ > int (*rxq_drain)(struct netdev_rxq *rx); >+ >+ /* Drain all packets waiting to be sent on queue 'qid'. */ >+ int (*txq_drain)(struct netdev *netdev, int qid, bool >+ concurrent_txq); > }; > > int netdev_register_provider(const struct netdev_class *); diff --git >a/lib/netdev-vport.c b/lib/netdev-vport.c index 2d0aa43..64cf617 100644 >--- a/lib/netdev-vport.c >+++ b/lib/netdev-vport.c >@@ -847,7 +847,8 @@ get_stats(const struct netdev *netdev, struct >netdev_stats *stats) > NULL, /* rx_dealloc */ \ > NULL, /* rx_recv */ \ > NULL, /* rx_wait */ \ >- NULL, /* rx_drain */ >+ NULL, /* rx_drain */ \ >+ NULL, /* tx_drain */ > > > #define TUNNEL_CLASS(NAME, DPIF_PORT, BUILD_HEADER, PUSH_HEADER, >POP_HEADER) \ >diff --git a/lib/netdev.c b/lib/netdev.c index 1e6bb2b..5e0c53f 100644 >--- a/lib/netdev.c >+++ b/lib/netdev.c >@@ -670,6 +670,15 @@ netdev_rxq_drain(struct netdev_rxq *rx) > : 0); > } > >+/* Flush packets on the queue 'qid'. */ int netdev_txq_drain(struct >+netdev *netdev, int qid, bool netdev_txq_drain) { >+ return (netdev->netdev_class->txq_drain >+ ? netdev->netdev_class->txq_drain(netdev, qid, netdev_txq_drain) >+ : EOPNOTSUPP); >+} >+ > /* Configures the number of tx queues of 'netdev'. Returns 0 if successful, > * otherwise a positive errno value. > * >diff --git a/lib/netdev.h b/lib/netdev.h index d6c07c1..7ddd790 100644 >--- a/lib/netdev.h >+++ b/lib/netdev.h >@@ -155,6 +155,7 @@ int netdev_rxq_drain(struct netdev_rxq *); int >netdev_send(struct netdev *, int qid, struct dp_packet_batch *, > bool may_steal, bool concurrent_txq); void > netdev_send_wait(struct >netdev *, int qid); >+int netdev_txq_drain(struct netdev *, int qid, bool concurrent_txq); > > /* native tunnel APIs */ > /* Structure to pass parameters required to build a tunnel header. */ >-- >2.4.11 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev