I see following behaviour: 1. Configure low -us (like 100) 2. After that I see many logs about suspicious iterations (expected).
2018-03-27T13:58:27Z|03574|pmd_perf(pmd7)|WARN|Suspicious iteration (Excessive total cycles): tsc=520415762246435 duration=106 us 2018-03-27T13:58:27Z|03575|pmd_perf(pmd7)|WARN|Neighborhood of suspicious iteration: tsc cycles packets cycles/pkt pkts/batch vhost qlen upcalls cycles/upcall 520415762297985 9711 32 303 32 424 0 0 520415762287041 10667 32 333 32 419 0 0 520415762277319 9722 32 303 32 429 0 0 520415762267083 9971 32 311 32 443 0 0 520415762257413 9670 32 302 32 451 0 0 520415762246435 10699 32 334 32 448 0 0 520415762235033 11109 32 347 32 455 0 0 520415762180220 9826 32 307 32 399 0 0 520415762169792 10229 32 319 32 413 0 0 520415762160385 9407 32 293 32 408 0 0 520415762150221 9891 32 309 32 434 0 0 2018-03-27T13:58:27Z|03576|pmd_perf(pmd7)|WARN|Suspicious iteration (Excessive total cycles): tsc=520415762469997 duration=104 us 2018-03-27T13:58:27Z|03577|pmd_perf(pmd7)|WARN|Neighborhood of suspicious iteration: tsc cycles packets cycles/pkt pkts/batch vhost qlen upcalls cycles/upcall 520415762519119 9462 32 295 32 505 0 0 520415762509595 9319 32 291 32 537 0 0 520415762500154 9283 32 290 32 569 0 0 520415762490585 9287 32 290 32 601 0 0 520415762480693 9730 32 304 32 633 0 0 520415762469997 10414 32 325 32 665 0 0 520415762459348 10342 32 323 32 697 0 0 520415762297985 9711 32 303 32 424 0 0 520415762287041 10667 32 333 32 419 0 0 520415762277319 9722 32 303 32 429 0 0 520415762267083 9971 32 311 32 443 0 0 3. Configure back high -us (like 1000). 4. Logs are still there with zero duration. Logs printed every second like this: 2018-03-27T14:02:08Z|04140|pmd_perf(pmd7)|WARN|Suspicious iteration (Excessive total cycles): tsc=520437806368099 duration=0 us [Thread 0x7fb56f2910 (LWP 19754) exited] [New Thread 0x7fb56f2910 (LWP 19755)] 2018-03-27T14:02:08Z|04141|pmd_perf(pmd7)|WARN|Neighborhood of suspicious iteration: tsc cycles packets cycles/pkt pkts/batch vhost qlen upcalls cycles/upcall 520437806368309 44 0 0 0 0 0 0 520437806368266 43 0 0 0 0 0 0 520437806368223 43 0 0 0 0 0 0 520437806368179 44 0 0 0 0 0 0 520437806368134 45 0 0 0 0 0 0 520437806368099 35 0 0 0 0 0 0 520437806005193 362819 0 0 0 0 0 0 520437806005149 44 0 0 0 0 0 0 520437806005105 44 0 0 0 0 0 0 520437806005061 44 0 0 0 0 0 0 520437806005017 44 0 0 0 0 0 0 2018-03-27T14:02:09Z|04142|pmd_perf(pmd7)|WARN|Suspicious iteration (Excessive total cycles): tsc=520437909369235 duration=0 us [Thread 0x7fb56f2910 (LWP 19755) exited] [New Thread 0x7fb56f2910 (LWP 19756)] 2018-03-27T14:02:09Z|04143|pmd_perf(pmd7)|WARN|Neighborhood of suspicious iteration: tsc cycles packets cycles/pkt pkts/batch vhost qlen upcalls cycles/upcall 520437909369437 44 0 0 0 0 0 0 520437909369393 44 0 0 0 0 0 0 520437909369350 43 0 0 0 0 0 0 520437909369306 44 0 0 0 0 0 0 520437909369262 44 0 0 0 0 0 0 520437909369235 27 0 0 0 0 0 0 520437909006124 362999 0 0 0 0 0 0 520437909006080 44 0 0 0 0 0 0 520437909006035 45 0 0 0 0 0 0 520437909005991 44 0 0 0 0 0 0 520437909005947 44 0 0 0 0 0 0 5. Logs could be stopped only by setting -us to extremely high value like 10000. Is it expected? One more comment inline. On 18.03.2018 20:55, Jan Scheurich wrote: > This patch enhances dpif-netdev-perf to detect iterations with > suspicious statistics according to the following criteria: > > - iteration lasts longer than US_THR microseconds (default 250). > This can be used to capture events where a PMD is blocked or > interrupted for such a period of time that there is a risk for > dropped packets on any of its Rx queues. > > - max vhost qlen exceeds a threshold Q_THR (default 128). This can > be used to infer virtio queue overruns and dropped packets inside > a VM, which are not visible in OVS otherwise. > > Such suspicious iterations can be logged together with their iteration > statistics to be able to correlate them to packet drop or other events > outside OVS. > > A new command is introduced to enable/disable logging at run-time and > to adjust the above thresholds for suspicious iterations: > > ovs-appctl dpif-netdev/pmd-perf-log-set on | off > [-b before] [-a after] [-e|-ne] [-us usec] [-q qlen] > > Turn logging on or off at run-time (on|off). > > -b before: The number of iterations before the suspicious iteration to > be logged (default 5). > -a after: The number of iterations after the suspicious iteration to > be logged (default 5). > -e: Extend logging interval if another suspicious iteration is > detected before logging occurs. > -ne: Do not extend logging interval (default). > -q qlen: Suspicious vhost queue fill level threshold. Increase this > to 512 if the Qemu supports 1024 virtio queue length. > (default 128). > -us usec: change the duration threshold for a suspicious iteration > (default 250 us). > > Note: Logging of suspicious iterations itself consumes a considerable > amount of processing cycles of a PMD which may be visible in the iteration > history. In the worst case this can lead OVS to detect another > suspicious iteration caused by logging. > > If more than 100 iterations around a suspicious iteration have been > logged once, OVS falls back to the safe default values (-b 5/-a 5/-ne) > to avoid that logging itself causes continuos further logging. > > Signed-off-by: Jan Scheurich <jan.scheur...@ericsson.com> > Acked-by: Billy O'Mahony <billy.o.mah...@intel.com> > --- > NEWS | 2 + > lib/dpif-netdev-perf.c | 201 > ++++++++++++++++++++++++++++++++++++++++++++ > lib/dpif-netdev-perf.h | 42 +++++++++ > lib/dpif-netdev-unixctl.man | 59 +++++++++++++ > lib/dpif-netdev.c | 5 ++ > 5 files changed, 309 insertions(+) > > diff --git a/NEWS b/NEWS > index 8f66fd3..61148b1 100644 > --- a/NEWS > +++ b/NEWS > @@ -76,6 +76,8 @@ v2.9.0 - 19 Feb 2018 > * Commands ovs-appctl dpif-netdev/pmd-*-show can now work on a single > PMD > * Detailed PMD performance metrics available with new command > ovs-appctl dpif-netdev/pmd-perf-show > + * Supervision of PMD performance metrics and logging of suspicious > + iterations Same Post-v2.9.0. > - vswitchd: > * Datapath IDs may now be specified as 0x1 (etc.) instead of 16 digits. > * Configuring a controller, or unconfiguring all controllers, now > deletes > diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c > index 2b36410..410a209 100644 > --- a/lib/dpif-netdev-perf.c > +++ b/lib/dpif-netdev-perf.c > @@ -25,6 +25,24 @@ > > VLOG_DEFINE_THIS_MODULE(pmd_perf); > > +#define ITER_US_THRESHOLD 250 /* Warning threshold for iteration duration > + in microseconds. */ > +#define VHOST_QUEUE_FULL 128 /* Size of the virtio TX queue. */ > +#define LOG_IT_BEFORE 5 /* Number of iterations to log before > + suspicious iteration. */ > +#define LOG_IT_AFTER 5 /* Number of iterations to log after > + suspicious iteration. */ > + > +bool log_enabled = false; > +bool log_extend = false; > +static uint32_t log_it_before = LOG_IT_BEFORE; > +static uint32_t log_it_after = LOG_IT_AFTER; > +static uint32_t log_us_thr = ITER_US_THRESHOLD; > +uint32_t log_q_thr = VHOST_QUEUE_FULL; > +uint64_t iter_cycle_threshold; > + > +static struct vlog_rate_limit latency_rl = VLOG_RATE_LIMIT_INIT(600, 600); > + > #ifdef DPDK_NETDEV > static uint64_t > get_tsc_hz(void) > @@ -127,6 +145,10 @@ pmd_perf_stats_init(struct pmd_perf_stats *s) > histogram_walls_set_log(&s->cycles_per_upcall, 1000, 1000000); > histogram_walls_set_log(&s->max_vhost_qfill, 0, 512); > s->start_ms = time_msec(); > + s->log_susp_it = UINT32_MAX; > + s->log_begin_it = UINT32_MAX; > + s->log_end_it = UINT32_MAX; > + s->log_reason = NULL; > } > > void > @@ -382,6 +404,10 @@ pmd_perf_stats_clear_lock(struct pmd_perf_stats *s) > history_init(&s->milliseconds); > s->start_ms = time_msec(); > s->milliseconds.sample[0].timestamp = s->start_ms; > + s->log_susp_it = UINT32_MAX; > + s->log_begin_it = UINT32_MAX; > + s->log_end_it = UINT32_MAX; > + s->log_reason = NULL; > /* Clearing finished. */ > s->clear = false; > ovs_mutex_unlock(&s->clear_mutex); > @@ -402,3 +428,178 @@ pmd_perf_stats_clear(struct pmd_perf_stats *s) > s->clear = true; > } > } > + > +/* Delay logging of the suspicious iteration and the range of iterations > + * around it until after the last iteration in the range to be logged. > + * This avoids any distortion of the measurements through the cost of > + * logging itself. */ > + > +void > +pmd_perf_set_log_susp_iteration(struct pmd_perf_stats *s, > + char *reason) > +{ > + if (s->log_susp_it == UINT32_MAX) { > + /* No logging scheduled yet. */ > + s->log_susp_it = s->iterations.idx; > + s->log_reason = reason; > + s->log_begin_it = history_sub(s->iterations.idx, log_it_before); > + s->log_end_it = history_add(s->iterations.idx, log_it_after + 1); > + } else if (log_extend) { > + /* Logging was initiated earlier, we the previous suspicious > iteration > + * now and extend the logging interval, if possible. */ > + struct iter_stats *susp = &s->iterations.sample[s->log_susp_it]; > + uint32_t new_end_it, old_range, new_range; > + > + VLOG_WARN_RL(&latency_rl, > + "Suspicious iteration (%s): tsc=%"PRIu64 > + " duration=%"PRIu64" us\n", > + s->log_reason, > + susp->timestamp, > + (1000000L * susp->cycles) / get_tsc_hz()); > + > + new_end_it = history_add(s->iterations.idx, log_it_after + 1); > + new_range = history_sub(new_end_it, s->log_begin_it); > + old_range = history_sub(s->log_end_it, s->log_begin_it); > + if (new_range < old_range) { > + /* Extended range exceeds history length. */ > + new_end_it = s->log_begin_it; > + } > + s->log_susp_it = s->iterations.idx; > + s->log_reason = reason; > + s->log_end_it = new_end_it; > + } > +} > + > +void > +pmd_perf_log_susp_iteration_neighborhood(struct pmd_perf_stats *s) > +{ > + ovs_assert(s->log_reason != NULL); > + ovs_assert(s->log_susp_it != UINT32_MAX); > + > + struct ds log = DS_EMPTY_INITIALIZER; > + struct iter_stats *susp = &s->iterations.sample[s->log_susp_it]; > + uint32_t range = history_sub(s->log_end_it, s->log_begin_it); > + > + VLOG_WARN_RL(&latency_rl, > + "Suspicious iteration (%s): tsc=%"PRIu64 > + " duration=%"PRIu64" us\n", > + s->log_reason, > + susp->timestamp, > + (1000000L * susp->cycles) / get_tsc_hz()); > + > + pmd_perf_format_iteration_history(&log, s, range); > + VLOG_WARN_RL(&latency_rl, > + "Neighborhood of suspicious iteration:\n" > + "%s", ds_cstr(&log)); > + ds_destroy(&log); > + s->log_susp_it = s->log_end_it = s->log_begin_it = UINT32_MAX; > + s->log_reason = NULL; > + > + if (range > 100) { > + /* Reset to safe default values to avoid disturbance. */ > + log_it_before = LOG_IT_BEFORE; > + log_it_after = LOG_IT_AFTER; > + log_extend = false; > + } > +} > + > +void > +pmd_perf_log_set_cmd(struct unixctl_conn *conn, > + int argc, const char *argv[], > + void *aux OVS_UNUSED) > +{ > + unsigned int it_before, it_after, us_thr, q_thr; > + bool on, extend; > + bool usage = false; > + > + on = log_enabled; > + extend = log_extend; > + it_before = log_it_before; > + it_after = log_it_after; > + q_thr = log_q_thr; > + us_thr = log_us_thr; > + > + while (argc > 1) { > + if (!strcmp(argv[1], "on")) { > + on = true; > + argc--; > + argv++; > + } else if (!strcmp(argv[1], "off")) { > + on = false; > + argc--; > + argv++; > + } else if (!strcmp(argv[1], "-e")) { > + extend = true; > + argc--; > + argv++; > + } else if (!strcmp(argv[1], "-ne")) { > + extend = false; > + argc--; > + argv++; > + } else if (!strcmp(argv[1], "-a") && argc > 2) { > + if (str_to_uint(argv[2], 10, &it_after)) { > + if (it_after > HISTORY_LEN - 2) { > + it_after = HISTORY_LEN - 2; > + } > + } else { > + usage = true; > + break; > + } > + argc -= 2; > + argv += 2; > + } else if (!strcmp(argv[1], "-b") && argc > 2) { > + if (str_to_uint(argv[2], 10, &it_before)) { > + if (it_before > HISTORY_LEN - 2) { > + it_before = HISTORY_LEN - 2; > + } > + } else { > + usage = true; > + break; > + } > + argc -= 2; > + argv += 2; > + } else if (!strcmp(argv[1], "-q") && argc > 2) { > + if (!str_to_uint(argv[2], 10, &q_thr)) { > + usage = true; > + break; > + } > + argc -= 2; > + argv += 2; > + } else if (!strcmp(argv[1], "-us") && argc > 2) { > + if (!str_to_uint(argv[2], 10, &us_thr)) { > + usage = true; > + break; > + } > + argc -= 2; > + argv += 2; > + } else { > + usage = true; > + break; > + } > + } > + if (it_before + it_after > HISTORY_LEN - 2) { > + it_after = HISTORY_LEN - 2 - it_before; > + } > + > + if (usage) { > + unixctl_command_reply_error(conn, > + "Usage: ovs-appctl dpif-netdev/pmd-perf-log-set " > + "[on|off] [-b before] [-a after] [-e|-ne] " > + "[-us usec] [-q qlen]"); > + return; > + } > + > + VLOG_INFO("pmd-perf-log-set: %s, before=%d, after=%d, extend=%s, " > + "us_thr=%d, q_thr=%d\n", > + on ? "on" : "off", it_before, it_after, > + extend ? "true" : "false", us_thr, q_thr); > + log_enabled = on; > + log_extend = extend; > + log_it_before = it_before; > + log_it_after = it_after; > + log_q_thr = q_thr; > + log_us_thr = us_thr; > + iter_cycle_threshold = (log_us_thr * get_tsc_hz()) / 1000000L; > + > + unixctl_command_reply(conn, ""); > +} > diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h > index fd9b0fc..985aa5a 100644 > --- a/lib/dpif-netdev-perf.h > +++ b/lib/dpif-netdev-perf.h > @@ -171,6 +171,14 @@ struct pmd_perf_stats { > struct history iterations; > /* Millisecond history buffer. */ > struct history milliseconds; > + /* Suspicious iteration log. */ > + uint32_t log_susp_it; > + /* Start of iteration range to log. */ > + uint32_t log_begin_it; > + /* End of iteration range to log. */ > + uint32_t log_end_it; > + /* Reason for logging suspicious iteration. */ > + char *log_reason; > }; > > /* Support for accurate timing of PMD execution on TSC clock cycle level. > @@ -341,6 +349,15 @@ history_store(struct history *h, struct iter_stats *is) > return history_next(h); > } > > +/* Data and function related to logging of suspicious iterations. */ > + > +extern bool log_enabled; > +extern uint32_t log_q_thr; > +extern uint64_t iter_cycle_threshold; > + > +void pmd_perf_set_log_susp_iteration(struct pmd_perf_stats *s, char *reason); > +void pmd_perf_log_susp_iteration_neighborhood(struct pmd_perf_stats *s); > + > /* Functions recording PMD metrics per iteration. */ > > static inline void > @@ -370,6 +387,7 @@ pmd_perf_end_iteration(struct pmd_perf_stats *s, int > rx_packets, > uint64_t now_tsc = cycles_counter_update(s); > struct iter_stats *cum_ms; > uint64_t cycles, cycles_per_pkt = 0; > + char *reason = NULL; > > cycles = now_tsc - s->current.timestamp; > s->current.cycles = cycles; > @@ -421,6 +439,27 @@ pmd_perf_end_iteration(struct pmd_perf_stats *s, int > rx_packets, > /* Store in iteration history. This advances the iteration idx and > * clears the next slot in the iteration history. */ > history_store(&s->iterations, &s->current); > + > + if (log_enabled) { > + /* Log suspicious iterations. */ > + if (cycles > iter_cycle_threshold) { > + reason = "Excessive total cycles"; > + } else if (s->current.max_vhost_qfill >= log_q_thr) { > + reason = "Vhost RX queue full"; > + } > + if (OVS_UNLIKELY(reason)) { > + pmd_perf_set_log_susp_iteration(s, reason); > + cycles_counter_update(s); > + } > + > + /* Log iteration interval around suspicious iteration when reaching > + * the end of the range to be logged. */ > + if (OVS_UNLIKELY(s->log_end_it == s->iterations.idx)) { > + pmd_perf_log_susp_iteration_neighborhood(s); > + cycles_counter_update(s); > + } > + } > + > if (now_tsc > s->next_check_tsc) { > /* Check if ms is completed and store in milliseconds history. */ > uint64_t now = time_msec(); > @@ -458,6 +497,9 @@ void pmd_perf_format_iteration_history(struct ds *str, > int n_iter); > void pmd_perf_format_ms_history(struct ds *str, struct pmd_perf_stats *s, > int n_ms); > +void pmd_perf_log_set_cmd(struct unixctl_conn *conn, > + int argc, const char *argv[], > + void *aux OVS_UNUSED); > > #ifdef __cplusplus > } > diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man > index 76c3e4e..ab8619e 100644 > --- a/lib/dpif-netdev-unixctl.man > +++ b/lib/dpif-netdev-unixctl.man > @@ -149,6 +149,65 @@ reported by the \fBdpif-netdev/pmd-stats-show\fR command. > To reset the counters and start a new measurement use > \fBdpif-netdev/pmd-stats-clear\fR. > . > +.IP "\fBdpif-netdev/pmd-perf-log-set\fR \fBon\fR|\fBoff\fR \ > +[\fB-b\fR \fIbefore\fR] [\fB-a\fR \fIafter\fR] [\fB-e\fR|\fB-ne\fR] \ > +[\fB-us\fR \fIusec\fR] [\fB-q\fR \fIqlen\fR]" > +. > +The userspace "netdev" datapath is able to supervise the PMD performance > +metrics and detect iterations with suspicious statistics according to the > +following criteria: > +.RS > +.IP \(em > +The iteration lasts longer than \fIusec\fR microseconds (default 250). > +This can be used to capture events where a PMD is blocked or interrupted for > +such a period of time that there is a risk for dropped packets on any of its > Rx > +queues. > +.IP \(em > +The max vhost qlen exceeds a threshold \fIqlen\fR (default 128). This can be > +used to infer virtio queue overruns and dropped packets inside a VM, which > are > +not visible in OVS otherwise. > +.RE > +.IP > +Such suspicious iterations can be logged together with their iteration > +statistics in the \fBovs-vswitchd.log\fR to be able to correlate them to > +packet drop or other events outside OVS. > + > +The above command enables (\fBon\fR) or disables (\fBoff\fR) supervision and > +logging at run-time and can be used to adjust the above thresholds for > +detecting suspicious iterations. By default supervision and logging is > +disabled. > + > +The command options are: > +.RS > +.IP "\fB-b\fR \fIbefore\fR" > +The number of iterations before the suspicious iteration to be logged > +(default 5). > +.IP "\fB-a\fR \fIafter\fR" > +The number of iterations after the suspicious iteration to be logged > +(default 5). > +.IP "\fB-e\fR" > +Extend logging interval if another suspicious iteration is detected > +before logging occurs. > +.IP "\fB-ne\fR" > +Do not extend logging interval if another suspicious iteration is detected > +before logging occurs (default). > +.IP "\fB-q\fR \fIqlen\fR" > +Suspicious vhost queue fill level threshold. Increase this to 512 if the Qemu > +supports 1024 virtio queue length (default 128). > +.IP "\fB-us\fR \fIusec\fR" > +Change the duration threshold for a suspicious iteration (default 250 us). > +.RE > + > +Note: Logging of suspicious iterations itself consumes a considerable amount > +of processing cycles of a PMD which may be visible in the iteration history. > +In the worst case this can lead OVS to detect another suspicious iteration > +caused by logging. > + > +If more than 100 iterations around a suspicious iteration have been logged > +once, OVS falls back to the safe default values (-b 5 -a 5 -ne) to avoid > +that logging itself continuously causes logging of further suspicious > +iterations. > +. > .IP "\fBdpif-netdev/pmd-rxq-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" > For one or all pmd threads of the datapath \fIdp\fR show the list of > queue-ids > with port names, which this thread polls. > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index f245ce2..670e5c1 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -1235,6 +1235,11 @@ dpif_netdev_init(void) > unixctl_command_register("dpif-netdev/pmd-rxq-rebalance", "[dp]", > 0, 1, dpif_netdev_pmd_rebalance, > NULL); > + unixctl_command_register("dpif-netdev/pmd-perf-log-set", > + "on|off [-b before] [-a after] [-e|-ne] " > + "[-us usec] [-q qlen]", > + 0, 10, pmd_perf_log_set_cmd, > + NULL); > return 0; > } > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev