git sha: [email protected]
Author: Gaetan Rivet <[email protected]>
One line subject: metrics: Introduce metrics framework.

This commit introduces a substantial metrics framework for OVS with
Prometheus data model support and a Python dashboard utility.

> diff --git a/lib/metrics-array.c b/lib/metrics-array.c
> new file mode 100644
> index 0000000000..ceb74ab8b5
> --- /dev/null
> +++ b/lib/metrics-array.c
> @@ -0,0 +1,112 @@

[ ... ]

> +void
> +metrics_array_read_one(double *values OVS_UNUSED,
> +                     void *it OVS_UNUSED)
> +{
> +    /* This is a dummy function serving as a placeholder. */
> +}

Does this function properly initialize the values array? The function
appears to be a no-op placeholder but is used as a callback in
metrics_array_read_values.

> diff --git a/lib/metrics-histogram.c b/lib/metrics-histogram.c
> new file mode 100644
> index 0000000000..022ee52c17
> --- /dev/null
> +++ b/lib/metrics-histogram.c

[ ... ]

> +    while (n_read_retries-- && sum != histogram->sum) {
> +        sum = histogram->sum;
> +        memcpy(buckets, histogram->bin, sizeof buckets);
> +    }

Can this loop fail to converge under heavy load conditions? The retry
mechanism attempts to get consistent data but may not handle cases where
the histogram is being updated faster than the read can complete.

> diff --git a/lib/metrics-visitor.c b/lib/metrics-visitor.c
> new file mode 100644
> index 0000000000..c1f7e81574
> --- /dev/null
> +++ b/lib/metrics-visitor.c

[ ... ]

> +    if (n == visitor->iterators.capacity) {
> +        visitor->iterators.stack = x2nrealloc(visitor->iterators.stack,
> +                                              &visitor->iterators.capacity,
> +                                              
> sizeof(visitor->iterators.stack[0]));
> +    }

Does this code handle the case where x2nrealloc might fail? While
x2nrealloc typically aborts on failure, the allocation pattern could
potentially grow unbounded.

> +    if (n == visitor->labels.capacity) {
> +        visitor->labels.stack = x2nrealloc(visitor->labels.stack,
> +                                       &visitor->labels.capacity,
> +                                       sizeof(visitor->labels.stack[0]));
> +    }

Similar concern about unbounded growth of the labels stack allocation.

> diff --git a/lib/metrics.c b/lib/metrics.c
> new file mode 100644
> index 0000000000..a879263e85
> --- /dev/null
> +++ b/lib/metrics.c

[ ... ]

> +void
> +metrics_root_set_name(const char *name)
> +{
> +    snprintf(metrics_root_name, sizeof metrics_root_name,
> +             "%s", name);
> +}

Does this function validate that name is not NULL? A NULL pointer passed
to snprintf would cause undefined behavior.

> diff --git a/lib/metrics.h b/lib/metrics.h
> new file mode 100644
> index 0000000000..2942194cb9
> --- /dev/null
> +++ b/lib/metrics.h

[ ... ]

> +#define METRICS_FOREACH(UP, NAME, ITERATION_CALLBACK, ...) \
> +    METRICS_DECLARE_INIT(NAME##_coll); \
> +    static struct metrics_foreach METRICS(NAME##_coll) = { \
> +        .node = METRICS_NODE_(NAME##_coll, NULL, FOREACH), \
> +        .map = ITERATION_CALLBACK, \
> +    }; \
> +    METRICS_DEFINE_INIT(UP, NAME##_coll); \
> +    METRICS_LABEL(NAME##_coll, NAME, NULL, __VA_ARGS__)

Does this macro handle the case where __VA_ARGS__ is empty? The macro
expansion might create invalid C code if no label keys are provided.

> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index 9dfa52d82e..e997258006 100644
> --- a/ofproto/ofproto-dpif-upcall.c

[ ... ]

> +                if (used && used < now - max_idle) {
> +                    histogram_add_sample(&revalidator->flow_del_latency,
> +                                         (now - max_idle) - used);
> +                }

Does this arithmetic handle potential underflow correctly? The expression
(now - max_idle) - used could underflow if the timing calculations are
incorrect.

> diff --git a/python/ovs/metrics.py b/python/ovs/metrics.py
> new file mode 100644
> index 0000000000..7a4ceb2651
> --- /dev/null
> +++ b/python/ovs/metrics.py

[ ... ]

> +    def wait(self, period):
> +        if period < self.last_query_duration:
> +            sys.stderr.write("Increasing period to %d ms\n" % 
> self.last_query_duration)
> +            period = self.last_query_duration
> +        else:
> +            # Avoid 'drifting' away from the target period,
> +            # try to remain close to requested reads
> +            next_wake = period
> +            if len(self.update_ts) > 0:
> +                next_wake -= self.last_ts() % period
> +                next_wake -= self.last_query_duration
> +            next_wake /= 1000
> +            if next_wake > 0:
> +                time.sleep(next_wake)
> +        return period

Can the modulo operation cause issues if period is zero? The expression
self.last_ts() % period would raise ZeroDivisionError if period becomes
zero.

> diff --git a/utilities/ovs-metrics.in b/utilities/ovs-metrics.in
> new file mode 100755
> index 0000000000..a6530b1288
> --- /dev/null
> +++ b/utilities/ovs-metrics.in

[ ... ]

> +    while abs(value) > power and n in range(len(labels) - 1):
> +        value /= power
> +        n += 1

Does this loop handle the case where labels is empty? If len(labels) is 0,
then range(len(labels) - 1) would be range(-1) which is empty, but the
function might still have logic issues with an empty labels list.

> +            except KeyError:
> +                # Some metrics can be disabled, such as hw-offload ones if
> +                # hw-offload=false in OVS config.
> +                self.disabled = True
> +        return 0

Are there potential race conditions when accessing metrics_names? The list
is accessed by index without bounds checking in the elif condition.

The commit appears to introduce a comprehensive metrics framework but has
several potential safety issues around bounds checking, NULL pointer
handling, and arithmetic operations that should be addressed.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to